I am facing an unexpected behavior of Cython (v. 0.29.13) when compiling OpenMP parallel code that is expect to perform a reduction:
import cython
from cython.parallel import prange, parallel
cpdef omp_test1(n):
cdef int sum = 0, i, imax
imax = <int>n
for i in prange(imax, nogil=True):
sum += 1
return sum
cpdef omp_test2(n):
cdef int sum = 0, i, imax
imax = <int>n
with nogil, parallel():
for i in prange(imax):
sum += 1
return sum
When calling both functions I would expect to have as a return value the input argument n
. Instead, the returned value is n * num_threads
. Surprisingly, on my Mac this unexpected behavior is observed only with GCC (v. 9.2.0_2); the clang (v. 11.0.0) returns the expected value n
.
Is there something wrong I am doing or is there a problem with the generated Cython code or GCC compiler?