Strange results using atomic in OMP (gfortran)

Strange results using atomic in OMP (gfortran) - multithreading

Testing an Atomic example code I got a strange result.
program atomic
use omp_lib
implicit none
integer, parameter :: num_threads = 4, m = 1000000
integer :: thread_num
integer :: i, j, sum1 = 0, sum2 = 0, tic,toc, rate
real:: time
integer, external :: increment
thread_num = 0
!$ call omp_set_num_threads(num_threads)
!////////// ATOMIC ////////////////////////////////////////////////////////////
CALL system_clock(count_rate=rate)
call system_clock(tic)
!$omp parallel do private(thread_num, j) &
!$omp shared(sum1, sum2)
do i = 0 , m-1
!$ thread_num = omp_get_thread_num()
!$omp atomic
sum1 = sum1 + i
sum2 = sum2 + increment(thread_num, i)
end do
!$omp end paralleldo
print*, "sum 1 = ", sum1
print*, "sum 2 = ", sum2
call system_clock(toc)
time = real(toc-tic)/real(rate)
print*, "Time atomic: ", time, 's'
!////////// CRITICAL ////////////////////////////////////////////////////////////
sum1=0; sum2=0
CALL system_clock(count_rate=rate)
call system_clock(tic)
!$omp parallel do private(thread_num, j) &
!$omp shared(sum1, sum2)
do i = 0 , m-1
!$ thread_num = omp_get_thread_num()
!$omp critical
sum1 = sum1 + i
sum2 = sum2 + increment(thread_num, i)
!$omp end critical
end do
!$omp end paralleldo
print*, "sum 1 = ", sum1
print*, "sum 2 = ", sum2
call system_clock(toc)
time = real(toc-tic)/real(rate)
print*, "Time critical: ", time, 's'
end program atomic
integer function increment (thread_num, j)
implicit none
integer, intent(in) :: thread_num, j
! print*, "Function increment run by thread number: ", thread_num
increment = j
end function increment
Using 'm = 10000000' (7 zeros) I get:
sum 1 = -2014260032
sum 2 = -1146784608
Time atomic: 1.13900006 s
sum 1 = -2014260032
sum 2 = -2014260032
Time critical: 4.09000015 s
Using 'm=1000000' (6 zeros) I get:
sum 1 = 1783293664
sum 2 = 1576859165
Time atomic: 0.123999998 s
sum 1 = 1783293664
sum 2 = 1783293664
Time critical: 0.133000001 s
I have two questions:
Why do I get a negative output in the first case?
Why is not sum1 equal to sum2 in atomic outputs?
It was compiled using:
gfortran -Wall -Wextra -fopenmp -O2 -Wall -o prog.exe prueba.f90
./prog.exe

Why do I get a negative output in the first case?
Because the sum operation overflows. From this source one can read:
In computer programming, an integer overflow occurs when an arithmetic
operation attempts to create a numeric value that is outside of the
range that can be represented with a given number of digits – either
higher than the maximum or lower than the minimum representable value
For a m = 10000000 the result is 49999995000000, which is a value bigger than the maximum value representable with an Integer (32-bit integer) in Fortran.
The second question
Why is not sum1 equal to sum2 in atomic outputs?
Because the atomic clause is only being applied to the operation:
sum1 = sum1 + i
The first problem you can solve by using a data-type that can represent a wider range of numbers. The second problem you can solve as follows:
!$omp atomic
sum1 = sum1 + i
!$omp atomic
sum2 = sum2 + increment(thread_num, i)

Related

Loop optimization in QB64

Have a loop in QB64 concerning loop optimization:
DIM N AS DOUBLE, X(100000000) AS DOUBLE
T! = TIMER
FOR N = 1 to 100000000
IF X(N) THEN
PRINT X(N)
EXIT FOR
END IF
NEXT
PRINT TIMER - T!
is it any faster than:
DIM N AS DOUBLE, X(100000000) AS DOUBLE
T! = TIMER
FOR N = 1 to 100000000
IF X(N) <> 0 THEN
PRINT X(N)
EXIT FOR
END IF
NEXT
PRINT TIMER - T!
EDITED: 09-18-2018 to include variable types

I written this code to evaluate your test:
REM Delete REM to enable console runs
REM $CONSOLE:ONLY
REM _DEST _CONSOLE
DIM SHARED N AS DOUBLE, X(100000000) AS DOUBLE
S# = 0: ZC% = 0
T% = 10
IF COMMAND$ <> "" THEN
T% = VAL(COMMAND$)
END IF
IF T% > 999 THEN T% = 999
FOR I% = 1 TO T%
A# = TRYA
B# = TRYB
D# = A# - B#
PRINT USING "Case A ... : #.########"; A#
PRINT USING "Case B ... : #.########"; B#
PRINT USING "Diff ..... : #.########"; D#;
A$ = ""
IF ABS(D#) < 0.00000001 THEN
ZC% = ZC% + 1
A$ = "*"
END IF
S# = S# + A# - B#
PRINT A$
PRINT
REM INKEY$ doesn't work in console mode!
A$ = INKEY$
IF A$ = CHR$(27) THEN
I% = I% + 1: EXIT FOR
END IF
NEXT
PRINT USING "Avrg A - B : #.########"; S# / (I% - 1)
PRINT USING "0 diff:### on ### tryes"; ZC%, (I% - 1)
PRINT
PRINT "Hit a key to exit!"
REM INPUT$ doesn't work in console mode!
A$ = INPUT$(1)
SYSTEM
FUNCTION TRYA#
T# = TIMER
FOR N = 1 TO 100000000
IF X(N) THEN
PRINT X(N)
EXIT FOR
END IF
NEXT
A# = TIMER - T#
TRYA = A#
END FUNCTION
FUNCTION TRYB#
T# = TIMER
FOR N = 1 TO 100000000
IF X(N) <> 0 THEN
PRINT X(N)
EXIT FOR
END IF
NEXT
A# = TIMER - T#
TRYB = A#
END FUNCTION
The two different routines are inserted into two functions: TRYA and TRYB.
I launched this SW with a loop that runs 999 times the functions and the result is:
Avrg. A - B: 0.00204501
0 diff:359 on 999 tryes
then I launched with a 10 times loop and the result is:
Avrg. A - B: -.01640625
0 diff: 1 on 10 tryes
then I launched with a 15 times loop and the result is:
Avrg. A - B: 0.00026042
0 diff: 5 on 15 tryes
Cause we launch the SW in a multi-thread ambient I don't believe this is a very good test, but there's some results:
In two cases the results of no difference (0 diff) is a third of all loops.
In two cases it seems the function TRYA is slower.
In one case it seems the function TRYB is slower.
Looking at these results, I think, we may consider the two functions equivalent!
You obtain more than 10 loops running the code from command line (or modifying the command$ parameter into the QB64 menu) as:
# ./test n
Where n is the number of loops you desire.
The SW was compiled using gcc with -O3 optimizations option. (To do this you have to modify the file [/opt/]qb64/internal/c/makeline_lnx.txt)

Recursion code in VBA

I am trying to run this code to calculate Q(n) at different Tn in the Equation 16.4 in the attached picture.But its not giving me the correct output. I would appreciate any help. Note: delta1=delta2 =...deltan = dt=1 ( I have taken here ) and further divided S term by 10000 just because in the Equation it is in basis point i.e. 100th part of 1 %.
Function Bootstrap(S As Range, Z As Range, L As Double) As Double
Dim j As Integer
Dim a As Variant
Dim b As Variant
Dim n As Integer
Dim Q() As Double
Dim sum As Double
Dim P As Double
Dim dt As Double
n = Application.WorksheetFunction.Max(S.Columns.Count, Z.Columns.Count)
a = S.Value
b = Z.Value
dt = 1
sum = 0
ReDim Q(0 To n)
Q(0) = 1
For j = 1 To n - 1
P = (b(1, j) * (L * Q(j - 1) - (L + dt * a(1, n) / 10000) * Q(j))) / (b(1, n) * (L + a(1, n) * dt / 10000)) + Q(n - 1) * L / (L + a(1, n) * dt / 10000)
sum = sum + P
Q(n) = sum
Next j
Bootstrap = sum
End Function

To solve a recursive function you can write it this way, for example
Function Factorial(n as long) as long
If n = 1 Then
Factorial = 1
Else
Factorial = n * Factorial(n-1)
End If
End function
Yes, you can see For...Loop can also do the Factorial calculation, but in your case, its much easier to use recursive solution.
Besides Eq 16.4 is intentionally written as a recursive function. It is not written as a summation function because it is harder to do so. If given to you is a summation function, then you can apply the For...Loop solution.
Hope this helps.
EDIT
Function Q(n as long) as double
If n = 1 Then
Q = 5
Else
Q = Z * ( L * Q_t - (L + d * S) * Q(n-1) ) / ( Z * ( L + d * S ) )
End If
End Function
Notice that the function Q keep calling itself in Q(n-1) when n>1. That is called recursive solution.
(Check the formula. I might copy it wrong)

Time Complexity of dependant nested loop

I've had a look at similar questions that have been asked, and have asked my classmates for advice but I am questioning the answer.
What's the time complexity of this algorithm?
for (i = 1; i < n; i *= 2)
for (j = 1; j < i; j *= 2)
\\ c elementary operations
I have been told O(log(n))^2 but from what I've read and tried it looks like O(log(n)*log(log(n))). Any help?

The inner loops repeats itself log_2(i) times for each iteration of the outer loop.
Let's sum that up then
(1) T(n) = log_2(1) + log_2(2) + log_2(4) + log_2(8) + ... + log_2(n)
(2) T(n) = sum { log_2(2^i) | i=0,1,..,log_2(n) }
(3) T(n) = sum { i * log_2(2) | i=0,1,...,log_2(n) }
(4) T(n) = 0 + 1 + ... + log_2(n)
(5) T(n) = (log_2(n) + 1)(log_2(n))/2
(6) T(n) is in O(log_2(n)^2)
Explanation:
(1) -> (2) is simply summation shorthand
(2) -> (3) is because log(a^b) = blog(a)
(3) -> (4) log_2(2) = 1
(4) -> (5) Sum of arithmetic progression
(5) -> (6) is giving asymptotic notation

OpenMP block gives false results

I would appreciate your point of view where I might did wrong using OpenMP.
I parallelized this code pretty strait forward - yet even with single thread (i.e., call omp_set_num_threads(1)) I get wrong results.
I have checked with Intel Inspector, and I do not have a race condition, yet the Inspector tool indicated as a warning that a thread might approach other thread stack (I have this warning in other code I have, and it runs well with OpenMP). I do not think this is the problem.
SUBROUTINE GR(NUMBER_D, RAD_D, RAD_CC, SPECT)
use TERM,only: DENSITY, TEMPERATURE, VISCOSITY, WATER_DENSITY, &
PRESSURE, D_HOR, D_VER, D_TEMP, QQQ, UMU
use SATUR,only: FF, A1, A2, AAA, BBB, SAT
use DELTA,only: DDM, DT
use CONST,only: PI, G
IMPLICIT NONE
INTEGER,INTENT(IN) :: NUMBER_D
DOUBLE PRECISION,INTENT(IN) :: RAD_CC(NUMBER_D), SPECT(NUMBER_D)
DOUBLE PRECISION,INTENT(INOUT) :: RAD_D(NUMBER_D)
DOUBLE PRECISION :: R3, DR3, C2, C0, P, Q, RAD_CR, SAT_CR, C4, A, &
C, D, CC, DD, CC2, DD2, RAD_ST, DRAA, DRA, DM, X1
INTEGER :: I
DDM = 0.0D0
!$OMP PARALLEL DO DEFAULT(SHARED) &
!$OMP PRIVATE(I,R3,DR3,C2,C0,P,Q,SAT,SAT_CR,C4,A) &
!$OMP PRIVATE (C,D,CC,DD,CC2,DD2,RAD_ST,DRAA,DRA,DM,RAD_CR,X1) &
!$OMP REDUCTION (+:DDM)
DO I=1,NUMBER_D
R3 = RAD_CC(I)**3
DR3 = RAD_D(I)**3-R3
IF(DR3.LT.1.0D-100) DR3 = 1.0D-100
C2 = -DSQRT(3.0D0*BBB*R3/AAA)
C0 = -R3
P = -0.3333333333D0*C2**2
Q = C0+0.074074074D0*C2**3
CALL CUBIC(P, Q, RAD_CR)
RAD_CR = RAD_CR - 0.3333333333D0*C2
SAT_CR = DEXP(AAA/RAD_CR-BBB*R3/(RAD_CR**3-R3))-1.0D0
DRA = DT*(SAT+1.0D0-DEXP(AAA/RAD_DROP(I)-BBB*R3/DR3))/ &
(FF*RAD_D(I))
IF(SAT.LT.SAT_CR) THEN
IF(DABS(SAT).LT.1.0D-10) THEN
P = -BBB*R3/AAA
Q = -R3
CALL CUBIC(P, Q, RAD_ST)
GO TO 22
END IF
C4 = DLOG(SAT+1.0D0)
A = -AAA/C4
C = (BBB-C4)*R3/C4
D = -A*R3
P = A*C-4.0D0*D
Q = -(A**2*D+C**2)
CALL CUBIC(P, Q, X1)
CC = DSQRT(A**2+4.D0*X1)
DD = DSQRT(X1**2-4.D0*D)
CC2 = 0.5D0*(A-CC)
IF(SAT.LT.0.0D0) THEN
DD2 = 0.5D0*(X1-DD)
RAD_ST = 0.5D0*(-CC2+DSQRT(CC2**2-4.0D0*DD2))
ELSE
DD2 = 0.5D0*(X1+DD)
RAD_ST = 0.5D0*(-CC2-DSQRT(CC2**2-4.0D0*DD2))
END IF
22 CONTINUE
DRAA = RAD_ST-RAD_D(I)
IF(ABS(DRAA).LT.ABS(DRA)) THEN
DRA = DRAA
DM = 1.3333333333333333D0*PI*WATER_DENSITY* &
(RAD_ST**3-RAD_D(I)**3)
ELSE
DM = 4.0D0*PI*WATER_DENSITY*RAD_D(I)**2*DRA
END IF
DDM = DDM+SPECT(I)*DM
RAD_D(I) = RAD_D(I) + DRA
ELSE
DM = 4.0D0*PI*WATER_DENSITY*RAD_D(I)**2*DRA
DDM = DDM+SPECT(I)*DM
RAD_D(I) = RAD_D(I) + DRA
END IF
END DO
!$OMP END PARALLEL DO
RETURN
END SUBROUTINE GR
SUBROUTINE CUBIC(P, Q, X)
IMPLICIT NONE
DOUBLE PRECISION,INTENT(IN) :: P, Q
DOUBLE PRECISION,INTENT(OUT) :: X
DOUBLE PRECISION :: DIS, PP, COSALFA,ALFA, QQ, U, V
DIS = (P/3.D0)**3+(0.5D0*Q)**2
IF(DIS.LT.0.0D0) THEN
PP = -P/3.0D0
COSALFA = -0.5D0*Q/DSQRT(PP**3)
ALFA = DACOS(COSALFA)
X = 2.0D0*DSQRT(PP)*DCOS(ALFA/3.0D0)
RETURN
ELSE
QQ = DSQRT(DIS)
U = -0.5D0*Q+QQ
V = -0.5D0*Q-QQ
IF(U.GE.0.0D0) THEN
U = U**0.333333333333333D0
ELSE
U = -(-U)**0.333333333333333D0
END IF
IF(V.GE.0.0D0) THEN
V = V**0.333333333333333D0
ELSE
V = -(-V)**0.333333333333333D0
END IF
X = U+V
END IF
RETURN
END SUBROUTINE CUBIC

Javascript converted to Excel VB Function Produces #NUM! error

Hey guys, I have a javascript function that produces a 12 digit UPC code (Based on the first 11 digits:
function ccc12(rawVal) {
factor = 3;
sum = 0;
rawVal = rawVal.toString();
if (rawVal.length!=11){
throw "The UCC-12 ID Number requires that you enter 11 digits.";
}
for (index = rawVal.length; index > 0; --index) {
sum = sum + rawVal.substring (index-1, index) * factor;
factor = 4 - factor;
}
return ((1000 - sum) % 10);
}
Assuming the above if I gave 84686400201 as the rawVal, then 2 would be the outcome returned.
This was then converted to
Function generateUPC(upcCode As Integer) As String
Dim upcCheckDigit, factor, sum As Integer
Dim upcString As String
factor = 3
sum = 0
For i = Len(upcCode) To 0 Step -1
sum = sum + Mid(upcCode, i - 1, 1) * factor
factor = 4 - factor
Next i
upcCheckDigit = ((1000 - sum) Mod 10)
upcString = upcCode & upcCheckDigit
generateUPC = upcString
End Function
This function returns the original string plus the last digit, but instead i get #NUM! in the worksheet when I put =generateUPC(84686400201) into the cell.
Any ideas? Never really bothered doing VB Macros etc before so this is new to me

I suggest changing upcCode to a string to avoid overflow and changing the indexes of your loop and within the Mid function to avoid out-of-bounds errors.
Function generateUPC(upcCode as String) As String
Dim upcCheckDigit, factor, sum As Integer
Dim upcCode, upcString As String
factor = 3
sum = 0
For i = Len(upcCode) To 1 Step -1
sum = sum + Mid(upcCode, i, 1) * factor
factor = 4 - factor
Next i
upcCheckDigit = ((1000 - sum) Mod 10)
upcString = upcCode & upcCheckDigit
generateUPC = upcString
End Function

VBA integers are -32k to +32k
VBA Longs are -2B to +2B
Your 'upcCode' integer is larger than the long data type so I tried it with Double, which is a float, but works:
Function generateUPC(upcCode As Double) As String
Dim upcCheckDigit, factor, sum As Double
Dim upcString As String
factor = 3
sum = 0
For i = Len(upcCode) To 0 Step -1
sum = sum + Mid(upcCode, i - 1, 1) * factor
factor = 4 - factor
Next i
upcCheckDigit = ((1000 - sum) Mod 10)
upcString = upcCode & upcCheckDigit
generateUPC = upcString
End Function

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Strange results using atomic in OMP (gfortran) - multithreading

Related

Loop optimization in QB64

Recursion code in VBA

Time Complexity of dependant nested loop

OpenMP block gives false results

Javascript converted to Excel VB Function Produces #NUM! error

Categories

Resources