Linux Fortran OpenMP - accessing global variables from a subroutine called from an OpenMP task

Linux Fortran OpenMP - accessing global variables from a subroutine called from an OpenMP task - linux

Is it legal/valid to access program global variables from an internal subroutine called from an OpenMP task?
ifort 2021.7.0 20220726 doesn't report an error, but appears to produce random results depending on compiler options. Example:
program test1
implicit none
integer :: i, j, g
g = 42
!$OMP PARALLEL DEFAULT(SHARED)
!$OMP SINGLE
i = 0
j = 1
do while (j < 60)
i = i + 1
!$OMP TASK DEFAULT(SHARED) FIRSTPRIVATE(i,j)
call sub(i,j)
!$OMP END TASK
j = j + j
end do
!$OMP END SINGLE
!$OMP END PARALLEL
stop
contains
subroutine sub(i,j)
implicit none
integer i,j
!$OMP CRITICAL(unit6)
write(6,*) i,j,g
!$OMP END CRITICAL(unit6)
end subroutine sub
end program test1
Compiled with: ifort -o test1 test1.f90 -qopenmp -warn all -check all
Expected result:
5 16 42
4 8 42
6 32 42
3 4 42
2 2 42
1 1 42
Obtained result:
2 2 -858993460
5 16 -858993460
4 8 -858993460
6 32 -858993460
1 1 -858993460
3 4 -858993460
Note: the order of output lines doesn't matter --- just the number in the third column should be 42.
Different unexpected results are obtained by changing compiler options. For example, with "ifort -o test1 test1.f90 -qopenmp -warn all -O0", the third column is 256 and with "ifort -o test1 test1.f90 -qopenmp -O0" it is -740818552.
Of course g could be passed to sub() as an argument, but the program I'm assisting with working on has dozens of shared global variables (that don't change in the parallel part) and subroutine calls go several layers deep.
Thanks, Peter McGavin.

Please try the oneAPI compiler package 2022.2 or 2022.3.
/iusers/xtian/temp$ ifx -qopenmp jimtest.f90
/iusers/xtian/temp$ ./a.out
2 2 42
1 1 42
3 4 42
5 16 42
4 8 42
6 32 42

Related

Is there a way to change the value of an eBPF map incrementally each time the function is called?

I'm currently using eBPF maps, and whenever I try to set the value (associated with a key in a hash table type map) to a variable that I increment at the end of the eBPF program, such that the value is incremented every time the function is run, the verifier throws an error,
invalid indirect read from stack R3 off -128+6 size 8
processed 188 insns (limit 1000000) max_states_per_insn 1 total_states 11 peak_states 11 mark_read 8
The main goal is to directly take the value and increment it every time the function is run.
I was under the impression that this would work.
bpf_spin_lock(&read_value->semaphore);
read_value->value += 1;
bpf_spin_unlock(&read_value->semaphore);
But this also throws the following error,
R1 type=inv expected=map_value
processed 222 insns (limit 1000000) max_states_per_insn 1 total_states 15 peak_states 15 mark_read 9

How to write multiple variables to a file using MPI I/O using Fortran

The below code gives me 4 variables saving 4 different rows with 10 colums, whereas I need to save like 4 columns in 10 rows. I'm using hexdump syntax to extract from the file
program main
use mpi
integer :: wsize,wrank,ierr,i,fh,offset
integer , parameter :: count = 10
integer :: buf1(count),buf2(count),buf3(count),buf4(count)
integer , dimension(10,2) :: buf
call MPI_Init(ierr)
call MPI_Comm_rank(MPI_COMM_WORLD,wrank,ierr)
call MPI_Comm_size(MPI_COMM_WORLD,wsize,ierr)
offset = 0;
call MPI_File_open(MPI_COMM_WORLD, "test1.dat", MPI_MODE_RDWR + MPI_MODE_CREATE, MPI_INFO_NULL, fh, ierr)
do i = 1,count
buf1(i) = 1*i
buf2(i) = 2*i
buf3(i) = 3*i
buf4(i) = 4*i
end do
call MPI_FILE_WRITE_AT(fh, offset, /buf1,buf2,buf3,buf4/), 4*count, MPI_INTEGER, mpi_status_ignore, ierr)
call MPI_File_close(fh,ierr)
call MPI_FINALIZE(ierr)
end program main
using Hexdump command:
hexdump -v -e ' "%10d" ' -e ' "\n"' test1.dat > hextest1.dat`
If you try to open hextest1.dat after conversions it seems to be like what I posted.
1 2 3 4 5 6 7 8 9 10
2 4 6 8 10 12 14 16 18 20
3 6 9 12 15 18 21 24 27 30
4 8 12 16 20 24 28 32 36 40

Calculate average of 1kb windows

My files looks like the following:
18 1600014 + CAA 0 3
18 1600017 - CTT 0 1
18 1600019 - CTC 0 1
18 1600020 + CAT 0 3
18 1600031 - CAA 0 1
18 1600035 - CAT 0 1
...
I am trying to calculate the average of column 6 in windows that cover 1000 range of column 2. So from 1600001-1601000, 1601001-1602000, etc. My values go from 1600000-1700000. Is there any way to do this is one step? My initial thought was to use grep to sort these values, but that would require many different commands. I am aware you can calculate the average with awk but can you reiterate over each window?
Desire output would be something like this:
1600001-1601000 3.215
1601001-1602000 3.141
1602001-1603000 3.542

You can use GNU awk to gather the counts and sums, if I understand your problem correct, you might need something like this:
BEGIN { mod = 1000
PROCINFO["sorted_in"] = "#ind_num_asc"
}
{
k= ($2 - ( $2 % mod ) ) / mod
sum[ k ]+= $6
cnt[ k ]++
}
END {
for( k in sum ) printf( "%d-%d\t%6.3f\n", k*mod +1, (k+1)*mod, sum[k] / cnt [k])
}

calculating delay cycles for hcs12

I try to calculate number of instruction cycles and delay cycles for HCS12. I have some information about HCS12
The HCS12 uses the bus clock (E clock) as a timing
reference.
The frequency of the E clock is half of that of the onboard clock oscillator (clock, 48 MHz, E-clock, 24 MHz).
Execution times of the instructions are also measured in E clock cycles
I wonder the 24Mhz is crystal frequency? If so, only half of the
crystal’s oscillator frequency is used for CPU instruction time. So,
should it be halved?
How can I make 100-ms time delay for a demo board with a 24-MHz bus
clock?
In order to create a 100-ms time delay, we need to repeat the preceding instruction sequence 60,000 times [100 ms ÷ (40 ÷ 24,000,000) μs = 60,000]. The following instruction sequence will create the desired delay:
There is an example but I don't understand how 60000 and 40 values are calculated.
ldx #60000
loop psha ; 2 E cycles
pula ; 3 E cycles
psha ; 2 E cycles
pula ; 3 E cycles
psha ; 2 E cycles
pula ; 3 E cycles
psha ; 2 E cycles
pula ; 3 E cycles
psha ; 2 E cycles
pula ; 3 E cycles
psha ; 2 E cycles
pula ; 3 E cycles
psha ; 2 E cycles
pula ; 3 E cycles
nop ; 2 E cycles
nop ; 3 E cycles
dbne x,loop

Your first section explains that if the internal oscillator (or external crystal) is 48 MHz, the EClock is 24 MHz. So if you want to delay by 100 millisec, that is 24,000,000 * 100 / 1,000 EClocks, namely 2,400,000 instruction cycles.
The maximum register size available is 16-bits, so a loop counter value is chosen that is <= 65535.
Conveniently 60,000 is a factor of 2,400,000 being 60,000 * 40. So the inner loop is contrived to take 40 cycles. However the timing comments on the last 3 lines are incorrect, they should be
nop ; 1 E cycle
nop ; 1 E cycle
dbne x,loop ; 3 E cycles
Giving the required 40 cycles execution time.
Note that if you have interrupts, other processes, this hard coded method is not very accurate, and a timer interrupt would be better.

how to read every ten lines of file using awk in loop?

I have a file in which lines are separated using a "return". I want to use two loops, one loop for reading every ten lines and one loop for doing a specific operation on those specific ten lines. How to read each ten lines in the file using awk?
The sample file is this:
1 2
3 4
5 6
7 8
9 10
9 10
7 8
6 5
4 3
2 1
2 1
4 3
5 4
6 5
7 6
8 7
9 8
0 9
1 2
3 4
5 6
7 8
9 10
9 10
7 8
6 5
4 3
2 1
2 1
4 3
5 4
6 5
7 6
8 7
9 8
0 9
I want to read each ten lines, then print the average of both numbers in those ten lines and print.
Thanks.

awk '
{sum1 += $1; sum2 += $2}
function output() {print sum1/10, sum2/10; sum1 = sum2 = 0}
NR % 10 == 0 {output()}
END {output()}
' input.file
outputs
5.3 5.7
4.5 4.9
5.5 5.5
3.5 3.9
The END only has 6 lines of data, but is dividing by 10. Please make your requirements more precise.

One possible solution is to check a counter and output and reset the current sum per column if the counter reaches a multiple of 10. Note that this will swallow the last few records if the total number of lines is not a multiple of 10. If you are sure your file won't contain any blank lines, the code can be further simplified.
#!/usr/bin/awk -f
BEGIN {
chunk_size = 10;
sum_first = 0;
sum_second = 0;
record_counter = 0;
}
/[0-9]+\s+[0-9]+/ {
record_counter += 1;
sum_first += $1;
sum_second += $2;
if (record_counter % chunk_size == 0) {
printf("%16.9f %16.9f\n",
sum_first / chunk_size,
sum_second / chunk_size);
sum_first = 0;
sum_second = 0;
}
}
Output for your example data:
5.300000000 5.700000000
4.500000000 4.900000000
5.500000000 5.500000000

As nu11po1n7er(sorry if i mispelled your name) has removed their answer i am going to add a similar one
awk -vc="10" '{a+=$1+$2}!(--c){c=10;print a/c;a=0}END{if(c)print a/(10-c)}' file
output
11
9.4
11
12.3333
This will print the average of every ten lines field one and two added together(which is what i gathered from OPs post/comments).
If it finished not on a multiple of 10 it will divide by however many lines were left for the avg.

situation1: only print one ave of per 10 line.
awk 'NR%10!=0{tmp=tmp+$1+ $2}NR%10==0{tmp = tmp+ $1+$2; print tmp/20; tmp=0}' 1.t
output:
5.5
4.7
5.5
situation2: print two averages for each column of per 10 line.
awk 'NR%10!=0{tmp=tmp+$1; tmp2=tmp2+$2}NR%10==0{tmp = tmp+ $1; tmp2=tmp2+$2; print tmp/10, tmp2/10; tmp=tmp2=0}' 1.t
output:
5.3 5.7
4.5 4.9
5.5 5.5

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Linux Fortran OpenMP - accessing global variables from a subroutine called from an OpenMP task - linux

Please try the oneAPI compiler package 2022.2 or 2022.3. /iusers/xtian/temp$ ifx -qopenmp jimtest.f90 /iusers/xtian/temp$ ./a.out 2 2 42 1 1 42 3 4 42 5 16 42 4 8 42 6 32 42

Related

Is there a way to change the value of an eBPF map incrementally each time the function is called?

How to write multiple variables to a file using MPI I/O using Fortran

Calculate average of 1kb windows

calculating delay cycles for hcs12

how to read every ten lines of file using awk in loop?

Categories

Resources