I've wrote my first program using allocatable. It works as expected. But, does it really? And more importantly, how can I set up a unit-test to catch memory leaks?
The idea behind the program is to allocate a chunck of storage room for my list of objects in the first place. And every time I add one more element more to the list than the allocated size, I double the allocation. I do this to reduce the number of allocations and subsequent copying of data from the old allocated memory, to the newly allocated memory.
I might over complicate this, but I'd like to spend some time now understanding the pitfalls, rather than falling head first into them a year or two down into the project.
The ode is compiled with gfortran 8.3.0 on linux. And using pFUnit 4.1. The code below is an extract to test only the allocation part.
Heres my test-program:
program test_realloc
use class_test_malloc
integer :: i
real :: x, y
type(tmalloc) :: myobject
call myobject%initialize()
do i=1, 100
x = i * i
y = sqrt(x)
call myobject%add_nbcell(i, x, y)
end do
call myobject%dump()
end program test_realloc
array_reallocation.f:
!
! Simple test to see if my understanding of dynamicly allocation
! of arrays is correct.
!
module class_test_malloc
use testinglistobj
implicit none
type tmalloc
integer :: numnbcells, maxnbcells
type(listobj), allocatable :: nbcells(:)
contains
procedure, public :: initialize => init
procedure, public :: add_nbcell ! Might be private?
procedure, private :: expand_nbcells
procedure, public :: dump
end type tmalloc
contains
subroutine init(this)
class(tmalloc), intent(inout) :: this
this%numnbcells = 0
this%maxnbcells = 4
allocate (this%nbcells(this%maxnbcells))
end subroutine init
subroutine add_nbcell(this, idx, x, y)
class(tmalloc), intent(inout) :: this
integer, intent(in) :: idx
real, intent(in) :: x, y
type(listobj) :: nbcell
if(this%numnbcells .eq. this%maxnbcells) then
call this%expand_nbcells()
print *,"Expanding"
endif
this%numnbcells = this%numnbcells + 1
nbcell%idx = idx
nbcell%x = x
nbcell%y = y
this%nbcells(this%numnbcells) = nbcell
print *,"Adding"
end subroutine add_nbcell
subroutine expand_nbcells(this)
class(tmalloc), intent(inout) :: this
type(listobj), allocatable :: tmpnbcells(:)
integer :: size
size = this%maxnbcells *2
allocate (tmpnbcells(size))
tmpnbcells(1:this%maxnbcells) = this%nbcells
call move_alloc( from=tmpnbcells, to=this%nbcells)
this%maxnbcells = size
end subroutine
subroutine dump(this)
class(tmalloc), intent(inout) :: this
integer :: i
do i=1, this%numnbcells
print*, this%nbcells(i)%x, this%nbcells(i)%y
end do
end subroutine
end module
listobj.f:
module testinglistobj
type listobj
integer :: idx
real :: x
real :: y
end type
end module testinglistobj
You will not get any memory leaks with this code. The reason is, and this is fundamental to the understanding of allocatable arrays, is that in Fortran 95 onwards it is required that allocatable arrays without the save attribute automatically get deallocated when they go out of scope. The nett result of this is that memory leaks for such arrays are impossible. This is one very good reason why you should prefer allocatable arrays to pointers. Related is the general software engineering principle of keeping the scope of variables as limited as possible, so that arrays are in memory for as short a period as possible.
Note this does not mean that you should never deallocate them as an array may remain in scope long after it is actually useful. Here "manual" deallocation may be of use. But it is not a memory leak.
Related
I am having trouble writing an allocatable array nested in a derived type using namelists. A minimal example is shown below. How can I modify the program to have the allocatable array inside the derived type work as though it were not nested?
program test
implicit none
type struct_foo
integer, allocatable :: nested_bar(:)
end type struct_foo
integer, allocatable :: bar(:)
type(struct_foo) :: foo
! namelist / list / foo, bar
namelist / list / bar
allocate(bar(5))
bar = [1:5]
allocate(foo%nested_bar(5))
foo%nested_bar=[1:5]
write(*,list)
end program test
With the foo commented out of the namelist, it works just fine, producing the output:
&LIST
BAR = 1, 2, 3, 4, 5
/
With foo included, the program fails to compile:
>> ifort -traceback test_1.f90 -o test && ./test
test_1.f90(20): error #5498: Allocatable or pointer derived-type fields require a user-defined I/O procedure.
write(*,list)
--------^
compilation aborted for test_1.f90 (code 1)
As the error message states, you need to provide a user defined derived type I/O (UDDTIO) procedure. This is required for input/output of any object with an allocatable or pointer component.
How the object of derived type is formatted in the file is completely under the control of the UDDTIO procedure.
An example, using a very simple output format, is below. Typically a UDDTIO procedure implementing namelist output would use an output format that was consistent with the other aspects of namelist output and typically there would also be a corresponding UDDTIO procedure that was then able to read the formatted results back in.
module foo_mod
implicit none
type struct_foo
integer, allocatable :: nested_bar(:)
contains
procedure, private :: write_formatted
generic :: write(formatted) => write_formatted
end type struct_foo
contains
subroutine write_formatted(dtv, unit, iotype, v_list, iostat, iomsg)
class(struct_foo), intent(in) :: dtv
integer, intent(in) :: unit
character(*), intent(in) :: iotype
integer, intent(in) :: v_list(:)
integer, intent(out) :: iostat
character(*), intent(inout) :: iomsg
integer :: i
if (allocated(dtv%nested_bar)) then
write (unit, "(l1,i10,i10)", iostat=iostat, iomsg=iomsg) &
.true., &
lbound(dtv%nested_bar, 1), &
ubound(dtv%nested_bar, 1)
if (iostat /= 0) return
do i = 1, size(dtv%nested_bar)
write (unit, "(i10)", iostat=iostat, iomsg=iomsg) &
dtv%nested_bar(i)
if (iostat /= 0) return
end do
write (unit, "(/)", iostat=iostat, iomsg=iomsg)
else
write (unit, "(l1,/)", iostat=iostat, iomsg=iomsg) .false.
end if
end subroutine write_formatted
end module foo_mod
program test
use foo_mod
implicit none
integer, allocatable :: bar(:)
type(struct_foo) :: foo
namelist / list / foo, bar
allocate(bar(5))
bar = [1:5]
allocate(foo%nested_bar(5))
foo%nested_bar=[1:5]
write (*,list)
end program test
Use of UDDTIO obviously requires a compiler that implements this Fortran 2003 language feature.
I'm trying to do a fortran subroutine to look for local maxima in a array of lenght n.
The program Grabs a 5 lenght array of the original n-lenght array and looks of the maxima is in the position 3. Then moves foreward with the to the next posible 5-lenght array in the n-lenght array.
I know this works, the problem is that i want to keep the maxima values, togheter with the correspondon wf and m parameters. So what im trying to do is write 3 files with this data, and loop them.
Here count_peak is a external variable, because i will loop this routine over several n-lenght arrays, and i want to keep counting the maxima peaks.
I know this is working, the problem i have is that when i read the files, the numbers and the lenght of the file seem to be wrong.
Here is the code:
subroutine local_max_bif_v2(valdata,size_data,compare_size,count_peak,debug)
!************************************************
! makes an dim(compare_size) array from data. if the maximum value is in the middle of the array (localdata(2)) saves the value to a binary file
!************************************************
use constants !Module that defines the changing parameters for the bifurcation
integer, intent(in) :: debug
integer, intent(in) :: size_data !input array lenght.
real(8), dimension(size_data), intent(in) :: valdata !input array
integer(4), intent(in) :: compare_size !even number, greater than 3 (3 or 5 give fast, good results on soft data.)
integer(4), intent(inout) :: count_peak
real(8), dimension(compare_size) :: localdata
real(8) :: peak
integer(4) :: peak_index
integer(4), dimension(1) :: maxind
integer(4) :: midpoint
integer(4) :: j
if (debug.eq.1) then
print*,
print*, '*************************************'
print*, 'debug info for local_max_bif: '
print*,
!print*, 'local_max_biff goes from : ', size_data-evaluate_lenght, 'to ', size_data-compare_size
end if
midpoint=(1+compare_size)/2
OPEN(2,file='peak.in',form='unformatted',status='unknown',access='stream')
OPEN(3,file='varw.in',form='unformatted',status='unknown',access='stream')
OPEN(4,file='varm.in',form='unformatted',status='unknown',access='stream')
do j=1,size_data-compare_size
localdata=valdata(j:j+compare_size-1)
maxind=maxloc(localdata)
if (maxind(1).eq.(midpoint)) then
peak=localdata(midpoint)
count_peak=count_peak+1
peak_index=j+midpoint-1
WRITE(2) peak
WRITE(3) wf
WRITE(4) m
end if
end do
CLOSE(2)
CLOSE(3)
CLOSE(4)
if (debug.eq.1) then
print*, 'last peak index: ', peak_index
print*, 'amount of peaks', count_peak
print*, '*************************************'
print*,
endif
return
!To do: -need to add a way to test if the new peak is repeated or not. then if not, add no the file the new one
end subroutine
I kept getting the warning message:
forrtl: warning (402): fort: (1): In call to I/O Read routine, an array temporary was created for argument #1
when I run the following codes. I had rummaged through old posts about the same issue on the forum (see here) but I was not quite sure the answer is applicable to my problem at hand. Could anyone help me get rid of this warning message? Thanks.
program main
implicit none
integer :: ndim, nrow, ncol
real, dimension(:,:), allocatable :: mat
ndim = 13
ncol = 1
allocate( mat(ncol,ndim))
call read_mat( mat, 'matrix.txt' )
contains
subroutine read_mat( mat, parafile)
implicit none
real, dimension(:,:), intent(out) :: mat
character(len=*), intent(in) :: parafile
character(len=500) :: par
character(len=80) :: fname1
integer :: n, m, pcnt, iostat
m = size(mat,dim=2)
n = size(mat,dim=1)
mat = 0.
open( unit= 10, file=parafile, status='old', action='read', iostat=iostat )
if( iostat==0 )then
pcnt = 1
write(fname1,'(a,i3,a)'),'(',m,'(f10.5))'
do
read( unit=10, fmt=*, iostat=iostat ) mat(pcnt,:)
pcnt = pcnt + 1
if( pcnt > n ) exit
enddo
else
print*, 'The file does not exist.'
endif
close( 10 )
end subroutine read_mat
end
It is the same underlying cause as in the linked question. The reference mat(pcnt,:) refers to array element storage that is not contiguous in memory, so to satisfy the compiler's internal requirements for the runtime code that actually does the read, it sets aside some storage that is contiguous, reads into that, then copies the elements out of that temporary storage into the final array.
Usually, in the context of a relatively slow IO statement, the overhead associated with the temporary is not significant.
The easiest way to prevent the warning is to disable the relevant runtime diagnostic with -check:noarg_temp_created or similar. But that is a bit of a sledgehammer — argument temporaries may be created that you do want to know about.
An alternative work around is to explicitly designate the elements to be read using an io-implied-do. Something like:
read (unit=10, fmt=*, iostat=iostat) (mat(pcnt,i),i=1,m)
with an appropriate declaration of i.
I wanted to know what the best way to write a large fortran array ( 5000 x 5000 real single precision numbers) to a file. I am trying to save the results of a numerical calculation for later use so they do not need to be repeated. From calculation 5000 x 5000 x 4bytes per number number is 100 Mb, is it possible to save this in a form that is only 100Mb? Is there a way to save fortran arrays as a binary file and read it back in for later use?
I've noticed that saving numbers to a text file produces a file much larger than the size of the data type being saved. Is this because the numbers are being saved as characters?
The only way I am familiar with to write to file is
open (unit=41, file='outfile.txt')
do i=1,len
do j=1,len
write(41,*) Array(i,j)
end do
end do
Although I'd imagine there is a better way to do it. If anyone could point me to some resources or examples to approve my ability to write and read larger files efficiently (in terms of memory) that would be great.
Thanks!
Write data files in binary, unless you're going to actually be reading the output - and you're not going to be reading a 2.5 million-element array.
The reasons for using binary are threefold, in decreasing importance:
Accuracy
Performance
Data size
Accuracy concerns may be the most obvious. When you are converting a (binary) floating point number to a string representation of the decimal number, you are inevitably going to truncate at some point. That's ok if you are sure that when you read the text value back into a floating point value, you are certainly going to get the same value; but that is actually a subtle question and requires choosing your format carefully. Using default formatting, various compilers perform this task with varying degrees of quality. This blog post, written from the point of view of a games programmer, does a good job of covering the issues.
Let's consider a little program which, for a variety of formats, writes a single-precision real number out to a string, and then reads it back in again, keeping track of the maximum error it encounters. We'll just go from 0 to 1, in units of machine epsilon. The code follows:
program testaccuracy
character(len=128) :: teststring
integer, parameter :: nformats=4
character(len=20), parameter :: formats(nformats) = &
[ '( E11.4)', '( E13.6)', '( E15.8)', '(E17.10)' ]
real, dimension(nformats) :: errors
real :: output, back
real, parameter :: delta=epsilon(output)
integer :: i
errors = 0
output = 0
do while (output < 1)
do i=1,nformats
write(teststring,FMT=formats(i)) output
read(teststring,*) back
if (abs(back-output) > errors(i)) errors(i) = abs(back-output)
enddo
output = output + delta
end do
print *, 'Maximum errors: '
print *, formats
print *, errors
print *, 'Trying with default format: '
errors = 0
output = 0
do while (output < 1)
write(teststring,*) output
read(teststring,*) back
if (abs(back-output) > errors(1)) errors(1) = abs(back-output)
output = output + delta
end do
print *, 'Error = ', errors(1)
end program testaccuracy
and when we run it, we get:
$ ./accuracy
Maximum errors:
( E11.4) ( E13.6) ( E15.8) (E17.10)
5.00082970E-05 5.06639481E-07 7.45058060E-09 0.0000000
Trying with default format:
Error = 7.45058060E-09
Note that even using a format with 8 digits after the decimal place - which we might think would be plenty, given that single precision reals are only accurate to 6-7 decimal places - we don't get exact copies back, off by approximately 1e-8. And this compiler's default format does not give us accurate round-trip floating point values; some error is introduced! If you're a video-game programmer, that level of accuracy may well be enough. If you're doing time-dependant simulations of turbulent fluids, however, that might absolutely not be ok, particularly if there's some bias to where the error is introduced, or if the error occurs in what is supposed to be a conserved quantity.
Note that if you try running this code, you'll notice that it takes a surprisingly long time to finish. That's because, maybe surprisingly, performance is another real issue with text output of floating point numbers. Consider the following simple program, which just writes out your example of a 5000 × 5000 real array as text and as unformatted binary:
program testarray
implicit none
integer, parameter :: asize=5000
real, dimension(asize,asize) :: array
integer :: i, j
integer :: time, u
forall (i=1:asize, j=1:asize) array(i,j)=i*asize+j
call tick(time)
open(newunit=u,file='test.txt')
do i=1,asize
write(u,*) (array(i,j), j=1,asize)
enddo
close(u)
print *, 'ASCII: time = ', tock(time)
call tick(time)
open(newunit=u,file='test.dat',form='unformatted')
write(u) array
close(u)
print *, 'Binary: time = ', tock(time)
contains
subroutine tick(t)
integer, intent(OUT) :: t
call system_clock(t)
end subroutine tick
! returns time in seconds from now to time described by t
real function tock(t)
integer, intent(in) :: t
integer :: now, clock_rate
call system_clock(now,clock_rate)
tock = real(now - t)/real(clock_rate)
end function tock
end program testarray
Here are the timing outputs, for writing to disk or to ramdisk:
Disk:
ASCII: time = 41.193001
Binary: time = 0.11700000
Ramdisk
ASCII: time = 40.789001
Binary: time = 5.70000000E-02
Note that when writing to disk, the binary output is 352 times as fast as ASCII, and to ramdisk it's closer to 700 times. There are two reasons for this - one is that you can write out data all at once, rather than having to loop; the other is that generating the string decimal representation of a floating point number is a surprisingly subtle operation which requires a significant amount of computing for each value.
Finally, is data size; the text file in the above example comes out (on my system) to about 4 times the size of the binary file.
Now, there are real problems with binary output. In particular, raw Fortran (or, for that matter, C) binary output is very brittle. If you change platforms, or your data size changes, your output may no longer be any good. Adding new variables to the output will break the file format unless you always add new data at the end of the file, and you have no way of knowing ahead of time what variables are in a binary blob you get from your collaborator (who might be you, three months ago). Most of the downsides of binary output are avoided by using libraries like NetCDF, which write self-describing binary files that are much more "future proof" than raw binary. Better still, since it's a standard, many tools read NetCDF files.
There are many NetCDF tutorials on the internet; ours is here. A simple example using NetCDF gives similar times to raw binary:
$ ./array
ASCII: time = 40.676998
Binary: time = 4.30000015E-02
NetCDF: time = 0.16000000
but gives you a nice self-describing file:
$ ncdump -h test.nc
netcdf test {
dimensions:
X = 5000 ;
Y = 5000 ;
variables:
float Array(Y, X) ;
Array:units = "ergs" ;
}
and file sizes about the same as raw binary:
$ du -sh test.*
96M test.dat
96M test.nc
382M test.txt
the code follows:
program testarray
implicit none
integer, parameter :: asize=5000
real, dimension(asize,asize) :: array
integer :: i, j
integer :: time, u
forall (i=1:asize, j=1:asize) array(i,j)=i*asize+j
call tick(time)
open(newunit=u,file='test.txt')
do i=1,asize
write(u,*) (array(i,j), j=1,asize)
enddo
close(u)
print *, 'ASCII: time = ', tock(time)
call tick(time)
open(newunit=u,file='test.dat',form='unformatted')
write(u) array
close(u)
print *, 'Binary: time = ', tock(time)
call tick(time)
call writenetcdffile(array)
print *, 'NetCDF: time = ', tock(time)
contains
subroutine tick(t)
integer, intent(OUT) :: t
call system_clock(t)
end subroutine tick
! returns time in seconds from now to time described by t
real function tock(t)
integer, intent(in) :: t
integer :: now, clock_rate
call system_clock(now,clock_rate)
tock = real(now - t)/real(clock_rate)
end function tock
subroutine writenetcdffile(array)
use netcdf
implicit none
real, intent(IN), dimension(:,:) :: array
integer :: file_id, xdim_id, ydim_id
integer :: array_id
integer, dimension(2) :: arrdims
character(len=*), parameter :: arrunit = 'ergs'
integer :: i, j
integer :: ierr
i = size(array,1)
j = size(array,2)
! create the file
ierr = nf90_create(path='test.nc', cmode=NF90_CLOBBER, ncid=file_id)
! define the dimensions
ierr = nf90_def_dim(file_id, 'X', i, xdim_id)
ierr = nf90_def_dim(file_id, 'Y', j, ydim_id)
! now that the dimensions are defined, we can define variables on them,...
arrdims = (/ xdim_id, ydim_id /)
ierr = nf90_def_var(file_id, 'Array', NF90_REAL, arrdims, array_id)
! ...and assign units to them as an attribute
ierr = nf90_put_att(file_id, array_id, "units", arrunit)
! done defining
ierr = nf90_enddef(file_id)
! Write out the values
ierr = nf90_put_var(file_id, array_id, array)
! close; done
ierr = nf90_close(file_id)
return
end subroutine writenetcdffile
end program testarray
OPEN the file for reading and writing as "unformatted", and READ and WRITE the data without supplying a format, as demonstrated in the program below.
program xunformatted
integer, parameter :: n = 5000, inu = 20, outu = 21
real :: x(n,n)
integer :: i
character (len=*), parameter :: out_file = "temp_num"
call random_seed()
call random_number(x)
open (unit=outu,form="unformatted",file=out_file,action="write")
do i=1,n
write (outu) x(i,:) ! write one row at a time
end do
print*,"sum(x) =",sum(x)
close (outu)
open (unit=inu,form="unformatted",file=out_file,action="read")
x = 0.0
do i=1,n
read (inu) x(i,:) ! read one row at a time
end do
print*,"sum(x) =",sum(x)
end program xunformatted
My program reads an “unformatted” file in Fortran. Among other things, this file contains an array that my program does not need, but which can get quite large. I would like to skip this array.
If this is the program writing the data:
program write
real :: useless(10), useful=42
open(123, file='foo', form='unformatted')
write(123) size(useless)
write(123) useless
write(123) useful
end program write
Then this works for reading:
program read
integer :: n
real, allocatable :: useless(:)
real :: useful
open(123, file='foo', form='unformatted')
read(123) n
allocate(useless(n))
read(123) useless
read(123) useful
print*, useful
end program read
But I would like to avoid allocating the “useless” array. I discovered this
program read2
integer :: n, i
real :: useless
real :: useful
open(123, file='foo', form='unformatted')
read(123) n
do i=1,n
read(123) useless
end do
read(123) useful
print*, useful
end program read2
does not work (because of the record lengths being written to the file [EDIT, see francescalus' answer]).
It is not an option to change the format of the file.
It isn't a sin to read fewer file storage units than are in the record.
program read
real :: useful
open(123, file='foo', form='unformatted')
read(123)
read(123)
read(123) useful
print*, useful
end program read
Each "empty" read still advances the record for a file connected for sequential access.
As a further comment: the second attempt doesn't fail "because of the record lengths". It fails because of the attempt to read separate records. Examples of the significance of this difference can be found in many SO posts.
Francescalus has shown how to skip over an entire line. If a line contains some data that should be skipped over as well some data to be read, you can read a dummy variable repeatedly to skip the bad data. Below is a program demonstrating this.
program write
real :: dummy,useless(10), useful=42
integer, parameter :: outu = 20, inu = 21
character (len=*), parameter :: fname = "foo"
integer :: n
call random_seed()
call random_number(useless)
open(outu, file=fname, form='unformatted', action = "write")
write(outu) size(useless)
write(outu) useless
write(outu) useful
close(outu)
open(inu, file=fname, form='unformatted', action = "read")
read (inu) n
read (inu) (dummy,i=1,n)
read (inu) useful
write (*,*) "useful = ",useful
end program write