Why am I getting the Fortran runtime error End of file when my input file is not over? [duplicate] - io

I am trying to read a 3x3 array and print it out but I am getting an end of line error:
The text file contains the following array:
1 2 3
4 5 6
7 8 9
Here is my code:
program myfile
implicit none
! Declare Variables
integer i,j
!real, dimension(1:3,1:3) :: A
integer, parameter :: M = 3, N =3
real, dimension(1:M,1:N) :: A
! Open and read data
open(unit=10, file = 'test_file_cols.txt', status = 'old')
do i =1,M
do j =1,N
read(unit=10,FMT=*) A(i,j)
print *,A(i,j)
end do
end do
end program myfile
The error I am getting is below:
1.000000
4.000000
7.000000
forrtl: severe (24): end-of-file during read, unit 10, file C:\Users\M42141\Documents\mean_flow_file\test_file_cols.txt

As discussed briefly in the comments by default all I/O in Fortran is record based. This is true for both formatted and unformatted files. What happens is the file is viewed as a set of records - and you can think of a record as a line in the file. Now these lines may be very, very long, especially in an unformatted files, but the default Fortran I/O methodology still views it as a set of lines.
Now the important thing is that by default every time you perform an I/O statement (read, write, print) the last thing it does is move from the record it is on to the next record - a write statement will write an end of record marker. This is why you automatically get a newline after a write statement, but it also means that for a read statement any remaining data in the record (on the line) will get skipped over. This is what is happening to you. The first read reads record 1, and so you get 1.0, and then moves to record 2. Your program then reads record 2 and so you get 4.0, and it automatically moves to record 3. this is then read (9.0) and the file pointer moves onto record 4. You then try to read this, but there isn't a record 4, so you get an end of file error.
Record structure is a bit strange when you first encounter it, but when you get used to it it is very powerful and convenient - I'll show an example below, and another one might be that you could leave a comment at the end of each line saying what it does, the end of the read statement will mean you move to the next record, so skipping the comment and needing to take no special action in you code to deal with such a case.
Anyway how to solve your case. Three possible ways
Read a whole record at a time - the comment suggests an implied do loop but I think in this case an array section is much easier and more intuitive
You can simply read the whole array in one go. This works because when a read statement finishes a record and finds it still "needs" more data it will carry onto the next record and keep reading. But note the end of line comment idea won't work here - can you work out why?
Non-Advancing I/O. I don't recommend this at all in this case, but for completeness this allows you to perform a read or write without moving onto the next record
There may be others, you could probably use so called stream I/O but personally I prefer record based whenever possible, I find it more convenient and powerful. Anyway here is a program illustrating the 3 methods. Note I have also changed your input file, getting the original to work with non-advancing I/O is a pain, but not the other 2 - another reason not to use it here.
ian#eris:~/work/stack$ cat readit.f90
Program readit
Implicit None
Real, Dimension( 1:3, 1:3 ) :: a
Integer :: i, j
! one line per read
Write( *, * ) 'Line at a time'
Open( 10, file = 'in' )
Do i = 1, 3
Read ( 10, * ) a( i, : )
Write( *, * ) a( i, : )
End Do
Close( 10 )
! All in one go
Write( *, * ) 'All in one go'
Open( 10, file = 'in' )
Read ( 10, * ) a
Write( *, * ) a
Close( 10 )
! Non advancing I/O
Write( *, * ) 'Non-advancing'
Open( 10, file = 'in' )
Do i = 1, 3
Do j = 1, 3
! Non advancing I/O requires a 'proper' format
Read ( 10, '( f3.1, 1x )', Advance = 'No' ) a( i, j )
Write( *, '( f3.1, 1x )', Advance = 'No' ) a( i, j )
End Do
! Move to next records (lines)
Read ( 10, * )
Write( *, * )
End Do
Close( 10 )
End Program readit
ian#eris:~/work/stack$ gfortran-8 -Wall -Wextra -pedantic -std=f2008 -fcheck=all -O readit.f90
ian#eris:~/work/stack$ cat in
1.0 2.0 3.00
4.0 5.0 6.00
7.0 8.0 9.00
ian#eris:~/work/stack$ ./a.out
Line at a time
1.00000000 2.00000000 3.00000000
4.00000000 5.00000000 6.00000000
7.00000000 8.00000000 9.00000000
All in one go
1.00000000 2.00000000 3.00000000 4.00000000 5.00000000 6.00000000 7.00000000 8.00000000 9.00000000
Non-advancing
1.0 2.0 3.0
4.0 5.0 6.0
7.0 8.0 9.0

Related

Passing a Character Array from VBA to Fortran DLL through a Type is corrupting the other Type members

Believe it or not, that title is about as short as I could make it and still describe the problem I'm having!
So here's the scenario: I'm calling a Fortran DLL from VBA, and the DLL uses user-defined types or whatever the Fortran name is for that (structs?) as an argument and copies the type back to the caller for validation.
The type has an array of fixed-length characters and some run of the mill integers.
I've noticed some funny behavior in any attributes defined after this character array that I'll go over below, right after I describe my boiled-down testing setup:
The Fortran Side:
Here's the main program:
SUBROUTINE characterArrayTest (simpleTypeIn, simpleTypeOut)
use simpleTypeDefinition
!GCC$ ATTRIBUTES STDCALL :: characterArrayTest
type(simpleType), INTENT(IN) :: simpleTypeIn
type(simpleType), INTENT(OUT) :: simpleTypeOut
simpleTypeOut = simpleTypeIn
END SUBROUTINE characterArrayTest
And here is the simpleTypeDefinition module file:
Module simpleTypeDefinition
Type simpleType
character (len=1) :: CharacterArray(1)
!The length of the array is one here, but modified in tests
integer (kind=2) :: FirstInteger
integer (kind=2) :: SecondInteger
integer (kind=2) :: ThirdInteger
End Type simpleType
End Module simpleTypeDefinition
The compilation step:
gfortran -c simpleTypeDefinition.f90 characterArrayTest.f90
gfortran -shared -static -o characterArrayTest.dll characterArrayTest.o
Note: This is the 32-bit version of gfortran, as I'm using the 32-bit version of Excel.
The VBA Side:
First, the mirrored simpleType and declare statements:
Type simpleType
CharacterArray(0) As String * 1
'The length of the array is one here, but modified in tests
FirstInteger As Integer
SecondInteger As Integer
ThirdInteger As Integer
End Type
Declare Sub characterArrayTest Lib "characterArrayTest.dll" _
Alias "characterarraytest_#8" _
(simpleTypeIn As simpleType, simpleTypeOut As simpleType)
Next, the calling code:
Dim simpleTypeIn As simpleType
Dim simpleTypeOut As simpleType
simpleTypeIn.CharacterArray(0) = "A"
'simpleTypeIn.CharacterArray(1) = "B"
'simpleTypeIn.CharacterArray(1) = "C"
'simpleTypeIn.CharacterArray(3) = "D"
simpleTypeIn.FirstInteger = 1
simpleTypeIn.SecondInteger = 2
simpleTypeIn.ThirdInteger = 3
Call Module4.characterArrayTest(simpleTypeIn, simpleTypeOut)
The Strange, Buggy Behavior:
Now that we're past the setup, I can describe what's happening:
(I'm playing around with the length of the character array, while leaving the length of the individual characters set to one. I match the character array parameters on both sides in all cases.)
Test case: CharacterArray length = 1
For this first case, everything works great, I pass in the simpleTypeIn and simpleTypeOut from VBA, the Fortran DLL accepts it and copies simpleTypeIn to simpleTypeOut, and after the call VBA returns simpleTypeOut with identical attributes CharacterArray, FirstInteger, and so forth.
Test case: CharacterArray length = 2
This is where things get interesting.
Before the call, simpleTypeIn was as defined. Right after the call, simpleTypeIn.ThirdInteger had changed from 3 to 65! Even weirder, 65 is the ASCII value for the character A, which is simpleTypeIn.CharacterArray(0).
I tested this relationship by changing "A" to "(", which has an ASCII value of 40, and sure enough, simpleTypeIn.ThirdInteger changed to 40. Weird.
In any case, one would expect that simpleTypeOut would be a copy of whatever weird thing simpleTypeIn has been morphed to, but not so! simpleTypeOut was a copy of simpleTypeIn except for simpleTypeOut.ThirdInteger, which was 16961!
Test case: CharacterArray length = 3
This case was identical to case 2, oddly enough.
Test case: CharacterArray length = 4
In this also weird case, after the call simpleTypeIn.SecondInteger changed from 2 to 65, and simpleTypeIn.ThirdInteger changed from 3 to 66, which is the ASCII value for B.
Not to be outdone, simpleTypeOut.SecondInteger came out as 16961 and simpleTypeOut.ThirdInteger was 17475. The other values copied successfully (I decommented the B, C, and D character assignments to match the array size.)
Observations:
This weird corruption seems to be linear with respect to the bytes in the character array. I did some testing that I'll catalogue if anyone wants on Monday with individual characters of length 2 instead of 1, and the corruption happened when the array had a size of 1, as opposed to waiting until the size was 2. It also didn't "skip" additional corruption when the size of the array was 3 like the size = 1 case did.
This is easily a hall of fame bug for me; I'm sure you can imagine how much concentrated fun this was to isolate in a large-scale program with a ton of Type attributes. If anyone has any ideas it'd be greatly appreciated!
If I don't get back to you right away it's because I'm calling it a day, but I'll try to monitor my inbox.
(This answer is based on an understanding of Fortran, but not VBA)
In this case, and in most cases, Fortran won't automatically resize arrays for you. When you reference the second element of character array (with simpleTypeIn.CharacterArray(1) = "B"), that element doesn't exist and it isn't created.
Instead, the code attempts to set whatever memory would be at the location of the second element of the character array, if it were to exist. In this case, that memory appears to be used to store the integers instead.
You can see the same thing happening if you forget about VBA entirely. Here is a sample code entirely in Fortran to demonstrate similar behavior:
enet-mach5% cat main.f90
! ===== Module of types
module types_m
implicit none
type simple_t
character(len=1) :: CharacterArray(1)
integer :: int1, int2, int3
end type simple_t
end module types_m
! ===== Module of subroutines
module subroutines_m
use types_m, only : simple_t
implicit none
contains
! -- Subroutine to modify first character, this should work
subroutine sub1(s)
type(simple_t), intent(INOUT) :: s
s%CharacterArray(1) = 'A'
end subroutine sub1
! -- Subroutine to modify first and other (nonexistent) characters, should fail
subroutine sub2(s)
type(simple_t), intent(INOUT) :: s
s%CharacterArray(1) = 'B'
s%CharacterArray(2:8) = 'C'
end subroutine sub2
end module subroutines_m
! ===== Main program, drives test
program main
use types_m, only : simple_t
use subroutines_m, only : sub1, sub2
implicit none
type(simple_t) :: s
! -- Set values to known
s%int1 = 1
s%int2 = 2
s%int3 = 3
s%CharacterArray(1) = 'X'
! -- Write out values of s
write(*,*) 'Before calling any subs:'
write(*,*) 's character: "', s%CharacterArray, '"'
write(*,*) 's integers: ', s%int1, s%int2, s%int3
! -- Call first subroutine, should be fine
call sub1(s)
write(*,*) 'After calling sub1:'
write(*,*) 's character: "', s%CharacterArray, '"'
write(*,*) 's integers: ', s%int1, s%int2, s%int3
! -- Call second subroutine, should overflow character array and corrupt
call sub2(s)
write(*,*) 'After calling sub2:'
write(*,*) 's character: "', s%CharacterArray, '"'
write(*,*) 's integers: ', s%int1, s%int2, s%int3
write(*,*) 'complete'
end program main
In this case, I've put both modules and the main routine in the same file. Typically, one would keep them in separate files but it's ok for this example. I also had to set 8 elements of CharacterArray to manifest an error, but the exact sizing depends on the system, compiler, and optimization settings. Running this on my machine yields:
enet-mach5% gfortran --version
GNU Fortran (SUSE Linux) 4.8.3 20140627 [gcc-4_8-branch revision 212064]
Copyright (C) 2013 Free Software Foundation, Inc.
GNU Fortran comes with NO WARRANTY, to the extent permitted by law.
You may redistribute copies of GNU Fortran
under the terms of the GNU General Public License.
For more information about these matters, see the file named COPYING
enet-mach5% gfortran main.f90 && ./a.out
main.f90:31.20:
s%CharacterArray(2:8) = 'C'
1
Warning: Lower array reference at (1) is out of bounds (2 > 1) in dimension 1
Before calling any subs:
s character: "X"
s integers: 1 2 3
After calling sub1:
s character: "A"
s integers: 1 2 3
After calling sub2:
s character: "B"
s integers: 1128481603 2 3
complete
Gfortran is smart enough to flag a compile-time warning that s%CharacterArray(2) is out of bounds. You can see the character array isn't resized, and the value of int1 is corrupted instead. If I compile with more run-time checking, I get a full error instead:
enet-mach5% gfortran -fcheck=all main.f90 && ./a.out
main.f90:31.20:
s%CharacterArray(2:8) = 'C'
1
Warning: Lower array reference at (1) is out of bounds (2 > 1) in dimension 1
Before calling any subs:
s character: "X"
s integers: 1 2 3
After calling sub1:
s character: "A"
s integers: 1 2 3
At line 31 of file main.f90
Fortran runtime error: Index '2' of dimension 1 of array 's' outside of expected range (1:1)
Looks like I'm (Edit: not) collecting my own bounty today!
The root of this problem lies in the fact that VBA takes 2 bytes per character while Fortran expects 1 byte per character. The memory garbling is caused by the character array taking up more space in memory than Fortran expects. The way to send 1 byte characters over to Fortran is as such:
Type Definition:
Type simpleType
CharacterArray(3) As Byte
FirstInteger As Integer
SecondInteger As Integer
ThirdInteger As Integer
End Type
Conversion from VBA character to Byte values:
Dim tempByte() As Byte
tempByte = StrConv("A", vbFromUnicode)
simpleTypeIn.CharacterArray(0) = tempByte(0)
tempByte = StrConv("B", vbFromUnicode)
simpleTypeIn.CharacterArray(1) = tempByte(0)
tempByte = StrConv("C", vbFromUnicode)
simpleTypeIn.CharacterArray(2) = tempByte(0)
tempByte = StrConv("D", vbFromUnicode)
simpleTypeIn.CharacterArray(3) = tempByte(0)
This code successfully passes the strings passed as arguments to the StrConv function. I tested that they translated to the proper ASCII characters in the Fortran DLL and they did! Also, the integers are no longer passed back incorrectly! A hall of fame bug has been stamped.

Write statement for a complex format / possibility to write more than once on the same excel line

I am presently working on a file to open one by one .txt documents, extract data, to finally fill a .excel document.
Because I did not know how it is possible to write multiple times on the same line of my Excel document after one write statement (because it jumps to the next line), I have created a string of characters which is filled time after time :
Data (data_limite(x),x=1,8)/10, 9, 10, 7, 9, 8, 8, 9/
do file_descr = 1,nombre_fichier,1
taille_data1 = data_limite(file_descr)
nvari = taille_data1-7
write (new_data1,"(A30,A3,A11,A3,F5.1,A3,A7,F4.1,<nvari>(A3))") description,char(9),'T-isotherme',char(9),T_trait,char(9),'d_gamma',taille_Gam,(char(9),i=1,nvari)
ecriture_descr = ecriture_descr//new_data1
end do
Main issue was I want to adapt char(9) amount with the data_limite value so I built a write statement with a variable amount of char(9).
At the end of the do-loop, I have a very complex format of ecriture_descr which has no periodic format due to the change of the nvari value
Now I want to add this to the first line of my .excel :
Open(Unit= 20 ,File='resultats.RES',status='replace')
write(20,100) 'param',char(9),char(9),char(9),char(9),char(9),'*',char(9),'nuances',char(9),'*',char(9),ecriture_descr
100 format (a5,5(a3),a,a3,a7,a,a3,???)
but I do not know how to write this format. It would have been easier if, at each iteration of the do-loop I could fill the first line of my excel and continue to fill the first line at each new new_data1 value.
EDIT : maybe adding advance='no' in my write statement would help me, I am presently trying to add it
EDIT 2 : it did not work with advance='no' but adding a '$' at the end of my format write statement disable the return of my function. By moving it to my do-loop, I guess I can solve my problem :). I am presently trying to add it
First of all, your line
ecriture_descr = ecriture_descr//new_data1
Is almost certainly not doing what you expect it to do. I assume that both ecriture_descr and new_data are of type CHARACTER(len=<some value>) -- that is a fixed length string. If you assign anything to such a string, the string is cut to length (if the assigned is too long), or padded with spaces (if the assigned is too short:
program strings
implicit none
character(len=8) :: h
h = "Hello"
print *, "|" // h // "|" ! Prints "|Hello |"
h = "Hello World"
print *, "|" // h // "|" ! Prints "|Hello Wo|"
end program strings
And this combination will work against you: ecriture_descr will already be padded to the max with spaces, so when you append new_data1 it will be just outside the range of ecriture_descr, a bit like this:
h = "Hello" ! h is actually "Hello "
h = h // "World" ! equiv to h = "Hello " // "World"
! = "Hello World"
! ^^^^^^^^^
! Only this is assigned to h => no change
If you want a string aggregator, you need to use the trim function which removes all trailing spaces:
h = trim(h) // " World"
Secondly, if you want to write to a file, but don't want to have a newline, you can add the option advance='no' into the write statement:
do i = 1, 100
write(*, '(I4)', advance='no') i
end do
This should make your job a lot easier than to create one very long string in memory and then write it all out in one go.

MATLAB vs. GNU Octave Textscan disparity

I wish to read some data from a .dat file without saving the file first. In order to do so, my code looks as follows:
urlsearch= 'http://minorplanetcenter.net/db_search/show_object?utf8=&object_id=2005+PM';
url= 'http://minorplanetcenter.net/tmp/2005_PM.dat';
urlmidstep=urlread(urlsearch);
urldata=urlread(url);
received= textscan(urldata , '%5s %7s %1s %1s %1s %17s %12s %12s %9s %6s %6s %3s ' ,'delimiter', '', 'whitespace', '');
data_received = received{:}
urlmidstep's function is just to do a "search", in order to be able to create the temporary .dat file. This data is then stored in urldata, which is a long char array. When I then use textscan in MATLAB, I get 12 columns as desired, which are stored in a cell array data_received.
However, in Octave I get various warning messages: warning: strread: field width '%5s' (fmt spec # 1) extends beyond actual word limit (for various field widths). My question is, why is my result different in Octave and how could I fix this? Shouldn't Octave behave the same as MATLAB, as in theory any differences should be dealt with as bugs?
Surely specifying the width of the strings and leaving both the delimiter and whitespace input arguments empty should tell the function to only deal with width of string, allowing spaces to be a valid characters.
Any help would be much appreciated.
I thinhk textscan works differently in MATLAB and Octave. To illustrate let's simplify the example. The code:
test_line = 'K05P00M C2003 01 28.38344309 37 57.87 +11 05 14.9 n~1HzV645';
test = textscan(test_line,'%5s','delimiter','');
test{:}
will would yield the following in MATLAB:
>> test{:}
ans =
'K05P0'
'0M C'
'2003 '
'01 28'
'.3834'
'4309 '
'37 57'
'.87 +'
'11 05'
'14.9 '
'n~1Hz'
'V645'
whereas in Octave, you get:
>> test{:}
ans =
{
[1,1] = K05P0
[2,1] = C2003
[3,1] = 01
[4,1] = 28.38
[5,1] = 37
[6,1] = 57.87
[7,1] = +11
[8,1] = 05
[9,1] = 14.9
[10,1] = n~1Hz
}
So it looks like Octave jumps to the next word and discards any remaining character in the current word, whereas MATLAB treats the whole string as one continuous word.
Why that is and which is one is correct, I do not know, but hopefully it'll point you in the right direction for understanding what is going on. You can try adding the delimiter to see how it affects the results.

`backspace` in gfortran (old and new)

Suppose I want to append a line to a file in Fortran. Using a recent version (4.7) of gfortran, I find that this works:
program test
integer :: lun=10, i=0
open(FILE='test.dat', UNIT=lun)
do
read(lun, *, END=20) i
print *, i
end do
20 backspace(lun)
write(lun, *), i+1
end program test
In gfortran 4.4 however, it overwrites the last line. To append, I find I need to use
20 continue
instead of backspace.
What is up with that? How would you handle this in a real program?
program test
integer :: lun=10, i=0,io
open(FILE='test.dat', UNIT=lun, POSITION="append")
backspace(lun,iostat=io)
if (io==0) then
read(lun,*) i
else
i = 0
end if
write(lun, *) i+1
end program test

Fortran read of data with * to signify similar data

My data looks like this
-3442.77 -16749.64 893.08 -3442.77 -16749.64 1487.35 -3231.45 -16622.36 902.29
.....
159*2539.87 10*0.00 162*2539.87 10*0.00
which means I start with either 7 or 8 reals per line and then (towards the end) have 159 values of 2539.87 followed by 10 values of 0 followed by 162 of 2539.87 etc. This seems to be a space-saving method as previous versions of this file format were regular 6 reals per line.
I am already reading the data into a string because of not knowing whether there are 7 or 8 numbers per line. I can therefore easily spot lines that contain *. But what then? I suppose I have to identify the location of each * and then identify the integer number before and real value after before assigning to an array. Am I missing anything?
Read the line. Split it into tokens delimited by whitespace(s). Replace the * in tokens that have it with space. Then read from the string one or two values, depending on wheather there was an asterisk or not. Sample code follows:
REAL, DIMENSION(big) :: data
CHARACTER(LEN=40) :: token
INTEGER :: iptr, count, idx
REAL :: val
iptr = 1
DO WHILE (there_are_tokens_left)
... ! Get the next token into "token"
idx = INDEX(token, "*")
IF (idx == 0) THEN
READ(token, *) val
count = 1
ELSE
! Replace "*" with space and read two values from the string
token(idx:idx) = " "
READ(token, *) count, val
END IF
data(iptr:iptr+count-1) = val ! Add "val" "count" times to the list of values
iptr = iptr + count
END DO
Here I have arbitrarily set the length of the token to be 40 characters. Adjust it according to what you expect to find in your input files.
BTW, for the sake of completeness, this method of compressing something by replacing repeating values with value/repetition-count pairs is called run-length encoding (RLE).
Your input data may have been written in a form suitable for list directed input (where the format specification in the READ statement is simply ''*''). List directed input supports the r*c form that you see, where r is a repeat count and c is the constant to be repeated.
If the total number of input items is known in advance (perhaps it is fixed for that program, perhaps it is defined by earlier entries in the file) then reading the file is as simple as:
REAL :: data(size_of_data)
READ (unit, *) data
For example, for the last line shown in your example on its own ''size_of_data'' would need to be 341, from 159+10+162+10.
With list directed input the data can span across multiple records (multiple lines) - you don't need to know how many items are on each line in advance - just how many appear in the next "block" of data.
List directed input has a few other "features" like this, which is why it is generally not a good idea to use it to parse "arbitrary" input that hasn't been written with it in mind - use an explicit format specification instead (which may require creating the format specification on the fly to match the width of the input field if that is not know ahead of time).
If you don't know (or cannot calculate) the number of items in advance of the READ statement then you will need to do the parsing of the line yourself.

Resources