Fortran Read Float From Text File - Odd behavior [duplicate] - io

This question already has answers here:
Precision problems of real numbers in Fortran [duplicate]
(4 answers)
Closed 6 years ago.
Can someone please explain how Fortran reads in data, particularly from text files. I thought I understood the behavior and the formatting options of I/O but the below example has me puzzled.
I am attempting to read in three values from the text file Domain.txt which contains three lines shown below
221
500.0200
500.0000
This file is then read by my program below
program main
implicit none
integer :: N
real :: xmax,xmin
open(unit=1,file='Domain.txt')
read(1,*) N ! Read first line
read(1,*) xmax ! Read second line
read(1,*) xmin ! Read third line
print*, N
print*, xmax
print*, xmin
end program
The result of this program is
221
500.019989
500.000000
So my confusion arises with the second output for the xmax variable. Why would it read in the second line as 500.019989 and not 500.0200?
I have tried using fortran formatting format(fm.d) in the read statement to say only read in the first two digits after the decimal place, but I was never able to resolve the issue.
I am using gfortran 4.8.5. Any help would be appreciated. I also know this is somewhat a duplicate of the question asked here (Reading REAL's from file in FORTRAN 77 - odd results) but I do not have enough reputation to comment and ask a question about the solution.

The number 2/100 (which is the fractional part of 500.02)
does not have a finite representation in binary. It has
the infinite periodic representation
0.0000001010001111010111000010100011110101110000101...
So the value will be truncated or rounded to fit into the
representation model of floating point numbers and appear as such
during reading.
This is how all languages (including Fortran, C, C++, Java...)
with a binary representation of floating point numbers react.

Related

Read substrings from a string containing multiplication [duplicate]

This question already has answers here:
'*' and '/' not recognized on input by a read statement
(2 answers)
Closed 4 years ago.
I am a scientist programming in Fortran, and I came up with a strange behaviour. In one of my programs I have a string containing several "words", and I want to read all words as substrings. The first word starts with an integer and a wildcard, like "2*something".
When I perform an internal read on that string, I expect to read all wods, but instead, the READ function repeatedly reads the first substring. I do not understand why, nor how to avoid this behaviour.
Below is a minimalist sample program that reproduces this behaviour. I would expect it to read the three substrings and to print "3*a b c" on the screen. Instead, I get "a a a".
What am I doing wrong? Can you please help me and explain what is going on?
I am compiling my programs under GNU/Linux x64 with Gfortran 7.3 (7.3.0-27ubuntu1~18.04).
PROGRAM testread
IMPLICIT NONE
CHARACTER(LEN=1024):: string
CHARACTER(LEN=16):: v1, v2, v3
string="3*a b c"
READ(string,*) v1, v2, v3
PRINT*, v1, v2, v3
END PROGRAM testread
You are using list-directed input (the * format specifier). In list-directed input, a number (n) followed by an asterisk means "repeat this item n times", so it is processed as if the input was a a a b c. You would need to have as input '3*a' b c to get what you want.
I will use this as another opportunity to point out that list-directed I/O is sometimes the wrong choice as its inherent flexibility may not be what you want. That it has rules for things like repeat counts, null values, and undelimited strings is often a surprise to programmers. I also often see programmers complaining that list-directed input did not give an error when expected, because the compiler had an extension or the programmer didn't understand just how liberal the feature can be.
I suggest you pick up a Fortran language reference and carefully read the section on list-directed I/O. You may find you need to use an explicit format or change your program's expectations.
Following the answer of #SteveLionel, here is the relevant part of the reference on list-directed sequential READ statements (in this case, for Intel Fortran, but you could find it for your specific compiler and it won't be much different).
A character string does not need delimiting apostrophes or quotation marks if the corresponding I/O list item is of type default character, and the following is true:
The character string does not contain a blank, comma (,), or slash ( / ).
The character string is not continued across a record boundary.
The first nonblank character in the string is not an apostrophe or a quotation mark.
The leading character is not a string of digits followed by an asterisk.
A nondelimited character string is terminated by the first blank, comma, slash, or end-of-record encountered. Apostrophes and quotation marks within nondelimited character strings are transferred as is.
In total, there are 4 forms of sequential read statements in Fortran, and you may choose the option that best fits your need:
Formatted Sequential Read:
To use this you change the * to an actual format specifier. If you know the length of the strings at advance, this would be as easy as '(a3,a2,a2)'. Or, you could come with a format specifier that matches your data, but this generally demands you knowing the length or format of stuff.
Formatted Sequential List-Directed:
You are currently using this option (the * format descriptor). As we already showed you, this kind of I/O comes with a lot of magic and surprising behavior. What is hitting you is the n*cte thing, that is interpreted as n repetitions of cte literal.
As said by Steve Lionel, you could put quotation marks around the problematic word, so it will be parsed as one-piece. Or, as proposed by #evets, you could split or break your string using the intrinsics index or scan. Another option could be changing your wildcard from asterisk to anything else.
Formatted Namelist:
Well, that could be an option if your data was (or could be) presented in the namelist format, but I really think it's not your case.
Unformatted:
This may not apply to your case because you are reading from a character variable, and an internal READ statement can only be formatted.
Otherwise, you could split your string by means of a function instead of a I/O operation. There is no intrinsic for this, but you could come with one without much trouble (see this thread for reference). As you may have noted already, manipulating strings in fortran is... awkward, at least. There are some libraries out there (like this) that may be useful if you are doing lots of string stuff in Fortran.

FORTRAN 77 Read from .mtx file [duplicate]

This question already has answers here:
Is floating point math broken?
(31 answers)
Closed 6 years ago.
I've been trying to use Fortran for my research project, with the GNU Fortran compiler (gfortran), latest version,
but I've been encountering some problems in the way it processes real numbers. If you have for example the code:
program test
implicit none
real :: y = 23.234, z
z = y * 100000
write(*,*) y, z
end program
You'll get as output:
23.23999 2323400.0
I find this really strange.
Can someone tell me what's exactly happening here? Looking at z I can see that y does retain its precision, so for calculations that shouldn't be a problem I suppose. But why is the output of y not exactly the same as the value that I've specified, and what can I do to make it exactly the same?
This is not a problem - all you see is floating-point representation of the number in the computer. The computer cannot handle real numbers exactly, but only approximations of them. A good read about this can be found here: What Every Computer Scientist Should Know About Floating-Point Arithmetic.
Simply by replacing real with double precision, you can increase the number of significant decimal places from about six to about 15 on most platforms.
The general issue is not limited to Fortran, but the representation of base 10 real numbers in another base of finite precision. This computer science question is asked many times here.
For the specifically Fortran aspects, the declaration "real" will likely give you a single precision floating point. As will expressing a constant as "23.234" without a type qualifier. The constant "100000" without a decimal point is an integer so the expression "y * 100000" is causing an implicit conversion of an integer to a real because "y" is a real variable.
For previous some previous discussions of these issues see Extended double precision , Fortran: integer*4 vs integer(4) vs integer(kind=4) and Is There a Better Double-Precision Assignment in Fortran 90?
The problem here is not with Fortran, in fact it is not a problem at all. This is just a feature of floating-point arithmetic. If you think about how you would represent 23.234 as a 'single float' in binary, you would see that the number has to be saved to only so many decimals of precision.
The thing to remember about float point number is: numbers that look round and even in base-10 probably won't in binary.
For a brief overview of floating-point topics, check the Wikipedia article. And for a VERY thorough explanation, check out the canonical paper by Goldberg (PDF).

Fortran `write (*, '(3G24.16)')` error

I have a Fortran file that must write these complicated numbers, basically I can't change these numbers:
File name: complicatedNumbers.f
implicit none
write (*,'(3G24.16)') 0.4940656458412465-323, 8.651144521298990, 495.6336980600139
end
It's then run with gfortran -o outa complicatedNumbers.f on my Ubuntu, but this error comes up:
Error: Expected expression in WRITE statement at (1)
I'm sure it has something to do with the complicated numbers because there are no errors if I change the three complicated numbers into simple numbers such as 11.11, 22.2, 33.3.
This is actually a stripped-down version of a complex Fortran file that contains many variables and links to other files. So ideally, the 3G24.16 should not be changed.
What does the 3G24.16 mean?
How can I fix it so that I can ultimately print out these numbers with ./outa?
There is nothing syntactically wrong in the snippet you've shown us. However, your use of a file name with the suffix .f makes me think that the compiler is assuming that your code is written in fixed form. That is the usual default behaviour of gfortran. If that is the case it probably truncates that line at about the last , which means that the compiler sees
write (*,'(3G24.16)') 0.4940656458412465-323, 8.651144521298990,
and raises the complaint you have shared with us. Either join us in the 21st Century and switch to free form source files, change .f to .f90 and see what fun ensues, or continue the line correctly with some character in column 6 of the next line.
As to what 3G24.16 means, refer to your favourite Fortran reference material under the heading of data edit descriptors, in particular the g data edit descriptor.
Oh, and if my bandying about of the terms fixed form source and free form source bamboozles you, read about them in your favourite Fortran reference material too.
Three errors in your program :
as you clearly use the Fortran fixed format, instructions are limited to 72 characters (132 in free format)
the number 0.4940656458412465-323 is probably not correctly written. The exponent character is missing. Try 0.4940656458412465D-323 instead. Here Fortran computes the substraction => 0.4940656458412465-323 is replaced by -322.505934354159. Notice that I propose the exponent D (double precision). Writing 0.4940656458412465E-323 is inaccurate because, for a single precision number, the minimum value for the exponent is -127.
other numbers should have an exponent D0 too because, in single precision, the number of significant digits do not exceed 6.
Possible correction, always in fixed format :
implicit none
write (*,'(3G24.16)') 0.4940656458412465D-323,
& 8.651144521298990d0,
& 495.6336980600139d0
end

Paraview "possible mismatch of datasize with declaration" error

Paraview (v4.1.0 64-bit, OSX 10.9.2) is giving me the following error:
Generic Warning: In /Users/kitware/Dashboards/MyTests/NightlyMaster/ParaViewSuperbuild-Release/paraview/src/paraview/VTK/IO/Legacy/vtkDataReader.cxx, line 1388
Error reading ascii data. Possible mismatch of datasize with declaration.
I'm not sure why. I've double-checked that fields are all of the expected lengths, and none of the values are NaN, inf, or otherwise extremely large. The issue starts with the output from timestep 16 (0-15 produces no error). Graphically, steps 0-15 produce plots of my data as expected; step 16 shows the "Y/Yc" series having an unexpectedly large point (0.5625, 2.86616e+36).
Is fine:
http://www.filedropper.com/ring0000015
Produces error:
http://www.filedropper.com/ring0000016
I have been facing the same problem for the last 6 months and been struggling to find a solution. I was given the following reasons to explain the error(http://www.cfd-online.com/Forums/paraview/139451-error-while-reading-vtk-files-paraview.html#post503315):
It could be a problem due to the character used for the line ending (http://en.wikipedia.org/wiki/Newline)
In a nutshell:
a)On Windows, line transition is with CR+LF.
b)On Linux, line transition is with LF only.
c)On Mac, some older versions used CR only. Nowadays I guess it should use LF as well.
CR= "Carriage Return" byte
LF= "Line Feed" byte
There might be one or more values that are of type NaN or Inf or some other special computational numeric definition for non-real numbers. They might be readable on Linux, but not on Mac, perhaps of the next possibility. If this is the case,
Location based numeric definitions, aka Locale, might be triggering situations where values are being stored with commas or with a strange scientific notation. For example, if a value "1.0002" is stored as "1,0002" or even perhaps "1.0002ES+000"
I have viewed other forums, and they have generally stated #2 and #3 and the possible solutions -- it has in general worked. However, none of the above seemed to solve my problem.
I noticed that some of the stored solution values in the ASCII files were as small as 10.e-34. I had a feeling that the underflow conditions maybe be triggering problems. I put a check in my code for underflow conditions and rounded them off to 0. This fixed the issue, with the solution being displayed at all times without error messages.
This may not fix the Inf/NaN problems, but if the numbers in the vtk file are too large or too small (i.e. 1e-50, 1e45), this may cause the same error.
One solution in this case is to change the datatype specification. When I had this problem, I specified the datatype as "float", which uses a 32-bit floating point representation (same as "float32"). Changing it to "float64" uses a 64-bit double-precision representation, which is consistent with my C++ code that generated the vtk file that uses doubles. This may eliminate the problem.
If you are using Fortran, this problem also occur when you write to file but not close it in code.
For example:
do i=1,10
write(numb,'(i3)')i
open(unit=1, file='test'//numb//'.vtk')
write(1,*).......
enddo

Fortran 77: output floats with variable widths

I need to output lots of (>20 million) float values to a text file from a Fortran 77 program. I'd like to keep the output file as small as possible. Therefore I would like to output the floats in a compact way, without resorting to binary.
I know the precision I need (usually two digits right of the decimal point), so in C I would use printf("%.2f %.2f", val1, val2); Is something like this possible in Fortran 77? All I found was that I have to set the field width explicitly (like in format (f8.2,x,f8.2)). This wastes lots of space, when I don't know the range of the output numbers beforehand.
If it is not possible in Fortran 77, do newer Fortran standards offer a way to do this?
The Fortran 2008 standard allows an edit descriptor such as f0.2 in response to which the output is the smallest possible field width which writes the whole part of the number followed by a decimal point and two fractional digits. I think that this has been part of the language standard since Fortran 90, possibly longer.
If you have a number, X, then INT(LOG10(X))+1 is the size of the integer part of your number (number of digits of the integer part). So, you just have to make some custom FORMAT labels for each of the values you want to print.
It is not very elegant, but I think it will help you achieve what you want.
I know this might come across as pedantic and unhelpful, but hear me out. It sounds like you are doing bad science. If your instrument is spitting out numbers from 1000.00 to 0.01, then your instrument is probably only accurate to one part in a hundred. So the number 9894.36 ought to be rounded to 9900 (no decimal point). All the other digits are not significant. Why is that relevant and helpful? Because you are wasting storage space if you are storing 9894.36. So, the answer is to use the g edit descriptor, which outputs in scientific notation. Then all of your numbers will take up the same space.

Resources