Fortran 77: output floats with variable widths - fortran77

I need to output lots of (>20 million) float values to a text file from a Fortran 77 program. I'd like to keep the output file as small as possible. Therefore I would like to output the floats in a compact way, without resorting to binary.
I know the precision I need (usually two digits right of the decimal point), so in C I would use printf("%.2f %.2f", val1, val2); Is something like this possible in Fortran 77? All I found was that I have to set the field width explicitly (like in format (f8.2,x,f8.2)). This wastes lots of space, when I don't know the range of the output numbers beforehand.
If it is not possible in Fortran 77, do newer Fortran standards offer a way to do this?

The Fortran 2008 standard allows an edit descriptor such as f0.2 in response to which the output is the smallest possible field width which writes the whole part of the number followed by a decimal point and two fractional digits. I think that this has been part of the language standard since Fortran 90, possibly longer.

If you have a number, X, then INT(LOG10(X))+1 is the size of the integer part of your number (number of digits of the integer part). So, you just have to make some custom FORMAT labels for each of the values you want to print.
It is not very elegant, but I think it will help you achieve what you want.

I know this might come across as pedantic and unhelpful, but hear me out. It sounds like you are doing bad science. If your instrument is spitting out numbers from 1000.00 to 0.01, then your instrument is probably only accurate to one part in a hundred. So the number 9894.36 ought to be rounded to 9900 (no decimal point). All the other digits are not significant. Why is that relevant and helpful? Because you are wasting storage space if you are storing 9894.36. So, the answer is to use the g edit descriptor, which outputs in scientific notation. Then all of your numbers will take up the same space.

Related

Are there example Strings for unittests? (e.g. with control characters)

I have to write several unittests in a software of ours. In many ocasions I came across Strings which are Internally set/User set or even persisted (e.g. Database/XML etc.)
Does anyone know of a character-set or string(s) to do simple tests, whether input is equal to output?
I know how to test, my question regards what to test.
For numbers you e.g test values around 0 or - + numbers or if they're retrieved from a string, floating point numbers and integer numbers.
But for Text/String I'm not aware of standard sequences to test.
Here is the Big List of Naughty Strings - sounds like it might be what you need!
The Big List of Naughty Strings is an evolving list of strings which have a high probability of causing issues when used as user-input data.

SUM not working 'Invalid or missing field format'

I have an input file in this format: (length 20, 10 chars and 10 numerics)
jname1 0000500006
bname1 0000100002
wname1 0000400007
yname1 0000000006
jname1 0000100001
mname1 0000500012
mname2 0000700013
In my jcl I have defined my sysin data as such:
SYSIN DATA *
SORT FIELDS=(1,1,CH,A)
SUM FIELDS=(11,10,FD)
DATAEND
*
It works fine as long as I don't add the sum fields so I'm wondering if I'm using the wrong format for my numerics seeing as I know they start at field 11 and have a length of 10 the format is the only thing that could be wrong.
As you might have already realised the point of this JCL is to just list the values but grouped by the first letter of the name (so for the example data and JCL I have given it would group the numeric for mname1 and mname2 together but leave the other records untouched).
I'm kind of new at this so I was wonder what I need for the format if my numerics are like that in the input file.
If new to DFSORT, get hold of the DFSORT Getting Started guide for your version of DFSORT (http://www-01.ibm.com/support/docview.wss?uid=isg3T7000080).
This takes your through all the basic operations with many examples.
The DFSORT Application Programming Guide describes everything you need to know, in detail. Again with examples. Appendix C of that document contains all the data-types available (note, when you tried to use FD, FD is not valid data-type, so probably a typo). There are Tables throughout the document listing what data-types are available where, if there is a particular limit.
For advanced techniques, consult the DFSORT Smart Tricks publication here: http://www-01.ibm.com/support/docview.wss?uid=isg3T7000094
You need to understand a bit more the way data is stored on a Mainframe as well.
Decimals (which can be "packed-decimal" or "zoned-decimal") do not contain a decimal-point. The decimal-point is implied. In high-level languages you tell the compiler where the decimal-point is (in a fixed position) and the compiler does the alignments for you. In Assembler, you do everything yourself.
Decimals are 100% accurate, as there are machine-instructions which act directly on packed-decimal data giving packed-decimal results.
A field which actually contains a decimal-point, cannot be directly used in arithmetic.
An unsigned field is treated as positive when used in any arithmetic.
The SUM statement supports a limited number of numeric definitions, and you have chosen the correct one. It does not matter that your data is unsigned.
If the format of the output from SUM is not what you want, look at OPTION ZDPRINT (or NOZDPRINT).
If you want further formatting, you can use OUTREC or OUTFIL.
As an option to using SUM, you can use OUTFIL reporting functions (especially, although not limited to, if you want a report). You can use SECTIONS and TRAILER3 with TOT/TOTAL.
Something to watch for with SUM (which is not a problem with the reporting features) is if any given one (or more) of your SUMmed fields exceed the field size. To continue to use SUM if that happens, you need to extend the field in INREC and then get SUM to use the new, sufficient, size.
After some trial and error I finally found it, appearantly the format I needed to use was the ZD format (zoned decimal, signed), so my sysin becomes this:
SYSIN DATA *
SORT FIELDS=(1,1,CH,A)
SUM FIELDS=(11,10,ZD)
DATAEND
*
even though my records don't contain any decimals and they are unsigned, I don't really get it so if someone knows why it's like that please go ahead and explain it to me.
For now the way I'm going to remember it is this: Z = symbol for real (meaning integers so no decimals)

Why does the W3C XML Schema specification allow integers to have leading zeros?

Recently I had an example where in a xml message integer fields contained leading zeros. Unfortunately these zeros had relevance. One could argue why in the schema definition integer was chosen. But that is not my question. I was a little surprised leading zeros where allowed at all. So I looked up the specs which of course told me the supertype is decimal. But as expected specification don't really tell you why certain choices where made. So my question is really what is the rationale for allowing leading zeros at all? I mean numbers generally don't have leading zeros.
On a side note I guess the only way to add a restriction on leading zeros is by a pattern.
My recollection is that the XML Schema working group allowed leading zeroes in XSD decimals because they are allowed in normal decimal notation: 1, 01, 001, 0001, etc. all denote the same number in normal numerical notation. (But I don't actually remember that it was discussed at any length, so perhaps this is just my reason for believing it was the right thing to do and other WG members had other reasons for being satisfied with it.)
You are correct to suggest that the root of the problem is the use of xsd:integer as a type for a notation using strings of digits in which leading zeroes are significant (as for example in U.S. zip codes); I think you may be over-generous to say that one could argue about that decision. What possible arguments could one bring forward in favor of such an obviously erroneous choice?
Although numbers often doesn't have leading zeroes, parsing numbers almost always allows leading zeroes.
You don't want to disallow leading zeroes for numbers completely, because you want the option to write a number like 0.12 and not only like .12. As you want to allow at least one leading zero for floating point numbers, it would feel a bit restrictive to only allow one leading zero, and only for floating point numbers.
Sometimes numbers do have leading zeroes, for example the components in a date in ISO8601 format; 2014-05-02. If you want to parse a component it's convenient if the leading zero is allowed, so that you don't have to write extra code to remove it before parsing.
The XML specification just uses the same sets of rules for parsing numbers that is generally used for most formats and in most programming languages.

Fortran `write (*, '(3G24.16)')` error

I have a Fortran file that must write these complicated numbers, basically I can't change these numbers:
File name: complicatedNumbers.f
implicit none
write (*,'(3G24.16)') 0.4940656458412465-323, 8.651144521298990, 495.6336980600139
end
It's then run with gfortran -o outa complicatedNumbers.f on my Ubuntu, but this error comes up:
Error: Expected expression in WRITE statement at (1)
I'm sure it has something to do with the complicated numbers because there are no errors if I change the three complicated numbers into simple numbers such as 11.11, 22.2, 33.3.
This is actually a stripped-down version of a complex Fortran file that contains many variables and links to other files. So ideally, the 3G24.16 should not be changed.
What does the 3G24.16 mean?
How can I fix it so that I can ultimately print out these numbers with ./outa?
There is nothing syntactically wrong in the snippet you've shown us. However, your use of a file name with the suffix .f makes me think that the compiler is assuming that your code is written in fixed form. That is the usual default behaviour of gfortran. If that is the case it probably truncates that line at about the last , which means that the compiler sees
write (*,'(3G24.16)') 0.4940656458412465-323, 8.651144521298990,
and raises the complaint you have shared with us. Either join us in the 21st Century and switch to free form source files, change .f to .f90 and see what fun ensues, or continue the line correctly with some character in column 6 of the next line.
As to what 3G24.16 means, refer to your favourite Fortran reference material under the heading of data edit descriptors, in particular the g data edit descriptor.
Oh, and if my bandying about of the terms fixed form source and free form source bamboozles you, read about them in your favourite Fortran reference material too.
Three errors in your program :
as you clearly use the Fortran fixed format, instructions are limited to 72 characters (132 in free format)
the number 0.4940656458412465-323 is probably not correctly written. The exponent character is missing. Try 0.4940656458412465D-323 instead. Here Fortran computes the substraction => 0.4940656458412465-323 is replaced by -322.505934354159. Notice that I propose the exponent D (double precision). Writing 0.4940656458412465E-323 is inaccurate because, for a single precision number, the minimum value for the exponent is -127.
other numbers should have an exponent D0 too because, in single precision, the number of significant digits do not exceed 6.
Possible correction, always in fixed format :
implicit none
write (*,'(3G24.16)') 0.4940656458412465D-323,
& 8.651144521298990d0,
& 495.6336980600139d0
end

Why does System.Numerics.Complex use doubles instead of decimals?

I've been working with System.Numerics.Complex recently, and I've started to notice the typical floating-point "drift" where the value stored gets calculated a tenth of a millionth off or something like that, which is well-known and common with the float type and even the double type. I looked into the Complex struct, and sure enough, it used double variables. Why does it use double values to store its data and not decimal values, which are designed to prevent this? How do I work around this?
To answer your question:
doubles are several orders of magnitude faster, as operations are done at the hardware level
base-2 floats can actually be more accurate for large computations, as there is less "wobble" when shifting up and down exponents: 1 bit of precision is less than 1 decimal digit. Moreover, base-2 can use an implicit leading bit, which means they can represent more numbers than other bases.
complex numbers are typically used for scientific/engineering applications, where small relative errors of approx 10-16 are outweighed by other sources of error (e.g. due to measurement or the model).
decimals on the other hand are typically used for "accounting" type operations, where round-off error is typically negligible (i.e. addition of small numbers, multiplication by integers, etc.)

Resources