Is there a way to compare the format in which a line of a text file was written in Fortran? - string

I'm developing a Fortran program that must obtain some data from a text file and generate another text file using specific data from the first one.
The input file have many lines written in several specific formats which I know of. Although I know the formats, the lines in this file are generated in a "random way".
It would be much easier to generate the output file if I could compare the format in which each line was written, then I would know exactly what data I can get from that line of the input file to use it in the output file.
What I need is something like, for example, knowing that the format of the line read and stored in the LINHA variable is described in the FORMATO variable, do something like:
    
IF (FORMATO = '(1X, 15,3F8.1,2 (5A, 1X))') THEN
READ (LINHA, '(6X, F8.1)') my_variable
END IF
Because there might be another format such as
'(6A, 2F8.1, F8.6,2 (6A))'
in which, if I use the same READ statement, I will read an F8.1 variable in my_variable, however this value is not the correct one.

A (not so elegant) work-around that I can think of is to read the entire line using the advance = no option of read() and parse each character in the line separately. While doing so, you may count white spaces or other specific characters that you know of and then identify the different formats from there.
It would be helpful if you could give more specifications of the nature of the task.

The best option is to read without format, keeping each line in a character array. Then read the line variable as an internal file with the required format using the variable IOSTAT in order to check if the format is the correct.
INT max_size = 80
CHARACTER(LEN=max_size) :: line
READ(*,*) line
READ(line,'(1X, 15,3F8.1,2 (5A, 1X))',IOSTAT=ios) var1, var2, ...

Problem solved using a mixture of some of the suggestions posted.
I read each of the lines of the input file in an internal variable (RLINFILE) in the format '(A165)'. After that, I read all the contents of the string that I put in this internal variable in several dummy variables, using the format I knew of the lines from where I wanted to get some information (read all the information of the line in the desired and get IOSTAT = 0 guarantee that this is the correct line), so if the result of the reading is ok (IOSTAT = 0), it is because the line I just read was the correct one for the information I wanted, so I store the contents of some of the dummy variables that represent the values that interest me. In the code, the solution looked something like this:
OPEN(UNIT=LU1,FILE=RlinName,STATUS='OLD')
ilin = 0
formato = '(14X,A,1X,F7.1,1X,F7.1,5X,A,1X,A,1X,A,5X,A,I5,1X,A,I3,3F8.1,A,A,A,1X,A,2(1X,F8.2),1X,A,1X,A)'
DO WHILE (.TRUE.)
READ(LU1,'(A165)',END=300) RLINFILE
READ(RLINFILE,formato,IOSTAT=linhaok) dum2_a1,dum2_f1,dum2_f2,dum2_a2,dum2_a3,dum2_a4,dum2_a5,dum2_i1,dum2_a6,dum2_i2,dum2_f3,dum2_f4,dum2_f5,dum2_a7,dum2_a8,dum2_a9,dum2_a10,dum2_f6,dum2_f7,dum2_a11,dum2_a12
IF(linhaok.EQ.0) THEN
ilin = ilin+1
rlin_lshu(ilin) = dum2_a4
rlin_nbpa(ilin) = dum2_i1
rlin_ncir(ilin) = dum2_i2
rlin_ppij(ilin) = dum2_f3
rlin_pqij(ilin) = dum2_f4
rlin_tapn(ilin) = dum2_a7
END IF
END DO
300 CLOSE(UNIT=LU1)

The description of the problem you are trying to solve is a bit vague to me, but the simplest solutions that comes to my mind, given the description of the problem, is to modify the original code that generates the input data file, to write the used Fortran READ format before the data line in the input file. This way, you can read the format as a string and use it in the subsequent data IO in your second code.
If you describe the specific task your tryting to accomplish in more details, perhaps more experienced Fortranners could help.

Related

Can I format variables like strings in python?

I want to use printing command bellow in many places of my script. But I need to keep replacing "Survived" with some other string.
print(df.Survived.value_counts())
Can I automate the process by formating variable the same way as string? So if I want to replace "Survived" with "different" can I use something like:
var = 'different'
text = 'df.{}.value_counts()'.format(var)
print(text)
unfortunately this prints out "df.different.value_counts()" as as a string, while I need to print the value of df.different.value_counts()
I'm pretty sure alot of IDEs, have this option that is called refactoring, and it allows you to change a similar line of code/string on every line of code to what you need it to be.
I'm aware of VSCode's way of refactoring, is by selecting a part of the code and right click to select the option called change all occurances. This will replace the exact code on every line if it exists.
But if you want to do what you proposed, then eval('df.{}.value_counts()'.format(var)) is an option, but this is very unsecured and dangerous, so a more safer approach would be importing the ast module and using it's literal_eval function which is safer. ast.literal_eval('df.{}.value_counts()'.format(var)).
if ast.literal_eval() doesn't work then try this final solution that works.
def cat():
return 1
text = locals()['df.{}.value_counts'.format(var)]()
Found the way: print(df[var].value_counts())

Fortran90 cray writing unformatted array using "*"

I have a program, written in fortran90, that is writing an array to a file, but for some reason is using an asterix to represent multiple columns:
8*9, 4, 2*9, 4
later on reading from the file I am getting I/O errors:
lib-4190 : UNRECOVERABLE library error
A numeric input field contains an invalid character.
Encountered during a list-directed READ from unit 10 Fortran unit 10 is connected to a sequential formatted text file:
Does anyone have any idea why this is happening, and if there is a flag to feed to the compiler to prevent it. I'm using the cray fortran compiler, and the write statement looks like this:
write (lun,*) nsf_species(bundle%species(1:bundle%n_prim))
Update:
The line reading in the data file looks like:
read (lun,*) Info(ifile)%alpha_i(1:size)
I have checked to ensure that it is this line that is causing the problem.
This compression of list-directed output is a very useful feature of the Cray Compilation Environment when writing out large amounts of data. This compressed output will, however, not be read in correctly, as you point out (which is less useful).
You can modify this behaviour, not using a compiler flag but by using the "assign" command.
Consider this sample code:
PROGRAM test
IMPLICIT NONE
INTEGER :: u
OPEN(UNIT=u,FILE="f1",FORM="FORMATTED",STATUS="UNKNOWN")
WRITE(u,*) 0,0,0
CLOSE(u)
OPEN(UNIT=u,FILE="f2",FORM="FORMATTED",STATUS="UNKNOWN")
WRITE(u,*) 0,0,0
CLOSE(u)
END PROGRAM test
We first build with CCE and execute. Files f1 and f2 both contain the compressed output form:
$ ftn -o test.x test.F90
$ ./test.x
$ cat f1
3*0
$ cat f2
3*0
Now we will use "assign" to modify the format in file f2. First we need to define a filename to hold the assign information:
$ export FILENV=my_filenenv
Now we use assign to switch off the compressed output for file f2:
$ assign -y on f:f2
Now we rerun the experiment (without needing to recompile):
$ ./test.x
$ cat f1
3*0
$ cat f2
0, 0, 0
There are options to do this for all files, for certain filename patterns or many other cases.
There are other things that assign can do. See "man assign" with PrgEnv-cray loaded for more details.
The write statement is using list directed formatting (it is still a formatted output statement - "formatted" means "formatted such that a human can read it")- as specified by the * inside the parenthesised part of the statement. The rules for list directed output give a great deal of freedom to the compiler. Typically, if you actually care about the details of the output, you should provide an explicit format.
One of the rules that does apply is that the resulting output should generally be suitable for list directed input. But there are some rather surprising rules for what is permitted as input for list directed formatting. One such feature is that you can specify in the input text a repeat count for an input values using the syntax repeat*value.
The compiler has noticed that there are repeat values in the output, so it has used this repeat count feature.
I don't know why you get an error message when reading the file under list directed input - as the line you show is a valid input line for list directed input. Make sure that the line causing the error is actually the line that you show.
A simple workaround solution would be to change the write statement so that it does not use the compressed format. e.g. change to:
write (lun,'(*(I5))') nsf_species(bundle%species(1:bundle%n_prim))
The '*' allows an arbitrary number of repeats of the specified format and should suppress the compressed output format.
However, if the compiler outputs in compressed format than it should be able to read back in in the same compressed format. Hopefully the helpdesk will be able to get to the root of why that does not work.

need guidance with basic function creation in MATLAB

I have to write a MATLAB function with the following description:
function counts = letterStatistics(filename, allowedChar, N)
This function is supposed to open a text file specified by filename and read its entire contents. The contents will be parsed such that any character that isn’t in allowedChar is removed. Finally it will return a count of all N-symbol combinations in the parsed text. This function should be stored in a file name “letterStatistics.m” and I made a list of some commands and things of how the function should be organized according to my professors' lecture notes:
Begin the function by setting the default value of N to 1 in case:
a. The user specifies a 0 or negative value of N.
b. The user doesn’t pass the argument N into the function, i.e., counts = letterStatistics(filename, allowedChar)
Using the fopen function, open the file filename for reading in text mode.
Using the function fscanf, read in all the contents of the opened file into a string variable.
I know there exists a MATLAB function to turn all letters in a string to lower case. Since my analysis will disregard case, I have to use this function on the string of text.
Parse this string variable as follows (use logical indexing or regular expressions – do not use for loops):
a. We want to remove all newline characters without this occurring:
e.g.
In my younger and more vulnerable years my father gave me some advice that I've been turning over in my mind ever since.
In my younger and more vulnerableyears my father gave me some advicethat I’ve been turning over in my mindever since.
Replace all newline characters (special character \n) with a single space: ' '.
b. We will treat hyphenated words as two separate words, hence do the same for hyphens '-'.
c. Remove any character that is not in allowedChar. Hint: use regexprep with an empty string '' as an argument for replace.
d. Any sequence of two or more blank spaces should be replaced by a single blank space.
Use the provided permsRep function, to create a matrix of all possible N-symbol combinations of the symbols in allowedChar.
Using the strfind function, count all the N-symbol combinations in the parsed text into an array counts. Do not loop through each character in your parsed text as you would in a C program.
Close the opened file using fclose.
HERE IS MY QUESTION: so as you can see i have made this list of what the function is, what it should do, and using which commands (fclose etc.). the trouble is that I'm aware that closing the file involves use of 'fclose' but other than that I'm not sure how to execute #8. Same goes for the whole function creation. I have a vague idea of how to create a function using what commands but I'm unable to produce the actual code.. how should I begin? Any guidance/hints would seriously be appreciated because I'm having programmers' block and am unable to start!
I think that you are new to matlab, so the documentation may be complicated. The root of the problem is the basic understanding of file I/O (input/output) I guess. So the thing is that when you open the file using fopen, matlab returns a pointer to that file, which is generally called a file ID. When you call fclose you want matlab to understand that you want to close that file. So what you have to do is to use fclose with the correct file ID.
fid = open('test.txt');
fprintf(fid,'This is a test.\n');
fclose(fid);
fid = 0; % Optional, this will make it clear that the file is not open,
% but it is not necessary since matlab will send a not open message anyway
Regarding the function creation the syntax is something like this:
function out = myFcn(x,y)
z = x*y;
fprintf('z=%.0f\n',z); % Print value of z in the command window
out = z>0;
This is a function that checks if two numbers are positive and returns true they are. If not it returns false. This may not be the best way to do this test, but it works as example I guess.
Please comment if this is not what you want to know.

Same for loop, giving out two different results using .write()

this is my first time asking a question so let me know if I am doing something wrong (post wise)
I am trying to create a function that writes into a .txt but i seem to get two very different results between calling it from within a module, and writing the same loop in the shell directly. The code is as follows:
def function(para1, para2): #para1 is a string that i am searching for within para2. para2 is a list of strings
with open("str" + para1 +".txt", 'a'. encoding = 'utf-8') as file:
#opens a file with certain naming convention
n = 0
for word in para2:
if word == para1:
file.write(para2[n-1]+'\n')
print(para2[n-1]) #intentionally included as part of debugging
n+=1
function("targetstr". targettext)
#target str is the phrase I am looking for, targettext is the tokenized text I am
#looking through. this is in the form of a list of strings, that is the output of
#another function, and has already been 'declared' as a variable
when I define this function in the shell, I get the correct words appearing. However, when i call this same function through a module(in the shell), nothing appears in the shell, and the text file shows a bunch of numbers (eg: 's93161), and no new lines.
I have even gone to the extent of including a print statement right after declaration of the function in the module, and commented everything but the print statement, and yet nothing appears in the shell when I call it. However, the numbers still appear in the text file.
I am guessing that there is a problem with how I have defined the parameters or how i cam inputting the parameters when I call the function.
As a reference, here is the desired output:
‘She
Ashley
there
Kitty
Coates
‘Let
let
that
PS: Sorry if this is not very clear as I have very limited knowledge on speaking python
I have found the solution to issue. Turns out that I need to close the shell and restart everything before the compiler recognizes the changes made to the function in the module. Thanks to those who took a look at the issue, and those who tried to help.

Reading all but last few lines of a data file

I can easily skip the header of a data file using getline, but then when I parse through the data file and get to the footer of the file, I end up stuck in a loop because the program is trying to parse columns of data that no longer exist. Is there an easy way to stop reading when there is no longer data in the line? It looks like there is a blank line followed by some footer information, but I cannot guarantee that all of my data files will look like that (i.e. I need something pretty generic).
Looking at your existing code (edit your question and put it there, not in a comment), I see you have nested loops. But what you really want is one loop with two reasons to exit.
while ((q < 16) && (liness >> temp)) { ... }
Read the line into a string, parse if only if you see \n at the end.

Resources