Opening text file and replacing two strings in Matlab - string

I need to replace two seperate strings in a text file and subsequently save the altered version as a new text file.
So far I have the following code:
fid = fopen('original_file.txt','rt') ;
X = fread(fid) ;
fclose(fid) ;
X = char(X.') ;
Y = strrep(X, 'results1.csv', 'results2.csv') ;
Z = strrep(X, 'plot1', 'plot2') ;
fid2 = fopen('new_file.txt','wt') ;
fwrite(fid2,Y) ;
fwrite(fid2,Z) ;
fclose (fid2) ;
The problem with this code is that it simply doubles the length of the text file - In other words the new_file.txt has twice as many lines as original_file.txt.
First the content is written with results1.csv changed to results2.csv then the same content is appended with plot1 changed to plot2.
Can someone point out what I'm missing here?

The problem is that you are creating two variables Y and Z and writing both variables to new_file.txt. To replace two separate strings, use the strrep function twice:
fid = fopen('original_file.txt','rt') ;
X = fread(fid) ;
fclose(fid) ;
X = char(X.') ;
Y = strrep(X, 'results1.csv', 'results2.csv') ;
Z = strrep(Y, 'plot1', 'plot2') ; % replace the second string, after the first replacement
fid2 = fopen('new_file.txt','wt') ;
fwrite(fid2,Z) ; % write just Z, with both replacements
fclose (fid2) ;

Related

bug in python target of antlr

The python code generated by antlr-4.9 has some syntax problems. Eg, for the following antlr grammar:
e returns [ObjExpr v]
: a=e op=('*'|'/') b=e {
$v = ObjExpr($op.type)
$v.e1 = $a.v
$v.e2 = $b.v
}
| INT {
$v = ObjExpr(21)
$v.i = $INT.int
}
;
MUL : '*' ;
DIV : '/' ;
INT : [0-9]+ ;
NEWLINE:'\r'? '\n' ;
WS : [ \t]+ -> skip ;
The code generated is:
localctx.v = ObjExpr((0 if localctx.op is None else localctx.op.type())
localctx.v.e1 = localctx.a.v
localctx.v.e2 = localctx.b.v
Whereas, the correct code should be:
localctx.v = ObjExpr((0 if localctx.op is None else localctx.op.type))
localctx.v.e1 = localctx.a.v
localctx.v.e2 = localctx.b.v
i.e., the code indentation is wrong, and the number of braces dont match. Manually editing the generated parser file to fix these errors makes the code run properly. How do I report this bug and get it fixed?

Conventions for metadata attributes in netCDF for compound data types

NetCDF allows (at least in its version 4 format based on HDF5) to create compound data types (very similar to a C struct). Each component has a label and a type and a position in the compound type. For example, for a data set of statistics, we could use the compound type defined by [('min', 'float'), ('max', 'float'), ('avg', 'float'), ('std', 'float')] has as its second component a float labeled max.
Now, netCDF also allows for adding metadata. These typically follow cenventions, such as the NetCDF Climate and Forecast (CF) Metadata Conventions. This is useful so that other users of the generated netCDF file can easily understand the metadata.
But I have not found conventions specifically dealing with compound data types, e.g., to give metadata specifically for one component of the compound data.
Are there such conventions?
If not or also, what is being used in practice?
If this is not used, what do you advise and why? (I was thinking of using multi-line attributes, so separated by \n, with a component-specific label to start each line, such as [avg] or #avg.)
To stay within the CF conventions you could create a separate variable for each member of the compound type and use the ancillary_variables attribute to indicate that they are related:
netcdf test {
dimensions:
time = 3 ;
lat = 36 ;
lon = 36 ;
variables:
double time(time) ;
time:long_name = "Time" ;
time:standard_name = "time" ;
time:units = "Days since 1970-01-01 00:00" ;
time:calendar = "standard" ;
float ctp(time, lat, lon) ;
ctp:_FillValue = -999.f ;
ctp:long_name = "Cloud Top Pressure" ;
ctp:standard_name = "air_pressure_at_cloud_top" ;
ctp:units = "Pa" ;
ctp:cell_methods = "time: mean" ;
ctp:ancillary_variables = "ctp_std ctp_min ctp_max" ;
float ctp_std(time, lat, lon) ;
ctp_std:_FillValue = -999.f ;
ctp_std:long_name = "Cloud Top Pressure Standard Deviation" ;
ctp_std:units = "Pa" ;
ctp_std:cell_methods = "time: standard_deviation" ;
float ctp_min(time, lat, lon) ;
ctp_min:_FillValue = -999.f ;
ctp_min:long_name = "Cloud Top Pressure Minimum" ;
ctp_min:units = "Pa" ;
ctp_min:cell_methods = "time: minimum" ;
float ctp_max(time, lat, lon) ;
ctp_max:_FillValue = -999.f ;
ctp_max:long_name = "Cloud Top Pressure Maximum" ;
ctp_max:units = "Pa" ;
ctp_max:cell_methods = "time: maximum" ;
}
You could then add metadata as usual via the variables' attributes. For example, the cell_methods attribute could be used to describe the applied statistics.
If you want to stick to the compound datatype, there is a ticket about vector quantities which might be related (although it is quite old): https://cf-trac.llnl.gov/trac/ticket/79

reading integers from a string

I want to read a line from a file, initialize an array from that line and then display the integers.
Why is is not reading the five integers in the line? I want to get output 1 2 3 4 5, i have 1 1 1 1 1
open Array;;
open Scanf;;
let print_ints file_name =
let file = open_in file_name in
let s = input_line(file) in
let n = ref 5 in
let arr = Array.init !n (fun i -> if i < !n then sscanf s "%d" (fun a -> a) else 0) in
let i = ref 0 in
while !i < !n do
print_int (Array.get arr !i);
print_string " ";
i := !i + 1;
done;;
print_ints "string_ints.txt";;
My file is just: 1 2 3 4 5
You might want to try the following approach. Split your string into a list of substrings representing numbers. This answer describes one way of doing so. Then use the resulting function in your print_ints function.
let ints_of_string s =
List.map int_of_string (Str.split (Str.regexp " +") s)
let print_ints file_name =
let file = open_in file_name in
let s = input_line file in
let ints = ints_of_string s in
List.iter (fun i -> print_int i; print_char ' ') ints;
close_in file
let _ = print_ints "string_ints.txt"
When compiling, pass str.cma or str.cmxa as an argument (see this answer for details on compilation):
$ ocamlc str.cma print_ints.ml
Another alternative would be using the Scanf.bscanf function -- this question, contains an example (use with caution).
The Scanf.sscanf function may not be particularly suitable for this task.
An excerpt from the OCaml manual:
the scanf facility is not intended for heavy duty lexical analysis and parsing. If it appears not expressive enough for your needs, several alternative exists: regular expressions (module Str), stream parsers, ocamllex-generated lexers, ocamlyacc-generated parsers
There is though a way to parse a string of ints using Scanf.sscanf (which I wouldn't recommend):
let rec int_list_of_string s =
try
Scanf.sscanf s
"%d %[0-9-+ ]"
(fun n rest_str -> n :: int_list_of_string rest_str)
with
| End_of_file | Scanf.Scan_failure _ -> []
The trick here is to represent the input string s as a part which is going to be parsed into a an integer (%d) and the rest of the string using the range format: %[0-9-+ ]", which will match the rest of the string, containing only decimal digits 0-9, the - and + signs, and whitespace .

SAS simplify the contents of a variable

In SAS, I've a variable V containing the following value
V=1996199619961996200120012001
I'ld like to create these 2 variables
V1=19962001 (= different modalities)
V2=42 (= the first modality appears 4 times and the second one appears 2 times)
Any idea ?
Thanks for your help.
Luc
For your first question (if I understand the pattern correctly), you could extract the first four characters and the last four characters:
a = substr(variable, 1,4)
b = substrn(variable,max(1,length(variable)-3),4);
You could then concatenate the two.
c = cats(a,b)
For the second, the COUNT function can be used to count occurrences of a string within a string:
http://support.sas.com/documentation/cdl/en/lefunctionsref/63354/HTML/default/viewer.htm#p02vuhb5ijuirbn1p7azkyianjd8.htm
Hope this helps :)
Make it a bit more general;
%let modeLength = 4;
%let maxOccur = 100; ** in the input **;
%let maxModes = 10; ** in the output **;
Where does a certain occurrence start?;
%macro occurStart(occurNo);
&modeLength.*&occurNo.-%eval(&modeLength.-1)
%mend;
Read the input;
data simplified ;
infile datalines truncover;
input v $%eval(&modeLength.*&maxOccur.).;
Declare output and work variables;
format what $&modeLength..
v1 $%eval(&modeLength.*&maxModes.).
v2 $&maxModes..;
array w {&maxModes.}; ** what **;
array c {&maxModes.}; ** count **;
Discover unique modes and count them;
countW = 0;
do vNo = 1 to length(v)/&modeLength.;
what = substr(v, %occurStart(vNo), &modeLength.);
do wNo = 1 to countW;
if what eq w(wNo) then do;
c(wNo) = c(wNo) + 1;
goto foundIt;
end;
end;
countW = countW + 1;
w(countW) = what;
c(countW) = 1;
foundIt:
end;
Report results in v1 and v2;
do wNo = 1 to countW;
substr(v1, %occurStart(wNo), &modeLength.) = w(wNo);
substr(v2, wNo, 1) = put(c(wNo),1.);
put _N_= v1= v2=;
end;
keep v1 v2;
The data I testes with;
datalines;
1996199619961996200120012001
197019801990
20011996199619961996200120012001
;
run;

How to read a C generated binary file in Lua

I want to read a 32 bit integer binary file provided by another program. The file contains only integer and no other characters (like spaces or commas). The C code to read this file is as follows:
FILE* pf = fopen("C:/rktemp/filename.dat", "r");
int sz = width*height;
int* vals = new int[sz];
int elread = fread((char*)vals, sizeof(int), sz, pf);
for( int j = 0; j < height; j++ )
{
for( int k = 0; k < width; k++ )
{
int i = j*width+k;
labels[i] = vals[i];
}
}
delete [] vals;
fclose(pf);
But I don't know how to read this file into array using Lua.
I've tried to read this file using io.read, but part of the array looks like this:
~~~~~~xxxxxxxxyyyyyyyyyyyyyyzzzzzzzz{{{{{{{{{|||||||||}}}}}}}}}}}~~~~~~~~~xxxxxxxyyyyyyyyyyyyyyzzzzzz{{{{{{{{{{|||||||||}}}}}}}}}}}~~~~~~~~~xxyyyyyyyyyyyyyzzzzz{{{{{{|||}}}yyyyyyyyyyyz{{{yyyyyyyyÞľūơǿȵɶʢ˺̤̼ͽаҩӱľǿجٴȵɶʢܷݸ˺໻⼼ӱľǿ
Also the Matlab code to read this file is like this:
row = image_size(1);
colomn = image_size(2);
fid = fopen(data_path,'r');
A = fread(fid, row * colomn, 'uint32')';
A = A + 1;
B = reshape(A,[colomn, row]);
B = B';
fclose(fid);
I've tried a function to convert bytes to integer, my code is like this:
function bytes_to_int(b1, b2, b3, b4)
if not b4 then error("need four bytes to convert to int",2) end
local n = b1 + b2*256 + b3*65536 + b4*16777216
n = (n > 2147483647) and (n - 4294967296) or n
return n
end
local sup_filename = '1.dat'
fid = io.open(sup_filename, "r")
st = bytes_to_int(fid:read("*all"):byte(1,4))
print(st)
fid:close()
But it still not read this file properly.
You are only calling bytes_to_int once. You need to call it for every int you want to read. e.g.
fid = io.open(sup_filename, "rb")
while true do
local bytes = fid:read(4)
if bytes == nil then break end -- EOF
local st = bytes_to_int(bytes:byte(1,4))
print(st)
end
fid:close()
Now you can use the new feature of Lua language by calling string.unpack , which has many conversion options for format string. Following options may be useful:
< sets little endian
> sets big endian
= sets native endian
i[n] a signed int with n bytes (default is native size)
I[n] an unsigned int with n bytes (default is native size)
The arch of your PC is unknown, so I assume the data to read is unsigned and native-endian.
Since you are reading binary data from the file, you should use io.open(sup_filename, "rb").
The following code may be useful:
local fid = io.open(sup_filename, "rb")
local contents = fid:read("a")
local now
while not now or now < #contents do
local n, now = string.unpack("=I4", contents, now)
print(n)
end
fid:close()
see also: Lua 5.4 manual

Resources