Separate chars of a file in matlab - string

I have strings of 32 chars in a file (multiple lines).
What I want to do is to make a new file and put them there by making columns of 4 chars each.
For example I have:
00000000000FDAD000DFD00ASD00
00000000000FDAD000DFD00ASD00
00000000000FDAD000DFD00ASD00
....
and in the new file, I want them to appear like this:
0000 0000 000F DAD0 00DF D00A SD00
0000 0000 000F DAD0 00DF D00A SD00
Can you anybody help me? I am working for hours now and I can't find the solution.

First, open the input file and read the lines as strings:
infid = fopen(infilename, 'r');
C = textscan(infid, '%s', 'delimiter', '');
fclose(infid);
Then use regexprep to split the string into space-delimited groups of 4 characters:
C = regexprep(C{:}, '(.{4})(?!$)', '$1 ');
Lastly, write the modified lines to the output file:
outfid = fopen(outfilename, 'w');
fprintf(outfid, '%s\n', C{:});
fclose(outfid);
Note that this solution is robust enough to work on lines of variable length.

Import
fid = fopen('test.txt');
txt = textscan(fid,'%s');
fclose(fid);
Transform into a M by 28 char array, transpose and reshape to have a 4 char block on each column. Then add to the bottom a row of blanks and reshape back. Store each line in a cell.
txt = reshape(char(txt{:})',4,[]);
txt = cellstr(reshape([txt; repmat(' ',1,size(txt,2))],35,[])')
Write each cell/line to new file
fid = fopen('test2.txt','w');
fprintf(fid,'%s\r\n',txt{:});
fclose(fid);

Here's one way to do it in Matlab:
% read in file
fid = fopen('data10.txt');
data = textscan(fid,'%s');
fclose(fid);
% save new file
s = size(data{1});
newFid = fopen('newFile.txt','wt');
for t = 1:s(1) % format and save each row
line = data{1}{t};
newLine = '';
index = 1;
for k = 1:7 % seven sets of 4 characters
count = 0;
while count < 4
newLine(end + 1) = line(index);
index = index + 1;
count = count + 1;
end
newLine(end + 1) = ' ';
end
fprintf(newFid, '%s\n', newLine);
end
fclose(newFid);

Related

MATLAB: How load a list of filenames from .txt file

filelist.txt contains a list of files:
/path/file1.json
/path/file2.json
/path/fileN.json
Is there a (simple) MATLAB command that will accept filelist.txt and read each file as a string and store each string into a cell array?
Just use readtable, asking it to read each line in full.
>> tbl = readtable('filelist.txt','ReadVariableNames',false,'Delimiter','\n');
>> tbl.Properties.VariableNames = {'filenames'}
tbl =
3×1 table
filenames
__________________
'/path/file1.json'
'/path/file2.json'
'/path/fileN.json'
Then access the elements in a loop
for idx = 1:height(tbl)
this_filename = tbl.filenames{idx};
end
This problem is a bit to specific for a standard function. However, it is easily doable with the combination of two functions:
First, you have to open the file:
fid = fopen('filelist.txt');
Next you can read line by line with:
line_ex = fgetl(fid)
This function includes a counter. If you call the function the next time, it will read the second line and so on. You find more information here.
The whole code might look like this:
% Open file
fid = fopen('testabc');
numberOfLines = 3;
% Preallocate cell array
line = cell(numberOfLines, 1);
% Read one line after the other and save it in a cell array
for i = 1:numberOfLines
line{i} = fgetl(fid);
end
% Close file
fclose(fid);
For this replace the for loop with a while loop:
i=0;
while ~feof(fid)
i=i+1
line{1} = fgetl(fid)
end
Alternative to while loop: Retrieve the number of lines and use in Caduceus' for-loop:
% Open file
fid = fopen('testabc');
numberOfLines = numlinestextfile('testable'); % function defined below
% Preallocate cell array
line = cell(numberOfLines, 1);
% Read one line after the other and save it in a cell array
for i = 1:numberOfLines
line{i} = fgetl(fid);
end
% Close file
fclose(fid);
Custom function:
function [lineCount] = numlinestextfile(filename)
%numlinestextfile: returns line-count of filename
% Detailed explanation goes here
if (~ispc) % Tested on OSX
evalstring = ['wc -l ', filename];
% [status, cmdout]= system('wc -l filenameOfInterest.txt');
[status, cmdout]= system(evalstring);
if(status~=1)
scanCell = textscan(cmdout,'%u %s');
lineCount = scanCell{1};
else
fprintf(1,'Failed to find line count of %s\n',filenameOfInterest.txt);
lineCount = -1;
end
else
if (~ispc) % For Windows-based systems
[status, cmdout] = system(['find /c /v "" ', filename]);
if(status~=1)
scanCell = textscan(cmdout,'%s %s %u');
lineCount = scanCell{3};
disp(['Found ', num2str(lineCount), ' lines in the file']);
else
disp('Unable to determine number of lines in the file');
end
end
end

MATLAB string 2 number of table

I have a 3-year data in a string tableformat.txt. Three of its lines are given below:
12-13 Jan -10.5
14-15 Jan -9.992
15-16 Jan -8
How to change the 3rd column (-10.5, -9.992 and -8) of string to be (-10.500, -9.992 and -8.000) of number?
I have made the following script:
clear all; clc;
filename='tableformat.txt';
fid = fopen(filename);
N = 3;
for i = [1:N]
line = fgetl(fid)
a = line(10:12);
na = str2num(a);
ma(i) = na;
end
ma
which gives:
ma = -1 -9 -8
When I did this change: a = line(10:15);, I got:
Error message: Index exceeds matrix dimensions.
This will work for you.
clear all;
clc;
filename='tableformat.txt';
filename2='tableformat2.txt';
fid = fopen(filename);
fid2 = fopen(filename2,'w');
formatSpec = '%s %s %6.4f\n';
N = 3;
for row = [1:N]
line = fgetl(fid);
a = strsplit(line,' ');
a{3}=cellfun(#str2num,a(3));
fprintf(fid2, formatSpec,a{1,:});
end
fclose(fid);
fclose(fid2);

Replace a string in a text file

I'd like to replace a string in a text file in MATLAB.
To read the specified line I used the code :
fid = fopen(file_name, 'r');
tt = textscan(fid, '%s', 1, 'delimiter', '\n', 'headerlines', i);
ttt = str2num(tt{1}{1});
where file_name is the name of my file and ttt is a cell array that contains the i-th string converted in integer.
For example, ttt = 1 2 3 4 5 6 7 8
Now, I'd like to change ttt with ttt = 0 0 0 0 0 0 0 0
and write the new ttt at the i-th line in the file.
Does anybody have an idea to handle this?
A possible implementation is
fid = fopen('input.txt', 'r+');
tt = textscan(fid, '%s', 1, 'delimiter', '\n', 'headerlines', 1); % scan the second line
tt = tt{1}{1};
ttt = str2num(tt); %parse all the integers from the string
ttt = ttt*0; %set all integers to zero
fseek(fid, length(tt), 'bof'); %go back to the start of the parsed line
format = repmat('%d ', 1, length(ttt));
format = format(1:end-1); %remove the trailing space
fprintf(fid, format, ttt); %overwrite the text in the file
fclose(fid); %close the file
This will change the input.txt from:
your first line
1 2 3 4 5 6 7 8
5 5 5 5 5 5 5 5
into
your first line
0 0 0 0 0 0 0 0
5 5 5 5 5 5 5 5
Note that it is only possible to overwrite existing characters in the file. If you want to insert new characters, you have two rewrite the full file or at least from the position where you want to insert your new characters.
Or use something like this ...
fid = fopen('input_file.txt', 'r');
f = fread(fid, '*char')';
fclose(fid);
f = strrep(f, ' ', ''); % to remove
f = strrep(f, ' ', ' .'); % to replace
% save into new txt file
fid = fopen('output_file.txt', 'w');
fprintf(fid, '%s', f);
fclose(fid);

SAS simplify the contents of a variable

In SAS, I've a variable V containing the following value
V=1996199619961996200120012001
I'ld like to create these 2 variables
V1=19962001 (= different modalities)
V2=42 (= the first modality appears 4 times and the second one appears 2 times)
Any idea ?
Thanks for your help.
Luc
For your first question (if I understand the pattern correctly), you could extract the first four characters and the last four characters:
a = substr(variable, 1,4)
b = substrn(variable,max(1,length(variable)-3),4);
You could then concatenate the two.
c = cats(a,b)
For the second, the COUNT function can be used to count occurrences of a string within a string:
http://support.sas.com/documentation/cdl/en/lefunctionsref/63354/HTML/default/viewer.htm#p02vuhb5ijuirbn1p7azkyianjd8.htm
Hope this helps :)
Make it a bit more general;
%let modeLength = 4;
%let maxOccur = 100; ** in the input **;
%let maxModes = 10; ** in the output **;
Where does a certain occurrence start?;
%macro occurStart(occurNo);
&modeLength.*&occurNo.-%eval(&modeLength.-1)
%mend;
Read the input;
data simplified ;
infile datalines truncover;
input v $%eval(&modeLength.*&maxOccur.).;
Declare output and work variables;
format what $&modeLength..
v1 $%eval(&modeLength.*&maxModes.).
v2 $&maxModes..;
array w {&maxModes.}; ** what **;
array c {&maxModes.}; ** count **;
Discover unique modes and count them;
countW = 0;
do vNo = 1 to length(v)/&modeLength.;
what = substr(v, %occurStart(vNo), &modeLength.);
do wNo = 1 to countW;
if what eq w(wNo) then do;
c(wNo) = c(wNo) + 1;
goto foundIt;
end;
end;
countW = countW + 1;
w(countW) = what;
c(countW) = 1;
foundIt:
end;
Report results in v1 and v2;
do wNo = 1 to countW;
substr(v1, %occurStart(wNo), &modeLength.) = w(wNo);
substr(v2, wNo, 1) = put(c(wNo),1.);
put _N_= v1= v2=;
end;
keep v1 v2;
The data I testes with;
datalines;
1996199619961996200120012001
197019801990
20011996199619961996200120012001
;
run;

Search text file in MATLAB for a string to begin process

I have a fairly large text file and am trying to search for a particular term so that i can start a process after that point, but this doesn't seem to be working for me:
fileID = fopen(resfile,'r');
line = 0;
while 1
tline = fgetl(fileID);
line = line + 1;
if ischar(tline)
startRow = strfind(tline, 'OptimetricsResult');
if isfinite(startRow) == 1;
break
end
end
end
The answer I get is 9, but my text file:
$begin '$base_index$'
$begin 'properties'
all_levels=000000000000
time(year=000000002013, month=000000000006, day=000000000020, hour=000000000008, min=000000000033, sec=000000000033)
version=000000000000
$end 'properties'
$begin '$base_index$'
$index$(pos=000000492036, lin=000000009689, lvl=000000000000)
$end '$base_index$'
definitely doesn't have that in the first 9 rows?
If I ctrl+F the file, I know that OptimetricsResult only appears once, and that it's 6792 lines down
Any suggestions?
Thanks
I think your script somehow works, and you were just looking at the wrong variable. I assume that the answer you get is startRow = 9 and not line = 9. Check the variable line. By the way, note that you're not checking an End-of-File, so your while loop might run indefinitely the file doesn't contain your search string.
An alternative approach, (which is much simpler in my humble opinion) would be reading all lines at once (each one stored as a separate string) with textscan, and then applying regexp or strfind:
%// Read lines from input file
fid = fopen(filename, 'r');
C = textscan(fid, '%s', 'Delimiter', '\n');
fclose(fid);
%// Search a specific string and find all rows containing matches
C = strfind(C{1}, 'OptimetricsResult');
rows = find(~cellfun('isempty', C));
I can't reproduce your problem.
Are you sure you've properly closed the file before re-running this script? If not, the internal line counter in fgetl does not get reset, so you get false results. Just issue a fclose all on the MATLAB command prompt, and add a fclose(fileID); after the loop, and test again.
In any case, I suggest modifying your infinite-loop (with all sorts of pitfalls) to the following finite loop:
haystack = fopen(resfile,'r');
needle = 'OptimetricsResult';
line = 0;
found = false;
while ~feof(haystack)
tline = fgetl(haystack);
line = line + 1;
if ischar(tline) && ~isempty(strfind(tline, needle))
found = true;
break;
end
end
if ~found
line = NaN; end
fclose(fileID);
line
You could of course also leave the searching to more specialized tools, which come free with most operating systems:
haystack = 'resfile.txt';
needle = 'OptimetricsResult';
if ispc % Windows
[~,lines] = system(['find /n "' needle '" ' haystack]);
elseif isunix % Mac, Linux
[~,lines] = system(['grep -n "' needle '" ' haystack]);
else
error('Unknown operating system!');
end
You'd have to do a bit more parsing to extract the line number from C, but I trust this will be no issue.

Resources