MATLAB: populate cell array and format number - string

What's the efficient way to populate a cell array with values from a vector (each formatted differently).
For example:
V = [12.3 .004 3 4.222];
Desired cell array:
array = {'A = 12.30' 'B = 0.004' 'C = 3' 'D = 4'};
Is there an easier way than calling sprintf for each and every cell?
array = {sprintf('A = %.2f',V(1)) sprintf('B = %.3f',V(2)) ... }

There's no vectorized form of sprintf that supports different formats, so you're mostly stuck with a sprintf call per cell. But you could arrange the code to be easier to deal with in a loop.
V = [12.3 .004 3 4.222];
names = num2cell('A':'Z');
formats = { '%.2f' '%.3f' '%d' '%.0f' };
c = cell(size(V));
for i = numel(V)
c{i} = sprintf(['%s = ' formats{i}], names{i}, V(i));
end
It would be tricky to get anything faster than the naive way without dropping down to Java or C, because it's still going to take a sprintf() call for each cell, and that's going to dominate the execution time.
If you have a large number of elements and a relatively small number of formats, you could use unique() to group them up in to one sprintf() call per format, using the vectorized version of sprintf and then splitting on a delimiter to get individual strings. That may or may not be faster, depending on your exact data set and implementation.
Or you could write a MEX file that pushes the loop down in to C, looping over a call to C's sprintf. That would be faster, once you get up to moderately large input sizes.

It can be done as follows:
V = [12.3 .004 3 4.222];
names = {'A', 'B', 'C', 'D'};
array = strcat(names(:), ' = ', ...
strtrim(mat2cell(num2str(V(:), '%.4f'), ones(1,numel(V))))).';
How this works:
num2str gives a char matrix, using a format specifier (you may want to change the one I used).
mat2cell converts that into a cell array, putting each row into a cell.
strtrim removes spaces.
strcat concatenates cell-wise.

Related

MATLAB cell to string

I am trying to read an excel sheet and then and find cells that are not empty and have date information in them by finding two '/' in a string
but matlab keeps to erroring on handling cell type
"Undefined operator '~=' for input arguments of type 'cell'."
"Undefined function 'string' for input arguments of type 'cell'."
"Undefined function 'char' for input arguments of type 'cell'."
MyFolderInfo = dir('C:\');
filename = 'Export.xls';
[num,txt,raw] = xlsread(filename,'A1:G200');
for i = 1:length(txt)
if ~isnan(raw(i,1))
if sum(ismember(char(raw(i,1)),'/')) == 2
A(i,1) = raw(i,1);
end
end
end
please help fixing it
There are multiple issues with your code. Since raw is a cell array, you can't run isnan on it, isnan is for numerical arrays. Since all you're interested in is cells with text in them, you don't need to use raw at all, any blank cells will not be present in txt.
My approach is to create a logical array, has_2_slashes, and then use it to extract the elements from raw that have two slashes in them.
Here is my code. I generalized it to read multiple columns since your original code only seemed to be written to handle one column.
filename = 'Export.xls';
[~, ~, raw] = xlsread(filename, 'A1:G200');
[num_rows, num_cols] = size(raw);
has_2_slashes = false(num_rows, num_cols);
for row = 1:num_rows
for col = 1:num_cols
has_2_slashes(row, col) = sum(ismember(raw{row, col}, '/')) == 2;
end
end
A = raw(has_2_slashes);
cellfun(#numel,strfind(txt,'/'))
should give you a numerical array where the (i,j)th element contains the number of slashes. For example,
>> cellfun(#numel,strfind({'a','b';'/','/abc/'},'/'))
ans =
0 0
1 2
The key here is to use strfind.
Now you may want to expand a bit in your question on what you intend to do next with txt -- in other words, specify desired output more, which is always a good thing to do. If you intend to read the dates, it may be better to just read it upfront, for example by using regexp or datetime as opposed to getting an array which can then map to where the dates are. As is, using ans>=2 next gives you the logical array that can let you extract the matched entries.

How to assign multiple lines to a string variable in Matlab

I have a few lines of text like this:
abc
def
ghi
and I want to assign these multiple lines to a Matlab variable for further processing.
I am copying these from very large text file and want to process it in Matlab Instead of saving the text into a file and then reading line by line for processing.
I tried to handle the above text lines as single string but am getting an error whilst trying to assign to a variable:
x = 'abc
def
ghi'
Error:
x = 'abc
|
Error: String is not terminated properly.
Any suggestions which could help me understand and solve the issue will be highly appreciated.
I frequently do this, namely copy text from elsewhere which I want to hard-code into a MATLAB script (in my case it's generally SQL code I want to manipulate and call from MATLAB).
To achieve this I have a helper function in clipboard2cellstr.m defined as follows:
function clipboard2cellstr
str = clipboard('paste');
str = regexprep(str, '''', ''''''); % Double any single quotes
strs = regexp(str, '\s*\r?\n\r?', 'split');
cs = sprintf('{\n''%s''\n}', strjoin(strs, sprintf('''\n''')));
clipboard('copy', cs);
disp(cs)
disp('(Copied to Clipboard)')
end
I then copy the text using Ctrl-c (or however) and run clipboard2cellstr. This changes the contents of the clipboard to something I can paste into the MATLAB editor using Ctrl-v (or however).
For example, copying this line
and this line
and this one, and then running the function generates this:
{
'For example, copying this line'
'and this line'
'and this one, and then running the function generates this:'
}
which is valid MATLAB which can be pasted directly in.
Your error is because you ended the line when MATLAB was expecting a closing quote character. You must use array notation to have multi-line or multi-element arrays.
You can assign like this if you use array notation
x = ['abc'
'def'
'hij']
>> x = 3×3 char array
Note: with this method, your rows must have the same number of characters, as you are really dealing with a character array. You can think of a character array like a numeric matrix, hence why it must be "rectangular".
If you have MATLAB R2016b or newer, you can use the string data type. This uses double quotes "..." rather than single quotes '...', and can be multi-line. You must still use array notation:
x = ["abc"
"def"
"hijk"]
>> x = 3×1 string array
We can have different numbers of characters in each line, as this is simply a 3 element string array, not a character array.
Alternatively, use a cell array of character arrays (or strings)
x = {'abc'
'def'
'hijk'}
>> x = 3×1 cell array
Again, you can have character arrays or strings of different lengths within a cell array.
In all of the above examples, a newline is simply for readability and can be replaced by a semi-colon ; to denote the next line of the array.
The option you choose will depend on what you want to do with the text. If you're reading from a file, I would suggest the string array or the cell array, as they can deal with different length lines. For backwards compatibility, use a cell array. You may find cellfun relevant for operating on cell arrays. For native string operations, use a string array.

Replace multiple substrings using strrep in Matlab

I have a big string (around 25M characters) where I need to replace multiple substrings of a specific pattern in it.
Frame 1
0,0,0,0,0,1,2,34,0
0,1,2,3,34,12,3,4,0
...........
Frame 2
0,0,0,0,0,1,2,34,0
0,1,2,3,34,12,3,4,0
...........
Frame 7670
0,0,0,0,0,1,2,34,0
0,1,2,3,34,12,3,4,0
...........
The substring I need to remove is the 'Frame #' and it occurs around 7670 times. I can give multiple search strings in strrep, using a cell array
strrep(text,{'Frame 1','Frame 2',..,'Frame 7670'},';')
However that returns a cell array, where in each cell, I have the original string with the corresponding substring of one of my input cell changed.
Is there a way to replace multiple substrings from a string, other than using regexprep? I noticed that it is considerably slower than strrep, that's why I am trying to avoid it.
With regexprep it would be:
regexprep(text,'Frame \d*',';')
and for a string of 25MB it takes around 47 seconds to replace all the instances.
EDIT 1: added the equivalent regexprep command
EDIT 2: added size of the string for reference, number of occurences for the substring and timing of execution for the regexprep
Ok, in the end I found a way to go around the problem. Instead of using regexprep to change the substring, I remove the 'Frame ' substring (including whitespace, but not the number)
rawData = strrep(text,'Frame ','');
This results in something like this:
1
0,0,0,0,0,1,2,34,0
0,1,2,3,34,12,3,4,0
...........
2
0,0,0,0,0,1,2,34,0
0,1,2,3,34,12,3,4,0
...........
7670
0,0,0,0,0,1,2,34,0
0,1,2,3,34,12,3,4,0
...........
Then, I change all the commas (,) and newline characters (\n) into a semicolon (;), using again strrep, and I create a big vector with all the numbers
rawData = strrep(rawData,sprintf('\r\n'),';');
rawData = strrep(rawData,';;',';');
rawData = strrep(rawData,';;',';');
rawData = strrep(rawData,',',';');
rawData = textscan(rawData,'%f','Delimiter',';');
then I remove the unnecessary numbers (1,2,...,7670), since they are located at a specific point in the array (each frame contains a specific amount of numbers).
rawData{1}(firstInstance:spacing:lastInstance)=[];
And then I go on with my manipulations. It seems that the additional strrep and removal of the values from the array is much much faster than the equivalent regexprep. With a string of 25M chars with regexprep I can do the whole operation in about 47", while with this workaround it takes only 5"!
Hope this helps somehow.
I think that this can be done using only textscan, which is known to be very fast. Be specifying a 'CommentStyle' the 'Frame #' lines are stripped out. This may only work because these 'Frame #' lines are on their own lines. This code returns the raw data as one big vector:
s = textscan(text,'%f','CommentStyle','Frame','Delimiter',',');
s = s{:}
You may want to know how many elements are in each frame or even reshape the data into a matrix. You can use textscan again (or before the above) to get just the data for the first frame:
f1 = textscan(text,'%f','CommentStyle','Frame 1','Delimiter',',');
f1 = s{:}
In fact, if you just want the elements from the first line, you can use this:
l1 = textscan(text,'%f,','CommentStyle','Frame 1')
l1 = l1{:}
However, the other nice thing about textscan is that you can use it to read in the file directly (it looks like you may be using some other means currently) using just fopen to get an FID. Thus the string data text doesn't have to be in memory.
Using regular expressions:
result = regexprep(text,'Frame [0-9]+','');
It's possible to avoid regular expressions as follows. I use strrep with suitable replacement strings that act as masks. The obtained strings are equal-length and are assured to be aligned, and can thus be combined into the final result using the masks. I've also included the ; you want. I don't know if it will be faster than regexprep or not, but it's definitely more fun :-)
% Data
text = 'Hello Frame 1 test string Frame 22 end of Frame 2 this'; %//example text
rep_orig = {'Frame 1','Frame 2','Frame 22'}; %//strings to be replaced.
%//May be of different lengths
% Computations
rep_dest = cellfun(#(s) char(zeros(1,length(s))), rep_orig, 'uni', false);
%//series of char(0) of same length as strings to be replaced (to be used as mask)
aux = cell2mat(strrep(text,rep_orig.',rep_dest.'));
ind_keep = all(double(aux)); %//keep characters according to mask
ind_semicolon = diff(ind_keep)==1; %//where to insert ';'
ind_keep = ind_keep | [ind_semicolon 0]; %// semicolons will also be kept
result = aux(1,:); %//for now
result(ind_semicolon) = ';'; %//include `;`
result = result(ind_keep); %//remove unwanted characters
With these example data:
>> text
text =
Hello Frame 1 test string Frame 22 end of Frame 2 this
>> result
result =
Hello ; test string ; end of ; this

Matlab: How do I compare two string arrays and then select out the number values associated with those strings?

I have one array of strings that I want to use to pull out samples from a larger matrix of data that I have. Right now I have the one array of strings, 1200x1. And my actual data 'names' (string array that denotes what the values correspond to 6855x1, and 'data' is 6855x2.
This is what I came up with:
C = intersect(names,sites) %To find common strings
%To find where these strings are in my original dataset:
Q=zeros(length(C),1)
for i=1:length(C)
for j=1
while strcmp(C(i),names(j))==0
j=j+1
Q(i)=j
end
end
end
%To then use the above values to compile a new vector with the actual data values from 'data':
A=zeros(length(Q),1)
for i=1:length(Q)
A(i) = mock(Q(i),1)
The only problem is I am running the second set of loops I listed right now, and it is obvious that it will take several hours. I think there must be a quicker way without setting up three loops. Does anyone know a better method?
The first thing to note is that your loop over Q can be trivially accelerated, as:
A = mock(Q,1);
although I suspect that you meant data(Q,2).
If you store your name list in a cell array rather than a regular array, you should be able to accelerate things further. Assume that data is a cell array, names{1:6855} and value list numbers(1:6855).
A = zeros(length(C),1);
for i1=1:length(C)
A(i1)=numbers(strcmp(C(i1),names));
end

How do I put variable values into a text string in MATLAB?

I'm trying to write a simple function that takes two inputs, x and y, and passes these to three other simple functions that add, multiply, and divide them. The main function should then display the results as a string containing x, y, and the totals.
I think there's something I'm not understanding about output arguments. Anyway, here's my (pitiful) code:
function a=addxy(x,y)
a=x+y;
function b=mxy(x,y)
b=x*y;
function c=dxy(x,y)
c=x/y;
The main function is:
function [d e f]=answer(x,y)
d=addxy(x,y);
e=mxy(x,y);
f=dxy(x,y);
z=[d e f]
How do I get the values for x, y, d, e, and f into a string? I tried different matrices and stuff like:
['the sum of' x 'and' y 'is' d]
but none of the variables are showing up.
Two additional issues:
Why is the function returning "ans 3" even though I didn't ask for the length of z?
If anyone could recommend a good book for beginners to MATLAB scripting I'd really appreciate it.
Here's how you convert numbers to strings, and join strings to other things (it's weird):
>> ['the number is ' num2str(15) '.']
ans =
the number is 15.
You can use fprintf/sprintf with familiar C syntax. Maybe something like:
fprintf('x = %d, y = %d \n x+y=%d \n x*y=%d \n x/y=%f\n', x,y,d,e,f)
reading your comment, this is how you use your functions from the main program:
x = 2;
y = 2;
[d e f] = answer(x,y);
fprintf('%d + %d = %d\n', x,y,d)
fprintf('%d * %d = %d\n', x,y,e)
fprintf('%d / %d = %f\n', x,y,f)
Also for the answer() function, you can assign the output values to a vector instead of three distinct variables:
function result=answer(x,y)
result(1)=addxy(x,y);
result(2)=mxy(x,y);
result(3)=dxy(x,y);
and call it simply as:
out = answer(x,y);
As Peter and Amro illustrate, you have to convert numeric values to formatted strings first in order to display them or concatenate them with other character strings. You can do this using the functions FPRINTF, SPRINTF, NUM2STR, and INT2STR.
With respect to getting ans = 3 as an output, it is probably because you are not assigning the output from answer to a variable. If you want to get all of the output values, you will have to call answer in the following way:
[out1,out2,out3] = answer(1,2);
This will place the value d in out1, the value e in out2, and the value f in out3. When you do the following:
answer(1,2)
MATLAB will automatically assign the first output d (which has the value 3 in this case) to the default workspace variable ans.
With respect to suggesting a good resource for learning MATLAB, you shouldn't underestimate the value of the MATLAB documentation. I've learned most of what I know on my own using it. You can access it online, or within your copy of MATLAB using the functions DOC, HELP, or HELPWIN.
I just realized why I was having so much trouble - in MATLAB you can't store strings of different lengths as an array using square brackets. Using square brackets concatenates strings of varying lengths into a single character array.
>> a=['matlab','is','fun']
a =
matlabisfun
>> size(a)
ans =
1 11
In a character array, each character in a string counts as one element, which explains why the size of a is 1X11.
To store strings of varying lengths as elements of an array, you need to use curly braces to save as a cell array. In cell arrays, each string is treated as a separate element, regardless of length.
>> a={'matlab','is','fun'}
a =
'matlab' 'is' 'fun'
>> size(a)
ans =
1 3
I was looking for something along what you wanted, but wanted to put it back into a variable.
So this is what I did
variable = ['hello this is x' x ', this is now y' y ', finally this is d:' d]
basically
variable = [str1 str2 str3 str4 str5 str6]

Resources