meshgrid equivalent for strings - string

I have two cells:
Months1 = {'F','G','H','J','K','M','N','Q','U','V','X','Z'};
Months2 = 2009:2014;
How do I generate all combinations without running a loop so that I achieve the following:
Combined = {'F09','F10','F11','',...,'G09',.....};
Basically all combinations of Months1 and Months2 as in meshgrid.

If you don't need cells and can use char arrays only, this can work:
Months1 = ['F','G','H','J','K','M','N','Q','U','V','X','Z']';
Months2 = num2str((2009:2014)');
[x, y] = meshgrid(1:12, 1:6);
Combined = strcat(Months1(x(:)), Months2(y(:),:));
and you can then reshape if required. I'm not yet sure how to do this with cells, though.
Inspired by this post.

My take on the problem would apply ndgrid, datestr (to handle any millennium) and strcat to do the work:
yearStrings = datestr(datenum(num2str(Months2(:)),'yyyy'),'yy');
[ii,jj] = ndgrid(1:numel(Months2),1:numel(Months1));
Combined = strcat(Months1(jj(:)).',yearStrings(ii(:),:)).'
Note: Years change faster than the prefixed letters, so Months2 goes first in ndgrid, then Months1. IMO, this is more intuitive behavior than meshgrid, which forces you to think in x,y space to predict how the outputs vary.
Or instead of the strcat line:
tmp = [Months1(jj(:)).',yearStrings(ii(:),:)].';
Combined = cellstr(reshape([tmp{:}],[],numel(ii)).').'

You can convert cell array to indices with grp2idx, then use meshgrid, then strcat to combine strings. Before you also need to convert numeric Months2 vector to cell array of strings.
[id1,id2] = meshgrid(grp2idx(Months1),Months2);
Months2cell = cellstr(num2str(id2(:)-2000,'%02d'))';
Combined = strcat( Months1(id1(:)), Months2cell );

Related

Finding indexes of strings in a string array in Matlab

I have two string arrays and I want to find where each string from the first array is in the second array, so i tried this:
for i = 1:length(array1);
cmp(i) = strfind(array2,array1(i,:));
end
This doesn't seem to work and I get an error: "must be one row".
Just for the sake of completeness, an array of strings is nothing but a char matrix. This can be quite restrictive because all of your strings must have the same number of elements. And that's what #neerad29 solution is all about.
However, instead of an array of strings you might want to consider a cell array of strings, in which every string can be arbitrarily long. I will report the very same #neerad29 solution, but with cell arrays. The code will also look a little bit smarter:
a = {'abcd'; 'efgh'; 'ijkl'};
b = {'efgh'; 'abcd'; 'ijkl'};
pos=[];
for i=1:size(a,1)
AreStringFound=cellfun(#(x) strcmp(x,a(i,:)),b);
pos=[pos find(AreStringFound)];
end
But some additional words might be needed:
pos will contain the indices, 2 1 3 in our case, just like #neerad29 's solution
cellfun() is a function which applies a given function, the strcmp() in our case, to every cell of a given cell array. x will be the generic cell from array b which will be compared with a(i,:)
the cellfun() returns a boolean array (AreStringFound) with true in position j if a(i,:) is found in the j-th cell of b and the find() will indeed return the value of j, our proper index. This code is more robust and works also if a given string is found in more than one position in b.
strfind won't work, because it is used to find a string within another string, not within an array of strings. So, how about this:
a = ['abcd'; 'efgh'; 'ijkl'];
b = ['efgh'; 'abcd'; 'ijkl'];
cmp = zeros(1, size(a, 1));
for i = 1:size(a, 1)
for j = 1:size(b, 1)
if strcmp(a(i, :), b(j, :))
cmp(i) = j;
break;
end
end
end
cmp =
2 1 3

Compare two arrays of strings

I have two lists of strings as a column in a table (PM25_spr{i}.MonitorID and O3_spr{i}.MonitorID). The lists are of different lengths. I want to compare the first 11 characters of each entry and pull out the index for each list where they are the same.
Example
List 1:
'01-003-0010-44201'
'01-027-0001-44201'
'01-051-0001-44201'
'01-073-0023-44201'
'01-073-1003-44201'
'01-073-1005-44201'
'01-073-1009-44201'
'01-073-1010-44201'
'01-073-2006-44201'
'01-073-5002-44201'
'01-073-5003-44201'
'01-073-6002-44201'
List 2:
'01-073-0023-88101'
'01-073-2003-88101'
'04-013-0019-88101'
'04-013-9992-88101'
'04-013-9997-88101'
'05-119-0007-88101'
'05-119-1008-88101'
'06-019-0008-88101'
'06-029-0014-88101'
'06-037-0002-88101'
'06-037-1103-88101'
'06-037-4002-88101'
'06-059-0001-88101'
'06-065-8001-88101'
'06-067-0010-88101'
'06-073-0003-88101'
'06-073-1002-88101'
'06-073-1007-88101'
'08-001-0006-88101'
'08-031-0002-88101'
I tried intersect, which isn't the right approach for what I want to do. I'm not sure how to use ismember given that I only want to look at the first 11 characters.
I tried strncmp, but Inputs must be the same size or either one can be a scalar.
chars2compare = length('18-097-0083');
strncmp(O3_spr{i}.MonitorID, PM25_spr{i}.MonitorID,chars2compare)
PM25_spr_MID = cell(length(years),1); % Preallocate cell array
for n = 1:length(PM25_spr{i}.MonitorID)
s = char(PM25_spr{i}.MonitorID(n)); % Convert string to char
PM25_spr_MID{i}(n) = cellstr(s(1:11)); % Pull out 1-11 characters and convert to cell
end
O3_spr_MID = cell(length(years),1); % Preallocate cell array
for n = 1:length(O3_spr{i}.MonitorID)
s = char(O3_spr{i}.MonitorID(n));
O3_spr_MID{i}(n) = cellstr(s(1:11));
end
[C, ia, ib] = intersect(O3_spr_MID{i}, PM25_spr_MID{i})
PerCap_spr_O3{i} = O3_spr{i}(ia,:);
PerCap_spr_PM25{i} = PM25_spr{i}(ib,:);
Assuming list1 and list2 to be the two input cell arrays, you can use few approaches.
I. Operate on cell arrays
With intersect -
%// Clip off after first 11 characters in each cell of the input cell arrays
list1_f11 = arrayfun(#(n) list1{n}(1:11),1:numel(list1),'uni',0)
list2_f11 = arrayfun(#(n) list2{n}(1:11),1:numel(list2),'uni',0)
%// Use intersect to find common indices in the input cell arrays
[~,idx_list1,idx_list2] = intersect(list1_f11,list2_f11)
With ismember -
%// Clip off after first 11 characters in each cell of the input cell arrays
list1_f11 = arrayfun(#(n) list1{n}(1:11),1:numel(list1),'uni',0)
list2_f11 = arrayfun(#(n) list2{n}(1:11),1:numel(list2),'uni',0)
%// Use ismember to find common indices in the input cell arrays
[LocA,LocB] = ismember(list1_f11,list2_f11);
idx_list1 = find(LocA)
idx_list2 = LocB(LocA)
II. Operate on char arrays
We can use char dierctly on the input cell arrays to get 2D char arrays as working with them could be faster than working withcells.
With intersect + 'rows' -
%// Convert to char arrays
list1c = char(list1)
list2c = char(list2)
%// Clip char arrays after first 11 columns
list1c_f11 = list1c(:,1:11)
list2c_f11 = list2c(:,1:11)
%// Use intersect with 'rows' option
[~,idx_list1,idx_list2] = intersect(list1c_f11,list2c_f11,'rows')
III. Operate on numeric arrays
We can convert the char arrays further to numeric arrays with just one column as that could lead to faster solutions.
%// Convert to char arrays
list1c = char(list1)
list2c = char(list2)
%// Clip char arrays after first 11 columns
list1c_f11 = list1c(:,1:11)
list2c_f11 = list2c(:,1:11)
%// Remove char columns of hyphens (3 and 7 for the given input)
list1c_f11(:,[3 7])=[];
list2c_f11(:,[3 7])=[];
%// Convert char arrays to numeric arrays
ncols = size(list1c_f11,2);
list1c_f11num = (list1c_f11 - '0')*(10.^(ncols-1:-1:0))'
list2c_f11num = (list2c_f11 - '0')*(10.^(ncols-1:-1:0))'
This point onwards you have three more approaches to work with that are listed next.
With ismember ( would be memory efficient, but maybe not fast across all datasizes) -
[LocA,LocB] = ismember(list1c_f11num,list2c_f11num);
idx_list1 = find(LocA)
idx_list2 = LocB(LocA)
With intersect (could be slow) -
[~,idx_list1,idx_list2] = intersect(list1c_f11num,list2c_f11num)
With bsxfun ( would be memory inefficient, but maybe fast for small to decent sized inputs) -
[idx_list1,idx_list2] = find(bsxfun(#eq,list1c_f11num,list2c_f11num'))

Matlab: Split string into multiple strings at the points where '_' exists

I have a string which is this:
a = 'Sound_impro_Act'
I want to divide this into multiple strings which are the words that are seperated by '_' and assign these to different variables.
The end result will be like this:
b = 'Sound'
c = 'impro'
d = 'act'
Thanks.
You may also use regexp which is faster than strsplit
a = 'Sound_impro_Act';
parts = regexp(a,'_','split');
[b,c,d] = deal(parts{:});
The last line comes from #Divakar's answer. Lots of thanks!
Use strsplit and then deal to put them into different variables -
split_strings = strsplit(a,'_')
[b,c,d] = deal(split_strings{:})
strsplit function can do it for you:
An example:
a = 'Sound_impro_Act';
b= strsplit(a, '_');
now you can access all splited values using b(1), b(2), b(3)

sort string according to first characters matlab

I have an cell array composed by several strings
names = {'2name_19surn', '3name_2surn', '1name_2surn', '10name_1surn'}
and I would like to sort them according to the prefixnumber.
I tried
[~,index] = sortrows(names.');
sorted_names = names(index);
but I get
sorted_names = {'10name_1surn', '1name_2surn', '2name_19surn', '3name_2surn'}
instead of the desired
sorted_names = {'1name_2surn', '2name_19surn', '3name_2surn','10name_1surn'}
any suggestion?
Simple approach using regular expressions:
r = regexp(names,'^\d+','match'); %// get prefixes
[~, ind] = sort(cellfun(#(c) str2num(c{1}), r)); %// convert to numbers and sort
sorted_names = names(ind); %// use index to build result
As long as speed is not a concern you can loop through all strings and save the first digets in an array. Subsequently sort the array as usual...
names = {'2name_2', '3name', '1name', '10name'}
number_in_string = zeros(1,length(names));
% Read numbers from the strings
for ii = 1:length(names)
number_in_string(ii) = sscanf(names{ii}, '%i');
end
% Sort names using number_in_string
[sorted, idx] = sort(number_in_string)
sorted_names = names(idx)
Take the file sort_nat from here
Then
names = {'2name', '3name', '1name', '10name'}
sort_nat(names)
returns
sorted_names = {'1name', '2name', '3name','10name'}
You can deal with arbitrary patterns using a regular expression:
names = {'2name', '3name', '1name', '10name'}
match = regexpi(names,'(?<number>\d+)\D+','names'); % created with regex editor on rubular.com
match = cell2mat(match); % cell array to struct array
clear numbersStr
[numbersStr{1:length(match)}] = match.number; % cell array with number strings
numbers = str2double(numbersStr); % vector of numbers
[B,I] = sort(numbers); % sorted vector of numbers (B) and the indices (I)
clear namesSorted
[namesSorted{1:length(names)}] = names{I} % cell array with sorted name strings

MATLAB empty cell(n,m) array of strings?

What is the quickest way to create an empty cell array of strings ?
cell(n,m)
creates an empty cell array of double.
How about a similar command but creating empty strings ?
Depends on what you want to achieve really. I guess the simplest method would be:
repmat({''},n,m);
Assignment to all cell elements using the colon operator will do the job:
m = 3; n = 5;
C = cell(m,n);
C(:) = {''}
The cell array created by cell(n,m) contains empty matrices, not doubles.
If you really need to pre populate your cell array with empty strings
test = cell(n,m);
test(:) = {''};
test(1,:) = {'1st row'};
test(:,1) = {'1st col'};
This is a super old post but I'd like to add an approach that might be working. I am not sure if it's working in an earlier version of MATLAB. I tried in 2018+ versions and it works.
Instead of using remat, it seems even more convenient and intuitive to start a cell string array like this:
C(1:10) = {''} % Array of empty char
And the same approach can be used to generate cell array with other data types
C(1:10) = {""} % Array of empty string
C(1:10) = {[]} % Array of empty double, same as cell(1,10)
But be careful with scalers
C(1:10) = {1} % an 1x10 cell with all values = {[1]}
C(1:10) = 1 % !!!Error
C(1:10) = '1' % !!!Error
C(1:10) = [] % an 1x0 empty cell array

Resources