convert matrix to cell string with elements separated by delimiter - string

How can I convert a matrix
A=[1,2,3;4,5,6]
to a cell of string
A_str = {'1_2_3';'4_5_6'};

One approach could be this -
%// Input
A=[1,2,3;4,5,6]
%// Make a cell array with each element a string off each element of A
cells = cellfun(#(x) num2str(x),num2cell(A),'Uni',0)
%// Join the cells with strjoin using `_` as the delimiter
A_str = arrayfun(#(n) strjoin(cells(n,:),'_'),1:size(cells,1),'Uni',0).'
Output -
A_str =
'1_2_3'
'4_5_6'

found this solution that seems faster
A=[1,2,3;4,5,6]
A_str = cell(size(A,1),1);
for index_row = 1 : size(A,1)
clear allOneString_temp
allOneString_temp = sprintf('%.0f_' , A(index_row,:));
A_str{index_row,:} = allOneString_temp(1:end-1);
end

Another approach, without loops:
A_str = num2str(A,'%i_');
A_str = mat2cell(A_str(:,1:end-1), ones(1,size(A_str,1)));

Related

Matlab String Conversion to Array

I got a string array of the format
sLine =
{
[1,1] = 13-Jul-16,10.46,100.63,15.7,54.4,55656465
[1,2] = 12-Jul-16,10.47,100.64,15.7,54.4,55656465
[1,3] = 11-Jul-16,10.48,100.65,15.7,54.4,55656465
[1,4] = 10-Jul-16,10.49,100.66,15.7,54.4,55656465
}
In which each element is a string ("13-Jul-16,10.46,100.63,15.7,54.4,55656465" is a string).
I need to convert this to 6 vectors, something like
[a b c d e f] = ...
such a way, for example, for the 1st column, it would be
a = [13-Jul-16;12-Jul-16;11-Jul-16;10-Jul-16]
I tried to use cell2mat function, but for some reason it does not separate the fields into matrix elements, but it concatenates the whole string into something like
cell2mat(sLine)
ans =
13-Jul-16,10.46,100.63,15.7,54.4,5565646512-Jul-16,10.47,100.64,15.7,54.4,5565646511-Jul-16,10.48,100.65,15.7,54.4,5565646510-Jul-16,10.49,100.66,15.7,54.4,55656465
So, how can I solve this?
Update
I got the sLine matrix following the steps
pFile = urlread('http://www.google.com/finance/historical?q=BVMF:PETR4&num=365&output=csv');
sLine = strsplit(pFile,'\n');
sLine(:,1)=[];
Update
Thanks to #Suever I could get now the column dates. So the updated last version of the code is
pFile = urlread('http://www.google.com/finance/historical?q=BVMF:PETR4&num=365&output=csv');
pFile=strtrim(pFile);
sLine = strsplit(pFile,'\n');
sLine(:,1)=[];
split_values = regexp(sLine, ',', 'split');
values = cat(1, split_values{:});
values(:,1)
Your data is all strings, therefore you will need to do some string manipulation rather than using cell2mat.
You will want to split each element at the ,characters and then concatenate the result together.
sLine = {'13-Jul-16,10.46,100.63,15.7,54.4,55656465',
'12-Jul-16,10.47,100.64,15.7,54.4,55656465',
'11-Jul-16,10.48,100.65,15.7,54.4,55656465',
'10-Jul-16,10.49,100.66,15.7,54.4,55656465'};
split_values = cellfun(#(x)strsplit(x, ','), sLine, 'uniformoutput', 0);
values = cat(1, split_values{:});
values(:,1)
% {
% [1,1] = 13-Jul-16
% [2,1] = 12-Jul-16
% [3,1] = 11-Jul-16
% [4,1] = 10-Jul-16
% }
If you want it to be more concise, we can just use regexp to split it up instead of strsplit since it can accept a cell array as input.
split_values = regexp(sLine, ',', 'split');
values = cat(1, split_values{:});
Update
The issue with the code that you've posted is that there is a trailing newline in the input and when you split on newlines the last element of your sLine cell array is empty causing your issues. You'll want to use strtrim on pFile before creating the cell array to remove trailing newlines.
sLine = strsplit(strtrim(pFile), '\n');
sLine(:,1) = [];

Put a character in between every character of all strings in a cell-array

I have input-cell = {'ABCD', 'ABD', 'BCD'}. How can I put the operator < into the strings in input-cell?
The expected output should be:
output-cell = {'A<B<C<D', 'A<B<D', 'B<C<D'}
To insert a fixed character (<) between the characters of each string in a cell array: you can use regexeprep as follows:
input_cell = {'ABCD', 'ABD', 'BCD'}; %// input cell array
c = '<'; %// character to be inserted
output_cell = regexprep(input_cell, '.(?=.)', ['$0' c]); %// output cell array
Result:
output_cell =
'A<B<C<D' 'A<B<D' 'B<C<D'
Here's another way of doing it. You can append the number of < that you need to the end of each cell with
t = strcat(input_cell{n}, repmat('<', 1, length(input_cell{n})-1));
and then you can simply rearrange the characters in each cell to place the < in the correct positions
output_cell{n}(1:2:length(t)) = t(1:ceil(length(t)/2));
output_cell{n}(2:2:length(t)) = t(1+ceil(length(t)/2):length(t));
Putting this altogether gives
input_cell = {'ABCD', 'ABD', 'BCD'};
output_cell = cell(size(input_cell));
for n = 1:length(output_cell)
t = strcat(input_cell{n}, repmat('<', 1, length(input_cell{n})-1));
output_cell{n}(1:2:length(t)) = t(1:ceil(length(t)/2));
output_cell{n}(2:2:length(t)) = t(1+ceil(length(t)/2):length(t));
end
which produces
>> output_cell
output_cell =
'A<B<C<D' 'A<B<D' 'B<C<D'
inputcell = {'ABCD', 'ABD', 'BCD'}; %// Initial cell
outputcell = cell(size(inputcell)); %// Initialise output
for ii = 1:numel(inputcell)
tmp = inputcell{ii}; %// grab the iith cell
tmp2=[]; %// Initialise empty collector
tmp2(1:2:numel(tmp)*2)=tmp; %// Put characters on odd indices
tmp2(tmp2==0)='<'; %// Fill the even indices with <
outputcell{ii} = tmp2(1:end-1); %// Store the new string
clear tmp2 %// Clear the temporary string
end
outputcell
outputcell =
'A<B<C<D' 'A<B<D' 'B<C<D'
This uses the fact that each entry into your inputcell is an 1xN character array which you can access using indices. Just append < after each letter and store the new string`. Thanks to #Daniel for the removal of the inner loop.

Compare two arrays of strings

I have two lists of strings as a column in a table (PM25_spr{i}.MonitorID and O3_spr{i}.MonitorID). The lists are of different lengths. I want to compare the first 11 characters of each entry and pull out the index for each list where they are the same.
Example
List 1:
'01-003-0010-44201'
'01-027-0001-44201'
'01-051-0001-44201'
'01-073-0023-44201'
'01-073-1003-44201'
'01-073-1005-44201'
'01-073-1009-44201'
'01-073-1010-44201'
'01-073-2006-44201'
'01-073-5002-44201'
'01-073-5003-44201'
'01-073-6002-44201'
List 2:
'01-073-0023-88101'
'01-073-2003-88101'
'04-013-0019-88101'
'04-013-9992-88101'
'04-013-9997-88101'
'05-119-0007-88101'
'05-119-1008-88101'
'06-019-0008-88101'
'06-029-0014-88101'
'06-037-0002-88101'
'06-037-1103-88101'
'06-037-4002-88101'
'06-059-0001-88101'
'06-065-8001-88101'
'06-067-0010-88101'
'06-073-0003-88101'
'06-073-1002-88101'
'06-073-1007-88101'
'08-001-0006-88101'
'08-031-0002-88101'
I tried intersect, which isn't the right approach for what I want to do. I'm not sure how to use ismember given that I only want to look at the first 11 characters.
I tried strncmp, but Inputs must be the same size or either one can be a scalar.
chars2compare = length('18-097-0083');
strncmp(O3_spr{i}.MonitorID, PM25_spr{i}.MonitorID,chars2compare)
PM25_spr_MID = cell(length(years),1); % Preallocate cell array
for n = 1:length(PM25_spr{i}.MonitorID)
s = char(PM25_spr{i}.MonitorID(n)); % Convert string to char
PM25_spr_MID{i}(n) = cellstr(s(1:11)); % Pull out 1-11 characters and convert to cell
end
O3_spr_MID = cell(length(years),1); % Preallocate cell array
for n = 1:length(O3_spr{i}.MonitorID)
s = char(O3_spr{i}.MonitorID(n));
O3_spr_MID{i}(n) = cellstr(s(1:11));
end
[C, ia, ib] = intersect(O3_spr_MID{i}, PM25_spr_MID{i})
PerCap_spr_O3{i} = O3_spr{i}(ia,:);
PerCap_spr_PM25{i} = PM25_spr{i}(ib,:);
Assuming list1 and list2 to be the two input cell arrays, you can use few approaches.
I. Operate on cell arrays
With intersect -
%// Clip off after first 11 characters in each cell of the input cell arrays
list1_f11 = arrayfun(#(n) list1{n}(1:11),1:numel(list1),'uni',0)
list2_f11 = arrayfun(#(n) list2{n}(1:11),1:numel(list2),'uni',0)
%// Use intersect to find common indices in the input cell arrays
[~,idx_list1,idx_list2] = intersect(list1_f11,list2_f11)
With ismember -
%// Clip off after first 11 characters in each cell of the input cell arrays
list1_f11 = arrayfun(#(n) list1{n}(1:11),1:numel(list1),'uni',0)
list2_f11 = arrayfun(#(n) list2{n}(1:11),1:numel(list2),'uni',0)
%// Use ismember to find common indices in the input cell arrays
[LocA,LocB] = ismember(list1_f11,list2_f11);
idx_list1 = find(LocA)
idx_list2 = LocB(LocA)
II. Operate on char arrays
We can use char dierctly on the input cell arrays to get 2D char arrays as working with them could be faster than working withcells.
With intersect + 'rows' -
%// Convert to char arrays
list1c = char(list1)
list2c = char(list2)
%// Clip char arrays after first 11 columns
list1c_f11 = list1c(:,1:11)
list2c_f11 = list2c(:,1:11)
%// Use intersect with 'rows' option
[~,idx_list1,idx_list2] = intersect(list1c_f11,list2c_f11,'rows')
III. Operate on numeric arrays
We can convert the char arrays further to numeric arrays with just one column as that could lead to faster solutions.
%// Convert to char arrays
list1c = char(list1)
list2c = char(list2)
%// Clip char arrays after first 11 columns
list1c_f11 = list1c(:,1:11)
list2c_f11 = list2c(:,1:11)
%// Remove char columns of hyphens (3 and 7 for the given input)
list1c_f11(:,[3 7])=[];
list2c_f11(:,[3 7])=[];
%// Convert char arrays to numeric arrays
ncols = size(list1c_f11,2);
list1c_f11num = (list1c_f11 - '0')*(10.^(ncols-1:-1:0))'
list2c_f11num = (list2c_f11 - '0')*(10.^(ncols-1:-1:0))'
This point onwards you have three more approaches to work with that are listed next.
With ismember ( would be memory efficient, but maybe not fast across all datasizes) -
[LocA,LocB] = ismember(list1c_f11num,list2c_f11num);
idx_list1 = find(LocA)
idx_list2 = LocB(LocA)
With intersect (could be slow) -
[~,idx_list1,idx_list2] = intersect(list1c_f11num,list2c_f11num)
With bsxfun ( would be memory inefficient, but maybe fast for small to decent sized inputs) -
[idx_list1,idx_list2] = find(bsxfun(#eq,list1c_f11num,list2c_f11num'))

sort string according to first characters matlab

I have an cell array composed by several strings
names = {'2name_19surn', '3name_2surn', '1name_2surn', '10name_1surn'}
and I would like to sort them according to the prefixnumber.
I tried
[~,index] = sortrows(names.');
sorted_names = names(index);
but I get
sorted_names = {'10name_1surn', '1name_2surn', '2name_19surn', '3name_2surn'}
instead of the desired
sorted_names = {'1name_2surn', '2name_19surn', '3name_2surn','10name_1surn'}
any suggestion?
Simple approach using regular expressions:
r = regexp(names,'^\d+','match'); %// get prefixes
[~, ind] = sort(cellfun(#(c) str2num(c{1}), r)); %// convert to numbers and sort
sorted_names = names(ind); %// use index to build result
As long as speed is not a concern you can loop through all strings and save the first digets in an array. Subsequently sort the array as usual...
names = {'2name_2', '3name', '1name', '10name'}
number_in_string = zeros(1,length(names));
% Read numbers from the strings
for ii = 1:length(names)
number_in_string(ii) = sscanf(names{ii}, '%i');
end
% Sort names using number_in_string
[sorted, idx] = sort(number_in_string)
sorted_names = names(idx)
Take the file sort_nat from here
Then
names = {'2name', '3name', '1name', '10name'}
sort_nat(names)
returns
sorted_names = {'1name', '2name', '3name','10name'}
You can deal with arbitrary patterns using a regular expression:
names = {'2name', '3name', '1name', '10name'}
match = regexpi(names,'(?<number>\d+)\D+','names'); % created with regex editor on rubular.com
match = cell2mat(match); % cell array to struct array
clear numbersStr
[numbersStr{1:length(match)}] = match.number; % cell array with number strings
numbers = str2double(numbersStr); % vector of numbers
[B,I] = sort(numbers); % sorted vector of numbers (B) and the indices (I)
clear namesSorted
[namesSorted{1:length(names)}] = names{I} % cell array with sorted name strings

Is it possible to concatenate a string with series of number?

I have a string (eg. 'STA') and I want to make a cell array that will be a concatenation of my sting with a numbers from 1 to X.
I want the code to do something like the fore loop here below:
for i = 1:Num
a = [{a} {strcat('STA',num2str(i))}]
end
I want the end results to be in the form of {<1xNum cell>}
a = 'STA1' 'STA2' 'STA3' ...
(I want to set this to a uitable in the ColumnFormat array)
ColumnFormat = {{a},... % 1
'numeric',... % 2
'numeric'}; % 3
I'm not sure about starting with STA1, but this should get you a list that starts with STA (from which I guess you could remove the first entry).
N = 5;
[X{1:N+1}] = deal('STA');
a = genvarname(X);
a = a(2:end);
You can do it with combination of NUM2STR (converts numbers to strings), CELLSTR (converts strings to cell array), STRTRIM (removes extra spaces)and STRCAT (combines with another string) functions.
You need (:) to make sure the numeric vector is column.
x = 1:Num;
a = strcat( 'STA', strtrim( cellstr( num2str(x(:)) ) ) );
As an alternative for matrix with more dimensions I have this helper function:
function c = num2cellstr(xx, varargin)
%Converts matrix of numeric data to cell array of strings
c = cellfun(#(x) num2str(x,varargin{:}), num2cell(xx), 'UniformOutput', false);
Try this:
N = 10;
a = cell(1,N);
for i = 1:N
a(i) = {['STA',num2str(i)]};
end

Resources