MATLAB generate combination from a string - string

I've a string like this "FBECGHD" and i need to use MATLAB and generate all the required possible permutations? In there a specific MATLAB function that does this task or should I define a custom MATLAB function that perform this task?

Use the perms function. A string in matlab is a list of characters, so it will permute them:
A = 'FBECGHD';
perms(A)
You can also store the output (e.g. P = perms(A)), and, if A is an N-character string, P is a N!-by-N array, where each row corresponds to a permutation.
If you are interested in unique permutations, you can use:
unique(perms(A), 'rows')
to remove duplicates (otherwise something like 'ABB' would give 6 results, instead of the 3 that you might expect).

As Richante answered, P = perms(A) is very handy for this. You may also notice that P is of type char and it's not convenient to subset/select individual permutation. Below worked for me:
str = 'FBECGHD';
A = perms(str);
B = cellstr(reshape(A,7,[])');
C = unique(B);
It also appears that unique(A, 'rows') is not removing duplicate values:
>> A=[11, 11];
>> unique(A, 'rows')
ans =
11 11
However, unique(A) would:
>> unique(A)
ans =
11
I am not a matlab pro by any means and I didn't investigate this exhaustively but at least in some cases it appears that reshape is not what you want. Notice that below gives 999 and 191 as permutations of 199 which isn't true. The reshape function as written appears to operate "column-wise" on A:
>> str = '199';
A = perms(str);
B = cellstr(reshape(A,3,[])');
C = unique(B);
>> C
C =
'191'
'199'
'911'
'919'
'999'
Below does not produce 999 or 191:
B = {};
index = 1;
while true
try
substring = A(index,:);
B{index}=substring;
index = index + 1;
catch
break
end
end
C = unique(B)
C =
'199' '919' '991'

Related

Issue with ASCii in Python3

I am trying to convert a string of varchar to ascii. Then i'm trying to make it so any number that's not 3 digits has a 0 in front of it. then i'm trying to add a 1 to the very beginning of the string and then i'm trying to make it a large number that I can apply math to it.
I've tried a lot of different coding techniques. The closest I've gotten is below:
s = 'Ak'
for c in s:
mgk = (''.join(str(ord(c)) for c in s))
num = [mgk]
var = 1
num.insert(0, var)
mgc = lambda num: int(''.join(str(i) for i in num))
num = mgc(num)
print(num)
With this code I get the output: 165107
It's almost doing exactly what I need to do but it's taking out the 0 from the ord(A) which is 65. I want it to be 165. everything else seems to be working great. I'm using '%03d'% to insert the 0.
How I want it to work is:
Get the ord() value from a string of numbers and letters.
if the ord() value is less than 100 (ex: A = 65, add a 0 to make it a 3 digit number)
take the ord() values and combine them into 1 number. 0 needs to stay in from of 65. then add a one to the list. so basically the output will look like:
1065107
I want to make sure I can take that number and apply math to it.
I have this code too:
s = 'Ak'
for c in s:
s = ord(c)
s = '%03d'%s
mgk = (''.join(str(s)))
s = [mgk]
var = 1
s.insert(0, var)
mgc = lambda s: int(''.join(str(i) for i in s))
s = mgc(s)
print(s)
but then it counts each letter as its own element and it will not combine them and I only want the one in front of the very first number.
When the number is converted to an integer, it
Is this what you want? I am kinda confused:
a = 'Ak'
result = '1' + ''.join(str(f'{ord(char):03d}') for char in a)
print(result) # 1065107
# to make it a number just do:
my_int = int(result)

Matlab, order cells of strings according to the first one

I have 2 cell of strings and I would like to order them according to the first one.
A = {'a';'b';'c'}
B = {'b';'a';'c'}
idx = [2,1,3] % TO FIND
B=B(idx);
I would like to find a way to find idx...
Use the second output of ismember. ismember tells you whether or not values in the first set are anywhere in the second set. The second output tells you where these values are located if we find anything. As such:
A = {'a';'b';'c'}
B = {'b';'a';'c'}
[~,idx] = ismember(A, B);
Note that there is a minor typo when you declared your cell arrays. You have a colon in between b and c for A and a and c for B. I placed a semi-colon there for both for correctness.
Therefore, we get:
idx =
2
1
3
Benchmarking
We have three very good algorithms here. As such, let's see how this performs by doing a benchmarking test. What I'm going to do is generate a 10000 x 1 random character array of lower case letters. This will then be encapsulated into a 10000 x 1 cell array, where each cell is a single character array. I construct A this way, and B is a random permutation of the elements in A. This is the code that I wrote to do this for us:
letters = char(97 + (0:25));
rng(123); %// Set seed for reproducibility
ind = randi(26, [10000, 1]);
lettersMat = letters(ind);
A = mat2cell(lettersMat, ones(10000,1), 1);
B = A(randperm(10000));
Now... here comes the testing code:
clear all;
close all;
letters = char(97 + (0:25));
rng(123); %// Set seed for reproducibility
ind = randi(26, [10000, 1]);
lettersMat = letters(ind);
A = mat2cell(lettersMat, 1, ones(10000,1));
B = A(randperm(10000));
tic;
[~,idx] = ismember(A,B);
t = toc;
fprintf('ismember: %f\n', t);
clear idx; %// Make sure test is unbiased
tic;
[~,idx] = max(bsxfun(#eq,char(A),char(B)'));
t = toc;
fprintf('bsxfun: %f\n', t);
clear idx; %// Make sure test is unbiased
tic;
[~, indA] = sort(A);
[~, indB] = sort(B);
idx = indB(indA);
t = toc;
fprintf('sort: %f\n', t);
This is what I get for timing:
ismember: 0.058947
bsxfun: 0.110809
sort: 0.006054
Luis Mendo's approach is the fastest, followed by ismember, and then finally bsxfun. For code compactness, ismember is preferred but for performance, sort is better. Personally, I think bsxfun should win because it's such a nice function to use ;).
This seems to be significantly faster than using ismember (although admittedly less clear than #rayryeng's answer). With thanks to #Divakar for his correction on this answer.
[~, indA] = sort(A);
[~, indB] = sort(B);
idx = indA(indB);
I had to jump in as it seems runtime performance could be a criteria here :)
Assuming that you are dealing with scalar strings(one character in each cell), here's my take that works even when you have not-commmon elements between A and B and uses the very powerful bsxfun and as such I am really hoping this would be runtime-efficient -
[v,idx] = max(bsxfun(#eq,char(A),char(B)'));
idx = v.*idx
Example -
A =
'a' 'b' 'c' 'd'
B =
'b' 'a' 'c' 'e'
idx =
2 1 3 0
For a specific case when you have no not-common elements between A and B, it becomes a one-liner -
[~,idx] = max(bsxfun(#eq,char(A),char(B)'))
Example -
A =
'a' 'b' 'c'
B =
'b' 'a' 'c'
idx =
2 1 3

Matlab. Find the indices of a cell array of strings with characters all contained in a given string (without repetition)

I have one string and a cell array of strings.
str = 'actaz';
dic = {'aaccttzz', 'ac', 'zt', 'ctu', 'bdu', 'zac', 'zaz', 'aac'};
I want to obtain:
idx = [2, 3, 6, 8];
I have written a very long code that:
finds the elements with length not greater than length(str);
removes the elements with characters not included in str;
finally, for each remaining element, checks the characters one by one
Essentially, it's an almost brute force code and runs very slowly. I wonder if there is a simple way to do it fast.
NB: I have just edited the question to make clear that characters can be repeated n times if they appear n times in str. Thanks Shai for pointing it out.
You can sort the strings and then match them using regular expression. For your example the pattern will be ^a{0,2}c{0,1}t{0,1}z{0,1}$:
u = unique(str);
t = ['^' sprintf('%c{0,%d}', [u; histc(str,u)]) '$'];
s = cellfun(#sort, dic, 'uni', 0);
idx = find(~cellfun('isempty', regexp(s, t)));
I came up with this :
>> g=#(x,y) sum(x==y) <= sum(str==y);
>> h=#(t)sum(arrayfun(#(x)g(t,x),t))==length(t);
>> f=cellfun(#(x)h(x),dic);
>> find(f)
ans =
2 3 6
g & h: check if number of count of each letter in search string <= number of count in str.
f : finally use g and h for each element in dic

Is it possible to concatenate a string with series of number?

I have a string (eg. 'STA') and I want to make a cell array that will be a concatenation of my sting with a numbers from 1 to X.
I want the code to do something like the fore loop here below:
for i = 1:Num
a = [{a} {strcat('STA',num2str(i))}]
end
I want the end results to be in the form of {<1xNum cell>}
a = 'STA1' 'STA2' 'STA3' ...
(I want to set this to a uitable in the ColumnFormat array)
ColumnFormat = {{a},... % 1
'numeric',... % 2
'numeric'}; % 3
I'm not sure about starting with STA1, but this should get you a list that starts with STA (from which I guess you could remove the first entry).
N = 5;
[X{1:N+1}] = deal('STA');
a = genvarname(X);
a = a(2:end);
You can do it with combination of NUM2STR (converts numbers to strings), CELLSTR (converts strings to cell array), STRTRIM (removes extra spaces)and STRCAT (combines with another string) functions.
You need (:) to make sure the numeric vector is column.
x = 1:Num;
a = strcat( 'STA', strtrim( cellstr( num2str(x(:)) ) ) );
As an alternative for matrix with more dimensions I have this helper function:
function c = num2cellstr(xx, varargin)
%Converts matrix of numeric data to cell array of strings
c = cellfun(#(x) num2str(x,varargin{:}), num2cell(xx), 'UniformOutput', false);
Try this:
N = 10;
a = cell(1,N);
for i = 1:N
a(i) = {['STA',num2str(i)]};
end

Equivalent of adding strings to a loop, for strings (matlab)?

How would I be able to do the equivalent of this with strings:
a = [1 2 3; 4 5 6];
c = [];
for i=1:5
b = a(1,:)+i;
c = [c;b];
end
c =
2 3 4
3 4 5
4 5 6
5 6 7
6 7 8
Basically looking to combine several strings into a Matrix.
You're growing a variable in a loop, which is a kind of sin in Matlab :) So I'm going to show you some better ways of doing array concatenation.
There's cell strings:
>> C = {
'In a cell string, it'
'doesn''t matter'
'if the strings'
'are not of equal lenght'};
>> C{2}
ans =
doesn't matter
Which you could use in a loop like so:
% NOTE: always pre-allocate everything before a loop
C = cell(5,1);
for ii = 1:5
% assign some random characters
C{ii} = char( '0'+round(rand(1+round(rand*10),1)*('z'-'0')) );
end
There's ordinary arrays, which have as a drawback that you have to know the size of all your strings beforehand:
a = [...
'testy' % works
'droop'
];
b = [...
'testing' % ERROR: CAT arguments dimensions
'if this works too' % are not consistent.
];
for these cases, use char:
>> b = char(...
'testing',...
'if this works too'...
);
b =
'testing '
'if this works too'
Note how char pads the first string with spaces to fit the length of the second string. Now again: don't use this in a loop, unless you've pre-allocated the array, or if there really is no other way to go.
Type help strfun on the Matlab command prompt to get an overview of all string-related functions available in Matlab.
You mean storing a string on each matrix position? You can't do that, since matrices are defined over basic types. You can have a CHAR on each position:
>> a = 'bla';
>> b = [a; a]
b <2x3 char> =
bla
bla
>> b(2,3) = 'e'
b =
bla
ble
If you want to store matrices, use a cell array (MATLAB reference, Blog of Loren Shure), which are kind of similar but using "{}" instead of "()":
>> c = {a; a}
c =
'bla'
'bla'
>> c{2}
ans =
bla

Resources