Combination of one of every string group in all possible combinations and orders in matlab - string

So I forgot a string and know there is three substrings in there and I know a few possibilities for each string. So all I need to do is go through all possible combinations and orders until I find the one I forgot. But since humans can only hold four items in their working memory (definately an upper limit for me), I cant keep tabs on which ones I examined.
So say I have n sets of m strings, how do I get all strings that have a length of n substrings consisting of one string from each set in any order?
I saw an example of how to do it in a nested loop but then I have to specify the order. The example is for n = 3 with different m`s. Not sure how to make this more general:
first = {'Hoi','Hi','Hallo'};
second = {'Jij','You','Du'};
third = {'Daar','There','Da','LengthIsDifferent'};
for iF = 1:length(first)
for iS = 1:length(second)
for iT = 1:length(third)
[first{iF}, second{iS}, third{iT}]
end
end
end
About this question: it does not solve this problem because it presumes that the order of the sets to choose from is known.

This generates the cartesian product of the indices using ndgrid.
Then uses some cellfun-magic to get all the strings. Afterwards it just cycles through all the permutations and appends those.
first = {'Hoi','Hi','Hallo'};
second = {'Jij','You','Du'};
third = {'Daar','There','Da','LengthIsDifferent'};
Vs = {first, second, third};
%% Create cartesian product
Indices = cellfun(#(X) 1:numel(X), Vs, 'uni', 0);
[cartesianProductInd{1:numel(Vs)}] = ndgrid(Indices{:});
AllStringCombinations = cellfun(#(A,I) A(I(:)), Vs, cartesianProductInd,'uni',0);
AllStringCombinations = cat(1, AllStringCombinations{:}).';%.'
%% Permute what we got
AllStringCombinationsPermuted = [];
permutations = perms(1:numel(Vs));
for i = 1:size(permutations,1)
AllStringCombinationsPermuted = [AllStringCombinationsPermuted; ...
AllStringCombinations(:,permutations(i,:));];
end

Related

accessing specific intervals in list

I have a list containing the system time from a machine. Since the list contains only the milliseconds part, the values donĀ“t go beyond 1000.
For better viewing I want to add i*1000 to certain intervals in this list for each time the list skips 1000. For better understanding i will give my input list and what my output list should look like:
inputlist = [300,600,900,200,500,800,100,400]
etc, the output list should look like this:
outputlist = [300,600,900,1200,1500,1800,2100,2400]
since I want the list to start with zero i subtracted the first element of the list from each element giving me a new inputlist:
inputlist_new = [0,300,600,-100,200,500,-200,100]
which should give me a new outputlist like:
outputlist_new = [0,300,600,900,1200,1500,1800,2100]
I tried creating a list containing the indices of each element < 0 to cut the list into intervals, to multiply the thousands on each interval but I am not able to do so. My code for this index list is this:
list_index = [i for i, j in enumerate(inputlist_new) if j < 0]
I actually found a solution myself: i copied the list i want to change into a new one, just for the sake of maybe using it later (so this is not necessary for the solution) and then used 2 for loops:
inputlist_new = [0,300,600,-100,200,500,-200,100]
inputlist2 = inputlist_new.copy()
for x in range(len(list_index)):
for y in range(list_index[x],len(inputlist2)):
inputlist2[y] = inputliste2[y]+1000
this gave me the wanted output

Splitting the output obtained by Counter in Python and pushing it to Excel

I am using the counter function to count every word of the description of 20000 products and see how many times this word repeats like 'pipette' repeats 1282 times.To do this i have split a column A into many columns P,Q,R,S,T,U & V
df["P"] = df["A"].str.split(n=10).str[0]
df["Q"] = df["A"].str.split(n=10).str[1]
df["R"] = df["A"].str.split(n=10).str[2]
df["S"] = df["A"].str.split(n=10).str[3]
df["T"] = df["A"].str.split(n=10).str[4]
df["U"] = df["A"].str.split(n=10).str[5]
df["V"] = df["A"].str.split(n=10).str[6]
This shows the splitted products
And the i am individually counting all of the columns and then add them to get the total number of words.
d = Counter(df['P'])
e = Counter(df['Q'])
f = Counter(df['R'])
g = Counter(df['S'])
h = Counter(df['T'])
i = Counter(df['U'])
j = Counter(df['V'])
m = d+e+f+g+h+i+j
print(m)
This is the image of the output i obtained on using counter.
Now i want to transfer the output into a excel sheet with the Keys in one column and the Values in another.
Am i using the right method to do so? If yes how shall i push them into different columns.
Note: Length of each key is different
Also i wanna make all the items of column 'A' into lower case so that the counter does not repeat the items. How shall I go about it ?
I've been learning python for just a couple of months but I'll give it a shot. I'm sure there are some better ways to perform that same action. Maybe we both can learn something from this question. Let me know how this turns out. GoodLuck
import pandas as pd
num = len(m.keys())
df = pd.DataFrame(columns=['Key', 'Value']
for i,j,k in zip(range(num), m.keys(), m.values()):
df.loc[i] = [j, k]
df.to_csv('Your_Project.csv')

MATLAB: Write Dynamic matrix to Excel

I'm using MATLAB R2009a and following this example:
http://uk.mathworks.com/help/matlab/matlab_external/using-a-matlab-application-as-an-automation-client.html
I'd like to edit it so that I can write a matrix of unknown size into a column in an excel sheet, therefore not explicitly stating the range. I've attempted it this way:
%Put MATLAB data into the worksheet
Hop = [47; 53; 93; 10]; %Pretend I don't know what size this matrix is.
p = length(Hop);
p = strcat('A',num2str(p));
eActivesheetRange = e.Activesheet.get('Range','A1:p');
eActivesheetRange.Value = Hop;
However, this errors out. I've tried several variations of this to no avail. For example, using 'A:B' puts this array in columns A and B in excel and a NAN into every cell beyond my array. As I only want column A filled, using simple ('Range','A') errors out also.
Thanks in advance for any advice you can offer.
You're having issues because you're trying to use your variable p in a string directly
range = 'A1:p';
'A1:p'
This isn't going to work, you want to include the value of p. There are a number of ways you can do this.
In the code you have provided, you have already set p = 'A10' so if you wanted to append that to your range, you'd perform string concatenation
p = 'A10';
range = strcat('A1:', p);
I personally prefer to use sprintf to place the number directly into my strings rather than concatenating a bunch of strings.
p = 10;
range = sprintf('A1:A%d', p)
'A1:A10`
So if we adapt your code to use this we should get
range = sprintf('A1:A%d', numel(Hop));
eActivesheetRange = e.Activesheet.get('Range', range);
eActivesheetRange.Value = Hop;
Also just to be a little explicit, I would use numel rather than length as length is ambiguous. Also, I would flatten Hop into a column vector just to make sure that it's the proper dimension to be written to the spreadsheet.
eActivesheetRange.Value = Hop(:);
Essentially, the idea is to replace xx in 'B1:Bxx' with the number of elements in your matrix.
I tried this:
e = actxserver('Excel.Application');
eWorkbook = e.Workbooks.Add;
e.Visible = 1;
eSheets = e.ActiveWorkbook.Sheets;
eSheet1 = eSheets.get('Item',1);
eSheet1.Activate;
A = [1 2 3 4];
eActivesheetRange = e.Activesheet.get('Range','A1:A4');
eActivesheetRange.Value = A;
The above is directly from the link you shared. The reason why what you are trying to do is failing is that the p you pass into e.Activesheet.get() is a variable and not a string. To avoid this, try the following:
B = randi([0 10],10,1)
eActivesheetRange = e.Activesheet.get('Range',['B1:B' num2str(numel(B))]);
eActivesheetRange.Value = B;
Here, num2str(numel(B)) will pass in a string, which is the number of elements in B. This is variable in the sense that it depends on the number of elements in B.

Switching positions of two strings within a list

I have another question that I'd like input on, of course no direct answers just something to point me in the right direction!
I have a string of numbers ex. 1234567890 and I want 1 & 0 to change places (0 and 9) and for '2345' & '6789' to change places. For a final result of '0678923451'.
First things I did was convert the string into a list with:
ex. original = '1234567890'
original = list(original)
original = ['0', '1', '2' etc....]
Now, I get you need to pull the first and last out, so I assigned
x = original[0]
and
y = original[9]
So: x, y = y, x (which gets me the result I'm looking for)
But how do I input that back into the original list?
Thanks!
The fact that you 'pulled' the data from the list in variables x and y doesn't help at all, since those variables have no connection anymore with the items from the list. But why don't you swap them directly:
original[0], original[9] = original[9], original[0]
You can use the slicing operator in a similar manner to swap the inner parts of the list.
But, there is no need to create a list from the original string. Instead, you can use the slicing operator to achieve the result you want. Note that you cannot swap the string elements as you did with lists, since in Python strings are immutable. However, you can do the following:
>>> a = "1234567890"
>>> a[9] + a[5:9] + a[1:5] + a[0]
'0678923451'
>>>

Subset String Array based on length

I have a vector with > 30000 words. I want to create a subset of this vector which contains only those words whose length is greater than 5. What is the best way to achieve this?
Basically df contains mutiple sentences.
So,
wordlist = df2;
wordlist = [strip(wordlist[i]) for i in [1:length(wordlist)]];
Now, I need to subset wordlist so that it contains only those words whose length is greater than 5.
sub(A,find(x->length(x)>5,A)) # => creates a view (most efficient way to make a subset)
EDIT: getindex() returns a copy of desired elements
getindex(A,find(x->length(x)>5,A)) # => makes a copy
You can use filter
wordlist = filter(x->islenatleast(x,6),wordlist)
and combine it with a fast condition such as islenatleast defined as:
function islenatleast(s,l)
if sizeof(s)<l return false end
# assumes each char takes at least a byte
l==0 && return true
p=1
i=0
while i<l
if p>sizeof(s) return false end
p = nextind(s,p)
i += 1
end
return true
end
According to my timings islenatleast is faster than calculating the whole length (in some conditions). Additionally, this shows the strength of Julia, by defining a primitive competitive with the core function length.
But doing:
wordlist = filter(x->length(x)>5,wordlist)
will also do.

Resources