Find index of cells containing my string - string

I have a cellarray C which contain numbers and string like that.
1 0 'C:\user' 41.57
2 0 'C:\user' 46.25
3 0 'C:\user' 48
4 0 'C:\user' 48.33
I want to get the index of the cell which is equal to a specified name enter.
I have tried to do something like that but it didn't work
idx=find(strcmp([C{:,:}],'C:\User\..')
I need help please

To use strcmp, you have to use num2str at first to convert the double to string. Use UniformOutput as false since your C has both numbers and strings.
idx = find(strcmp(cellfun(#num2str, C, 'un', 0), 'C:\user'));
[row, col] = ind2sub(size(C), idx);

Related

If row contains string, find number in string that isn't 0

I'm trying to figure out how to find if a row contains a string and if it does, find if it doesn't contain 0, and ideally show the number or string containing the non-zero number.
Ex:
h i j k l .. ah
6 1 : Count of hread = 60 other
7 dir not found 0 : Count of hread = 60
Before I changed my output to show the count number, it was just a yes/no type of thing and my ah cell had a formula like: =IF(COUNTIF(H25:AE25,"hread = 60"),"YES","NO")
But now I am showing the count and it's showing as 0 but I don't care about ones with count of 0.
How do I do the formula so it only shows in column ah, ones that aren't a count of 0? If I can also show ones with a count that isn't 0 and also what the string shows for the count/output, that would be great. I was looking at regular expressions, but I'm not sure how to do this in excel, or if there's a better way: regex.
This is using Excel 365/OneNote version, but older versions of excel formulas are fine.
So you can see, the string is 0: Count of hread = 60, so I want to find ones without the 0:, and hopefully return the number before the semicolon in the non-zero cases.

Splitting a pandas column every n characters

I have a dataframe where some columns contain long strings (e.g. 30000 characters). I would like to split these columns every 4000 characters so that I end up with a range of new columns containing strings of length at most 4000. I have an upper bound on the string lengths so I know there should be at most 9 new columns. I would like there to always be 9 new columns, having None/NaN in columns where the string is shorter.
As an example (with n = 10 instead of 4000 and 3 columns instead of 9), let's say I have the dataframe:
df_test = pd.DataFrame({'id': [1, 2, 3],
'str_1': ['This is a long string', 'This is an even longer string', 'This is the longest string of them all'],
'str_2': ['This is also a long string', 'a short string', 'mini_str']})
id str_1 str_2
0 1 This is a long string This is also a long string
1 2 This is an even longer string a short string
2 3 This is the longest string of them all mini_str
In this case I want to get the result
id str_1_1 str_1_2 str_1_3 str_1_4 str_2_1 str_2_2 str_2_3
0 1 This is a long strin g NaN This is al so a long string
1 2 This is an even long er string NaN a short st ring NaN
2 3 This is th e longest string of them all mini_str NaN NaN
Here, I want e.g. first row, column str_1_3 to be a string of length 1.
I tried using
df_test['str_1'].str.split(r".{10}", expand=True, n=10)
but that didn't work. It gave this as result
0 1 2 3
0 g None
1 er string None
2 them all
where the first columns aren't filled.
I also tried looping through every row and inserting '|' every 10 characters and then splitting on '|' but that seems tedious and slow.
Any help is appreciated.
The answer is quite simple, that is, insert a delimiter and split it.
For example, use | as the delimiter and let n = 4:
series = pd.Series(['This is an even longer string', 'This is the longest string of them all'],name='str1')
name = series.name
cols = series.str.replace('(.{10})', r'\1|').str.split('|', n=4, expand=True).add_prefix(f'{name}_')
That is, use str.replace to add delimiter, use str.split to split them apart and use add_prefix to add the prefixes.
The output will be:
str1_0 str1_1 str1_2 str1_3
0 This is an even long er string None
1 This is th e longest string of them all
The reason why str.split('.{10}') doesn't work is that the pat param in the function str.split is a pattern to match the strings as split delimiters but not strings that should be in splited results. Therefore, with str.split('.{10}'), you get one character every 10-th chars.
UPDATE: Accroding to the suggestion from #AKX, use \x1F as a better delimiter:
cols = series.str.replace('(.{10})', '\\1\x1F').str.split('\x1F', n=4, expand=True).add_prefix(f'{name}_')
Note the absence of the r string flags.

How to search for a specific string in cell array

I would like to search for a specific string in matlab cell. For example my cell contains a column of strings like this
variable(:,5) = {'10';'10;20';'20';'10;20';'10';'10';'20'};
I would like to search for all cells that have only '10' and delete them.
I tried using this statement for searching
is10 = ~cellfun(# isempty , strfind (variable(:,5) , '10'));
But this returns all cells with '10' (including the ones with '10;20').
I would like to have just the cells with pure '10' values
What is the best way to do this?
It is not working as you expect because strfind allows for a partial string match. What you want is an exact match. You can do this using strcmp. Also, the input to strcmp can actually be a cell array of strings so you can use it the following way.
A = {'10';'10;20';'20';'10;20';'10';'10';'20'};
is10 = strcmp(A, '10');
%// 1 0 0 0 1 1 0
You could also use ismember to do the same thing.
is10 = ismember(A, '10');
%// 1 0 0 0 1 1 0
As a side note, most string functions (including strfind) can actually accept a cell array of strings as input. So in your initial post, the wrapping of strfind inside of cellfun is unnecessary.

Convert produced integers into a vertical list and reverse list order

This code converts decimal to binary, whilst still needing to be reversed to display the correct number.
dec = int(input("Please enter number to convert to decimal: "))
while dec>0:
quoteint = dec/2
rem = dec%2
print (int(rem))
dec = int(dec/2)
I'm looking to get the numbers produced by the code below to be displayed in a line. Eg
1 0 0 1 0 0
But the code currently only produces the integers in a row like this.
1
0
0
1
0
0
I know I need to turn the integers into a list and then reverse the list to get it to display the correct binary number. Can someone explain how I could possibly do this?
I think if you change the line
print (int(rem)) to
print (int(rem), end="") that should do the job.

How to replace a digit in a string of numbers with a random number

I have a string which is a sequence of numbers between 0-9, I would like to replace some of the numbers in that string with random numbers, but I'm unable to do with with either rand or randi as I keep getting the following error:
Conversion to char from cell is not possible.
Error in work (line 58)
mut(i,7:14) = {mute_by};
Here's what I'm currently doing to try and alter some digits of my string:
% mutate probability 0.2
if (rand < 0.2)
% pick best chromosome to mutate
mut = combo{10};
mute_by = rand([0,9]);
for i = 1:5
mut(i) = {mute_by};
end
end
mut represents the string 110202132224154246176368198100
How would I go about doing this? I assumed it would be fairly simple but I've been going over the documentation for a while now and I can't find the solution.
What I would do is generate a logical array that represents true if you want to replace a particular position in your string and false otherwise. How you're determining this is by generating random floating pointing point numbers of the same size as your string, and checking to see if their values are < 0.2. For any positions in your string that satisfy this constraint, replace them with another random integer.
The best way to do this would be to convert your string into actual numbers so that you can actual modify their positions numerically, create our logical array, then replace these values with random integers at these co-ordinates.
Something like this:
rng(123); %// Set seed for reproducibility
mut = '110202132224154246176368198100';
mut_num = double(mut) - 48;
n = numel(mut_num);
vec = rand(1, n) < 0.4;
num_elem = sum(vec);
mut_num(vec) = randi(10,1,num_elem) - 1;
mut_final = char(mut_num + 48);
Let's go through this code slowly. rng is a random seed generator, and I set it to 123 so that you're able to reproduce the same results that I have made as random number generation... is of course random. I declare the number you have made as a string, then I have a nifty trick of turning each character in your string into a single element of a numeric array. I do this by casting to double, then subtracting by 48. By casting to double, each element gets converted into its ASCII code, and so 0 would be 48, 1 would be 49 and so on. Therefore, we need to subtract the values by 48 so that we can bring this down to a range of [0-9]. I then count how long your string is with numel, then figure out which values I want to replace by generating that logical vector we talked about.
We then count up how many elements we need to change, then generate a random integer vector that is exactly this size, and we use this logical vector to index into our recently converted string (being stored as a numeric array) with random integers. Note that randi generates values from 1 up to whatever maximum you want. Because this starts at 1, I have to generate up to 10, then subtract by 1. The output of randi gets placed into our numeric array, and then we convert our numeric array into a string with char. Note that we need to add by 48 to convert the numbers into their ASCII equivalents before creating our string.
I've changed the probability to 0.4 to actually see the changes better. Setting this to 0.2 I could barely notice any changes. mut_final would contain the changed string. Here is what they look like:
>> mut
mut =
110202132224154246176368198100
>> vec
vec =
Columns 1 through 13
0 1 1 0 0 0 0 0 0 1 1 0 0
Columns 14 through 26
1 1 0 1 1 0 0 0 0 0 0 0 1
Columns 27 through 30
1 1 1 0
>> mut_final
mut_final =
104202132444143248176368195610
vec contains those positions in the string you want to change, starting from the 2nd position, 3rd position, etc. The corresponding positions in mut change with respect to vec while the rest of the string is untouched and is finally stored in mut_final.

Resources