Extract words before a specific pattern - excel

I have a column with different names in the rows:
hello_world_xt_x_D3_m6
bye_bye_x_D1_m3
h1_man_xt_x_D3_m6
bonjour_no_x_D1_m12
I would like to remove the ending part which follows the pattern
_x_DN_mZ
where N is a number between 0 and 3 and Z is a number between 0 and 16.
I would like to have
hello_world_xt
bye_bye
h1_man_xt
bonjour_no
I think I should use a combination of search and trim/right, but I do not know how to apply it.
I have tried with =substitute(a2, "_x_D2_m3","") but I do not know how to extend it regardless the numbers which follows D and m

You could use Wildcards (See the ? in the search string)
EDIT: replace second ? with *
Formula: =LEFT(A2,SEARCH("_x_D?_m*",A2)-1)

With data in column A, in B1 enter:
=MID(A1,1,FIND("_x_",A1)-1)
and copy downward:

Would this do?
=LEFT(A2,FIND("_x_",A2)-1)

Related

Replace part of string with values in other columns

I am trying to replace a part of string in "input1" with values in column "input2" to get the "output" as shown:
Any idea how to do this in MS excel?
Try below formula-
=SUBSTITUTE(A2,"replace_this",B2)
In addition to the REPLACE and SUBSTITUTE function, if you just need to add something before and/or after some other value (as shown in your picture), you could just use simple concatenation:
="Text-before-value" & B2 & "Text-after-value"
A
B
C
1
Input1
input2
output
2
abc_'replace_this' xyz
replace1
=REPLACE(A2,FIND("replace_this",A2,1),LEN("replace_this"),B2)

Retrieving part of a string in excel

I am trying to extract a part of a string in Excel (Excel for Mac ver.15.33) but I cannot figure out an appropriate formula structure.
Consider the following string in Excel cell A1:
Description:Guanine nucleotide-binding protein alpha-4 subunit:Gopi K. Podila:2006-05-06 Model Notes:editing needed -- 3' only editing needed at the middle portion of G protein alpha domain also:Gopi K. Podila:2006-05-06 Defline:Guanine nucleotide-binding protein alpha-4 subunit:Gopi K. Podila:2006-05-06 Literature:TITLE The genome sequence of Ustilago maydis:Gopi K. Podila:2006-02-10
I would like to extract everything between "Description:" and the first next ":" to appear.
I would also like to extract everything between "Defline:" and the first next ":" to appear.
Note that not every string i would like to perform this on will start with "Description:". The string can also start with "Defline:" or "Model Notes:" or other. The only constant is that whatever I would like to extract is placed in between "A Word:" and ":".
Thank you very much in advance!
With data in A1, in B1 enter:
=TRIM(MID(SUBSTITUTE($A1,":",REPT(" ",999)),2*999-998,999))
EDIT#1:
If "Description:" can occur anywhere in A1, then use:
=TRIM(MID(A1,FIND("Description:",A1)+LEN("Description:"),FIND(":",A1,FIND("Description:",A1)+LEN("Description:"))-(FIND("Description:",A1)+LEN("Description:"))))

Converting bar-codes to specific format using IF, FIND & MID function of excel

In Excel I have a number of columns containing barcodes of different types such as:
WS-S5-S-L1-C31-F-U5-S9-P14 convert to 05-09-14
WS-S5-S-L1-C31-F-U5-S8-P1 convert to 05-08-01
WS-S5-N-L1-C29-V-U16-S6-P6 convert to 16-06-06
I want to convert these to 8 characters using the following rules:
remove the U and prefix 0 where appropriate
remove S and prefix 0 where appropriate
remove P and prefix 0 where appropriate
I believe there is a way to use IF,FIND & MID function to convert these in Excel but don't know how to start. Any help will be much appreciated.
Well, this does case 2 fine, but does not deal with P14 or U16, but you can take it further...
=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(RIGHT(A1,LEN(A1)-FIND("U",A1,1)+1),"U",0),"S",0),"P",0)
assumed data was in cell A1 to A3, just drag down. Very brute force though...

Find string (from table) in cell in matlab

I want to find the location of one string (which I take it from a table) inside of a cell:
A is my table, and B is the cell.
I have tested :
strncmp(A(1,8),B(:,1),1)
but it couldn't find the location.
I have tested many commands like:
ismember,strmatch,find(strcmp),find(strcmpi)find(ismember),strfind and etc ... but they all give me errors mostly because of the type of my data !
So please suggest me a solution.
You want strfind:
>> strfind('0123abcdefgcde', 'cde')
ans =
7 12
If A is a table and B a cell array, you need to index this way:
strfind(B{1}, A.VarName{1});
For example:
>> A = cell2table({'cde'},'VariableNames',{'VarName'}); %// create A as table
>> B = {'0123abcdefgcde'}; %// create B as cell array of strings
>> strfind(B{1}, A.VarName{1})
ans =
7 12
Luis Mendo's answer is absolotely correct, but I want to add some general information.
Your problem is that all the functions you tried (strfind, ...) only work for normal strings, but not for cell array. The way you index your A and B in your code snippet they still stay a cell array (of dimension (1,1)). You need to use curly brackets {} to "get rid of" the cell array and get the containign string. Luis Mendo shows how to do this.
Modified solution from a Mathworks forum, for the case of a single-column table with ragged strings
find(strcmp('mystring',mytable{:,:}))
will give you the row number.

in excel, I want to count the number of cells that do not contain a specific character

in excel, I want to count the number of cells that do not contain a specific character (in this case, a "." /period).
I tried something like countif(A1:A10,"<>.*") but this is wrong and I can't seem to figure it out.
Say I have these data in column A:
D
N
P
.
.
A
N
.
P
.
And the count would be 6
For your example:
=COUNTIF(A1:A10,"<>.")
returns 6. But it would be a different story if say you wanted to exclude P. from the count also.
Your data may not be quite what you think it is however, because including the * should make no difference for your example.
Or you could subtract periods from the total and be left with the non periods
=COUNTIF(A1:A10,"*")-COUNTIF(A1:A10,"=.")
gives 6.
If your data includes periods along with other characters in the same cell and want a similar count:
then this:
=COUNTA(A1:A10)-COUNTIF(A1:A10,"*.*")
will return 5

Resources