Extract number or text from One row data - excel-formula

I need solution fro this:
/Women/Dresses/Short-sleeved-Peplum-Dress/p/8503311?utm_extID=Dec10
I need to extract data between /p/ and ? i,e 8503311
total len of this line is 67 but if differs not fixed
I tried by using find function not able get solution

if it's always after p/ and the length of number is fixed (in your example = 7), this should work:
=MID(A1,SEARCH("p/",A1)+2,7)
This is assuming data in A1.

should be like:
import re
x = '/Women/Dresses/Short-sleeved-Peplum-Dress/p/8503311?utm_extID=Dec10'
re.search('(?<=/p/)\d+',x).group()

Assuming the data string is in A1, this should return everything between, but not including, the first instance of /p/ and the first ? thereafter:
=MID(A1,FIND("/p/",A1)+3,FIND("?",MID(A1,FIND("/p/",A1)+4,LEN(A1))))

Related

I need a function for excel to convert to camel case

I need a function for excel which would convert
'random.text.random'
to
'randomTextRandom'
Assuming your value is in cell A1 ...
=LOWER(LEFT(A1,1)) & MID(SUBSTITUTE(PROPER(SUBSTITUTE(A1,"."," "))," ",""),2,LEN(A1))
... or if you're expected to have the first character capitalized ...
=UPPER(LEFT(A1,1)) & MID(SUBSTITUTE(PROPER(SUBSTITUTE(A1,"."," "))," ",""),2,LEN(A1))
Try below formula-
=SUBSTITUTE(PROPER(A1),".","")
Convert to Camel Case
=LEFT(A1,FIND(".",A1)-1)&SUBSTITUTE(PROPER(RIGHT(A1,LEN(A1)-FIND(".",A1))),".","") = randomTextRandom
To the left of the first dot:
LEFT(A1,FIND(".",A1)-1) = random
To the right of the first dot:
RIGHT(A1,LEN(A1)-FIND(".",A1)) = text.random
Make it proper:
PROPER(...) = Text.Random
Remove the remaining dot:
SUBSTITUTE(...,".","") = TextRandom
Since i have mentioned in comments above hence adding the same in the answers as well, also you can use Upper or Lower or Proper as per your requirement basically you needed to remove the dot and just concatenate without it
Formula used in cell B1
=SUBSTITUTE(A1,".","")

How to extract the characters from a string in Excel

Hi I would like to extract dynamically the numbers from string in Excel.
I have the following strings and I would like to have only the numbers before ". pdf". taken out of the string into the next column.
As you can see the number of characters varies from line to line.
I have invented something like this:
=MID(M20;SEARCH("_";M20);20)
But this takes out only the numbers after "_" and .pdf after this....
How to make it the way I like?
D:\Users\xxxx\Desktop\1610\ts25b_4462.pdf
D:\Users\xxx\Desktop\1610\ts02b_39522.pdf
D:\Users\xxxxx\Desktop\1610\ts02b_except_39511.pdf
D:\Users\xxxx\Desktop\1610\ts02b_except_39555.pdf
D:\Users\xxxx\Desktop\1610\ts22b_6118.pdf
So that I have just :
4462
39522
39511
39555
6118
and so on...
Thank you!!!
With VBA, try to do it like this:
Public Function splitThings(strInput As String) As String
splitThings = Split(Split(strInput, "_")(1), ".")(0)
End Function
Concerning your formula, try to use =LEFT(MID(M20;SEARCH("_";M20);20),K), where K is the difference of the length of ts22b_6118.pdf and 4 (.pdf). 4 is the length of .pdf.
Something like this should do the work:
=LEFT(MID(I3,SEARCH("_",I3)+1,LEN(I3)),LEN(MID(I3,SEARCH("_",I3),LEN(I3)))-5)
You should do it using Excel formula. For example:
=SUBSTITUTE(LEFT(A1,FIND(".pdf",A1)-1),LEFT(A1,FIND("_",A1)),"")
Using the first line as an example, with LEFT(A1,FIND(".pdf",A1)-1) you will have D:\Users\xxxx\Desktop\1610\ts25b_4462 and with the LEFT(A1,FIND("_",A1)) D:\Users\xxxx\Desktop\1610\ts25b_, if you SUBSTITUTE the first part by "" you will have 4462.
Hope this can help.
With this formula, you should be able to get the numbers you want:
=MID(A1,FIND("|",SUBSTITUTE(A1,"_","|",LEN(A1)-LEN(SUBSTITUTE(A1,"_",""))))+1,FIND(".",A1)-FIND("|",SUBSTITUTE(A1,"_","|",LEN(A1)-LEN(SUBSTITUTE(A1,"_",""))))-1)
Basically, this is the initial fomula:
=MID(A1,FIND("_",A1)+1,FIND(".",A1)-FIND("_",A1)-1)
But since there may be two _ in the string so this is the one to find the 2nd _:
=SUBSTITUTE(A1,"_","|",LEN(A1)-LEN(SUBSTITUTE(A1,"_","")))
Now just replace this SUBSTITUTE with A1 above and you get that long formula. Hope this helps.
This will return the number you want regardless of extension (could be .pdf, could be .xlsx, etc) and regardless of the number of underscores present in the filename and/or filepath:
=TRIM(LEFT(RIGHT(SUBSTITUTE(SUBSTITUTE(M20,".",REPT(" ",LEN(M20))),"_",REPT(" ",LEN(M20))),LEN(M20)*2),LEN(M20)))

MATLAB export data stored in a double array and cell array to a CSV file

I have a MATLAB structure with 19 fields. The main field is a 1 x 108033 double with all values numeric. It looks like this, basically 108033 numbers:
pnum: 5384940 5437561 5570271 5661637 5771155 ...
I have another field called inventors which is a 1 x 108033 cell value. Every cell contains a different number of strings. Columns 1 to 5 for example are
inventors: {2x1 cell} {4x1 cell} {1x1 cell} {1x1 cell} {1x1 cell}
For the first column value, the 2 x 1 cell consists of the following values
5012491-01 and 2035147-03 and so on.
I'd like to jointly export these two to a CSV file. The ideal outcome would repeat the number in pnum so that it establishes a clear link between the pnum and the inventors. Thus, the ideal outcome would look something like this (with the contents of what is in the inventors cell displayed).
pnum inventors
5384940 5012491-01
5384940 2035147-03
5437561 5437561-01
5437561 5437561-02
5437561 5437561-03
5437561 5012491-02
5570271 5437561-03
5661637 1885634-08
5771155 5012491-01
I asked a more complex version of this question before but it was not clear enough what the problem was. Hope it is now.
I'm assuming each cell in inventors is a cell array of strings. It wouldn't make sense for these to be actual floating point or intenger numbers, because the dash would subtract the two numbers separating them together. Now, because you're writing to a CSV file, the easiest thing I can think of is to iterate over each number and cell, then repeat the ID number for as many times as there are elements in a cell. First create the right headers, then write your results. Something like this comes to mind:
f = fopen('data.csv', 'w'); %// Open up data for writing
fprintf(f, 'pnum,inventors\n'); %// Write headers
for ii = 1 : numel(pnum) %// For each unique number
inventor = inventors{ii};
for jj = 1 : numel(inventor) %// For each inventor ID
fprintf(f, '%d,%s\n', pnum(ii), inventor{jj}); %// Write the right combo to file
end
end
fclose(f); %// Close the file
fopen here opens up a file called data.csv so we can write things to it. What is returned is a file pointer called f, which we use to write stuff to this file. After, we write the headers of the file, consisting of pnum and inventors. This is a CSV file so there's a comma separating the two. Now, for each unique number, we then access the right slot in inventors then for each unique inventor, add the same unique ID with the right inventor ID as a line in this file. I use fprintf to write things to file using the associated file pointer established earlier. Once we're done, close the file with fclose.
To double check that this works, I've used the small example you've provided in your post:
pnum = [5384940 5437561 5570271 5661637 5771155];
inventors = {{'5012491-01', '2035147-03'}.', {'5437561-01', '5437561-02', '5437561-03', '5012491-02'}.', {'5437561-03'}, {'1885634-08'}, {'5012491-01'}};
Bear in mind that I don't have access to your struct, so you'll have to access the right fields and assign them to the corresponding variables seen above. So if your struct is called something like data, then you'd do this before you run the above code:
pnum = data.pnum;
inventors = data.inventors;
Running the above code I just wrote and opening up the CSV file (which is called data.csv), I get this:
pnum,inventors
5384940,5012491-01
5384940,2035147-03
5437561,5437561-01
5437561,5437561-02
5437561,5437561-03
5437561,5012491-02
5570271,5437561-03
5661637,1885634-08
5771155,5012491-01

Separate number with symbol in Excel

I am looking for a function in Microsoft Excel that can separate number with symbol. For the example :
I want to separate number in this code:
12345_ABCD
I am looking for a function that can enables me to get the number (in this case 12345) without another character (in this case _ABCD).
And the problem is the the total character of number can be vary. For the example, it can be 12345_ABCD or 234_ABC or 34567_AB.
Please kindly help my problem. Thanks for your concern
If your cells always have numbers in front and then an underscore you could use this:
=Left(A1;Find("_";A1)-1)
Here is the function to get the number alone (in your case)
Public Function segregatenumber(a As String)
segregatenumber = Left(a, InStr(a, "_") - 1)
End Function
then you may use it as function in Excel cells

xlsread ('not the file name but a string contained in an element of an array that is the file name)

I would like to read an excel file (xlsread) but I don't want to put manually the string every time but instead I want to xlsread the name of the file that is contained in an array.
For example, my array B is:
B =
'john.xlsx'
'mais.xlsx'
'car.xlsx'
Then I would like to read the excel WITH THE NAME that is inside the first element, that means: "john.xlsx"
How can I do this?
data = xlsread(B{1});
Or, if you want to read all of them:
for i=1:length(B)
data(i).nums = xlsread(B{i});
end
Assuming, of course, your B is a cell array. If it's not, it can't exist the way you described it. If all strings have the same length (then it would be possible) or padding with spaces, you can split the char array into a cell array using
B = mat2cell(B,ones(size(B,1),1),size(B,2));
Strings of different lengths would have to be inside a cell array, which you can access elements via the curly brackets {}. So, you can call xlsread on the first element this way:
names{1} = 'john.xlsx';
names{2} = 'mais.xlsx';
names{3} = 'car.xlsx';
num = xlsread(names{1});

Resources