Related
I am not sure where to begin with the formula as I have gotten myself so confused with everything. I have a cell the contains "PON " or "PON: " or "PON = " then the actual PON (Example: PON 123467) I want to formula to return 123467 in the cell.
Examples What I want returned
I have PON 123467 for shoes 123467
I have PON: 234567-AB for food 234567-AB
I have PON - 569874-Weird for accessories 569874-Weird
I have PON = DOG-564-987 for dog food DOG-564-987
I am currently using Excel 365
Filterxml() will give you best companion here in this case. Try-
=FILTERXML("<t><s>"&SUBSTITUTE(FILTERXML("<t><s>"&SUBSTITUTE(A1," for","</s><s>")&"</s></t>","//s[1]")," ","</s><s>")&"</s></t>","//s[last()]")
Using FILTERXML, and testing for a substring following PON, you can try:
=FILTERXML("<t><s>"&SUBSTITUTE(TRIM(A1)," ","</s><s>") & "</s></t>","//s[contains(.,'PON')]/following-sibling::*[string-length(.)>2][1]")
Note that FILTERXML solution will cause a PON that is solely numeric, but with a leading zero, to drop the leading zero. Unfortunately, the xPath implementation in that function does not include the string() function
If dropping the leading zero might be a problem, you can add a character to the node that will force the number to be seen as a string. In the modified formula below, I use the unicode zero-width space, but there are others you can use. Note that this will count as a character for the string=length function, so be sure to maintain the >2 parameter:
=FILTERXML("<t><s>"&SUBSTITUTE(TRIM(A1)," ","</s><s>"&UNICHAR(8203)) & "</s></t>","//s[contains(.,'PON')]/following-sibling::*[string-length(.)>2][1]")
Because of the variablity in your data, that sometimes there are extraneous space-separated substrings between PON and your desired extract, the xpath:
locates the substring PON
returns all subsequent siblings that have a string-length of more than two (adjust if necessary)
returns the first sibling that meets that criterion.
You might try this formula.
=TRIM(LEFT(MID(A2,FIND(#{1,2,3,4,5,6,7,8,9},A2),100),FIND(" ",MID(A2,FIND(#{1,2,3,4,5,6,7,8,9},A2),100))))
It extracts the text between the first number and the first space following that number. The size of that extract is limited to 100 characters.
So How to get substring from nth position up to the end of string?
Input at cell A1 Name: Thomas B.
Expected output: Thomas B.
I know some way to do it but I wonder if there are other elegant ways than them? (some kind of =RIGHT(A1, -6)....)
=MID(A1, 6, 999999) //999999 looks not so good
=MID(A1, 6, LEN(A1) - 5) //must calculate 2 times, first get len, then get substring, seems too much works?
REPLACE
As Dominique already wrote:
'Why don't you just replace the first six characters by an empty string?'
=REPLACE(A1,1,6,"")
I've done some time measuring, but the difference is less than a second at 50000 records (for LEFT, MID, REPLACE & SUSTITUTE). So I'm afraid ELEGANCE is all you're going to get.
A Small Study
I created this study due to the fact that when you say from the n-th character, your n-th character is 7 (your MID-s are wrong), but you want to remove the first n-1 (6) characters. So depending on how you formulate your question, you might have a different approach in RIGHT or MID, and you will remember REPLACE and SUBSTITUTE or you may not.
Small Study Formulas for A1 (*) and B1 (#, ?, *)
Get String From N-th Character to the End, e.g. 7
=RIGHT(A1,LEN(A1)-(B1-1))
=RIGHT(A1,LEN(A1)-B1+1)
=RIGHT(A1,LEN(A1)-6)
=MID(A1,B1,LEN(A1)-(B1-1))
=MID(A1,B1,LEN(A1)-B1+1)
=MID(A1,B1,LEN(A1))
=MID(A1,7,LEN(A1)-6)
=MID(A1,7,LEN(A1))
Remove N First Characters of a String, e.g. 6
=RIGHT(A1,LEN(A1)-B1)
=RIGHT(A1,LEN(A1)-6)
=MID(A1,B1+1,LEN(A1)-B1)
=MID(A1,B1+1,LEN(A1))
=MID(A1,7,LEN(A1)-6)
=MID(A1,7,LEN(A1))
Get String After a Character e.g. " "
=RIGHT(A1,LEN(A1)-(FIND(B1,A1)))
=RIGHT(A1,LEN(A1)-(FIND(" ",A1)))
=MID(A1,FIND(B1,A1)+1,LEN(A1)-FIND(B1,A1))
=MID(A1,FIND(B1,A1)+1,LEN(A1))
=MID(A1,FIND(" ",A1)+1,LEN(A1)-FIND(" ",A1))
=MID(A1,FIND(" ",A1)+1,LEN(A1))
Get String After a String e.g. ": "
=RIGHT(A1,LEN(A1)-(FIND(B1,A1)+LEN(B1))+1)
=RIGHT(A1,LEN(A1)-FIND(B1,A1)-LEN(B1)+1)
=RIGHT(A1,LEN(A1)-FIND(": ",A1)-LEN(": ")+1)
=MID(A1,FIND(B1,A1)+LEN(B1),LEN(A1)-(FIND(B1,A1)+LEN(B1))+1)
=MID(A1,FIND(B1,A1)+LEN(B1),LEN(A1)-FIND(B1,A1)-LEN(B1)+1)
=MID(A1,FIND(B1,A1)+LEN(B1),LEN(A1))
=MID(A1,FIND(": ",A1)+LEN(": "),LEN(A1)-FIND(": ",A1)-LEN(": ")+1)
=MID(A1,FIND(": ",A1)+LEN(": "),LEN(A1))
Back to Remove N First Characters of a String, e.g. 6
=SUBSTITUTE(A1,LEFT(A1,6),"",1)
=REPLACE(A1,1,6,"")
Well, both of your methods already work, but you could also use this one:
=RIGHT(A1,LEN(A1)-6)
(you nearly had this one in your own question)
or this one:
=TRIM(MID(A1,FIND(":",A1)+1,100))
(the FIND() function returns the numeric position of a search string, so is great for doing dynamic substrings)
Why don't you just replace the first six characters by an empty string?
=SUBSTITUTE(A1;LEFT(A1;6);"";1)
Another possibility is that you create a constant with the value 2^31-1 (=2147483647), which is the maximum signed integer value on 32-bit systems, and you give it a nice name, like MaxInt, then your first formula will be efficient and nice looking, too:
=MID(A1, 6, MaxInt)
You can add the Name with Ctrl+F3. If you are interested in fast calculations, giving it as 2147483647 rather than 2^31-1 may have some (very little) advantage.
I have Excel sheet which contains data similar to
Addresses
xyz,abc,olk
opn,opk,prt
we-ylj,tyf,uyfas
oiui,ytfy,tydry - We also work in bla,bla,bla
ytfyt,tyfyt,ghfyt
i-hgsd,gsdf-hgd,sdgh,- We also work in xxx,yy,zzz
ytsfgh,gfasdg,tydsfyt
I want to remove all substring which is next to the character "-" only if it's in the last position.
Result should be like
xyz,abc,olk
opn,opk,prt
we-ylj,tyf,uyfas
oiui,ytfy,tydry
ytfyt,tyfyt,ghfyt i-hgsd,gsdf-hgd,sdgh
ytsfgh,gfasdg,tydsfyt
I tried with =Substitute function but unable to replace data because of the last substring separated from "-" is not similar.
Going by your specifications, I would use two columns just so it's not a very long formula:
In B1:
=IFERROR(FIND(CHAR(1),SUBSTITUTE(A1,"-",CHAR(1),LEN(A1)-LEN(SUBSTITUTE(A1,"-",""))))-1,LEN(A1))
This gets the position of the last - or the full text length.
Then in C1:
=LEFT(A1,IF(FIND(",",A1)<B1,B1,LEN(A1)))
This checks if there's a , before the last -. If there is no ,, then the full text is taken.
EDIT: I only now noticed your edited comment. If it's just everything after - We, then I would use this:
=TRIM(LEFT(A1,IFERROR(FIND("- We",A1)-2,LEN(A1))))
I have an error in this excel formula and I can't just figure it out:
=LEFT(B3,FIND(",",B3&",")-1)&","&RIGHT(B3,LEN(B3)-FIND("&",B3&"&")),RIGHT(B3,LEN(B3)-SEARCH("#",SUBSTITUTE(B3," ","#",LEN(B3)-LEN(SUBSTITUTE(B3," ",""))))&", "&SUBSTITUTE(RIGHT(B3,LEN(B3)-FIND("&",B3&"&")-1),RIGHT(B3,LEN(B3)-SEARCH("#",SUBSTITUTE(B3," ","#",LEN(B3)-LEN(SUBSTITUTE(B3," ",""))))),""))
It may seem like a big formula, but all it's intended to do is if no ampersand is in a cell, return an empty cell, if no comma but ampersand exists, then return this, for example:
KNUD J & MARIA L HOSTRUP
into this:
HOSTRUP,MARIA L
Otherwise, there is no ampersand but there is a comma so we just return: LEFT(A1,FIND("&",A1,1)-1).
Seems basic, but formula has been giving me error message and doesn't point to the problem.
Your error is here:
=LEFT(B3,FIND(",",B3&",")-1)&","&RIGHT(B3,LEN(B3)-FIND("&",B3&"&")),
At this point, the comma doesn't apply to anthing, because the right operator has matching parens
As far as what you want? Let's break that up into what you actually asked for:
if no ampersand in a cell, return empty cell,
B4=Find("&", B3&"&")
B5=IF(B4>LEN(B3),"",B6)
if no comma but ampersand exists
B6=IF(FIND(",", B3&",")>LEN(B3),B8,B7)
then turn this, for example:
KNUD J & MARIA L HOSTRUP
into this:
HOSTRUP,MARIA L
I'm presuming you mean to put the last whole word? Let's mark the last whole word:
B9=SUBSTITUTE(B3," ","#",LEN(B3)-LEN(SUBSTITUTE(B3," ","")))
B10=RIGHT(B7,LEN(B9)-FIND("#",B9))
And the stuff between the ampersand and the last word
B11=TRIM(MID(B9,B4 + 1, LEN(B9)-FIND("#",B9)-1))
Then calculating it is easy
B7=B10&","&B11
Otherwise, there is no ampersand but there is a comma so we just return:
LEFT(A1,FIND("&",A1,1)-1).
Well, if you want that, let's just put that in B8
B8=LEFT(A1,FIND("&",A1,1)-1)
(But I think you actually mean B3 instead of A1)
B8=LEFT(B3,FIND("&",B3,1)-1)
And there you have it (B5 contains the information you're looking for) It took a few cells, but it's easier to debug this way. If you want to collapse it, you can (but doing so is more code, because we can reduce duplication by referencing a previously calculated cell on more than one occasion).
Summary:
B3=<Some Name with & or ,>
B4=FIND("&", B3&"&")
B5=IF(B4>LEN(B3),"",B6)
B6=IF(FIND(",", B3&",")>LEN(B3),B7,B8)
B7=B10&","&B11
B8=LEFT(B3,FIND("&",B3,1)-1)
B9=SUBSTITUTE(B3," ","#",LEN(B3)-LEN(SUBSTITUTE(B3," ","")))
B10=RIGHT(B9,LEN(B9)-FIND("#",B9))
B11=TRIM(MID(B9,B4 + 1, LEN(B9)-FIND("#",B9)-1))
When I put in "KNUD J & MARIA L HOSTRUP", I get "HOSTRUP,MARIA" in B5.
I have a set of data that shown below on excel.
R/V(208,0,32) YR/V(255,156,0) Y/V(255,217,0)
R/S(184,28,16) YR/S(216,128,0) Y/S(209,171,0)
R/B(255,88,80) YR/B(255,168,40) Y/B(255,216,40)
And I want to separate the data in each cell look like this.
R/V 208 0 32
R/S 184 28 16
R/B 255 88 80
what is the function in excel that I can use for this case.
Thank you in advance.
kennytm doesn't provide an example so here's how you do substrings:
=MID(text, start_num, char_num)
Let's say cell A1 is Hello.
=MID(A1, 2, 3)
Would return
ell
Because it says to start at character 2, e, and to return 3 characters.
In Excel, the substring function is called MID function, and indexOf is called FIND for case-sensitive location and SEARCH function for non-case-sensitive location. For the first portion of your text parsing the LEFT function may also be useful.
See all the text functions here: Text Functions (reference).
Full worksheet function reference lists available at:
Excel functions (by category)
Excel functions (alphabetical)
Another way you can do this is by using the substitute function. Substitute "(", ")" and "," with spaces.
e.g.
=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A1, "(", " "), ")", " "), ",", " ")
I believe we can start from basic to achieve desired result.
For example, I had a situation to extract data after "/". The given excel field had a value of 2rko6xyda14gdl7/VEERABABU%20MATCHA%20IN131621.jpg . I simply wanted to extract the text from "I5" cell after slash symbol. So firstly I want to find where "/" symbol is (FIND("/",I5). This gives me the position of "/". Then I should know the length of text, which i can get by LEN(I5).so total length minus the position of "/" . which is LEN(I5)-(FIND("/",I5)) . This will first find the "/" position and then get me the total text that needs to be extracted.
The RIGHT function is RIGHT(I5,12) will simply extract all the values of last 12 digits starting from right most character. So I will replace the above function "LEN(I5)-(FIND("/",I5))" for 12 number in the RIGHT function to get me dynamically the number of characters I need to extract in any given cell and my solution is presented as given below
The approach was
=RIGHT(I5,LEN(I5)-(FIND("/",I5))) will give me out as VEERABABU%20MATCHA%20IN131621.jpg . I think I am clear.
Update on 11/30/2022
With new excel functions, you can use the following in cell C1 for the input in A1:
=TEXTJOIN(" ",,TEXTSPLIT(A1,{"(",",",")"}))
Here is the output:
What about using Replace all?
Just replace All on bracket to space.
And comma to space. And I think you can achieve it.