extract certain text after certain characters - excel

what is the easiest way with an Excel formula to extract certain details from a cell? So for example, if this is in cell A1 column=""HMI_LOCATE"" px=""CLASS"" position=""99"" validation=""ROOM"" then I'm trying to extract just the data the falls in between the double "" after the px= so in this example, I need to extract just the letters CLASS and nothing else, what is the easiest way to extract that data, the part I'm trying to extract won't always be 5 characters long it could be much longer or shorter.

Do you want to achieve this?
With o365 you can use this formula
=FILTERXML("<t><s>"&SUBSTITUTE(A1,CHAR(34)&CHAR(34),"</s><s>")&"</s></t>","//s[position() mod 2 = 0]")
or for older EXCEL-versions
=IFERROR(INDEX(FILTERXML("<t><s>"&SUBSTITUTE($A$1,CHAR(34)&CHAR(34),"</s><s>")&"</s></t>","//s"),ROW(A1)*2),"-")
This splits the string at the quotation marks (CHAR(34)) and builds an array of elements. Then every second element is put out.
For tons of other possibilities have a look at this awesome guide by JvdV.
EDIT:
To get the element after px= no matter where it is, you can use
=LET(list,
FILTERXML("<t><s>"&SUBSTITUTE($A$1,CHAR(34)&CHAR(34),"</s><s>")&"</s></t>","//s"),
INDEX(list,MATCH("px=",list,0)+1)
)
The LET-function lets you assign functions to variables which then can be used for further calculations.

Related

How can I check that a string only contains a defined set of substrings by Excel formula?

I have a dictionary containing lots of words - I want the user to be able to input a list of substrings, and then a filtered list will be updated, containing only words that contain those substrings and nothing else. Any words that contain extra characters the user didn't specify, should not appear. Cell F3 will use a FILTER function to create the list. As in the mock-up below:
What I need is a formula that would generate the TRUE or FALSE flags from the yellow section (B3:B9), but I'm not sure how to go about this.
I'm sure this could be solved by VBA or Regex using Google Sheets, but I want to know if there's a way to do this by formula, as I don't want this to require a button press or script execution, and my spreadsheet can't be hosted on Google sheets due to its size. Any ideas?
You can also use a combination of ISNUMBER and SUMPRODUCT:
=ISNUMBER(SUMPRODUCT(MATCH(MID(A3,ROW(INDEX(A:A,1,1):INDEX(A:A,LEN(A3),1)),1),$D$3:$D$5,0)))
Adjusted formula:
=ISNUMBER(SUMPRODUCT(MATCH(MID(A3,ROW(A$1:INDEX(A:A,LEN(A3))),1),$D$3:$D$5,0)))
The result:
The test being ran below is subtracting each instance of your dictionary from the length of original string. If the result is 0, this returns TRUE. If not, this returns FALSE. This is not case sensitive - a & A will be treated equally here.
=NOT(LEN(A1)-(LEN(A1)-LEN(SUBSTITUTE(UPPER(A1),D1,"")))-(LEN(A1)-LEN(SUBSTITUTE(UPPER(A1),D2,"")))-(LEN(A1)-LEN(SUBSTITUTE(UPPER(A1),D3,""))))
The equation works fine although I don't know if it is an optimal solution for you, but posting as answer in case it is for somebody else. The issue with this approach is the equation gets longer and longer for each character you add to your dictionary. Depending on the size of dictionary and strings to test against, this can get sloppy and calc heavy really quick.
Have you considered a UDF in VBA?

Attempting to extract one of two possible words from a text string if matches a criteria, or input the string in it's entirety

I'm trying to bring back one of two possible words (Local or National) from a text string, and if neither of these words are in the text string, then bring back the string in the whole cell
The issue I have is that I can bring back either word when they appear, but I get an error when they do not
I'm currently using
=IFERROR(IF(SEARCH("*local*",B2,1),"Local"),IF(SEARCH("*national*",B2,1),"National"))
However this obviously this doesn't bring back if the words not exist
I'm sure it's easy and I'm missing something, but I just cannot figure it out. any help would be great
Cheers all
You can use:
Formula in B1:
=IF(ISNUMBER(SEARCH("*local*",A1)),"Local",IF(ISNUMBER(SEARCH("*national*",A1)),"national",A1))
Drag down
Note:
Notice that your wildcards make it that even a string with 'international' in it will return 'national'. If this is not what you want, you should remove the wildcards.
You can also use INDEX/AGGREGATE:
=IFERROR(INDEX({"local","national"},AGGREGATE(15,7,ROW($1:$2)/(ISNUMBER(SEARCH({"local";"national"},A1))),1)),A1)
This will allow one to replace the both hard coded arrays with a range of cells that contain the outputs. If Local and National were in D1:D2 then you can use:
=IFERROR(INDEX($D:$D,AGGREGATE(15,7,ROW($D$1:$D$2)/(ISNUMBER(SEARCH($D$1:$D$2,A1))),1)),A1)
That way if the list gets bigger the formula does not.
I'd suggest regular expressions.
=IF(REGEXMATCH(A2,"(Local|National)"),REGEXEXTRACT(A2,"(Local|National)"),A2)

Finding the right range from excel table

What is the best way to find the right column for the travelled miles using visual basic coding or some excel function and return the price from that column? HLOOKUP can't be used here because the lookup value isn't exact and the ranges in the table are also not with specific intervals (If they were, I could use e.g. FLOOR(travelled miles/100)*100 and find the price with HLOOKUP). Obviously, it's easy to find the price manually with a small table but with a big table computer will be faster.
Note that, if x is between a and b, then MEDIAN(x,a,b)=x. Combine this with some nested IFs:
=IF(MEDIAN(B5,B1,C1-1)=B5,B2,IF(MEDIAN(B5,C1,D1-1)=B5,C2,IF(MEDIAN(B5,D1,E1-1)=B5,D2)))
I'm on my phone, so just done the first three cases, but hopefully you can see how it continues.
(should note you need to remove the dashes for this to work)
Edit:
I also want to answer your question in the comments above. You can use the following to keep the dash, but get a number to work with.
Assume cell A1 has got the value 10-. We can use the FIND function to work out where the - occurs and then use the LEFT function to only return the characters from before the dash:
=LEFT(A1,FIND("-",A1)-1)
This will return the value 10, but it will return it as a string, not a number - basically Excel will think it is text. To force Excel to consider it as a number, we can simply multiply the value by one. Our formula above therefore becomes:
=(LEFT(A1,FIND("-",A1)-1))*1
You may also see people use a double minus sign, like this:
=--LEFT(A1,FIND("-",A1)-1)
I don't recommend this because it's a bit complex, but combining with the formula above would give:
=IF(MEDIAN(B5,--LEFT(B1,FIND("-",B1)-1),--LEFT(C1,FIND("-",C1)-1)-1)=B5,B2,IF(MEDIAN(B5,--LEFT(C1,FIND("-",C1)-1,--LEFT(D1,FIND("-",D1)-1-1)=B5,C2,IF(MEDIAN(B5,--LEFT(D1,FIND("-",D1)-1,--LEFT(E1,FIND("-",E1)-1-1)=B5,D2)))

How to modify numbers at the end of a cell using a formula

I have cells in excel containing data of the form v-1-2-1, v-1-2-10, v-1-2-100. I want to convert it to v-1-2-001, v-1-2-010,v-1-2-100. I have nearly 2000 entries
If all of the data follows the format shown then you could use FIND to return the position of '-'. There will be three instances of this character and you need to find the third one so use the position given by the first instance as the start position parameter of the second FIND and again for the third (essentially nesting FIND). Once you have the position of the third '-' you know where the final set of numbers are (from the returned third position+1 to the LEN of the string) and could use SUBSTITUTE or a combination of other excel string functions to configure the final portion as you need it.
I'm assuming that excel has your data formatted as text.
If you need further assistance I'm happy to knock up the formula in excel but I'm off to work now and won't be able to do so for around 9 hours.
Please try:
=LEFT(A1,6)&TEXT(MID(A1,7,10),"000")

how to find the characters between 2 strings in excel

I have an xml file imported into excel with the tags. How do i retrieve the value of the string between 2 strings.
Eg. "<"product_offer_group_id">"686819743"<"/product_offer_group_id">"
How do i retrieve 686819743 from this. To note the string length is varying and ranges from 1 to 20 digits.
you need to procced in excel? Not sure about possibility of usage of regular expressions(which are a pretty good solution for that case) in Excel standard functions, but with VBA You can for sure.
look here:
http://lispy.wordpress.com/2008/10/17/using-regex-functions-in-excel/
Alternativelly you can also try to play with standard Excel Text functions, like find, left, right etc.
If you want a solution without using VB script and only Excel functions, assuming your value is in cell A1, the following use of MID, FIND, and CHAR functions would work:
=MID(A1,FIND(CHAR(34)&">"&CHAR(34),A1,1)+3,FIND(CHAR(34)&"<"&CHAR(34),A1,FIND(CHAR(34)&">"&CHAR(34),A1,1)+1)-FIND(CHAR(34)&">"&CHAR(34),A1,1)-3)
The above searches for the first occurrence of the tag ">", and takes whatever is between that tag and the next occurring "<" tag.
The magic number 3 in the function is the length of these two searched tags and used to cut down on calling an additional LEN(CHAR(34)&">"&CHAR(34)) function.

Resources