how to find the characters between 2 strings in excel - string

I have an xml file imported into excel with the tags. How do i retrieve the value of the string between 2 strings.
Eg. "<"product_offer_group_id">"686819743"<"/product_offer_group_id">"
How do i retrieve 686819743 from this. To note the string length is varying and ranges from 1 to 20 digits.

you need to procced in excel? Not sure about possibility of usage of regular expressions(which are a pretty good solution for that case) in Excel standard functions, but with VBA You can for sure.
look here:
http://lispy.wordpress.com/2008/10/17/using-regex-functions-in-excel/
Alternativelly you can also try to play with standard Excel Text functions, like find, left, right etc.

If you want a solution without using VB script and only Excel functions, assuming your value is in cell A1, the following use of MID, FIND, and CHAR functions would work:
=MID(A1,FIND(CHAR(34)&">"&CHAR(34),A1,1)+3,FIND(CHAR(34)&"<"&CHAR(34),A1,FIND(CHAR(34)&">"&CHAR(34),A1,1)+1)-FIND(CHAR(34)&">"&CHAR(34),A1,1)-3)
The above searches for the first occurrence of the tag ">", and takes whatever is between that tag and the next occurring "<" tag.
The magic number 3 in the function is the length of these two searched tags and used to cut down on calling an additional LEN(CHAR(34)&">"&CHAR(34)) function.

Related

Counting number of occurences of a specific search string

I'm building a monitoring system that takes a log (where people register their work in a set format) and returns a counter, which I can use for analysis. The monitor and log are two separate workbooks. The log has entries like this: INITALS;DATE;HOUR:RESULT|
Each cell can contain multiple entries.
My first attempt was to do a simple countif and look for a string (note that I use ; instead of , in formulas since I work on a Dutch excel):
=COUNTIF('LOCATION'!Table[LOG];"*NB;??/??/????;??:??:#A*|*")
This worked fine, but the formula only counted the number of cells where this string was present, not the actual number of occurences. I then tried this solution.
=SUM(LEN('LOCATION'!Tabel13[LOG])-LEN(SUBSTITUTE('LOCATION'!Tabel13[LOG];"NB";"")))
This indeed counted the number of times "NB" was present in the LOG. However, when I tried to use the original search string, this solution stopped working:
=SUM(LEN('LOCATION'!Tabel13[LOG])-LEN(SUBSTITUTE('LOCATION'!Tabel13[LOG];"*NB;??/??/????;??:??:#A*|*";"")))
It seems to me that SUM does not recognize symbols like ? or * which are necessary to define the correct search string. Where did I go wrong? Or can this be solved in another way? I can still look into VBA, but the workbooks are slow as hell already.
"?" and "*" are wildcards. Some functions support these (like COUNTIFS()) where others don't. Like you found out, SUBSTITUTE() does not.
Here is one way to count, assuming ms365:
Formula in C1:
=REDUCE(0,A1:A2,LAMBDA(a,b,a+LET(X,SEQUENCE(LEN(b)),SUM(--(IFERROR(SEARCH("NB;??/??/????;??:??:#A*|*",b,X),0)=X)))))
Note: I removed the asterisk in front of "NB" just to make searching for a position valid in comparison to what i called variable "X".

extract certain text after certain characters

what is the easiest way with an Excel formula to extract certain details from a cell? So for example, if this is in cell A1 column=""HMI_LOCATE"" px=""CLASS"" position=""99"" validation=""ROOM"" then I'm trying to extract just the data the falls in between the double "" after the px= so in this example, I need to extract just the letters CLASS and nothing else, what is the easiest way to extract that data, the part I'm trying to extract won't always be 5 characters long it could be much longer or shorter.
Do you want to achieve this?
With o365 you can use this formula
=FILTERXML("<t><s>"&SUBSTITUTE(A1,CHAR(34)&CHAR(34),"</s><s>")&"</s></t>","//s[position() mod 2 = 0]")
or for older EXCEL-versions
=IFERROR(INDEX(FILTERXML("<t><s>"&SUBSTITUTE($A$1,CHAR(34)&CHAR(34),"</s><s>")&"</s></t>","//s"),ROW(A1)*2),"-")
This splits the string at the quotation marks (CHAR(34)) and builds an array of elements. Then every second element is put out.
For tons of other possibilities have a look at this awesome guide by JvdV.
EDIT:
To get the element after px= no matter where it is, you can use
=LET(list,
FILTERXML("<t><s>"&SUBSTITUTE($A$1,CHAR(34)&CHAR(34),"</s><s>")&"</s></t>","//s"),
INDEX(list,MATCH("px=",list,0)+1)
)
The LET-function lets you assign functions to variables which then can be used for further calculations.

How to search for items with multiple "-" in excel or VBA?

I have a list of item numbers (100K) like this:
Some of the items have format like SAG571A-244-4 (thousands) which need to be filtered so I can delete them and only keep the items that have ONE hyphen per SKU. How can I isolate the items that have two instances of "-" in it's SKU? I'm open to solutions within Excel or using VBA as well.
Native text filters don't seem to be capable of this. I'm stumped.
As per John Coleman's comment, "*-*-*" can be used to isolate strings that have at least two dashes in them.
I would add that if you're entering them as a custom text filter, you should lose the double quotes (so just *-*-*) as otherwise the field seems to interpret the quotes literally.
Seems to work for me.
If you want just an excel formula to verify this and give you a result of the number of hyphens (0, 1, or 2+), here is one:
=IF(ISERROR(SEARCH("-",A1)),"0",IF(ISERROR(SEARCH("-",A1,IFERROR(SEARCH("-",A1)+1,LEN(A1)))),"1","2+"))
Replace A1 with your relevant column, then fill down. This is kind of a terrible way to do this performance wise, but you avoid using VBA and possibly xlsm files.
The code first checks to see if there is one hyphen, then if there is it checks to see if there is another hyphen after the position the first one was found. Looking for multiple hyphens in this manner is cumbersome and I don't recommend it.

How to modify numbers at the end of a cell using a formula

I have cells in excel containing data of the form v-1-2-1, v-1-2-10, v-1-2-100. I want to convert it to v-1-2-001, v-1-2-010,v-1-2-100. I have nearly 2000 entries
If all of the data follows the format shown then you could use FIND to return the position of '-'. There will be three instances of this character and you need to find the third one so use the position given by the first instance as the start position parameter of the second FIND and again for the third (essentially nesting FIND). Once you have the position of the third '-' you know where the final set of numbers are (from the returned third position+1 to the LEN of the string) and could use SUBSTITUTE or a combination of other excel string functions to configure the final portion as you need it.
I'm assuming that excel has your data formatted as text.
If you need further assistance I'm happy to knock up the formula in excel but I'm off to work now and won't be able to do so for around 9 hours.
Please try:
=LEFT(A1,6)&TEXT(MID(A1,7,10),"000")

How do I get the last character of a string using an Excel function?

How do I get the last character of a string using an Excel function?
No need to apologize for asking a question! Try using the RIGHT function. It returns the last n characters of a string.
=RIGHT(A1, 1)
=RIGHT(A1)
is quite sufficient (where the string is contained in A1).
Similar in nature to LEFT, Excel's RIGHT function extracts a substring from a string starting from the right-most character:
SYNTAX
RIGHT( text, [number_of_characters] )
Parameters or Arguments
text
The string that you wish to extract from.
number_of_characters
Optional. It indicates the number of characters that you wish to extract starting from the right-most character. If this parameter is omitted, only 1 character is returned.
Applies To
Excel 2016, Excel 2013, Excel 2011 for Mac, Excel 2010, Excel 2007, Excel 2003, Excel XP, Excel 2000
Since number_of_characters is optional and defaults to 1 it is not required in this case.
However, there have been many issues with trailing spaces and if this is a risk for the last visible character (in general):
=RIGHT(TRIM(A1))
might be preferred.
Looks like the answer above was a little incomplete try the following:-
=RIGHT(A2,(LEN(A2)-(LEN(A2)-1)))
Obviously, this is for cell A2...
What this does is uses a combination of Right and Len - Len is the length of a string and in this case, we want to remove all but one from that... clearly, if you wanted the last two characters you'd change the -1 to -2 etc etc etc.
After the length has been determined and the portion of that which is required - then the Right command will display the information you need.
This works well combined with an IF statement - I use this to find out if the last character of a string of text is a specific character and remove it if it is. See, the example below for stripping out commas from the end of a text string...
=IF(RIGHT(A2,(LEN(A2)-(LEN(A2)-1)))=",",LEFT(A2,(LEN(A2)-1)),A2)
Just another way to do this:
=MID(A1, LEN(A1), 1)

Resources