Separate second number from text using Excel formula - excel-formula

I have project number in cell I132. Values are like (just an example what they can be):
654321 - 9000 Workshop
654321 - 2100 Subcontractor
654321 - 3500 Unrealistic
654321 - 6400 Flawless victory
I have only one value in I132 (for example) 654321 - 9000 Workshop. How to separate second number after - (9000) using Excel formula?
I have tried with no success:
=IF(ISERROR(FIND(" ";I132;FIND(" ";I132;1)+1));I132;LEFT(I132;FIND(" ";I132;FIND(" ";I132;1)+1)))

If all your data has this same format as your posted examples (6 digit number followed by space dash space), then use:
=MID(A1,10,4)
EDIT:
If the first number is not always 6 characters long, use:
=MID(A1,FIND(" - ",A1)+3,4)

To make it very generic, we can use the following version:
=LEFT(MID(A1,FIND("-",A1)+2,1000),FIND(" ",MID(A1,FIND("-",A1)+2,1000))-1)
This way it will work even if the first and second numbers have more than 6 and 4 digits. This is basing on the assumption that it will have a dash between the numbers and after the second number is a blank space.

If then format is always the same, with 4 digits numbers, formula should be:
=MID(I132, FIND("-", I132) + 2, 4)

Suppose your data has the same structure across board which is
random number + (space)-(space) + random number + (space) + word
You can use the following formula to find the second number:
=FILTERXML("<t><s>" & SUBSTITUTE(A1," ","</s><s>")&"</s></t>","//s[text()='-']/following-sibling::*[1]")
For the logic behind this formula you may give a read to this article: Extract Words with FILTERXML.
Cheers :)

Related

Excel Formula Extract any number greater than x charters from a string

I have a file which contains a list of data. In each cell is a name and number and a date the date is either mm/yy or mm-yy or mm-yyyy etc. (never the day just month and year)
The number I need is always going to be greater than 5 characters. Is there a way that I can get just the number from the string
xx company holding - 96923432 -02-22. (number required 96923432)
yy Company (HOLDINGS) LTD - 131002204 - 02/2023 (number required 131002204)
ab HOLDINGS LIMITED / 115472907 / Feb-23 (number required 115472907)
... prior removed
=========UPDATE=========
This formula will work for you, which splits your data by space, then converts to a number and then extracts the max. Adjust as needed if you have occasions where you may not have a number greater than 5 by wrapping with an IF().
=MAX(IFERROR(NUMBERVALUE(TEXTSPLIT(A2," ")),0))
This is interesting since you use 2 different delimiters. However, no worries you can simply use the following to capture both instances. If you have more possible delimiters simply just add them between the {} in both textbefore and textafter functions. Here is an example of the equation:=TEXTBEFORE(TEXTAFTER(A2, {"-","/"}), {"-","/"})
This should work for you then if you want to return nothing if output is less than 5. =IF(LEN(TEXTBEFORE(TEXTAFTER(A1,{"-","/"}),{"-","/"}))>5,TEXTBEFORE(TEXTAFTER(A1,{"-","/"}),{"-","/"}),"")

Is there an excel formula to extract numbers from the end of a string in a cell, where the length is not always constant

I am trying to separate information copied from a PDF table - id usually use text to columns but the only delamination is spaces and this then splits the data into multiple unusable columns
The data comes like this:
Raw Data
A1 Company 0
Company2 40000
name a 1
name b 15
name c 184
Big 17 Company 1887
I need the output to be:
Company
Units
A1 Company
0
Company2
40000
name a
1
name b
15
name c
184
Big 17 Company
1887
So the company name (that might contain numbers) is separated for the unit number (that could be 1-5 digits long).
I haven't been able to figure out a way that uses =len() as the string length isn't a constant mixed with the last numbers not being a consistent number of digits.
I'm currently using:
=SUMPRODUCT(MID(0&A2, LARGE(INDEX(ISNUMBER(--MID(A2, ROW(INDIRECT("1:"&LEN(A2))), 1)) * ROW(INDIRECT("1:"&LEN(A2))), 0), ROW(INDIRECT("1:"&LEN(A2))))+1, 1) * 10^ROW(INDIRECT("1:"&LEN(A2)))/10)
This gives me all the numbers in the cell - which works for 90% of the data as most of the company's don't have numbers in their name. But for something like 'A1 Company 0' it gives 10 as the output not just the 0. I then go and manually edit the small number of companies that this happens too.
I then use a mixture of =LEN() =LEFT and =RIGHT to split the information up as required for the further automated analysis.
I'd prefer a formula over VBA/macro
I cant provide the actual data but I hope I've given enough examples in the table above to show the main problems (different company name lengths, companies with numbers in their name, different amount of digits representing the units)
Using Libre Office, but this formula checks for the last space in the cell
=RIGHT(A1,LEN(A1)-FIND("#",SUBSTITUTE(A1," ","#",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))),1))
Taken from: https://trumpexcel.com/find-characters-last-position/
FILTERXML() would best choice for this case. Try-
=FILTERXML("<t><s>"&SUBSTITUTE(A1:A6," ","</s><s>")&"</s></t>","//s[last()]")
Details about FILTERXML() from JvdV here.
See if the following works for you:
Formula in B2:
=LEFT(A2,LEN(A2)-1-LEN(C2))
In C2:
=-LOOKUP(1,-RIGHT(A2,ROW($1:$5)))
For those users using ms365's newest functions:
=HSTACK(TEXTBEFORE(A2," ",-1),TEXTAFTER(A2," ",-1))

Extract a numeral with a specific number of digits from a string

Question relates to Excel (Office365):
I am seeking a solution that will extract a number with a length of 4 digits from a string.
A couple of examples of the type of strings I am referring to are:
"16016KT 9999 SCT030"
"PROB30 0500 FG BKN001"
"MOD TURB BLW 5000FT TILL302300"
"INTER 6000 SHRA SCT015"
In each of the above strings there are a combination of letters and numbers of varying lengths and no set pattern.
The sequence of characters that I am interested in are the 4 digit numbers (in BOLD). Not, the 5000 in 5000ft.
The sequence of 4 digits is unique to all the strings I will be evaluating.
Thanks!
You may use:
=IFERROR(TEXT(FILTERXML("<t><s>"&SUBSTITUTE(A1," ","</s><s>")&"</s></t>","//s[.*0=0][string-length()=4]"),"0000"),"Non found")
On more recent versions of Excel, you may try:
=RegexpFind(A1, "\b[0-9]{4}\b", 0)
See here for how to activate regex support in Excel.
another solution:
=IFERROR(TEXT(UNIQUE(SEQUENCE(9999)/(FIND(" " & TEXT(SEQUENCE(9999),"0000") &" ",A2)>0),,1),"0000"),"")
Another option
In B1, formula copied down :
=IFERROR(TEXT(0+MID(A1,SEARCH(" ???? ",A1)+1,4),"0000"),"not found")

Excel Formula to extract previous word (towards left) from a specific position

I have multiple records as below in an excel file say Col A:
Infogain India (P) Ltd. 3-6 yrs Noida
ROBOSPECIES TECHNOLOGIES PVT LTD 0-2 yrs New Delhi
Red Lemon 0-3 yrs Noida(Sector-7 Noida)
Within the data there is a range of years mentioned e.g. 3-6 yrs in the first list item.
I want to extract the data 3-6, 0-2, 0-3 etc from above 3 list items. I understand a search for " yrs " in all the strings will give me the end position. However, I am unable to determine how to find the starting position of the Number of years.
I require the excel formula which will give me the year range.
I do not want to use any VBA for the solution.
If there are no spaces between numbers then you can use following formula.
=TRIM(RIGHT(SUBSTITUTE(TRIM(LEFT(SUBSTITUTE(A3," yrs",REPT(" ",99)),99))," ",REPT(" ",99)),99))
Try,
=TRIM(RIGHT(REPLACE(A1, FIND(" yrs", A1), LEN(A1), TEXT(,)), 4))
Try the following though pretty sure it can be condensed. I have attempted to handle additional white space potentially being present and also the years being multi digit in length e.g. 12-15. Incorporates a method by Raystafarian to find a last occurence of a character.
=RIGHT(TRIM(LEFT(TRIM(SUBSTITUTE(A1,CHAR(32)," ")),FIND("yrs",TRIM(SUBSTITUTE(A1,CHAR(32)," ")),1)-1)),LEN(TRIM(LEFT(TRIM(SUBSTITUTE(A1,CHAR(32)," ")),FIND("yrs",TRIM(SUBSTITUTE(A1,CHAR(32)," ")),1)-1)))-LOOKUP(9.9999999999E+307,FIND(" ",TRIM(LEFT(TRIM(SUBSTITUTE(A1,CHAR(32)," ")),FIND("yrs",TRIM(SUBSTITUTE(A1,CHAR(32)," ")),1)-1)),ROW($1:$1024))))
Try with below formula
=TRIM(RIGHT(SUBSTITUTE(LEFT(A1,SEARCH("yrs",A1)-1)," ","|",LEN(LEFT(A1,SEARCH("yrs",A1)-1))-LEN(SUBSTITUTE(LEFT(A1,SEARCH("yrs",A1)-1)," ",""))-1),LEN(SUBSTITUTE(LEFT(A1,SEARCH("yrs",A1)-1)," ","|",LEN(LEFT(A1,SEARCH("yrs",A1)-1))-LEN(SUBSTITUTE(LEFT(A1,SEARCH("yrs",A1)-1)," ",""))-1))-SEARCH("|",SUBSTITUTE(LEFT(A1,SEARCH("yrs",A1)-1)," ","|",LEN(LEFT(A1,SEARCH("yrs",A1)-1))-LEN(SUBSTITUTE(LEFT(A1,SEARCH("yrs",A1)-1)," ",""))-1))))

How to build complex value from three variables?

I have an Excel spreadsheet with over 2000 entries:
Field B1: CustomerID as 000012345
Field B2: CustomerID as 0000432
Field C1: CustomerCountry as DE
Field C2: CustomerCountry as IT
I need to build codes 13 digits long including "CustomerCountry" + "CustomerID" without leading 0 + random number (can be 6 digits, more or less, depends in length of CustomerID).
The results should be like this: D1 Code as DE12345967895 or D2 Code as IT43274837401
How to do it with Excel functions?
UPDATED:
I tried this one. My big problem is to say that random number should be long enough to get 13 characters in all. Sometimes CustomerID is just 3 or 4 digits long, and concatenation of three variables can be just 10 or 9 characters. But codes have to be always 13 characters long.
Use & to concatenate strings.
Use VALUE(CustomerID) to trim the leading zeroes from the ID
Use RAND() to add a random number between 0 and 1 or RANDBETWEEN(x,y) to create one between x and y.
Combine the above and there you are!
If you always want 13 digits you can use LEFT(INT(RAND()*10^13);(13-LEN(CustomerCountry)-LEN(VALUE(CustomerID)))) for the random number to ALWAYS be the right length.
total formula
= CustomerCountry
& VALUE(CustomerID)
& LEFT(INT(RAND()*10^13);(13-LEN(CustomerCountry)-LEN(VALUE(CustomerID))))
=C1 & TEXT(B1,"0") & RIGHT(TEXT(RANDBETWEEN(0,99999999999),"00000000000"),11 - LEN(TEXT(B1,"0")))
that should do it
I don’t understand what is where and OP has accepted answer so have not bothered testing:
=LEFT(RIGHT(C1,2)&VALUE(MID(B1,15,13))&RANDBETWEEN(10^9,10^10),13)
(but I might revert to this if no one else picks the flaws in it first!)

Resources