How do I find the last number in a string with excel formulas - excel-formula

I'm parsing strings in excel, and I need to return everything through the last number. For example:
Input: A00XX
Output: A00
In my case, I know the last number will be between index 3 and 5, so I'm brute-forcing it with:
=LEFT([#Point],
IF(SUM((MID([#Point],5,1)={"0","1","2","3","4","5","6","7","8","9"})+0),5,
IF(SUM((MID([#Point],4,1)={"0","1","2","3","4","5","6","7","8","9"})+0),4,
IF(SUM((MID([#Point],3,1)={"0","1","2","3","4","5","6","7","8","9"})+0),3,
))))
Unfortunately, I've run into some edge cases where the numbers extended beyond index 5. Is there a generic way to find the last number in a string using excel formulas?
Note:
I've tried =MAX(SEARCH(... but it returns the index of the first number, not the last.

As a starting point: if we know the position of the last number, we can use LEFT to get the string to that point. Suppose that the position is 5:
=LEFT(A1, 5)
But, we don't know the position of the last number. Now, what if the only valid number was 0, and it only appeared once: then we could use FIND to locate the position of the number:
=LEFT(A1, FIND(0, A1))
But, we have more than one valid number. Suppose that we had all the numbers from 0 through 9, but each number could only appear once — then we could use MAX on a FIND array, to tell us which of the numbers is the last one:
=LEFT(A1, MAX(FIND({0,1,2,3,4,5,6,7,8,9}, A1)))
Unfortunately, FIND will throw a #VALUE! error any number doesn't appear, which will then make MAX return the same error. So, we need to fix that with IFERROR:
=LEFT(A1, MAX(IFERROR(FIND({0,1,2,3,4,5,6,7,8,9}, A1), 0)))
However, numbers can appear more than once. As such, we need a method to find the last occurrence of a value in a string (since FIND and SEARCH will, by default, return the first occurrence).
The SUBSTITUTE function has 3 mandatory arguments — Initial String, Value to be Replaced, Value to Replace with — and one Optional argument — the occurrence to replace. Normally, this is omitted, so that all occurrences are replaced. But, if we know how many times a character appears in a string, then we can replace just the last instance with a special/uncommon sub-string to search for.
To count how many times a character appears in a String, just start with the length of the String, then subtract the length when you SUBSTITUTE all copies of that character for Nothing:
=LEN(A1) - LEN(SUBSTITUTE(A1, 0, ""))
This means we can now replace the last occurrence of the character with, for example, ">¦<", and then FIND that:
=FIND(">¦<", SUBSTITUTE(A1, 0, ">¦<", LEN(A1) - LEN(SUBSTITUTE(A1, 0, ""))))
Of course, we want to do this for all the numbers from 0 to 9, and take the MAX value (remembering our IFERROR), so we need to put the Array of values back in:
=MAX(IFERROR(FIND(">¦<", SUBSTITUTE(A1, {0,1,2,3,4,5,6,7,8,9}, ">¦<", LEN(A1) - LEN(SUBSTITUTE(A1, {0,1,2,3,4,5,6,7,8,9}, "")))), 0))
Then, we plug that all back into our initial LEFT function:
=LEFT(A1, MAX(IFERROR(FIND(">¦<", SUBSTITUTE(A1, {0,1,2,3,4,5,6,7,8,9}, ">¦<", LEN(A1) - LEN(SUBSTITUTE(A1, {0,1,2,3,4,5,6,7,8,9}, "")))), 0)))

An alternative, assuming that the length of the string in question will never be more than 9 characters (which seems a safe assumption based on your description):
=LEFT(A1,MATCH(0,0+ISERR(0+MID(A1,{1;2;3;4;5;6;7;8;9},1))))
This, depending on your version of Excel, may or may not require committing with CTRL+SHIFT+ENTER.
Note also that the separator within the array constant {1;2;3;4;5;6;7;8;9} is the semicolon, which, for English-language versions of Excel, represents the row-separator. This may require amending if you are using a non-English-language version.
Of course, we can replace this static constant with a dynamic construction. However, since we are already making the assumption that 9 is an upper limit on the number of characters for the string in question, this would not seem to be necessary.

If you have the newest version of Excel, you can try something like:
=LEFT(D1,
LET(x, SEQUENCE(LEN(D1)),
MAX(IF(ISNUMBER(NUMBERVALUE(MID(D1, SEQUENCE(LEN(D1)), 1))), x))))
For example:

Related

How to generate a random alphanumeric string with a formula in Excel (or Google Sheets or LibreOffice)

I'm trying to generate a random 8 character alphanumeric string in Excel (or Google Sheets or Libreoffice, which both have the same challenge) using a formula. I'd like to get something like this:
6n1a3pax
I've tried various formulae including ones like this which generate the ASCII characters for individual random numbers between an upper and lower number:
=CHAR(RANDBETWEEN(65,90)) & CHAR(RANDBETWEEN(65,90)) & CHAR(RANDBETWEEN(65,90)) &CHAR(RANDBETWEEN(65,90))& CHAR(RANDBETWEEN(65,90)) & CHAR(RANDBETWEEN(65,90)) & CHAR(RANDBETWEEN(65,90)) & CHAR(RANDBETWEEN(65,90))
However, they're lengthy, you have to repeat the RANDBETWEEN() function multiple times inside a formula, and you can't choose both "alpha" and "numeric" in the same RANDBETWEEN().
Is there any easy way to do this in Excel, Google Sheets or LibreOffice Calc? If a solution works in one and not in the others then great if you can mention which one(s).
(N.B. This is not a duplicate of questions about how to stop recalculation of randomisation functions in Excel)
in GS try:
=LAMBDA(x, x)(DEC2HEX(RANDBETWEEN(0, HEX2DEC("FFFFFFFF")), 8))
if that's not enough and you need
A-Z char 65-90
a-z char 97-122
0-9 char 48-58
=JOIN(, BYROW(SEQUENCE(8), LAMBDA(x, IF(COINFLIP(), IF(COINFLIP(),
CHAR(RANDBETWEEN(65, 90)), CHAR(RANDBETWEEN(97, 122))), RANDBETWEEN(0, 9)))))
frozen:
=LAMBDA(x, x)(JOIN(, BYROW(SEQUENCE(8), LAMBDA(x, IF(COINFLIP(), IF(COINFLIP(),
CHAR(RANDBETWEEN(65, 90)), CHAR(RANDBETWEEN(97, 122))), RANDBETWEEN(0, 9))))))
alternative (with better distribution):
=JOIN(, BYROW(SEQUENCE(8), LAMBDA(x, SINGLE(SORT(CHAR({
SEQUENCE(10, 1, 48);
SEQUENCE(26, 1, 65);
SEQUENCE(26, 1, 97)}),
RANDARRAY(62, 1), )))))
or frozen:
=LAMBDA(x, x)(JOIN(, BYROW(SEQUENCE(8), LAMBDA(x, SINGLE(SORT(CHAR({
SEQUENCE(10, 1, 48);
SEQUENCE(26, 1, 65);
SEQUENCE(26, 1, 97)}),
RANDARRAY(62, 1), ))))))
for more see: stackoverflow.com/questions/66201364
LibreOffice Calc 7.x:
A non-volatile option for LibreOffice Calc 7.x is the use of the RANDBETWEEN.NV() function:
Formula in A1:
=CONCAT(IF({1,2,3,4,5,6,7,8},MID("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789",RANDBETWEEN.NV(1,62),1),))
Note that using ROW(1:8) would still force recalculation when any value in rows 1-8 have been made (thus volatile):
=CONCAT(IF(ROW(1:8),MID("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789",RANDBETWEEN.NV(1,62),1),))
Excel ms365:
Unfortunately there is, AFAIK, not a non-volatile Excel equivalent to this function. If volatility is not a problem, then try:
=CONCAT(MID("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789",RANDARRAY(8,,1,62,1),1))
Here's my take, for Google Sheets:
=lambda(_,
_
)(
lambda(
numWords, wordLength, charRegex, ascii,
lambda(
alphabet,
map(
sequence(numWords),
lambda(_,
concatenate(
map(
sequence(wordLength),
lambda(_,
mid(alphabet, randbetween(1, len(alphabet)), 1)
)
)
)
)
)
)(concatenate(filter(ascii, regexmatch(ascii, charRegex))))
)(10, 8, "[0-9a-zA-Z]", arrayformula(char(sequence(127))))
)
The formula will generate 10 passwords of 8 characters each from an alphabet that includes lower and upper case letters, and digits.
To choose which characters to include in the alphabet, replace [0-9a-zA-Z] with another regex like [0-9a-z!#$%&/] or [-!#$%&/\w]. Note that you may need to \escape any regex special characters there.
The pattern avoids the non-uniform distribution issues that plague some of the solutions presented in this thread. The ones that use coinflip() or isodd(rand()*N) will give results that overrepresent smaller sub alphabets like 0-9. The ones that use sort() will not repeat any chars in the result, which is not optimal.
It's possible to do this in Excel using a combination of the following functions:
SEQUENCE() VSTACK() RANDARRAY() CHAR() INDEX() TEXTJOIN()
Unfortunately this doesn't work in LibreOffice (at the moment) as it does not have the SEQUENCE() function. It does not work in Google Sheets as the RANDARRAY() function only takes 2 parameters and the VSTACK() function does not exist, although you can use braces and a semicolon, e.g. {SEQUENCE(26,1,97,1);SEQUENCE(10,1,48,1)}.
Here's the formula you need:
Upper-case e.g "413BK5S0": =TEXTJOIN("",1,INDEX(CHAR(VSTACK(SEQUENCE(26,1,65,1),SEQUENCE(10,1,48,1))),RANDARRAY(8,1,1,36,TRUE)))
Lower-case e.g. "b8etbno8": =TEXTJOIN("",1,INDEX(CHAR(VSTACK(SEQUENCE(26,1,97,1),SEQUENCE(10,1,48,1))),RANDARRAY(8,1,1,36,TRUE)))
The following explanation for each function:
SEQUENCE() - a sequence of e.g. 26 numbers, in 1 column, starting at number 65, increasing by 1 each time (with the second incidence of the function being 10 numbers starting at 48)
VSTACK() - combine the 2 SEQUENCE() formulae into 1 array (sequence) of numbers
CHAR() - the ASCII character associated with a decimal ASCII number (where the decimal number is generated by the SEQUENCE() function) - see https://www.asciitable.com/
RANDARRAY() - an array of 8 random numbers, 1 column wide, minimum number 1, maximum 36
INDEX() - the value from each element within the sequence of characters, where each of 8 element numbers is provided by RANDARRAY()
TEXTJOIN() - join the values in an array together into one cell, with no separator and ignoring empty values
What do you think of something like this?
=CONCATENATE(BYROW(SEQUENCE(8),LAMBDA(e,IF(ISODD(ROUNDUP(RAND()*10)),CHAR(RANDBETWEEN(65,90)),ROUNDDOWN(RAND()*10)))))
If you want to include lower case, you can do a similar logic:
=CONCATENATE(BYROW(SEQUENCE(8),LAMBDA(e,IF(ISODD(ROUNDUP(RAND()*10)),IF(ISODD(ROUNDUP(RAND()*10)),CHAR(RANDBETWEEN(65,90)),CHAR(RANDBETWEEN(97,122))),ROUNDDOWN(RAND()*10)))))
The logic is the next one: what I'm doing is with ISODD(ROUNDUP(RAND()*10) generating a random number between 1 and 10 and checking if it's odd. If it is, it generates a letter or else it generates a number. With CONCATENATE(BYROW(SEQUENCE(8)... I'm doing this 8 times and concatenating them. What I just added was a second "random and odd" time when it's time to generate a letter so you can have upper and lower case

Excel get substring from nth position up to the end of string

So How to get substring from nth position up to the end of string?
Input at cell A1 Name: Thomas B.
Expected output: Thomas B.
I know some way to do it but I wonder if there are other elegant ways than them? (some kind of =RIGHT(A1, -6)....)
=MID(A1, 6, 999999) //999999 looks not so good
=MID(A1, 6, LEN(A1) - 5) //must calculate 2 times, first get len, then get substring, seems too much works?
REPLACE
As Dominique already wrote:
'Why don't you just replace the first six characters by an empty string?'
=REPLACE(A1,1,6,"")
I've done some time measuring, but the difference is less than a second at 50000 records (for LEFT, MID, REPLACE & SUSTITUTE). So I'm afraid ELEGANCE is all you're going to get.
A Small Study
I created this study due to the fact that when you say from the n-th character, your n-th character is 7 (your MID-s are wrong), but you want to remove the first n-1 (6) characters. So depending on how you formulate your question, you might have a different approach in RIGHT or MID, and you will remember REPLACE and SUBSTITUTE or you may not.
Small Study Formulas for A1 (*) and B1 (#, ?, *)
Get String From N-th Character to the End, e.g. 7
=RIGHT(A1,LEN(A1)-(B1-1))
=RIGHT(A1,LEN(A1)-B1+1)
=RIGHT(A1,LEN(A1)-6)
=MID(A1,B1,LEN(A1)-(B1-1))
=MID(A1,B1,LEN(A1)-B1+1)
=MID(A1,B1,LEN(A1))
=MID(A1,7,LEN(A1)-6)
=MID(A1,7,LEN(A1))
Remove N First Characters of a String, e.g. 6
=RIGHT(A1,LEN(A1)-B1)
=RIGHT(A1,LEN(A1)-6)
=MID(A1,B1+1,LEN(A1)-B1)
=MID(A1,B1+1,LEN(A1))
=MID(A1,7,LEN(A1)-6)
=MID(A1,7,LEN(A1))
Get String After a Character e.g. " "
=RIGHT(A1,LEN(A1)-(FIND(B1,A1)))
=RIGHT(A1,LEN(A1)-(FIND(" ",A1)))
=MID(A1,FIND(B1,A1)+1,LEN(A1)-FIND(B1,A1))
=MID(A1,FIND(B1,A1)+1,LEN(A1))
=MID(A1,FIND(" ",A1)+1,LEN(A1)-FIND(" ",A1))
=MID(A1,FIND(" ",A1)+1,LEN(A1))
Get String After a String e.g. ": "
=RIGHT(A1,LEN(A1)-(FIND(B1,A1)+LEN(B1))+1)
=RIGHT(A1,LEN(A1)-FIND(B1,A1)-LEN(B1)+1)
=RIGHT(A1,LEN(A1)-FIND(": ",A1)-LEN(": ")+1)
=MID(A1,FIND(B1,A1)+LEN(B1),LEN(A1)-(FIND(B1,A1)+LEN(B1))+1)
=MID(A1,FIND(B1,A1)+LEN(B1),LEN(A1)-FIND(B1,A1)-LEN(B1)+1)
=MID(A1,FIND(B1,A1)+LEN(B1),LEN(A1))
=MID(A1,FIND(": ",A1)+LEN(": "),LEN(A1)-FIND(": ",A1)-LEN(": ")+1)
=MID(A1,FIND(": ",A1)+LEN(": "),LEN(A1))
Back to Remove N First Characters of a String, e.g. 6
=SUBSTITUTE(A1,LEFT(A1,6),"",1)
=REPLACE(A1,1,6,"")
Well, both of your methods already work, but you could also use this one:
=RIGHT(A1,LEN(A1)-6)
(you nearly had this one in your own question)
or this one:
=TRIM(MID(A1,FIND(":",A1)+1,100))
(the FIND() function returns the numeric position of a search string, so is great for doing dynamic substrings)
Why don't you just replace the first six characters by an empty string?
=SUBSTITUTE(A1;LEFT(A1;6);"";1)
Another possibility is that you create a constant with the value 2^31-1 (=2147483647), which is the maximum signed integer value on 32-bit systems, and you give it a nice name, like MaxInt, then your first formula will be efficient and nice looking, too:
=MID(A1, 6, MaxInt)
You can add the Name with Ctrl+F3. If you are interested in fast calculations, giving it as 2147483647 rather than 2^31-1 may have some (very little) advantage.

PowerQuery M Conditional Column 'count' argument is out of range

I have a sheet with dates as MMDDYYY with no leading 0's if month number is single digit. For example, 1012018 or 12312018. Each record has a date, and each date is either 7 or 8 characters in length.
Here is the code I am using to convert the numbers to dates:
if Text.Length([ContractDate]) = 7
then
Text.Range([ContractDate],0,1)&"/"&Text.Range([ContractDate],1,2)&"/"&Text.Range([ContractDate],4,4)
else
Text.Range([ContractDate],0,2)&"/"&Text.Range([ContractDate],2,2)&"/"&Text.Range([ContractDate],4,4)
The code works fine for the "else" condition but I am getting error "Expression.Error: The 'count' argument is out of range. Details: 4" for all records where Text.Length() = 7. I verified this by adding a second column to get Length of ContractDate.
What am I missing?
EDIT: Problem Solved - I'm an idiot. I was getting an error because in the "then" condition, I am extracting a substring of (4,4) from a value that only has Len=7. I can't get 4 characters out of a 7 character string when starting at index of 4.
I know you found the issue with your code, but worth pointing out some things that might be good to know.
Text.Range with no character count will pull in all characters past the start point (so Text.Range([ContractDate], 4) would work for both).
Text.Middle operates like Text.Range but will not cause an error if you select a range that expands past the size of the string. This can be useful if for some reason you were dealing with variable size strings where you need a specific number of characters up to a limit past a certain position.
You could also use Text.PadStart([ContractDate], 8, "0") to pad the 7 length strings with a 0 at the start, and avoid the need for a conditional check all together.

Remove text from excel cell before first occurance of special character [duplicate]

Is there an efficient way to identify the last character/string match in a string using base functions? I.e. not the last character/string of the string, but the position of a character/string's last occurrence in a string. Search and find both work left-to-right so I can't think how to apply without lengthy recursive algorithm. And this solution now seems obsolete.
I think I get what you mean. Let's say for example you want the right-most \ in the following string (which is stored in cell A1):
Drive:\Folder\SubFolder\Filename.ext
To get the position of the last \, you would use this formula:
=FIND("#",SUBSTITUTE(A1,"\","#",(LEN(A1)-LEN(SUBSTITUTE(A1,"\","")))/LEN("\")))
That tells us the right-most \ is at character 24. It does this by looking for "#" and substituting the very last "\" with an "#". It determines the last one by using
(len(string)-len(substitute(string, substring, "")))\len(substring)
In this scenario, the substring is simply "\" which has a length of 1, so you could leave off the division at the end and just use:
=FIND("#",SUBSTITUTE(A1,"\","#",LEN(A1)-LEN(SUBSTITUTE(A1,"\",""))))
Now we can use that to get the folder path:
=LEFT(A1,FIND("#",SUBSTITUTE(A1,"\","#",LEN(A1)-LEN(SUBSTITUTE(A1,"\","")))))
Here's the folder path without the trailing \
=LEFT(A1,FIND("#",SUBSTITUTE(A1,"\","#",LEN(A1)-LEN(SUBSTITUTE(A1,"\",""))))-1)
And to get just the filename:
=MID(A1,FIND("#",SUBSTITUTE(A1,"\","#",LEN(A1)-LEN(SUBSTITUTE(A1,"\",""))))+1,LEN(A1))
However, here is an alternate version of getting everything to the right of the last instance of a specific character. So using our same example, this would also return the file name:
=TRIM(RIGHT(SUBSTITUTE(A1,"\",REPT(" ",LEN(A1))),LEN(A1)))
How about creating a custom function and using that in your formula? VBA has a built-in function, InStrRev, that does exactly what you're looking for.
Put this in a new module:
Function RSearch(str As String, find As String)
RSearch = InStrRev(str, find)
End Function
And your function will look like this (assuming the original string is in B1):
=LEFT(B1,RSearch(B1,"\"))
New Answer | 31-3-2022:
With even newer functions come even shorter answers. At time of writing in BETA, but probably widely available in the near future, we can use TEXTBEFORE():
=LEN(TEXTBEFORE(A2,B2,-1))+1
The trick here is that the 3rd parameter tells the function to retrieve the last occurence of the substring we give in the 2nd parameter. At time of writing this function is still case-sensitive by default which could be handeld by the optional 4th parameter.
Original Answer | 17-6-2020:
With newer versions of excel come new functions and thus new methods. Though it's replicable in older versions (yet I have not seen it before), when one has Excel O365 one can use:
=MATCH(2,1/(MID(A1,SEQUENCE(LEN(A1)),1)="Y"))
This can also be used to retrieve the last position of (overlapping) substrings:
=MATCH(2,1/(MID(A1,SEQUENCE(LEN(A1)),2)="YY"))
| Value | Pattern | Formula | Position |
|--------|---------|------------------------------------------------|----------|
| XYYZ | Y | =MATCH(2,1/(MID(A2,SEQUENCE(LEN(A2)),1)="Y")) | 3 |
| XYYYZ | YY | =MATCH(2,1/(MID(A3,SEQUENCE(LEN(A3)),2)="YY")) | 3 |
| XYYYYZ | YY | =MATCH(2,1/(MID(A4,SEQUENCE(LEN(A4)),2)="YY")) | 4 |
Whilst this both allows us to no longer use an arbitrary replacement character and it allows overlapping patterns, the "downside" is the useage of an array.
Note: You can force the same behaviour in older Excel versions through either
=MATCH(2,1/(MID(A2,ROW(A1:INDEX(A:A,LEN(A2))),1)="Y"))
Entered through CtrlShiftEnter, or using an inline INDEX to get rid of implicit intersection:
=MATCH(2,INDEX(1/(MID(A2,ROW(A1:INDEX(A:A,LEN(A2))),1)="Y"),))
tigeravatar and Jean-François Corbett suggested to use this formula to generate the string right of the last occurrence of the "\" character
=TRIM(RIGHT(SUBSTITUTE(A1,"\",REPT(" ",LEN(A1))),LEN(A1)))
If the character used as separator is space, " ", then the formula has to be changed to:
=SUBSTITUTE(RIGHT(SUBSTITUTE(A1," ",REPT("{",LEN(A1))),LEN(A1)),"{","")
No need to mention, the "{" character can be replaced with any character that would not "normally" occur in the text to process.
Just came up with this solution, no VBA needed;
Find the last occurance of "_" in my example;
=IFERROR(FIND(CHAR(1);SUBSTITUTE(A1;"_";CHAR(1);LEN(A1)-LEN(SUBSTITUTE(A1;"_";"")));0)
Explained inside out;
SUBSTITUTE(A1;"_";"") => replace "_" by spaces
LEN( *above* ) => count the chars
LEN(A1)- *above* => indicates amount of chars replaced (= occurrences of "_")
SUBSTITUTE(A1;"_";CHAR(1); *above* ) => replace the Nth occurence of "_" by CHAR(1) (Nth = amount of chars replaced = the last one)
FIND(CHAR(1); *above* ) => Find the CHAR(1), being the last (replaced) occurance of "_" in our case
IFERROR( *above* ;"0") => in case no chars were found, return "0"
Hope this was helpful.
You could use this function I created to find the last instance of a string within a string.
Sure the accepted Excel formula works, but it's much too difficult to read and use. At some point you have to break out into smaller chunks so it's maintainable. My function below is readable, but that's irrelevant because you call it in a formula using named parameters. This makes using it simple.
Public Function FindLastCharOccurence(fromText As String, searchChar As String) As Integer
Dim lastOccur As Integer
lastOccur = -1
Dim i As Integer
i = 0
For i = Len(fromText) To 1 Step -1
If Mid(fromText, i, 1) = searchChar Then
lastOccur = i
Exit For
End If
Next i
FindLastCharOccurence = lastOccur
End Function
I use it like this:
=RIGHT(A2, LEN(A2) - FindLastCharOccurence(A2, "\"))
Considering a part of a Comment made by #SSilk my end goal has really been to get everything to the right of that last occurence an alternative approach with a very simple formula is to copy a column (say A) of strings and on the copy (say ColumnB) apply Find and Replace. For instance taking the example: Drive:\Folder\SubFolder\Filename.ext
This returns what remains (here Filename.ext) after the last instance of whatever character is chosen (here \) which is sometimes the objective anyway and facilitates finding the position of the last such character with a short formula such as:
=FIND(B1,A1)-1
I'm a little late to the party, but maybe this could help. The link in the question had a similar formula, but mine uses the IF() statement to get rid of errors.
If you're not afraid of Ctrl+Shift+Enter, you can do pretty well with an array formula.
String (in cell A1):
"one.two.three.four"
Formula:
{=MAX(IF(MID(A1,ROW($1:$99),1)=".",ROW($1:$99)))} use Ctrl+Shift+Enter
Result: 14
First,
ROW($1:$99)
returns an array of integers from 1 to 99: {1,2,3,4,...,98,99}.
Next,
MID(A1,ROW($1:$99),1)
returns an array of 1-length strings found in the target string, then returns blank strings after the length of the target string is reached: {"o","n","e",".",..."u","r","","",""...}
Next,
IF(MID(I16,ROW($1:$99),1)=".",ROW($1:$99))
compares each item in the array to the string "." and returns either the index of the character in the string or FALSE: {FALSE,FALSE,FALSE,4,FALSE,FALSE,FALSE,8,FALSE,FALSE,FALSE,FALSE,FALSE,14,FALSE,FALSE.....}
Last,
=MAX(IF(MID(I16,ROW($1:$99),1)=".",ROW($1:$99)))
returns the maximum value of the array: 14
Advantages of this formula is that it is short, relatively easy to understand, and doesn't require any unique characters.
Disadvantages are the required use of Ctrl+Shift+Enter and the limitation on string length. This can be worked around with a variation shown below, but that variation uses the OFFSET() function which is a volatile (read: slow) function.
Not sure what the speed of this formula is vs. others.
Variations:
=MAX((MID(A1,ROW(OFFSET($A$1,,,LEN(A1))),1)=".")*ROW(OFFSET($A$1,,,LEN(A1)))) works the same way, but you don't have to worry about the length of the string
=SMALL(IF(MID(A1,ROW($1:$99),1)=".",ROW($1:$99)),2) determines the 2nd occurrence of the match
=LARGE(IF(MID(A1,ROW($1:$99),1)=".",ROW($1:$99)),2) determines the 2nd-to-last occurrence of the match
=MAX(IF(MID(I16,ROW($1:$99),2)=".t",ROW($1:$99))) matches a 2-character string **Make sure you change the last argument of the MID() function to the number of characters in the string you wish to match!
In newer versions of Excel (2013 and up) flash fill might be a simple and quick solution see: Using Flash Fill in Excel
.
For a string in A1 and substring in B1, use:
=XMATCH(B1,MID(A1,SEQUENCE(LEN(A1)),LEN(B1)),,-1)
Working from inside out, MID(A1,SEQUENCE(LEN(A1)),LEN(B1)) splits string A1 into a dynamic array of substrings, each the length of B1. To find the position of the last occurrence of substring B1, we use XMATCH with its Search_mode argument set to -1.
A simple way to do that in VBA is:
YourText = "c:\excel\text.txt"
xString = Mid(YourText, 2 + Len(YourText) - InStr(StrReverse(YourText), "\" ))
Very late to the party, but A simple solution is using VBA to create a custom function.
Add the function to VBA in the WorkBook, Worksheet, or a VBA Module
Function LastSegment(S, C)
LastSegment = Right(S, Len(S) - InStrRev(S, C))
End Function
Then the cell formula
=lastsegment(B1,"/")
in a cell and the string to be searched in cell B1 will populate the cell with the text trailing the last "/" from cell B1.
No length limit, no obscure formulas.
Only downside I can think is the need for a macro-enabled workbook.
Any user VBA Function can be called this way to return a value to a cell formula, including as a parameter to a builtin Excel function.
If you are going to use the function heavily you'll want to check for the case when the character is not in the string, then string is blank, etc.
If you're only looking for the position of the last instance of character "~" then
=len(substitute(String,"~",""))+1
I'm sure there is version that will work with the last instance of a string but I have to get back to work.
Cell A1 = find/the/position/of/the last slash
simple way to do it is reverse the text and then find the first slash as normal. Now you can get the length of the full text minus this number.
Like so:
=LEN(A1)-FIND("/",REVERSETEXT(A1),1)+1
This returns 21, the position of the last /

Excel: last character/string match in a string

Is there an efficient way to identify the last character/string match in a string using base functions? I.e. not the last character/string of the string, but the position of a character/string's last occurrence in a string. Search and find both work left-to-right so I can't think how to apply without lengthy recursive algorithm. And this solution now seems obsolete.
I think I get what you mean. Let's say for example you want the right-most \ in the following string (which is stored in cell A1):
Drive:\Folder\SubFolder\Filename.ext
To get the position of the last \, you would use this formula:
=FIND("#",SUBSTITUTE(A1,"\","#",(LEN(A1)-LEN(SUBSTITUTE(A1,"\","")))/LEN("\")))
That tells us the right-most \ is at character 24. It does this by looking for "#" and substituting the very last "\" with an "#". It determines the last one by using
(len(string)-len(substitute(string, substring, "")))\len(substring)
In this scenario, the substring is simply "\" which has a length of 1, so you could leave off the division at the end and just use:
=FIND("#",SUBSTITUTE(A1,"\","#",LEN(A1)-LEN(SUBSTITUTE(A1,"\",""))))
Now we can use that to get the folder path:
=LEFT(A1,FIND("#",SUBSTITUTE(A1,"\","#",LEN(A1)-LEN(SUBSTITUTE(A1,"\","")))))
Here's the folder path without the trailing \
=LEFT(A1,FIND("#",SUBSTITUTE(A1,"\","#",LEN(A1)-LEN(SUBSTITUTE(A1,"\",""))))-1)
And to get just the filename:
=MID(A1,FIND("#",SUBSTITUTE(A1,"\","#",LEN(A1)-LEN(SUBSTITUTE(A1,"\",""))))+1,LEN(A1))
However, here is an alternate version of getting everything to the right of the last instance of a specific character. So using our same example, this would also return the file name:
=TRIM(RIGHT(SUBSTITUTE(A1,"\",REPT(" ",LEN(A1))),LEN(A1)))
How about creating a custom function and using that in your formula? VBA has a built-in function, InStrRev, that does exactly what you're looking for.
Put this in a new module:
Function RSearch(str As String, find As String)
RSearch = InStrRev(str, find)
End Function
And your function will look like this (assuming the original string is in B1):
=LEFT(B1,RSearch(B1,"\"))
New Answer | 31-3-2022:
With even newer functions come even shorter answers. At time of writing in BETA, but probably widely available in the near future, we can use TEXTBEFORE():
=LEN(TEXTBEFORE(A2,B2,-1))+1
The trick here is that the 3rd parameter tells the function to retrieve the last occurence of the substring we give in the 2nd parameter. At time of writing this function is still case-sensitive by default which could be handeld by the optional 4th parameter.
Original Answer | 17-6-2020:
With newer versions of excel come new functions and thus new methods. Though it's replicable in older versions (yet I have not seen it before), when one has Excel O365 one can use:
=MATCH(2,1/(MID(A1,SEQUENCE(LEN(A1)),1)="Y"))
This can also be used to retrieve the last position of (overlapping) substrings:
=MATCH(2,1/(MID(A1,SEQUENCE(LEN(A1)),2)="YY"))
| Value | Pattern | Formula | Position |
|--------|---------|------------------------------------------------|----------|
| XYYZ | Y | =MATCH(2,1/(MID(A2,SEQUENCE(LEN(A2)),1)="Y")) | 3 |
| XYYYZ | YY | =MATCH(2,1/(MID(A3,SEQUENCE(LEN(A3)),2)="YY")) | 3 |
| XYYYYZ | YY | =MATCH(2,1/(MID(A4,SEQUENCE(LEN(A4)),2)="YY")) | 4 |
Whilst this both allows us to no longer use an arbitrary replacement character and it allows overlapping patterns, the "downside" is the useage of an array.
Note: You can force the same behaviour in older Excel versions through either
=MATCH(2,1/(MID(A2,ROW(A1:INDEX(A:A,LEN(A2))),1)="Y"))
Entered through CtrlShiftEnter, or using an inline INDEX to get rid of implicit intersection:
=MATCH(2,INDEX(1/(MID(A2,ROW(A1:INDEX(A:A,LEN(A2))),1)="Y"),))
tigeravatar and Jean-François Corbett suggested to use this formula to generate the string right of the last occurrence of the "\" character
=TRIM(RIGHT(SUBSTITUTE(A1,"\",REPT(" ",LEN(A1))),LEN(A1)))
If the character used as separator is space, " ", then the formula has to be changed to:
=SUBSTITUTE(RIGHT(SUBSTITUTE(A1," ",REPT("{",LEN(A1))),LEN(A1)),"{","")
No need to mention, the "{" character can be replaced with any character that would not "normally" occur in the text to process.
Just came up with this solution, no VBA needed;
Find the last occurance of "_" in my example;
=IFERROR(FIND(CHAR(1);SUBSTITUTE(A1;"_";CHAR(1);LEN(A1)-LEN(SUBSTITUTE(A1;"_";"")));0)
Explained inside out;
SUBSTITUTE(A1;"_";"") => replace "_" by spaces
LEN( *above* ) => count the chars
LEN(A1)- *above* => indicates amount of chars replaced (= occurrences of "_")
SUBSTITUTE(A1;"_";CHAR(1); *above* ) => replace the Nth occurence of "_" by CHAR(1) (Nth = amount of chars replaced = the last one)
FIND(CHAR(1); *above* ) => Find the CHAR(1), being the last (replaced) occurance of "_" in our case
IFERROR( *above* ;"0") => in case no chars were found, return "0"
Hope this was helpful.
You could use this function I created to find the last instance of a string within a string.
Sure the accepted Excel formula works, but it's much too difficult to read and use. At some point you have to break out into smaller chunks so it's maintainable. My function below is readable, but that's irrelevant because you call it in a formula using named parameters. This makes using it simple.
Public Function FindLastCharOccurence(fromText As String, searchChar As String) As Integer
Dim lastOccur As Integer
lastOccur = -1
Dim i As Integer
i = 0
For i = Len(fromText) To 1 Step -1
If Mid(fromText, i, 1) = searchChar Then
lastOccur = i
Exit For
End If
Next i
FindLastCharOccurence = lastOccur
End Function
I use it like this:
=RIGHT(A2, LEN(A2) - FindLastCharOccurence(A2, "\"))
Considering a part of a Comment made by #SSilk my end goal has really been to get everything to the right of that last occurence an alternative approach with a very simple formula is to copy a column (say A) of strings and on the copy (say ColumnB) apply Find and Replace. For instance taking the example: Drive:\Folder\SubFolder\Filename.ext
This returns what remains (here Filename.ext) after the last instance of whatever character is chosen (here \) which is sometimes the objective anyway and facilitates finding the position of the last such character with a short formula such as:
=FIND(B1,A1)-1
I'm a little late to the party, but maybe this could help. The link in the question had a similar formula, but mine uses the IF() statement to get rid of errors.
If you're not afraid of Ctrl+Shift+Enter, you can do pretty well with an array formula.
String (in cell A1):
"one.two.three.four"
Formula:
{=MAX(IF(MID(A1,ROW($1:$99),1)=".",ROW($1:$99)))} use Ctrl+Shift+Enter
Result: 14
First,
ROW($1:$99)
returns an array of integers from 1 to 99: {1,2,3,4,...,98,99}.
Next,
MID(A1,ROW($1:$99),1)
returns an array of 1-length strings found in the target string, then returns blank strings after the length of the target string is reached: {"o","n","e",".",..."u","r","","",""...}
Next,
IF(MID(I16,ROW($1:$99),1)=".",ROW($1:$99))
compares each item in the array to the string "." and returns either the index of the character in the string or FALSE: {FALSE,FALSE,FALSE,4,FALSE,FALSE,FALSE,8,FALSE,FALSE,FALSE,FALSE,FALSE,14,FALSE,FALSE.....}
Last,
=MAX(IF(MID(I16,ROW($1:$99),1)=".",ROW($1:$99)))
returns the maximum value of the array: 14
Advantages of this formula is that it is short, relatively easy to understand, and doesn't require any unique characters.
Disadvantages are the required use of Ctrl+Shift+Enter and the limitation on string length. This can be worked around with a variation shown below, but that variation uses the OFFSET() function which is a volatile (read: slow) function.
Not sure what the speed of this formula is vs. others.
Variations:
=MAX((MID(A1,ROW(OFFSET($A$1,,,LEN(A1))),1)=".")*ROW(OFFSET($A$1,,,LEN(A1)))) works the same way, but you don't have to worry about the length of the string
=SMALL(IF(MID(A1,ROW($1:$99),1)=".",ROW($1:$99)),2) determines the 2nd occurrence of the match
=LARGE(IF(MID(A1,ROW($1:$99),1)=".",ROW($1:$99)),2) determines the 2nd-to-last occurrence of the match
=MAX(IF(MID(I16,ROW($1:$99),2)=".t",ROW($1:$99))) matches a 2-character string **Make sure you change the last argument of the MID() function to the number of characters in the string you wish to match!
In newer versions of Excel (2013 and up) flash fill might be a simple and quick solution see: Using Flash Fill in Excel
.
For a string in A1 and substring in B1, use:
=XMATCH(B1,MID(A1,SEQUENCE(LEN(A1)),LEN(B1)),,-1)
Working from inside out, MID(A1,SEQUENCE(LEN(A1)),LEN(B1)) splits string A1 into a dynamic array of substrings, each the length of B1. To find the position of the last occurrence of substring B1, we use XMATCH with its Search_mode argument set to -1.
A simple way to do that in VBA is:
YourText = "c:\excel\text.txt"
xString = Mid(YourText, 2 + Len(YourText) - InStr(StrReverse(YourText), "\" ))
Very late to the party, but A simple solution is using VBA to create a custom function.
Add the function to VBA in the WorkBook, Worksheet, or a VBA Module
Function LastSegment(S, C)
LastSegment = Right(S, Len(S) - InStrRev(S, C))
End Function
Then the cell formula
=lastsegment(B1,"/")
in a cell and the string to be searched in cell B1 will populate the cell with the text trailing the last "/" from cell B1.
No length limit, no obscure formulas.
Only downside I can think is the need for a macro-enabled workbook.
Any user VBA Function can be called this way to return a value to a cell formula, including as a parameter to a builtin Excel function.
If you are going to use the function heavily you'll want to check for the case when the character is not in the string, then string is blank, etc.
If you're only looking for the position of the last instance of character "~" then
=len(substitute(String,"~",""))+1
I'm sure there is version that will work with the last instance of a string but I have to get back to work.
Cell A1 = find/the/position/of/the last slash
simple way to do it is reverse the text and then find the first slash as normal. Now you can get the length of the full text minus this number.
Like so:
=LEN(A1)-FIND("/",REVERSETEXT(A1),1)+1
This returns 21, the position of the last /

Resources