Excel 2007 - Generate unique ID based on text? - excel

I have a sheet with a list of names in Column B and an ID column in A. I was wondering if there is some kind of formula that can take the value in column B of that row and generate some kind of ID based on the text? Each name is also unique and is never repeated in any way.
It would be best if I didn't have to use VBA really. But if I have to, so be it.

Solution Without VBA.
Logic based on First 8 characters + number of character in a cell.
= CODE(cell) which returns Code number for first letter
= CODE(MID(cell,2,1)) returns Code number for second letter
= IFERROR(CODE(MID(cell,9,1)) If 9th character does not exist then return 0
= LEN(cell) number of character in a cell
Concatenating firs 8 codes + adding length of character on the end
If 8 character is not enough, then replicate additional codes for next characters in a string.
Final function:
=CODE(B2)&IFERROR(CODE(MID(B2,2,1)),0)&IFERROR(CODE(MID(B2,3,1)),0)&IFERROR(CODE(MID(B2,4,1)),0)&IFERROR(CODE(MID(B2,5,1)),0)&IFERROR(CODE(MID(B2,6,1)),0)&IFERROR(CODE(MID(B2,7,1)),0)&IFERROR(CODE(MID(B2,8,1)),0)&LEN(B2)

Sorry, I didn't found a solution with formula only even if this thread might help (trying to calculate the points in a scrabble game) but I didn't find a way to be sure the generated hash would be unique.
Yet, here is my solution, based on a UDF (Used-Defined Function):
Put the code in a module:
Public Function genId(ByVal sName As String) As Long
'Function to create a unique hash by summing the ascii value of each character of a given string
Dim sLetter As String
Dim i As Integer
For i = 1 To Len(sName)
genId = Asc(Mid(sName, i, 1)) * i + genId
Next i
End Function
And call it in your worksheet like a formula:
=genId(A1)
[EDIT] Added the * i to take into account the order. It works on my unit tests

May be OTT for your needs, but you can use a call to CoCreateGuid to get a real GUID
Private Declare Function CoCreateGuid Lib "ole32" (ID As Any) As Long
Function GUID() As String
Dim ID(0 To 15) As Byte
Dim i As Long
If CoCreateGuid(ID(0)) = 0 Then
For i = 0 To 15
GUID = GUID & Format(Hex$(ID(i)), "00")
Next
Else
GUID = "Error while creating GUID!"
End If
End Function
Test using
Sub testGUID()
MsgBox GUID
End Sub
How to best implement depends on your needs. One way would be to write a macro to get a GUID populate a column where names exist. (note, using it as a udf as is is no good, since it will return a new GUID when recalculated)
EDIT
See this answer for creating a SHA1 hash of a string

Do you just want an incrementing numeric id column to sit next to your values? If so, and if your values will always be unique, you can very easily do this with formulae.
If your values were in column B, starting in B2 underneath your headers for example, in A2 you would type the formula "=IF(B2="","",1+MAX(A$1:A1))". You can copy and paste that down as far as your data extends, and it will increment a numeric identifier for each row in column B which isn't blank.
If you need to do anything more complicated, like identify and re-identify repeating values, or make identifiers 'freeze' once they're populated, let me know. Currently, when you clear or add values to your list the identifers will toggle themselves up and down, so you need to be careful if your data changes.

Unique identifier based on the number of specific characters in text. I used an identifier based on vowels and numbers.
=LEN($J$14)-LEN(SUBSTITUTE($J$14;"a";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"e";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"i";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"j";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"o";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"u";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"y";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"1";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"2";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"3";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"4";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"5";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"6";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"7";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"8";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"9";""))&LEN($J$14)-LEN(SUBSTITUTE($J$14;"0";""))

You say you are confident that there are no duplicate values in your words. To push it further, are you confident that the first 8 characters in any word would be unique?
If so, you can use the below formula. It works by individually taking each character's ASCII code - 40 [assuming normal characters, this puts numbers at between 8 & 57, and letters at between 57 & 122], and multiplying that characters code by 10 ^ [that character's digit placement in the word]. Basically it takes that character code [-40], and concatenates each code onto the next.
EDIT Note that this code no longer requires that at least 8 characters exist in your word to prevent an error, as the actual word to be coded has 8 "0"'s appended to it.
=TEXT(SUM((CODE(MID(LOWER(RIGHT(REPT("0",8)&A3,8)),{1,2,3,4,5,6,7,8},1))-40)*10^{0,2,4,6,8,10,12,14}),"#")
Note that as this uses the ASCII values of the characters, the ID # could be used to identify the name directly - this does not really create anonymity, it just turns 8 unique characters into a unique number. It is obfuscated with the -40, but not really 'safe' in that sense. The -40 is just to get normal letters and numbers in the 2 digit range, so that multiplying by 10^0,2,4 etc. will create a 2 digit unique add-on to the created code.
EDIT FOR ALTERNATIVE METHOD
I had previously attempted to do this so that it would look at each letter of the alphabet, count the number of times it appears in the word, and then multiply that by 10*[that letter's position in the alphabet]. The problem with doing this (see comment below for formula) is that it required a number of 10^26-1, which is beyond Excel's floating point precision. However, I have a modified version of that method:
By limiting the number of allowed characters in the alphabet, we can get the max total size possible to 10^15-1, which Excel can properly calculate. The formula looks like this:
=RIGHT(REPT("0",15)&TEXT(SUM(LEN(A3)*10^{0,1,2,3,4,5,6,7,8,9,10,11,12,13,14}-LEN(SUBSTITUTE(A3,MID(Alphabet,{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15},1),""))*10^{0,1,2,3,4,5,6,7,8,9,10,11,12,13,14}),"#"),15)
[The RIGHT("00000000000000"... portion of the formula is meant to keep all codes the same number of characters]
Note that here, Alphabet is a named string which holds the characters: "abcdehilmnorstu". For example, using the above formula, the word "asdf" counts the instances of a, s, and d, but not 'f' which isn't in my contracted alphabet. The code of "asdf" would be:
001000000001001
This only works with the following assumptions:
The letters not listed (nor numbers / special characters) are not required to make each name unique. For example, asdf & asd would have the same code in the above method.
And,
The order of the letters is not required to make each name unique. For example, asd & dsa would have the same code in the above method.

Related

EXCEL: Unique alphanumeric code with certain characters excluded (without VBA / duplicates)

I am trying to create a list =5 alphanumeric characters.
They cannot contain 1, and i and there cannot be duplicates when dragging / copying the code down.
The characters that are allowed are:
023456789ABCDEFGHJKLMNOPQRSTUWVXYZ (Capital)
I have tried numerous of options but I can't seem to figure this one out.
Cheers
If your allowable character string is in cell A1 then the following formula will result in random codes that are each five characters in length:
=MID(A1,RANDBETWEEN(1,34),1) & MID(A1,RANDBETWEEN(1,34),1) & MID(A1,RANDBETWEEN(1,34),1) & MID(A1,RANDBETWEEN(1,34),1) & MID(A1,RANDBETWEEN(1,34),1)
But note that there is no guarantee that the codes will be unique.
As #ScottCraner pointed out... if you should happen to have Office 365, you can use this much shorter formula that takes advantage of two new functions only available in Excel 365:
=CONCAT(MID(A1,RANDARRAY(5,,1,34,TRUE),1))
But again, there is no guarantee that the resulting codes will be unique.
This formula will generate the codes in order
=SUBSTITUTE(SUBSTITUTE(BASE(K, 34,5),"1","Z"),"I","Y")
Here K can be 0, 1, 2, .... One way to generate the first ~1,048,576 K's is to use ROW()-1. You could get higher values of K by using something like K = 1048576*(COLUMN()-1) + ROW()-1.
The formula works by
(a) calling BASE(K, 34, 5) to get a 5-char long base-34 representation of K
(b) substituting Z for 1 since 1 is not a valid char
(c) substituting Y for I since I is not a valid char

search first x characters from excel range

I need to search an Excel range for specific text, and return how many are found, but want to search only on the first 4 characters. I have
=SUMPRODUCT(-- ISNUMBER(FIND("V",E4:E40)))
which does search the range, but it finds all V's and not those just in the first 4 characters. The data starts with a 4 character acronym and then has free form text.
I'm unable to get LEFT to work in this formula.
=SUMPRODUCT(-- (LEFT(E4:E40,4) = "VOID"))
If you only want to see if V is in the first 4 letters:
=SUMPRODUCT(--(FIND("V",E4:E40 & "XXXXV")<=4))
concatenating the & "XXXXV" on the end removes the possiblity of the error. So the Find will return a number. We then test whether that V is found in the first 4 characters.
We added fourX to the string to ensure that if the value has a length less than 4 we do not get false positives.

Remove text from excel cell before first occurance of special character [duplicate]

Is there an efficient way to identify the last character/string match in a string using base functions? I.e. not the last character/string of the string, but the position of a character/string's last occurrence in a string. Search and find both work left-to-right so I can't think how to apply without lengthy recursive algorithm. And this solution now seems obsolete.
I think I get what you mean. Let's say for example you want the right-most \ in the following string (which is stored in cell A1):
Drive:\Folder\SubFolder\Filename.ext
To get the position of the last \, you would use this formula:
=FIND("#",SUBSTITUTE(A1,"\","#",(LEN(A1)-LEN(SUBSTITUTE(A1,"\","")))/LEN("\")))
That tells us the right-most \ is at character 24. It does this by looking for "#" and substituting the very last "\" with an "#". It determines the last one by using
(len(string)-len(substitute(string, substring, "")))\len(substring)
In this scenario, the substring is simply "\" which has a length of 1, so you could leave off the division at the end and just use:
=FIND("#",SUBSTITUTE(A1,"\","#",LEN(A1)-LEN(SUBSTITUTE(A1,"\",""))))
Now we can use that to get the folder path:
=LEFT(A1,FIND("#",SUBSTITUTE(A1,"\","#",LEN(A1)-LEN(SUBSTITUTE(A1,"\","")))))
Here's the folder path without the trailing \
=LEFT(A1,FIND("#",SUBSTITUTE(A1,"\","#",LEN(A1)-LEN(SUBSTITUTE(A1,"\",""))))-1)
And to get just the filename:
=MID(A1,FIND("#",SUBSTITUTE(A1,"\","#",LEN(A1)-LEN(SUBSTITUTE(A1,"\",""))))+1,LEN(A1))
However, here is an alternate version of getting everything to the right of the last instance of a specific character. So using our same example, this would also return the file name:
=TRIM(RIGHT(SUBSTITUTE(A1,"\",REPT(" ",LEN(A1))),LEN(A1)))
How about creating a custom function and using that in your formula? VBA has a built-in function, InStrRev, that does exactly what you're looking for.
Put this in a new module:
Function RSearch(str As String, find As String)
RSearch = InStrRev(str, find)
End Function
And your function will look like this (assuming the original string is in B1):
=LEFT(B1,RSearch(B1,"\"))
New Answer | 31-3-2022:
With even newer functions come even shorter answers. At time of writing in BETA, but probably widely available in the near future, we can use TEXTBEFORE():
=LEN(TEXTBEFORE(A2,B2,-1))+1
The trick here is that the 3rd parameter tells the function to retrieve the last occurence of the substring we give in the 2nd parameter. At time of writing this function is still case-sensitive by default which could be handeld by the optional 4th parameter.
Original Answer | 17-6-2020:
With newer versions of excel come new functions and thus new methods. Though it's replicable in older versions (yet I have not seen it before), when one has Excel O365 one can use:
=MATCH(2,1/(MID(A1,SEQUENCE(LEN(A1)),1)="Y"))
This can also be used to retrieve the last position of (overlapping) substrings:
=MATCH(2,1/(MID(A1,SEQUENCE(LEN(A1)),2)="YY"))
| Value | Pattern | Formula | Position |
|--------|---------|------------------------------------------------|----------|
| XYYZ | Y | =MATCH(2,1/(MID(A2,SEQUENCE(LEN(A2)),1)="Y")) | 3 |
| XYYYZ | YY | =MATCH(2,1/(MID(A3,SEQUENCE(LEN(A3)),2)="YY")) | 3 |
| XYYYYZ | YY | =MATCH(2,1/(MID(A4,SEQUENCE(LEN(A4)),2)="YY")) | 4 |
Whilst this both allows us to no longer use an arbitrary replacement character and it allows overlapping patterns, the "downside" is the useage of an array.
Note: You can force the same behaviour in older Excel versions through either
=MATCH(2,1/(MID(A2,ROW(A1:INDEX(A:A,LEN(A2))),1)="Y"))
Entered through CtrlShiftEnter, or using an inline INDEX to get rid of implicit intersection:
=MATCH(2,INDEX(1/(MID(A2,ROW(A1:INDEX(A:A,LEN(A2))),1)="Y"),))
tigeravatar and Jean-François Corbett suggested to use this formula to generate the string right of the last occurrence of the "\" character
=TRIM(RIGHT(SUBSTITUTE(A1,"\",REPT(" ",LEN(A1))),LEN(A1)))
If the character used as separator is space, " ", then the formula has to be changed to:
=SUBSTITUTE(RIGHT(SUBSTITUTE(A1," ",REPT("{",LEN(A1))),LEN(A1)),"{","")
No need to mention, the "{" character can be replaced with any character that would not "normally" occur in the text to process.
Just came up with this solution, no VBA needed;
Find the last occurance of "_" in my example;
=IFERROR(FIND(CHAR(1);SUBSTITUTE(A1;"_";CHAR(1);LEN(A1)-LEN(SUBSTITUTE(A1;"_";"")));0)
Explained inside out;
SUBSTITUTE(A1;"_";"") => replace "_" by spaces
LEN( *above* ) => count the chars
LEN(A1)- *above* => indicates amount of chars replaced (= occurrences of "_")
SUBSTITUTE(A1;"_";CHAR(1); *above* ) => replace the Nth occurence of "_" by CHAR(1) (Nth = amount of chars replaced = the last one)
FIND(CHAR(1); *above* ) => Find the CHAR(1), being the last (replaced) occurance of "_" in our case
IFERROR( *above* ;"0") => in case no chars were found, return "0"
Hope this was helpful.
You could use this function I created to find the last instance of a string within a string.
Sure the accepted Excel formula works, but it's much too difficult to read and use. At some point you have to break out into smaller chunks so it's maintainable. My function below is readable, but that's irrelevant because you call it in a formula using named parameters. This makes using it simple.
Public Function FindLastCharOccurence(fromText As String, searchChar As String) As Integer
Dim lastOccur As Integer
lastOccur = -1
Dim i As Integer
i = 0
For i = Len(fromText) To 1 Step -1
If Mid(fromText, i, 1) = searchChar Then
lastOccur = i
Exit For
End If
Next i
FindLastCharOccurence = lastOccur
End Function
I use it like this:
=RIGHT(A2, LEN(A2) - FindLastCharOccurence(A2, "\"))
Considering a part of a Comment made by #SSilk my end goal has really been to get everything to the right of that last occurence an alternative approach with a very simple formula is to copy a column (say A) of strings and on the copy (say ColumnB) apply Find and Replace. For instance taking the example: Drive:\Folder\SubFolder\Filename.ext
This returns what remains (here Filename.ext) after the last instance of whatever character is chosen (here \) which is sometimes the objective anyway and facilitates finding the position of the last such character with a short formula such as:
=FIND(B1,A1)-1
I'm a little late to the party, but maybe this could help. The link in the question had a similar formula, but mine uses the IF() statement to get rid of errors.
If you're not afraid of Ctrl+Shift+Enter, you can do pretty well with an array formula.
String (in cell A1):
"one.two.three.four"
Formula:
{=MAX(IF(MID(A1,ROW($1:$99),1)=".",ROW($1:$99)))} use Ctrl+Shift+Enter
Result: 14
First,
ROW($1:$99)
returns an array of integers from 1 to 99: {1,2,3,4,...,98,99}.
Next,
MID(A1,ROW($1:$99),1)
returns an array of 1-length strings found in the target string, then returns blank strings after the length of the target string is reached: {"o","n","e",".",..."u","r","","",""...}
Next,
IF(MID(I16,ROW($1:$99),1)=".",ROW($1:$99))
compares each item in the array to the string "." and returns either the index of the character in the string or FALSE: {FALSE,FALSE,FALSE,4,FALSE,FALSE,FALSE,8,FALSE,FALSE,FALSE,FALSE,FALSE,14,FALSE,FALSE.....}
Last,
=MAX(IF(MID(I16,ROW($1:$99),1)=".",ROW($1:$99)))
returns the maximum value of the array: 14
Advantages of this formula is that it is short, relatively easy to understand, and doesn't require any unique characters.
Disadvantages are the required use of Ctrl+Shift+Enter and the limitation on string length. This can be worked around with a variation shown below, but that variation uses the OFFSET() function which is a volatile (read: slow) function.
Not sure what the speed of this formula is vs. others.
Variations:
=MAX((MID(A1,ROW(OFFSET($A$1,,,LEN(A1))),1)=".")*ROW(OFFSET($A$1,,,LEN(A1)))) works the same way, but you don't have to worry about the length of the string
=SMALL(IF(MID(A1,ROW($1:$99),1)=".",ROW($1:$99)),2) determines the 2nd occurrence of the match
=LARGE(IF(MID(A1,ROW($1:$99),1)=".",ROW($1:$99)),2) determines the 2nd-to-last occurrence of the match
=MAX(IF(MID(I16,ROW($1:$99),2)=".t",ROW($1:$99))) matches a 2-character string **Make sure you change the last argument of the MID() function to the number of characters in the string you wish to match!
In newer versions of Excel (2013 and up) flash fill might be a simple and quick solution see: Using Flash Fill in Excel
.
For a string in A1 and substring in B1, use:
=XMATCH(B1,MID(A1,SEQUENCE(LEN(A1)),LEN(B1)),,-1)
Working from inside out, MID(A1,SEQUENCE(LEN(A1)),LEN(B1)) splits string A1 into a dynamic array of substrings, each the length of B1. To find the position of the last occurrence of substring B1, we use XMATCH with its Search_mode argument set to -1.
A simple way to do that in VBA is:
YourText = "c:\excel\text.txt"
xString = Mid(YourText, 2 + Len(YourText) - InStr(StrReverse(YourText), "\" ))
Very late to the party, but A simple solution is using VBA to create a custom function.
Add the function to VBA in the WorkBook, Worksheet, or a VBA Module
Function LastSegment(S, C)
LastSegment = Right(S, Len(S) - InStrRev(S, C))
End Function
Then the cell formula
=lastsegment(B1,"/")
in a cell and the string to be searched in cell B1 will populate the cell with the text trailing the last "/" from cell B1.
No length limit, no obscure formulas.
Only downside I can think is the need for a macro-enabled workbook.
Any user VBA Function can be called this way to return a value to a cell formula, including as a parameter to a builtin Excel function.
If you are going to use the function heavily you'll want to check for the case when the character is not in the string, then string is blank, etc.
If you're only looking for the position of the last instance of character "~" then
=len(substitute(String,"~",""))+1
I'm sure there is version that will work with the last instance of a string but I have to get back to work.
Cell A1 = find/the/position/of/the last slash
simple way to do it is reverse the text and then find the first slash as normal. Now you can get the length of the full text minus this number.
Like so:
=LEN(A1)-FIND("/",REVERSETEXT(A1),1)+1
This returns 21, the position of the last /

Excel : Find only Hexa decimals from 1 cell

I'm a newbie on Excel.
So I have a list of some names ending with Hexa decimals. And some names, that doesn't have any.
My mission is to see only those names with Hexa decimals. (Mabye somehow filter them out)
Column:
BFAXSPOINTDEVBAUHOFLAN2AD
BFAXSQLBAUHOFLAN207
BFAXSQLDEVBAUHOFLAN27A
BFREPDEVBAUHOFLAN258
BFREPORTINGBAUHOFLAN20B
COBALTSEA02900
COBALTSEAVHOST900
DIRECTO8000
DIRECTO9000
DIRECTODCDIRECTOLA009
DYNAMAEBSSISE006
SURVEYEBSSISE006
KVMSRV00",
KVMSRV01",
KVMSRV02",
ASR
CACTI
DBSYNC",
DTV
and so on...
The Function HEX2DEC will help you achieve what you want - it attempts to convert a number as a hexidecimal, into a decimal. If it is not a valid Hex input, it will produce an error.
The key is understanding how many digits you expect your decimal to be - is it the last 5 characters; the last 10; etc. Also note that there is a risk that random text / numbers will be seen as hexidecimal when really that's not what it represents [but that's a problem with the question as you have laid it out; going solely based on the text provided, all we can see is whether a particular cell creates a valid Hexidecimal].
The full formula would look like this[assuming your data starts in A1, and that your Hexidecimal numbers are expected to be 6 characters long, this goes in B1 and is copied down]:
=ISERROR(HEX2DEC(RIGHT(A1,6)))
This takes the 6 rightmost characters of a cell, and attempts to convert it from Hex to Decimal. If it fails, it will produce TRUE [because of ISERROR]; if it succeeds, it will produce FALSE.
Then simply filter on your column to see the subset of results you care about.
Consider the following UDF:
Public Function EndsInHex(r As Range) As Boolean
Dim s As String, CH As String
s = r(1).Text
CH = Right(s, 1)
If CH Like "[A-F]" Or CH Like "[0-9]" Then
EndsInHex = True
Else
EndsInHex = False
End If
End Function
For the string to end in a hex, the last character must be a hex.

Excel: last character/string match in a string

Is there an efficient way to identify the last character/string match in a string using base functions? I.e. not the last character/string of the string, but the position of a character/string's last occurrence in a string. Search and find both work left-to-right so I can't think how to apply without lengthy recursive algorithm. And this solution now seems obsolete.
I think I get what you mean. Let's say for example you want the right-most \ in the following string (which is stored in cell A1):
Drive:\Folder\SubFolder\Filename.ext
To get the position of the last \, you would use this formula:
=FIND("#",SUBSTITUTE(A1,"\","#",(LEN(A1)-LEN(SUBSTITUTE(A1,"\","")))/LEN("\")))
That tells us the right-most \ is at character 24. It does this by looking for "#" and substituting the very last "\" with an "#". It determines the last one by using
(len(string)-len(substitute(string, substring, "")))\len(substring)
In this scenario, the substring is simply "\" which has a length of 1, so you could leave off the division at the end and just use:
=FIND("#",SUBSTITUTE(A1,"\","#",LEN(A1)-LEN(SUBSTITUTE(A1,"\",""))))
Now we can use that to get the folder path:
=LEFT(A1,FIND("#",SUBSTITUTE(A1,"\","#",LEN(A1)-LEN(SUBSTITUTE(A1,"\","")))))
Here's the folder path without the trailing \
=LEFT(A1,FIND("#",SUBSTITUTE(A1,"\","#",LEN(A1)-LEN(SUBSTITUTE(A1,"\",""))))-1)
And to get just the filename:
=MID(A1,FIND("#",SUBSTITUTE(A1,"\","#",LEN(A1)-LEN(SUBSTITUTE(A1,"\",""))))+1,LEN(A1))
However, here is an alternate version of getting everything to the right of the last instance of a specific character. So using our same example, this would also return the file name:
=TRIM(RIGHT(SUBSTITUTE(A1,"\",REPT(" ",LEN(A1))),LEN(A1)))
How about creating a custom function and using that in your formula? VBA has a built-in function, InStrRev, that does exactly what you're looking for.
Put this in a new module:
Function RSearch(str As String, find As String)
RSearch = InStrRev(str, find)
End Function
And your function will look like this (assuming the original string is in B1):
=LEFT(B1,RSearch(B1,"\"))
New Answer | 31-3-2022:
With even newer functions come even shorter answers. At time of writing in BETA, but probably widely available in the near future, we can use TEXTBEFORE():
=LEN(TEXTBEFORE(A2,B2,-1))+1
The trick here is that the 3rd parameter tells the function to retrieve the last occurence of the substring we give in the 2nd parameter. At time of writing this function is still case-sensitive by default which could be handeld by the optional 4th parameter.
Original Answer | 17-6-2020:
With newer versions of excel come new functions and thus new methods. Though it's replicable in older versions (yet I have not seen it before), when one has Excel O365 one can use:
=MATCH(2,1/(MID(A1,SEQUENCE(LEN(A1)),1)="Y"))
This can also be used to retrieve the last position of (overlapping) substrings:
=MATCH(2,1/(MID(A1,SEQUENCE(LEN(A1)),2)="YY"))
| Value | Pattern | Formula | Position |
|--------|---------|------------------------------------------------|----------|
| XYYZ | Y | =MATCH(2,1/(MID(A2,SEQUENCE(LEN(A2)),1)="Y")) | 3 |
| XYYYZ | YY | =MATCH(2,1/(MID(A3,SEQUENCE(LEN(A3)),2)="YY")) | 3 |
| XYYYYZ | YY | =MATCH(2,1/(MID(A4,SEQUENCE(LEN(A4)),2)="YY")) | 4 |
Whilst this both allows us to no longer use an arbitrary replacement character and it allows overlapping patterns, the "downside" is the useage of an array.
Note: You can force the same behaviour in older Excel versions through either
=MATCH(2,1/(MID(A2,ROW(A1:INDEX(A:A,LEN(A2))),1)="Y"))
Entered through CtrlShiftEnter, or using an inline INDEX to get rid of implicit intersection:
=MATCH(2,INDEX(1/(MID(A2,ROW(A1:INDEX(A:A,LEN(A2))),1)="Y"),))
tigeravatar and Jean-François Corbett suggested to use this formula to generate the string right of the last occurrence of the "\" character
=TRIM(RIGHT(SUBSTITUTE(A1,"\",REPT(" ",LEN(A1))),LEN(A1)))
If the character used as separator is space, " ", then the formula has to be changed to:
=SUBSTITUTE(RIGHT(SUBSTITUTE(A1," ",REPT("{",LEN(A1))),LEN(A1)),"{","")
No need to mention, the "{" character can be replaced with any character that would not "normally" occur in the text to process.
Just came up with this solution, no VBA needed;
Find the last occurance of "_" in my example;
=IFERROR(FIND(CHAR(1);SUBSTITUTE(A1;"_";CHAR(1);LEN(A1)-LEN(SUBSTITUTE(A1;"_";"")));0)
Explained inside out;
SUBSTITUTE(A1;"_";"") => replace "_" by spaces
LEN( *above* ) => count the chars
LEN(A1)- *above* => indicates amount of chars replaced (= occurrences of "_")
SUBSTITUTE(A1;"_";CHAR(1); *above* ) => replace the Nth occurence of "_" by CHAR(1) (Nth = amount of chars replaced = the last one)
FIND(CHAR(1); *above* ) => Find the CHAR(1), being the last (replaced) occurance of "_" in our case
IFERROR( *above* ;"0") => in case no chars were found, return "0"
Hope this was helpful.
You could use this function I created to find the last instance of a string within a string.
Sure the accepted Excel formula works, but it's much too difficult to read and use. At some point you have to break out into smaller chunks so it's maintainable. My function below is readable, but that's irrelevant because you call it in a formula using named parameters. This makes using it simple.
Public Function FindLastCharOccurence(fromText As String, searchChar As String) As Integer
Dim lastOccur As Integer
lastOccur = -1
Dim i As Integer
i = 0
For i = Len(fromText) To 1 Step -1
If Mid(fromText, i, 1) = searchChar Then
lastOccur = i
Exit For
End If
Next i
FindLastCharOccurence = lastOccur
End Function
I use it like this:
=RIGHT(A2, LEN(A2) - FindLastCharOccurence(A2, "\"))
Considering a part of a Comment made by #SSilk my end goal has really been to get everything to the right of that last occurence an alternative approach with a very simple formula is to copy a column (say A) of strings and on the copy (say ColumnB) apply Find and Replace. For instance taking the example: Drive:\Folder\SubFolder\Filename.ext
This returns what remains (here Filename.ext) after the last instance of whatever character is chosen (here \) which is sometimes the objective anyway and facilitates finding the position of the last such character with a short formula such as:
=FIND(B1,A1)-1
I'm a little late to the party, but maybe this could help. The link in the question had a similar formula, but mine uses the IF() statement to get rid of errors.
If you're not afraid of Ctrl+Shift+Enter, you can do pretty well with an array formula.
String (in cell A1):
"one.two.three.four"
Formula:
{=MAX(IF(MID(A1,ROW($1:$99),1)=".",ROW($1:$99)))} use Ctrl+Shift+Enter
Result: 14
First,
ROW($1:$99)
returns an array of integers from 1 to 99: {1,2,3,4,...,98,99}.
Next,
MID(A1,ROW($1:$99),1)
returns an array of 1-length strings found in the target string, then returns blank strings after the length of the target string is reached: {"o","n","e",".",..."u","r","","",""...}
Next,
IF(MID(I16,ROW($1:$99),1)=".",ROW($1:$99))
compares each item in the array to the string "." and returns either the index of the character in the string or FALSE: {FALSE,FALSE,FALSE,4,FALSE,FALSE,FALSE,8,FALSE,FALSE,FALSE,FALSE,FALSE,14,FALSE,FALSE.....}
Last,
=MAX(IF(MID(I16,ROW($1:$99),1)=".",ROW($1:$99)))
returns the maximum value of the array: 14
Advantages of this formula is that it is short, relatively easy to understand, and doesn't require any unique characters.
Disadvantages are the required use of Ctrl+Shift+Enter and the limitation on string length. This can be worked around with a variation shown below, but that variation uses the OFFSET() function which is a volatile (read: slow) function.
Not sure what the speed of this formula is vs. others.
Variations:
=MAX((MID(A1,ROW(OFFSET($A$1,,,LEN(A1))),1)=".")*ROW(OFFSET($A$1,,,LEN(A1)))) works the same way, but you don't have to worry about the length of the string
=SMALL(IF(MID(A1,ROW($1:$99),1)=".",ROW($1:$99)),2) determines the 2nd occurrence of the match
=LARGE(IF(MID(A1,ROW($1:$99),1)=".",ROW($1:$99)),2) determines the 2nd-to-last occurrence of the match
=MAX(IF(MID(I16,ROW($1:$99),2)=".t",ROW($1:$99))) matches a 2-character string **Make sure you change the last argument of the MID() function to the number of characters in the string you wish to match!
In newer versions of Excel (2013 and up) flash fill might be a simple and quick solution see: Using Flash Fill in Excel
.
For a string in A1 and substring in B1, use:
=XMATCH(B1,MID(A1,SEQUENCE(LEN(A1)),LEN(B1)),,-1)
Working from inside out, MID(A1,SEQUENCE(LEN(A1)),LEN(B1)) splits string A1 into a dynamic array of substrings, each the length of B1. To find the position of the last occurrence of substring B1, we use XMATCH with its Search_mode argument set to -1.
A simple way to do that in VBA is:
YourText = "c:\excel\text.txt"
xString = Mid(YourText, 2 + Len(YourText) - InStr(StrReverse(YourText), "\" ))
Very late to the party, but A simple solution is using VBA to create a custom function.
Add the function to VBA in the WorkBook, Worksheet, or a VBA Module
Function LastSegment(S, C)
LastSegment = Right(S, Len(S) - InStrRev(S, C))
End Function
Then the cell formula
=lastsegment(B1,"/")
in a cell and the string to be searched in cell B1 will populate the cell with the text trailing the last "/" from cell B1.
No length limit, no obscure formulas.
Only downside I can think is the need for a macro-enabled workbook.
Any user VBA Function can be called this way to return a value to a cell formula, including as a parameter to a builtin Excel function.
If you are going to use the function heavily you'll want to check for the case when the character is not in the string, then string is blank, etc.
If you're only looking for the position of the last instance of character "~" then
=len(substitute(String,"~",""))+1
I'm sure there is version that will work with the last instance of a string but I have to get back to work.
Cell A1 = find/the/position/of/the last slash
simple way to do it is reverse the text and then find the first slash as normal. Now you can get the length of the full text minus this number.
Like so:
=LEN(A1)-FIND("/",REVERSETEXT(A1),1)+1
This returns 21, the position of the last /

Resources