Find entire number between # and space at variable places within a cell - excel

How can I find an entire number between a "#" and a space when that combination could appear anywhere in a given cell?
Example cell contents:
"This is a #123 Test that I 45 like to run"
"This is a #45 Test that I 98 like to run"
I need to return "123" from the first one and "45" from the second one.
Using Mid(), I can return the "1", but the problem is the number between # and space can vary in length, but there will generally be a #, number or numbers, then a space.
As a secondary issue, there may be scenarios where there is no "#", but I need to find the first numeric value in the cell and return them (i.e. "1", "34", "648").
Any advice on either of these challenges is greatly appreciated.

This should work as well:
=MID(A11,(FIND("#",A11,1)+1),FIND(" ",A11,FIND("#",A11,1)+1)-FIND("#",A11,1))
works by looking for the hash and the following space... Not for the secondary question...

Since you've put the excel-vba tag on your question, here's a vba way of doing it using regular expressions that should satisfy both your primary and secondary issues:
Sub tmp()
Dim regEx As New RegExp
regEx.Pattern = "^.*?\#?(\d+)"
Dim i As Integer
For i = 1 To Range("A" & Rows.Count).End(xlUp).Row:
Set mat = regEx.Execute(Cells(i, 1).Value)
If mat.Count = 1 Then
Cells(i, 2).Value = mat(0).SubMatches(0)
End If
End Sub
The regular expression uses a non-greedy character search (ie the "?" on the end of "'.*?" is what does that) to find the first pattern in the cell that matches either "#123" or just "123" where the "123" is any arbitrary sequence of digits.

This will return the first number in a string:
=--LEFT(MID(A1,AGGREGATE(15,6,FIND({1,2,3,4,5,6,7,8,9,0},A1),1),LEN(A1)),FIND(" ",MID(A1,AGGREGATE(15,6,FIND({1,2,3,4,5,6,7,8,9,0},A1),1),LEN(A1))))
AGGREGATE was introduced in 2010 Excel. If you do not ahve that then you will need to use this array formula:
=--LEFT(MID(A1,MIN(IFERROR(FIND({1,2,3,4,5,6,7,8,9,0},A1),1E+99)),LEN(A1)),FIND(" ",MID(A1,MIN(IFERROR(FIND({1,2,3,4,5,6,7,8,9,0},A1),1E+99)),LEN(A1))))
Being an array formula it needs to be confirmed with Ctrl-Shift-Enter instead of Enter when exiting edit mode. If done correctly then excel will put {} around the formula.


Remove text from excel cell before first occurance of special character [duplicate]

Is there an efficient way to identify the last character/string match in a string using base functions? I.e. not the last character/string of the string, but the position of a character/string's last occurrence in a string. Search and find both work left-to-right so I can't think how to apply without lengthy recursive algorithm. And this solution now seems obsolete.
I think I get what you mean. Let's say for example you want the right-most \ in the following string (which is stored in cell A1):
To get the position of the last \, you would use this formula:
That tells us the right-most \ is at character 24. It does this by looking for "#" and substituting the very last "\" with an "#". It determines the last one by using
(len(string)-len(substitute(string, substring, "")))\len(substring)
In this scenario, the substring is simply "\" which has a length of 1, so you could leave off the division at the end and just use:
Now we can use that to get the folder path:
Here's the folder path without the trailing \
And to get just the filename:
However, here is an alternate version of getting everything to the right of the last instance of a specific character. So using our same example, this would also return the file name:
How about creating a custom function and using that in your formula? VBA has a built-in function, InStrRev, that does exactly what you're looking for.
Put this in a new module:
Function RSearch(str As String, find As String)
RSearch = InStrRev(str, find)
End Function
And your function will look like this (assuming the original string is in B1):
New Answer | 31-3-2022:
With even newer functions come even shorter answers. At time of writing in BETA, but probably widely available in the near future, we can use TEXTBEFORE():
The trick here is that the 3rd parameter tells the function to retrieve the last occurence of the substring we give in the 2nd parameter. At time of writing this function is still case-sensitive by default which could be handeld by the optional 4th parameter.
Original Answer | 17-6-2020:
With newer versions of excel come new functions and thus new methods. Though it's replicable in older versions (yet I have not seen it before), when one has Excel O365 one can use:
This can also be used to retrieve the last position of (overlapping) substrings:
| Value | Pattern | Formula | Position |
| XYYZ | Y | =MATCH(2,1/(MID(A2,SEQUENCE(LEN(A2)),1)="Y")) | 3 |
| XYYYZ | YY | =MATCH(2,1/(MID(A3,SEQUENCE(LEN(A3)),2)="YY")) | 3 |
| XYYYYZ | YY | =MATCH(2,1/(MID(A4,SEQUENCE(LEN(A4)),2)="YY")) | 4 |
Whilst this both allows us to no longer use an arbitrary replacement character and it allows overlapping patterns, the "downside" is the useage of an array.
Note: You can force the same behaviour in older Excel versions through either
Entered through CtrlShiftEnter, or using an inline INDEX to get rid of implicit intersection:
tigeravatar and Jean-François Corbett suggested to use this formula to generate the string right of the last occurrence of the "\" character
If the character used as separator is space, " ", then the formula has to be changed to:
No need to mention, the "{" character can be replaced with any character that would not "normally" occur in the text to process.
Just came up with this solution, no VBA needed;
Find the last occurance of "_" in my example;
Explained inside out;
SUBSTITUTE(A1;"_";"") => replace "_" by spaces
LEN( *above* ) => count the chars
LEN(A1)- *above* => indicates amount of chars replaced (= occurrences of "_")
SUBSTITUTE(A1;"_";CHAR(1); *above* ) => replace the Nth occurence of "_" by CHAR(1) (Nth = amount of chars replaced = the last one)
FIND(CHAR(1); *above* ) => Find the CHAR(1), being the last (replaced) occurance of "_" in our case
IFERROR( *above* ;"0") => in case no chars were found, return "0"
Hope this was helpful.
You could use this function I created to find the last instance of a string within a string.
Sure the accepted Excel formula works, but it's much too difficult to read and use. At some point you have to break out into smaller chunks so it's maintainable. My function below is readable, but that's irrelevant because you call it in a formula using named parameters. This makes using it simple.
Public Function FindLastCharOccurence(fromText As String, searchChar As String) As Integer
Dim lastOccur As Integer
lastOccur = -1
Dim i As Integer
i = 0
For i = Len(fromText) To 1 Step -1
If Mid(fromText, i, 1) = searchChar Then
lastOccur = i
Exit For
End If
Next i
FindLastCharOccurence = lastOccur
End Function
I use it like this:
=RIGHT(A2, LEN(A2) - FindLastCharOccurence(A2, "\"))
Considering a part of a Comment made by #SSilk my end goal has really been to get everything to the right of that last occurence an alternative approach with a very simple formula is to copy a column (say A) of strings and on the copy (say ColumnB) apply Find and Replace. For instance taking the example: Drive:\Folder\SubFolder\Filename.ext
This returns what remains (here Filename.ext) after the last instance of whatever character is chosen (here \) which is sometimes the objective anyway and facilitates finding the position of the last such character with a short formula such as:
I'm a little late to the party, but maybe this could help. The link in the question had a similar formula, but mine uses the IF() statement to get rid of errors.
If you're not afraid of Ctrl+Shift+Enter, you can do pretty well with an array formula.
String (in cell A1):
{=MAX(IF(MID(A1,ROW($1:$99),1)=".",ROW($1:$99)))} use Ctrl+Shift+Enter
Result: 14
returns an array of integers from 1 to 99: {1,2,3,4,...,98,99}.
returns an array of 1-length strings found in the target string, then returns blank strings after the length of the target string is reached: {"o","n","e",".",..."u","r","","",""...}
compares each item in the array to the string "." and returns either the index of the character in the string or FALSE: {FALSE,FALSE,FALSE,4,FALSE,FALSE,FALSE,8,FALSE,FALSE,FALSE,FALSE,FALSE,14,FALSE,FALSE.....}
returns the maximum value of the array: 14
Advantages of this formula is that it is short, relatively easy to understand, and doesn't require any unique characters.
Disadvantages are the required use of Ctrl+Shift+Enter and the limitation on string length. This can be worked around with a variation shown below, but that variation uses the OFFSET() function which is a volatile (read: slow) function.
Not sure what the speed of this formula is vs. others.
=MAX((MID(A1,ROW(OFFSET($A$1,,,LEN(A1))),1)=".")*ROW(OFFSET($A$1,,,LEN(A1)))) works the same way, but you don't have to worry about the length of the string
=SMALL(IF(MID(A1,ROW($1:$99),1)=".",ROW($1:$99)),2) determines the 2nd occurrence of the match
=LARGE(IF(MID(A1,ROW($1:$99),1)=".",ROW($1:$99)),2) determines the 2nd-to-last occurrence of the match
=MAX(IF(MID(I16,ROW($1:$99),2)=".t",ROW($1:$99))) matches a 2-character string **Make sure you change the last argument of the MID() function to the number of characters in the string you wish to match!
In newer versions of Excel (2013 and up) flash fill might be a simple and quick solution see: Using Flash Fill in Excel
For a string in A1 and substring in B1, use:
Working from inside out, MID(A1,SEQUENCE(LEN(A1)),LEN(B1)) splits string A1 into a dynamic array of substrings, each the length of B1. To find the position of the last occurrence of substring B1, we use XMATCH with its Search_mode argument set to -1.
A simple way to do that in VBA is:
YourText = "c:\excel\text.txt"
xString = Mid(YourText, 2 + Len(YourText) - InStr(StrReverse(YourText), "\" ))
Very late to the party, but A simple solution is using VBA to create a custom function.
Add the function to VBA in the WorkBook, Worksheet, or a VBA Module
Function LastSegment(S, C)
LastSegment = Right(S, Len(S) - InStrRev(S, C))
End Function
Then the cell formula
in a cell and the string to be searched in cell B1 will populate the cell with the text trailing the last "/" from cell B1.
No length limit, no obscure formulas.
Only downside I can think is the need for a macro-enabled workbook.
Any user VBA Function can be called this way to return a value to a cell formula, including as a parameter to a builtin Excel function.
If you are going to use the function heavily you'll want to check for the case when the character is not in the string, then string is blank, etc.
If you're only looking for the position of the last instance of character "~" then
I'm sure there is version that will work with the last instance of a string but I have to get back to work.
Cell A1 = find/the/position/of/the last slash
simple way to do it is reverse the text and then find the first slash as normal. Now you can get the length of the full text minus this number.
Like so:
This returns 21, the position of the last /

Prevent Partial Duplicates in Excel

I have a worksheet with products where the people in my office can add new positions. The problem we're running into is that the products have specifications but not everybody puts them in (or inputs them wrong).
"cool product 14C"
Is there a way to convert Data Valuation option so that it warns me now in case I put "very cool product 14B" or anything that contains an already existing string of characters (say, longer than 4), like "cool produKt 14C" but also "good product 15" and so on?
I know that I can prevent 100% matches using COUNTIF and spot words that start/end in the same way using LEFT/RIGHT but I need to spot partial matches within the entries as well.
Thanks a lot!
If you want to cover typo's, word wraps, figure permutations etc. maybe a SOUNDEX algorithm would suit to your problem. Here's an implementation for Excel ...
So if you insert this as a user defined function, and create a column =SOUNDEX(A1) for each product row, upon entry of a new product name you can filter for all product rows with same SOUNDEX value. You can further automate this by letting user enter the new name into a dialog form first, do the validation, present them a Combo Box dropdown with possible duplicates, etc. etc. etc.
small function to find parts of strings terminated by blanks in a range (in answer to your comment)
Function FindSplit(Arg As Range, LookRange As Range) As String
Dim LookFor() As String, LookCell As Range
Dim Idx As Long
LookFor = Split(Arg)
FindSplit = ""
For Idx = 0 To UBound(LookFor)
For Each LookCell In LookRange.Cells
If InStr(1, LookCell, LookFor(Idx)) <> 0 Then
If FindSplit <> "" Then FindSplit = FindSplit & ", "
FindSplit = FindSplit & LookFor(Idx) & ":" & LookCell.Row
End If
Next LookCell
Next Idx
If FindSplit = "" Then FindSplit = "Cool entry!"
End Function
This is a bit crude ... but what it does is the following
split a single cell argument in pieces and put it into an array --> split()
process each piece --> For Idx = ...
search another range for strings that contain the piece --> For Each ...
add piece and row number of cell where it was found into a result string
You can enter/copy this as a formula next to each cell input and know immediately if you've done a cool input or not.
Value of cell D8 is [asd:3, wer:4]
Note the use of absolute addressing in the start of lookup range; this way you can copy the formula well down.
edit 17-Mar-2015
further to comment Joanna 17-Mar-2015, if the search argument is part of the range you're scanning, e.g. =FINDSPLIT(C5; C1:C12) you want to make sure that the If Instr(...) doesn't hit if LookCell and LookFor(Idx) are really the same cell as this would create a false positive. So you would rewrite the statement to
If InStr(1, LookCell, LookFor(Idx)) <> 0 And _
Not (LookCell.Row = Arg.Row And LookCell.Column = Arg.Column) _
Do not use a complete column (e.g. $C:$C) as the second argument as the function tends to become very slow without further precautions

Excel Substrings

I have two unordered sets of data here:
blah blah:2020:50::7.1:45
movie blah:blahbah, The:1914:54:
I want to extract all the data to the left of the year (aka, 1915 and 1914).
What excel formula would I use for this?
I tried this formula
these were the results below:
: blahblah, The:1914:54::7
This is because there is a colon in the movie title.
The results I need consistently are:
Can someone help with this?
You can use Regular Expressions, make sure you include a reference for it in your VBA editor. The following UDF will do the job.
Function ExtractNumber(cell As Range) As String
ExtractNumber = ""
Dim rex As New RegExp
rex.Pattern = "(:\d{4}:\d{2}::\d\.\d:\d{2}::\d:\d:\d:\d:\d:\d:\d)"
rex.Global = True
Dim mtch As Object, sbmtch As Object
For Each mtch In rex.Execute(cell.Value)
ExtractNumber = ExtractNumber & mtch.SubMatches(0)
Next mtch
End Function
Without VBA:
In reality you don't want to find the : You want to find either :1 or :2 since the year will either start with 1 or 2This formula should do it:
Look for a four digit string, in a certain range, bounded by colons.
For example:
=MID(A1,MIN(FIND(":" &ROW(INDIRECT("1900:2100"))&":",A1 &":" &ROW(INDIRECT("1900:2100"))&":")),99)
entered as an array formula by holding down ctrl-shift while hitting Enter would ensure years in the range 1900 to 2100. Change those values as appropriate for your data. The 99 at the end represents the longest possible string. Again, that can be increased as required.
You can use the same approach to return just the left hand part, up to the colon preceding the year:
=LEFT(A1,-1+MIN(FIND(":" &ROW(INDIRECT("1900:2100"))&":",A1 &":" &ROW(INDIRECT("1900:2100"))&":")))
Here is a screen shot, showing the original data in B1:B2, with the results of the first part in B4:B5, and the formula for B4 showing in the formula bar.
The results for the 2nd part are in B7:B9

Calculate alphanumeric string to an integer in Excel

I have an issue that I've not been able to figure out even with many of the ideas presented in other posts. My data comes in Excel and here are examples of each manner that any given cell might have the data:
4days 4hrs 41mins 29seconds
23hrs 43mins 4seconds
2hrs 2mins
52mins 16seconds
The end result would be to calculate the total minutes while allowing seconds to be ignored, so that the previous values would end up as follows:
Would anyone have an idea how to go about that?
Thanks for the assistance!
Bit tedious (and assumes units are always plural - also produces results in different order to example) but, with formulae only, if your data is in column A, in B1 and copied down:
="="&SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A1,"days","*1440+"),"hrs","*60+"),"mins","*1+"),"seconds","*0")," ","")&0
then Copy B and Paste Special values into C and apply Text to Columns to C with Tab as the delimiter.
This array formula** should also work:
=SUM(IFERROR(0+MID(REPT(" ",31)&SUBSTITUTE(A1&"dayhrminsecond"," ",REPT(" ",31)),FIND({"day","hr","min","second"},REPT(" ",31)&SUBSTITUTE(A1&"dayhrminsecond"," ",REPT(" ",31)))-31,31),0)*{1440,60,1,0})
**Array formulas are not entered in the same way as 'standard' formulas. Instead of pressing just ENTER, you first hold down CTRL and SHIFT, and only then press ENTER. If you've done it correctly, you'll notice Excel puts curly brackets {} around the formula (though do not attempt to manually insert these yourself).
The easiest option is probably VBA with a regular expression. You can then easily find each of the fields, and do the maths.
If you want to stick to "pure" Excel, then it seems to only option is to use SEARCH or FIND to find the position of each of the "days", "hrs", "mins" in the text (you may have to check if they're always plural). Then use MID with the position found above to extract the different components. See for similar examples.
But there's quite a bit of work to handle the cases where some components are missing, so either you'll use quite a few cells, so you'll get a very complex formula...
Here is a User Defined Function, written in VBA, which takes your string as the argument and returns the number of minutes. Only the first characters of the time interval names are checked (e.g. d, h, m) as this seems to provide sufficient discrimination.
To enter this User Defined Function (UDF), opens the Visual Basic Editor.
Ensure your project is highlighted in the Project Explorer window.
Then, from the top menu, select Insert/Module and
paste the code below into the window that opens.
To use this User Defined Function (UDF), enter a formula like
in some cell.
Option Explicit
Function SumMinutes(S As String) As Long
Dim RE As Object, MC As Object
Dim lMins As Long
Dim I As Long
Set RE = CreateObject("vbscript.regexp")
With RE
.Pattern = "(\d+)(?=\s*d)|(\d+)(?=\s*h)|(\d+)(?=\s*m)"
.Global = True
.ignorecase = True
If .test(S) = True Then
Set MC = .Execute(S)
For I = 0 To MC.Count - 1
With MC(I)
lMins = lMins + _
.submatches(0) * 1440 + _
.submatches(1) * 60 + _
End With
Next I
End If
End With
SumMinutes = lMins
End Function

