EXCEL VBA: extracting 8 digits sequence from a string in cell - string

Good day everyone,
I am trying to find a smart solution of extracting 8 digits from a cell (unique ID). The problem here occurs that it might look like this:
112, 65478411, sale
746, id65478411, sale 12.50
999, 65478411
999, id65478411
Thats most of the cases, and probably all mentioned, so I basically need to find the 8 digits in the cell and extract them into different cell. Does anyone have any ideas? I though of eliminating the first characted, then check if the cell is starting with the id, eliminate it further but I understood that this is not the smart way..
Thank you for the insights.

Try this formula:
=--TEXT(LOOKUP(10^8,MID(SUBSTITUTE(A1," ","x"),ROW(INDIRECT("1:"&LEN(A1)-7)),8)+0),"00000000")
This will return the 8 digit number in the string.
To return just the text then:
=TEXT(LOOKUP(10^8,MID(SUBSTITUTE(A1," ","x"),ROW(INDIRECT("1:"&LEN(A1)-7)),8)+0),"00000000")

You can also write a UDF to accomplish this task, example below
Public Function GetMy8Digits(cell As Range)
Dim s As String
Dim i As Integer
Dim answer
Dim counter As Integer
'get cell value
s = cell.Value
'set the counter
counter = 0
'loop through the entire string
For i = 1 To Len(s)
'check to see if the character is a numeric one
If IsNumeric(Mid(s, i, 1)) = True Then
'add it to the answer
answer = answer + Mid(s, i, 1)
counter = counter + 1
'check to see if we have reached 8 digits
If counter = 8 Then
GetMy8Digits = answer
Exit Function
End If
Else
'was not numeric so reset counter and answer
counter = 0
answer = ""
End If
Next i
End Function

Here is an alternative:
=RIGHT(TRIM(MID(SUBSTITUTE(A1,",",REPT(" ",LEN(A1))),LEN(A4),LEN(A1))),8)
Replace all commas with spaces repeated the length of the string,
Then take the mid point starting from the length of the original string for the length of the string (ie second word in new string)
Trim out the spaces
take the right 8 chars to trim out any extra chars (like id)

Related

VBA split-column-at-certain-length?

I have a question that is almost exactly the same as this thread link below:
Excel VBA Split Column at Certain Length into Multiple Rows
However, what if i'd like to look for the last delimiter before splitting?
example: length requirement is less than 10.
my string is: 11, 3344, 5566 result will be:
Row 1: 11,3344,
Row 2: 5566
basically i dont need to be exactly 10 characters (should be 10 chars at most) but i need to consider the last delimiter before it reaches 10 characters. Kindly help please
Start at the 1000 character and loop through the text backward to check if the current character is equal to the delimiter that you want, if not continue looping backward. If it is equal to the delimiter, stop the loop and get the position.
Sub tester():
''Application.SendKeys "^g ^a {DEL}"
Dim last_ctr As Variant
Dim current_char As Variant
Dim final_char As Variant
Dim limit As Variant
Dim my_delimiter as Variant
'' your limit on the text
limit = 1000
my_delimiter = ","
test = ActiveSheet.Range("A1")
'' loop backwards from the string
For x = limit To 0 Step -1
'' check the current char if it is equal to what you want
current_char = Mid(test, x, 1)
If current_char = my_delimiter Then
'' get the current position of the loop
final_char = x
Exit For
End If
Next x
End Sub
After retrieving the position (final_char) use the Left command on it
Left(test, final_char)
Pass Left$(myString, 10) in to the spliter. Then you are only looking at the first 10 characters and doing the split.

Extracting a substring from a cell at a certain index (different row length, multiple rows)

Working as an accountant, and I'm wondering if there's a way to extract a certain string of characters, at a certain index, from a cell of varying lengths. Sometimes, I receive a statement of account as a PDF, which doesn't convert nicely to an Excel spreadsheet.
This is what I want to achieve
The strings are at the same index within each cell. I can't quite figure out how to isolate them, though, because each string is a different length, and the substring I want doesn't start with the same character. I've tried LEN, MID, etc. to extract them, but I'm not sure how to do it.
I was able to extract items that start with specific characters (like INV and "45") with:
=TRIM(MID(SUBSTITUTE(A1," ",REPT(" ",99)),MAX(1,FIND("XXX",SUBSTITUTE(A1," ",REPT(" ",99)))-50),99))
But I can't figure out how to get the strings I'm looking for (in this case, the amounts for the invoices).
If you don't mind some VBA, you can try this:
The basic problem with the data is the dates included in the fields are variable length and include the only real delimiter (spaces) within those elements. Using the spaces from the right, can be used to return your fields.
Credit to Sumit Bansal at (https://trumpexcel.com) for the original LastPosition() function. However, this function only returns the last occurrence of a character.
But a couple of slight modifications, LastPositionN() now returns the nth occurrence of the search text, to be specified as a 3rd parameter.
Note: Follow here for the original function, and info on cutting/pasting into the VBA editor: https://trumpexcel.com/find-characters-last-position/
Function LastPosition(rCell As Range, rChar As String)
'This function gives the last nth position of the specified character
'This code has been developed by Sumit Bansal (https://trumpexcel.com)
Dim rLen As Integer
rLen = Len(rCell)
For i = rLen To 1 Step -1
If Mid(rCell, i - 1, 1) = rChar Then
'LastPosition = i ' original code returnd Last position +1
LastPosition = i - 1 ' correction returns codes original intent
Exit Function
End If
Next i
End Function
Function LastPositionN(rCell As Range, rChar As String, Optional nNthOccurance As Integer = 1)
'This function gives the last position of the specified character
'This code has been developed by Sumit Bansal
' (https://trumpexcel.com)
' and was modified my Jeff Bowman 9/11/2018 for nth position from right
'Returns zero (0) if not nth occurence is not found
Dim rLen As Integer, nNthCount As Integer, LastPosition As Integer
nNthCount = 0
rLen = Len(rCell)
For i = rLen To 1 Step -1
If Mid(rCell, i - 1, 1) = rChar Then
'LastPosition = i ' original code returnd Last position +1
LastPositionN = i - 1 ' correction returns codes original
'intent
nNthCount = nNthCount + 1
If nNthCount = nNthOccurance Then
Exit Function
End If
End If
Next i
End Function
`
Now, with left(), right(), and find() we can isolate and parse the end of the data string.
=LEFT(RIGHT(A8,LastPositionN(A8," ",1)),FIND(" ",RIGHT(A8,LastPositionN(A8," ",1))))

text to columns: split at the first number in the value

I have 1 column with about 60 cells with values or different length. Each (or at least most) of the values have a numeric characters in the value. I want to split the columns cells into more columns which I normally would do with the 'tekst to columns' function of excel.
But this function does not have an advanced option of splitting the value at the first numeric character. splitting based on spaces, comma etc. is possible but this does not help me.
Is there any way to divide the cells into 2 columns at the first number in the cell value?
I have looked at numerous other questions but none of them (or other internet fora) have helped me to split the value at the first number of the cell value.
Thanks #quantum285 for the answer. This routine works if the string contains one number. I changed the teststring to firstpart323secondpart.
then part1 returns 'firstpart32' and part2 return secondpart.
I tried to understand what happens in this code, please correct me if I'm wrong:
First, the lenght of the string is determined.
Secondly, for each position in this string is checked if it is numeric or not. But this check is dan from right to left? So in case of firstpart323secondpart: the length is 22.
then isnumeric() checks for every position from 1 to 22 if it is numeric and stops when it finds a number?
If so, part 1 is the the tekst before the value i, where i is the first position from right to left in the string where a number is found.
and part 2 is then the string on the right from this same position.
However, I am looking for a routine which find the first position from left to right (or the last position from right to left) where a number is, ...
So I changed the routine now, simply adjusting the for i = 1 to line:
Sub test()
For j = 4 To Cells(Rows.Count, 4).End(xlUp).Row
For i = Len(Cells(j, 4)) To 1 Step -1
If IsNumeric(Mid(Cells(j, 4), i, 1)) Then
Cells(j, 5) = Left(Cells(j, 4), i - 1)
Cells(j, 6) = (Right(Cells(j, 4), Len(Cells(j, 4)) - i + 1))
End If
Next i
Next j
End Sub
this almost perfectly works (except for a few cells which have multiple number combinations in the value (like: soup 10g 20boxes). But as these are only a few, I can adjust them by hand.
Thanks!
Sub test()
testString = "firstpart3secondpart"
For i = 1 To Len(testString)
If IsNumeric(Mid(testString, i, 1)) Then
part1 = Left(testString, i - 1)
part2 = (Right(testString, Len(testString) - i))
End If
Next i
MsgBox (part1)
MsgBox (part2)
End Sub
Use something like this within your loop.

Deleting variable number of leading characters from a variable-length string

If I am having G4ED7883666 and I want the output to be 7883666
and I have to apply this on a range of cells and they are not the same length and the only common thing is that I have to delete anything before the number that lies before the alphabet?
This formula finds the last number in a string, that is, all digits to the right of the last alpha character in the string.
=RIGHT(A1,MATCH(99,IFERROR(1*MID(A1,LEN(A1)+1-ROW($1:$25),1),99),0)-1)
Note that this is an array formula and must be entered with the Control-Shift-Enter keyboard combination.
How the formula works
Let's assume that the target string is fairly simple: "G4E78"
Working outward from the middle of the formula, the first thing to do is create an array with the elements 1 through 25. (Although this might seem to limit the formula to strings with no more than 25 characters, it actually places a limit of 25 digits on the size of the number that may be extracted by the formula.
ROW($1:$25) = {1;2;3;4;5;6;7; etc.}
Subtracting from this array the value of (1 + the length of the target string) produces a new array, the elements of which count down from the length of string. The first five elements will correspond to the position of the characters of the string - in reverse order!
LEN(A1)+1-ROW($1:$25) = {5;4;3;2;1;0;-1;-2;-3;-4; etc.}
The MID function then creates a new array that reverses the order of the characters of the string.
For example, the first element of the new array is the result of MID(A1, 5, 1), the second of MID(A1, 4, 1) and so on. The #VALUE! errors reflect the fact that MID cannot evaluate 0 or negative values as the position of a string, e.g., MID(A1,0,1) = #VALUE!.
MID(A1,LEN(A1)+1-ROW($1:$25),1) = {"8";"7";"E";"4";"G";#VALUE!;#VALUE!; etc.}
Multiplying the elements of the array by 1 turns the character elements of that array to #VALUE! errors as well.
=1*MID(A1,LEN(A1)+1-ROW($1:$25),1) = {"8";"7";#VALUE!;"4";#VALUE!;#VALUE!;#VALUE!; etc.}
And the IFERROR function turns the #VALUES into 99, which is just an arbitrary number greater than the value of a single digit.
IFERROR(1*MID(A1,LEN(A1)+1-ROW($1:$25),1),99) = {8;7;99;4;99;99;99; etc.}
Matching on the 99 gives the position of the first non-digit character counting from the right end of the string. In this case, "E" is the first non-digit in the reversed string "87E4G", at position 3. This is equivalent to saying that the number we are looking for at the end of the string, plus the "E", is 3 characters long.
MATCH(99,IFERROR(1*MID(A1,LEN(A1)+1-ROW($1:$25),1),99),0) = 3
So, for the final step, we take 3 - 1 (for the "E) characters from the right of string.
RIGHT(A1,MATCH(99,IFERROR(1*MID(A1,LEN(A1)+1-ROW($1:$25),1),99),0)-1) = "78"
One more submission for you to consider. This VBA function will get the right most digits before the first non-numeric character
Public Function GetRightNumbers(str As String)
Dim i As Integer
For i = Len(str) To 0 Step -1
If Not IsNumeric(Mid(str, i, 1)) Then
Exit For
End If
Next i
GetRightNumbers = Mid(str, i + 1)
End Function
You can write some VBA to format the data (just starting at the end and working back until you hit a non-number.)
Or you could (if you're happy to get an addin like Excelicious) then you can use regular expressions to format the text via a formula. An expression like [0-9]+$ would return all the numbers at the end of a string IIRC.
NOTE: This uses the regex pattern in James Snell's answer, so please upvote his answer if you find this useful.
Your best bet is to use a regular expression. You need to set a reference to VBScript Regular Expressions for this to work. Tools --> References...
Now you can use regex in your VBA.
This will find the numbers at the end of each cell. I am placing the result next to the original so that you can verify it is working the way you want. You can modify it to replace the cell as soon as you feel comfortable with it. The code works regardless of the length of the string you are evaluating, and will skip the cell if it doesn't find a match.
Sub GetTrailingNumbers()
Dim ws As Worksheet
Dim rng As Range
Dim cell As Range
Dim result As Object, results As Object
Dim regEx As New VBScript_RegExp_55.RegExp
Set ws = ThisWorkbook.Sheets("Sheet1")
' range is hard-coded here, but you can define
' it programatically based on the shape of your data
Set rng = ws.Range("A1:A3")
' pattern from James Snell's answer
regEx.Pattern = "[0-9]+$"
For Each cell In rng
If regEx.Test(cell.Value) Then
Set results = regEx.Execute(cell.Value)
For Each result In results
cell.Offset(, 1).Value = result.Value
Next result
End If
Next cell
End Sub
Takes the first 4 digits from the right of num:
num1=Right(num,4)
Takes the first 5 digits from the left of num:
num1=Left(num,5)
First takes the first ten digits from the left then takes the first four digits from the right:
num1=Right(Left(num, 10),4)
In your case:
num=G4ED7883666
num1=Right(num,7)

Excel 2010 VBA step through a string and place one char into each cell in sequence

I am used to string slicing in 'C' many, many years ago but I am trying to work with VBA for this specific task.
Right now I have created a string "this is a string" and created a new workbook.
What I need now is to use string slicing to put 't' in, say, A1, 'h' in A2, 'i' in A3 etc. to the end of the string.
After which my next string will go in, say B1 etc. until all strings are sliced.
I have searched but it seems most people want to do it the other way around (concatenating a range).
Any thoughts?
Use the mid function.
=MID($A$1,1,1)
The second argument is the start position so you could replace that for something like the row or col function so you can drag the formula dynamically.
ie.
=MID($A$1,ROW(),1)
If you wanted to do it purely in VBA, I believe the mid function exists in there too, so just loop through the string.
Dim str as String
str = Sheet1.Cells(1,1).Value
for i = 1 to Len(str)
'output string 1 character at a time in column C
sheet1.cells(i,3).value = Mid(str,i,1)
next i
* edit *
If you want to do this with multiple strings from an array, you could use something like:
Dim str(1 to 2) as String
str(1) = "This is a test string"
str(2) = "Some more test text"
for j = Lbound(str) to Ubound(str)
for i = 1 to Len(str(j))
'output strings 1 character at a time in columns A and B
sheet1.cells(i,j).value = Mid(str(j),i,1)
next i
next j

Resources