Extract two substrings from a set of three separated by ampersands - string

I am trying to extract numbers from a text.
If I have an entry like 12&6&2014, how can I extract the 12 (the number that is before the first &) and 2014 (the number that occurs after the second &)?

To get first number:
=LEFT(A1, FIND("&", A1)-1)
To get last number after the second &:
=RIGHT(A1, 4)
Otherwise, if that's not always a year:
=MID(A1, FIND(CHAR(1), SUBSTITUTE(A1, "&", CHAR(1), 2))+1, LEN(A1))

you can loop through each character int he string and check to see if it is numeric
Sub getNumberValues()
Dim s As String
Dim c As New Collection
Dim sNewString As String
s = "12&6&2014"
For v = 1 To Len(s)
If IsNumeric(Mid(s, v, 1)) Then
sNewString = sNewString & Mid(s, v, 1)
Else
c.Add sNewString
sNewString = ""
End If
Next v
'add the last entry
c.Add sNewString
sNewString = ""
For Each x In c
sNewString = sNewString & x & Chr(13)
Next
MsgBox sNewString
End Sub

If what I understand is correct which is that the characters that separate vary as well as how many digits are used you might look into something like this:
Function CleanUp(Txt)
For x = 1 To 255
Select Case x
Case 45, 47, 65 To 90, 95, 97 To 122
Txt = WorksheetFunction.Substitute(Txt, Chr(x), "") <- "" can be replaced
End Select with "&" to do a
Next x MID() using & as
the delimiter
CleanUp = Txt
End Function
If you can use VBA, this will replace your characters with a blank, but you could put in your own character and then use your formulas to separate from a specific delimiter.
The original code can be found here:
http://www.mrexcel.com/forum/excel-questions/380531-extract-only-numbers.html

Simplest might be Text to Columns with & as the delimiter, then delete the middle column. This overwrites the original data, so a copy might be appropriate.
Another simple way would be to create two copies, and for one Find what: &* and Replace with: nothing, for the other Find what: *& and Replace with: nothing.
An alternative formula solution might be:
=DAY(SUBSTITUTE($A1,"&","/"))
and
=YEAR(SUBSTITUTE($A1,"&","/"))

Related

How to format strings with numbers and special characters in Excel or Access using VBA?

I have a mathematical problem: these five strings are IDs for the same object. Due to these differences, objects appear multiple times in my Access table/query. Although there are a lot of these mutations, but I take this as an example.
76 K 6-18
76 K 6-18(2)
0076 K 0006/ 2018
0076 K 0006/2018
76 K 6/18
How would the VBA-code have to look like to recognize that these numbers stand for the same thing , so a general formatting with "RegEx()" or "format()" or "replace()"...but they must not only refer to this example but to the kind.
The common factor of these and all other mutations is always the following:
1) includes "-", no zeros left of "-", just 18 an not 2018 (year) at the end.
2) is like the first but with (2) (which can be dropped).
3) includes "/", zeros left of "/", and 2018 as year at the end.
4) is like third, but without space after "/".
5) is like the first one, but with a "/" instead of "-".
Character is always one single "K"! I suppose the best way would be to convert all 5 strings to 76 K 6 18 or in ohter cases for example to 1 K 21 20 or 123 K 117 20 . Is this possible with one elegant code or formula? Thanks
Here is a fun alternative using a rather complex but intuitive regular expression:
^0*(\d+) (K) 0*(\d+)[-\/] ?\d{0,2}(\d\d)(?:\(\d+\))?$
See an online demo
^ - Start line anchor.
0* - 0+ zeros to catch any possible leading zeros.
(\d+) - A 1st capture group of 1+ digits ranging 0-9.
- A space character.
(K) - 2nd Capture group capturing the literal "K".
- A space character.
(\d+) - A 3rd capture group of 1+ digits ranging 0-9.
[-\/] - Character class of either a hyphen or forward slash.
? - An optional space character.
\d{0,2} - 0-2 digits ranging from 0-9.
(\d\d) - A 4th capture group holding exactly two digits.
(?:\(\d+\))? - An optional non-capture group holding 1+ digits inside literal paranthesis.
$ - End line anchor.
Now just replace the whole string by the 4 capture groups with spaces in between.
Let's test this in VBA:
'A code-block to call the function.
Sub Test()
Dim arr As Variant: arr = Array("76 K 6-18", "76 K 6-18(2)", "0076 K 0006/ 2018", "0076 K 0006/2018", "76 K 6/18")
For x = LBound(arr) To UBound(arr)
Debug.Print Transform(CStr(arr(x)))
Next
End Sub
'The function that transform the input.
Function Transform(StrIn As String) As String
With CreateObject("vbscript.regexp")
.Global = True
.Pattern = "^0*(\d+) (K) 0*(\d+)[-\/] ?\d{0,2}(\d\d)(?:\(\d+\))?$"
Transform = .Replace(StrIn, "$1 $2 $3 $4")
End With
End Function
All the elements from the initial array will Debug.Print "76 K 6 18".
Hope it helps, happy coding!
EDIT: If your goal is just to check if your string compiles against the pattern, the pattern itself can be shortened a little and you can return a boolean instead:
'A code-block to call the function.
Sub Test()
Dim arr As Variant: arr = Array("76 K 6-18", "76 K 6-18(2)", "0076 K 0006/ 2018", "0076 K 0006/2018", "76 K 6/18")
For x = LBound(arr) To UBound(arr)
Debug.Print Transform(CStr(arr(x)))
Next
End Sub
'The function that checks the input.
Function Transform(StrIn As String) As Boolean
With CreateObject("vbscript.regexp")
.Global = True
.Pattern = "^0*\d+ K 0*\d+[-\/] ?\d{2,4}(?:\(\d+\))?$"
Transform = .Test(StrIn)
End With
End Function
As #Vincent has suggested, look at using a custom function to convert all of the different data to be consistent. Based on what you have described, the following seems to work:
Function fConvertFormula(strData As String) As String
On Error GoTo E_Handle
Dim astrData() As String
strData = Replace(strData, "/", " ")
strData = Replace(strData, "-", " ")
strData = Replace(strData, " ", " ")
astrData = Split(strData, " ")
If UBound(astrData) = 3 Then
astrData(0) = CLng(astrData(0))
astrData(2) = CLng(astrData(2))
If InStr(astrData(3), "(") > 0 Then
astrData(3) = Left(astrData(3), InStr(astrData(3), "(") - 1)
End If
If Len(astrData(3)) = 4 Then
astrData(3) = Right(astrData(3), 2)
End If
fConvertFormula = Join(astrData, " ")
End If
fExit:
On Error Resume Next
Exit Function
E_Handle:
MsgBox Err.Description & vbCrLf & vbCrLf & "fConvertFormula", vbOKOnly + vbCritical, "Error: " & Err.Number
Resume fExit
End Function
It starts by replacing "field" delimiters with spaces, and then does a replace of double spaces. It then removes any leading zeroes from the first and third elements, if there is a bracket in the last element then delete that part, and finally converts to a 2 digit value before joining it all back up.
You may have other cases that you need to deal with, so I would suggest creating a query with the original data and the data converted by this function, and seeing what it throws out.
This function unifies the given string by the rules you defined in your question:
Public Function UnifyValue(ByVal inputValue As String) As String
'// Remove all from "(" on.
inputValue = Split(inputValue, "(")(0)
'// Replace / by blank
inputValue = Replace(inputValue, "/", " ")
'// Replace - by blank
inputValue = Replace(inputValue, "-", " ")
'// Replace double blanks by one blank
inputValue = Replace(inputValue, " ", " ")
'// Split by blank
Dim splittedInputValue() As String
splittedInputValue = Split(inputValue, " ")
'// Create the resulting string
UnifyValue = CLng(splittedInputValue(0)) & _
" " & splittedInputValue(1) & _
" " & CLng(splittedInputValue(2)) & _
" " & Right(CLng(splittedInputValue(3)), 2)
End Function
It always returns 76 K 6 18 regarding to your sample values.

Excel VBA split after second space

I'm trying to split my column so that the names
James John Doe
Comes out as only
James John
Using the below formula but it only leaves the first name, where I want it to split at the second occurrence of "space".
Sub Split1()
Dim r As Range
For Each r In Range("A2:A" & Cells(Rows.count, "A").End(xlUp).Row).Cells.SpecialCells(xlCellTypeConstants)
r.Value = Split(r.Value, " ")(0)
Next r
Can anyone help me out?
Thanks
There are a few ways you can turn a three-word phrase into the first two words.
Let's start with your Split() method.
This function returns an array. Your particular method of attempting to access the index will only return a single word.
You can place into an array, then just combine the array elements:
For Each r In Range(...)
Dim retVal() As String
retVal = Split(r.Value)
r.Value = retVal(0) & " " & retVal(1)
Next r
You can remove the last word with Replace():
For Each r In Range(...)
r.Value = Replace(r.Value, Split(r.Value)(2), "")
Next
Or you can even use Regular Expressions:
With CreateObject("VBScript.RegExp")
.Pattern = "\s[^\s]+$"
For Each r in Range(...)
r.Value = .Replace(r.Value, "")
Next
End With
In Regular Expressions, \s signifies a single space character, the [^...] bracket means "Do not include", which we placed a \s within the bracket, so that would match any non-space character, followed by the + means 1 or more times, and finally the $ signifies the end of the string. Essentially, you are wanting to match a word [^\s]+ that is at the end of the string $, preceeded by a space \s, and remove it via the .Replace() method. And you actually could also simply use the pattern \s\S+$, which is essentially the same thing (\S means any non-space character when it's capitalized).
You may try to use left & find to obtain the string value untill second space instead of split function
Code modification:
Dim r As Range
Dim s As String, newText As String
Dim Length As Long
For Each r In Range("A2:A" & Cells(Rows.Count, "A").End(xlUp).Row).Cells.SpecialCells(xlCellTypeConstants)
s = r.Value
Length = Application.WorksheetFunction.Find(" ", s, Application.WorksheetFunction.Find(" ", s) + 1)
r.Value = Left(s, Length)
Next r
Sample output:
Further way to extract the 1st and 3rd token of a split array
This approach profits from the advanced possibilities of Application.Index allowing to indicate any new row or columns order; the wanted columns are reflected here by the last 1-based (columns) argument Array(1, 3):
Function GetFirstLast(s As String) As String
GetFirstLast = Join(Application.Index(Split(s), 0, Array(1, 3)))
End Function
Example call:
Debug.Print GetFirstLast("James John Doe")
resulting in
James Doe in the VB Editor's immediate window.

Replace the first space in a column with a delimiting character

I have a report that includes a bunch of text in one cell. The first part of the text is a product# but the length varies. The product number is separated from the other information by a space.
I'm looking to write a macro that will replace just the first space with a delimiting character. I usually use "~". This will then allow me to script a text-to-columns command that will isolate the product number in one column.
You can do this with a formula:
=LEFT(A1, FIND(" ", A1, 1)-1) & "~" & RIGHT(A1,LEN(A1) - FIND(" ", A1, 1))
Copy that down. Copy/PasteSpecial Values. Then text-to-column that result
With VBA, the following approach is possible:
Locate the first empty string position and write it to a variable
Take the left part of the string to the position and append the replacement string
Take the right part of the string from the position to the end and append the rest
This is the function:
Public Function ReplaceFirstSpace(myInput As String, _
Optional replacement As String = "~") As String
Dim position As Long
position = InStr(1, myInput, " ")
If position = 0 Then
ReplaceFirstSpace = myInput
Else
ReplaceFirstSpace = Left(myInput, position - 1) & _
replacement & Right(myInput, Len(myInput) - position)
End If
End Function
And some tests:
Sub TestMe()
Debug.Print ReplaceFirstSpace("my name is")
Debug.Print ReplaceFirstSpace("slim shaddy")
Debug.Print ReplaceFirstSpace("tikitiki")
Debug.Print ReplaceFirstSpace(" taram")
Debug.Print ReplaceFirstSpace("tam ")
Debug.Print ReplaceFirstSpace("")
End Sub
Use REPLACE:
=REPLACE(A1,FIND(" ",A1),1,"~")

Convert column number to column letter in VBA

I use the following VBA code to insert the column number from Cell C1 into Cell B1:
Sub AdressColumn()
Sheet1.Range("B1").Value = Sheet1.Range("C1").Column
End Sub
In this case the result on my spreadsheet looks like this:
A B C
1 3
2
3
All this works fine so far.
However, instead of inserting the 3 I would prefer to insert the letter of the column. In this case the letter C should be inserted into Cell B1.
I also tried to go with the formula here but I could not make it work since in my case I do not use a given number. Instead I refer to a Column with the .Column function.
What do I have to change in my formula to make it work?
Split the $ out of an absolute cell address.
Sub AdressColumn()
Sheet1.Range("B1").Value = split(Sheet1.Range("C1").address, "$")(1)
End Sub
... or split the colon out of the relative full column address.
Sub AdressColumn()
Sheet1.Range("B2").Value = Split(Sheet1.Range("C1").EntireColumn.Address(0, 0), ":")(0)
End Sub
user4039065 is very close, but the subscript at the end of the line should be (2). Then, e.g., 677 represents column "ZA" and column 16384 represents "XFD" by the function below:
Const MAX_COL_NUMBER = 16384
...
Function columnNumberToColumnString(col As Integer) As String
If col > MAX_COL_NUMBER Or col < 1 Then
columnNumberToColumnString = "ERROR": Exit Function
Else
columnNumberToColumnString = Split(Columns(col).Address, "$")(2)
End If
' delete code block below after seeing how Split works
msg = "Split <" & Columns(col).Address & ">"
For i = 0 To UBound(Split(Columns(col).Address, "$"))
msg = msg + Chr(13) & Chr(10) & "Substring " & i & _
" is <" & Split(Columns(col).Address, "$")(i) & ">"
Next
MsgBox msg
End Function
In fact, if I use (1) in place of my (2), for column 26 I get Z:, not just Z, as explained below.
The Split function, when used with a valid Excel column address, returns an array of length 3, showing in stages how the final result is arrived at.
For 256, for example, the results displayed by the msg code block are:
Address of column number 256 is <$IV:$IV>
Substring 0 is <> (since first $ is first character, strip it and all after)
Substring 1 is <IV:> (since second $ is after the :, strip it and all after)
Substring 2 is <IV> (since : above is a usual delimiter, strip it)
Split "Returns a zero-based, one-dimensional array containing ... 'all substrings' " (if the limit (third) argument is omitted) of the given expression (first argument).

How do I convert "3 days" to "3" in Excel?

I have a one column filled with values like 3 days, 6 days, etc.
How can I strip out the text and force convert these to integers?
If you want to convert them "in-place", then select the cells and run this little macro:
Sub fixData()
Dim r As Range, v As String, i As Long
For Each r In Intersect(Selection, ActiveSheet.UsedRange)
v = r.Text
i = InStr(1, v, " ")
If i <> 0 Then
r.Value = Mid(v, 1, i - 1)
End If
Next r
End Sub
You could try substitute on a column next to you data:
=substitute(A1," days","") +0
Substitute replaces the string part " data" with nothing (zls "") and adding zero returns a number. This assumes your data is in col A. Drag down the formula and voila
If you don't mind changing the data you could also just select it all hit ctrl+F choose find and replace find field is " days", leave replace with blank and hit "Replace all"
If its VBA, you could actually use the Val function to do this.
If A1 has the data and you want this integer value in A2
So something like:
Range("A2").value = Val(Range("A1").value)
The returned value is actually a double. If a double isn't good enough,
Range("A2").value = CInt(Val(Range("A1").value))
Hope that helps.

Resources