I have a text document full of 9 digit numbers. I need Excel to either read the text (.txt) file or a text cell and add each 9 digit number to each cell in a column.
Example text file:
123456789, 987654321, 213454321 / 987656789, [098752739]
Excel result:
123456789
987654321
213454321
987656789
098752739
Any advice?
You can use the standard Excel feature for importing data from Text file. Ribbon tab Data -> From Text.
From your result, with each cell containing the 9 digit number:
Click Data -> Text to columns
Select "Fixed Width"
You will then need to set break lines between each of your nine
digits
Optional: Click "Next" to format each of the fields
Finally, click "Finish"
You should now see each of the nine digits in a separate column.
Assuming your data is in a single cell, then your task has three parts:
read data from text file
parse each cell
store the results in a column
This code addresses only the second part:
Sub ParseData()
Dim s1 As String
s1 = Range("A1").Text
s1 = Replace(s1, " ", "|")
s1 = Replace(s1, "/", "|")
s1 = Replace(s1, "[", "|")
s1 = Replace(s1, "]", "|")
s1 = Replace(s1, ",", "|")
s1 = CleanUp(s1, "|")
ary = Split(s1, "|")
i = 1
For Each a In ary
Cells(i, 2).NumberFormat = "#"
Cells(i, 2).Value = a
i = i + 1
Next a
End Sub
Public Function CleanUp(sIN As String, sep As String) As String
Dim temp As String, temp2 As String, i As Long, CH As String
temp = sIN
While Left(temp, 1) = sep
temp = Mid(temp, 2)
Wend
While Right(temp, 1) = sep
temp = Mid(temp, 1, Len(temp) - 1)
Wend
temp2 = ""
For i = 1 To Len(temp)
CH = Mid(temp, i, 1)
If temp2 = "" Then
temp2 = CH
ElseIf CH <> sep Then
temp2 = temp2 & CH
ElseIf Right(temp2, 1) <> sep Then
temp2 = temp2 & CH
End If
Next i
CleanUp = temp2
End Function
NOTES:
The code replaces the various field separators with a single pipe. The data is then split using the pipe. The resulting array is then stored in cells:
Related
I am trying to convert a string, which has a date in US format into UK format.
The following code seems to be hit or miss when it comes to a date that is single digits for both the day and the month:
X = 3
Do While strTimeStamp = 0
If InStr(WS2.Cells(lRow, lCol), "TIMESTAMP") <> 0 Then
strHPCStats = Split(WS2.Cells(lRow, lCol), " ")
'strHPCStats(X) = Mid(strHPCStats(X), 4, 6)
re.Pattern = "^(\d{2})(\d{2})(\d{4})$"
strHPCStats(X) = re.Replace(strHPCStats(X), "$3/$2/$1")
strHPCStats(X) = Format$(strHPCStats(X), "dd/mmm/yyyy")
strTimeStamp = strHPCStats(X)
WS2.Cells(lRow, lCol).EntireRow.Delete
lRow = lRow - 1
Else
WS2.Cells(lRow, lCol).EntireRow.Delete
lRow = lRow - 1
End If
lRow = lRow + 1
Loop
The typical string:
4:19:17 (application) TIMESTAMP 3/13/2022
The string where it is having trouble:
5:36:32 (cameo) TIMESTAMP 4/1/2022
d{2} will look for exactly 2 digits, so if your date has a month (or day) with only 1 digit, the regex doesn't match.
If you want to specify 1 or 2 digits, you can for example use d{1,2}, so the statement would be
e.Pattern = "^(\d{1,2})(\d{1,2})(\d{4})$"
Details: How to use Regular Expressions (Regex) in Microsoft Excel both in-cell and loops
There's no need to use regular expressions, given your expected data.
Just look for two slashes in a space-separated string:
Function us2ukDate(S As String) As Date
Dim v, w, x
v = Split(S, " ")
For Each w In v
If (Len(w) - Len(Replace(w, "/", ""))) = 2 Then
x = Split(w, "/")
us2ukDate = DateSerial(x(2), x(0), x(1))
Exit Function
End If
Next w
End Function
testing example
If, instead of just returning the date, you want to change the format within the string, you could do something like:
Sub convertStrings()
Const d1 = "4:19:17 (application) TIMESTAMP 3/13/2022"
Const d2 = "5:36:32 (cameo) TIMESTAMP 4/1/2022"
Dim sParts
sParts = Split(d1, " ")
sParts(UBound(sParts)) = Format(us2ukDate(sParts(UBound(sParts))), "dd-mmm-yyyy")
Debug.Print Join(sParts, " ")
sParts = Split(d2, " ")
sParts(UBound(sParts)) = Format(us2ukDate(sParts(UBound(sParts))), "dd-mmm-yyyy")
Debug.Print Join(sParts, " ")
End Sub
I have a comma separated lists in cells. All numbers are positive and between 1 and 10.
Example:
if I have in A1: (2,3,5,6), I would like to have missing numbers in B1:(1,4,7,8,9,10).
If A2: (1,10), then I would have in B2:(2,3,4,5,6,7,8,9)
If A3: (7), then I would have in B2:(1,2,3,4,5,6,8,9,10)
I searched for a solution online, but I couldn't find anything similar with comma separated numbers.
I'd be glad if I can have a solution here. Thanks.
Here is a user-defined function that should accomplish this... probably can be optimized.
Public Function MissingNumbers(ByVal numberList As String) As String
Dim temp As String
temp = Replace(numberList, "(", "")
temp = Replace(temp, ")", "")
Dim arr As Variant
arr = Split(temp, ",")
Dim newNumbers As String
newNumbers = "1,2,3,4,5,6,7,8,9,10,"
Dim i As Long
For i = LBound(arr) To UBound(arr)
newNumbers = Replace(newNumbers, arr(i) & ",", "")
Next
newNumbers = "(" & Left$(newNumbers, Len(newNumbers) - 1) & ")"
MissingNumbers = newNumbers
End Function
Just for fun demonstrating how to use negative filtering:
Function MissingList(ByVal numberList As String) As String
Dim given: given = Split(Mid(numberList, 2, Len(numberList) - 2), ",")
Dim series: series = GetSeries() ' i.e. numbers 1..10
Dim i As Long
For i = 0 To UBound(given)
series = Filter(series, given(i), False) ' << negative filtering
Next
MissingList = "(" & Replace(Join(series, ","), "0", "10") & ")"
End Function
As Filter executes a partial search in the 1..10 series, 10 has to be replaced temporarily by a unique 0.
Help function GetSeries()
Function GetSeries()
' Purpose: get numbers 1..10
Const LAST As Long = 10: Const FIRST = 1
Dim tmp: tmp = Application.Transpose(Evaluate("row(" & FIRST & ":" & LAST & ")"))
tmp(LAST) = 0 ' replace 10 by 0 as search item 1 would filter out value 10, too
GetSeries = tmp
End Function
Cell A2 is the data given which has different lines breaks. I separated the data before and after "-" as shown in B2 & C3. Then I sorted the data of C2 from lowest to largest in D2. The desire result is cell E2. I would like to have a user define function to get index B2 by matching D2 from C2. Please note A2 has four values in 4 lines breaks in one cell not in every cell there is a value, please find attached.
The UDF below will do what you want. Call it from the worksheet like =InexMatch(A2). Make sure that the cell you place it in has its WrapText property set to True.
Function InexMatch(Cell As Range) As String
' 003
Dim Arr As Variant
Dim Sp() As String
Dim Tmp As String
Dim Done As Boolean
Dim i As Integer
Arr = Split(Cell.Value, Chr(10))
For i = 0 To UBound(Arr)
Sp = Split(Arr(i), "-")
Arr(i) = Sp(1) & "-" & Sp(0)
Next i
Do
Done = True
For i = 0 To UBound(Arr) - 1
If Val(Arr(i + 1)) < Val(Arr(i)) Then
Tmp = Arr(i)
Arr(i) = Arr(i + 1)
Arr(i + 1) = Tmp
Done = False
End If
Next i
Loop While Not Done
On Error Resume Next
ReDim Sp(UBound(Arr))
For i = 0 To UBound(Arr)
Sp(i) = Split(Arr(i), "-")(1)
Next i
InexMatch = Join(Sp, Chr(10))
End Function
The function will return a null string if the referenced cell is blank. It can deal with cells that have fewer than 4 lines. It will fail if the CR isn't ANSII Chr(10) or the dash isn't a ANSII Chr(45) - a minus sign. It has no provision for incomplete lines within cells, meaning lines which don't have characters on both sides of a dash.
In order to restore the original format in the sorted string please delete all the lines below the end of the Do Loop in the code above, starting with On Error Resume Next, and replace them with the following.
For i = 0 To UBound(Arr)
Sp = Split(Arr(i), "-")
Arr(i) = Sp(1) & "-" & Sp(0)
Next i
InexMatchV2 = Join(Arr, Chr(10))
I am looking for some input and possible example for parsing a text file with the following format: (sorry not sure how to retain the formatting of the file in this text)
NAME ID FORMAT SHORT NAME
DESCRIPTION (this field is on the second row an indented by 5 spaces)
The first row (NAME, ID, FORMAT and SHORT NAME) always consist of just one row. The DESCRIPTION text may span multiple rows. In some cases, there is only a first row of NAME, ID, etc. without a corresponding DESCRIPTION row.
Here is an example of how the data looks in the file now:
NAME ID FORMAT SHORT NAME
DESCRIPTION
ABC 01 xx AB
abcdefg
hijklm
nopqrs
DEF 02 xx DE
abcedfg
hijklmnopqrst
GHI 03 xx.x GH
JKL 001 xx JKL
abcdef
ghijk
lmnopq
rstu
vwxyz
I would like to parse out the NAME, ID, FORMAT, SHORT NAME and DESCRIPTION into 5 separate columns in a csv or excel file for additional analysis. I don't care if the DESCRIPTION field is broken across multiple lines but it can also be concatenated into a single longer string.
Hope this all makes sense. Thanks in advance!
Providing the data for NAME,ID,FORMAT and SHORT NAME is aligned
beneath their header word then use those words on the first line
to calculate the start position and length of each field, then split
the lines into fields using Mid(). Join the description lines and write out to
the previous record before a new record is started. For example
Option Explicit
Sub ParseTextFile()
Const INFILE = "c:\temp\testfile.txt"
Const OUTFILE = "c:\temp\testfile.xlsx"
Dim wbOut As Workbook, ws As Worksheet, iRow As Long
Dim txt As String, ff As Integer, i As Integer, desc As String
Dim start(4) As Integer, length(4) As Integer
Dim count As Integer, msg As String
Set wbOut = Workbooks.Add
Set ws = wbOut.Sheets("Sheet1")
ws.Range("A1:E1") = Array("NAME", "ID", "FORMAT", "SHORT NAME", "DESCRIPTION")
ws.Columns("A:E").NumberFormat = "#"
iRow = 1
ff = FreeFile()
Open INFILE For Input As #ff
While Not EOF(ff)
count = count + 1
Line Input #ff, txt
If count = 1 Then
start(1) = InStr(1, txt, "NAME", vbTextCompare)
start(2) = InStr(1, txt, "ID", vbTextCompare)
start(3) = InStr(1, txt, "FORMAT", vbTextCompare)
start(4) = InStr(1, txt, "SHORT NAME", vbTextCompare)
For i = 1 To 3
length(i) = start(i + 1) - start(i)
Next
Else
If Left(txt, 1) = " " Then
desc = desc & Trim(txt) & " "
Else
' save the description from last record
ws.Cells(iRow, 5) = Trim(desc)
desc = ""
' new row
iRow = iRow + 1
length(4) = Len(txt) - start(4) + 1
For i = 1 To 4
ws.Cells(iRow, i) = Mid(txt, start(i), length(i))
Next
End If
End If
Wend
Close #ff
' final description
ws.Cells(iRow, 5) = Trim(desc)
' save result
ws.Columns("A:E").AutoFit
wbOut.Close True, OUTFILE
msg = count & " lines read from " & INFILE & vbCr & _
iRow - 1 & " rows written to " & OUTFILE
MsgBox msg, vbInformation
End Sub
I have thousands of addresses in this format:
123 Happy St. Kansas City, MO 64521
9812 Main Street Minneapolis, MN 62154
12 Virgina Ave, Apt 8, Dallas, TX 54334
I want to extract the address, city, state, zip into individual cells (without using VB if possible). I've tried a couple variations of other methods posted, but I can't quite get desired results.
Analyze your problem!
you want to split your address string at the comma
you then want to split the right fragment from (1) at the first blank
ad 1): you get the position of the comma using =FIND(",", A1), and use the result in a =LEFT(...) and a =RIGHT(...) - for the latter you also need the string length (=LEN(...))
B1: =LEFT(A1;FIND(",";A1)-1)
C1: =RIGHT(A1;LEN(A1)-LEN(B1)-2)
Now comes the fun part ... in your 3rd example we mustn't split on the first comma, but on the third comma ... or as a more general rule, we always must split on the last comma .... but how do we find how many commas we have in the string, to feed its position as an additional argument into the =FIND(...) function?
Quick answer: look at Stackoverflow (exactly here) ... very clever ... subtract the length of the string with all commas removed from the original length, and then replace the last occurence of the comma by something else, because =SUBSTITUTE(...) works on occurence, whilst =FIND() only works on position. If you incorporate all this this, you will have
B1: =LEFT(A1;FIND("#";SUBSTITUTE(A1;",";"#"; LEN(A1)-LEN(SUBSTITUTE(A1;",";""))))-1) --> full address
C1: (same as above)
Here we use "#" as a neutral substitution string for the final comma as we asume that no address uses the "#"
ad 2): you apply the above (with blank instead of comma) once again to the right part. You can use the simple first version of the formulae as it's clear you want to split at the first blank
D1: =LEFT(C1;FIND(" ";C1)-1) --> state
E1: =RIGHT(C1;LEN(C1)-LEN(D1)-1) --> ZIP code
This VBA function extracts Zip, State, City, Street1, and Street2 (Suite, Apt, etc.) into separate columns. Would need minor modification to remove commas.
Option Explicit
Function ParseAddress(ByVal varAddress As Variant, ByVal strAddressPart As String) As String
Dim aryAddressTokens() As String
Dim strCity As String
Dim intCtr As Integer
Dim intStreet2Tokens As Integer
Dim strStreet1, strStreet2 As String
If IsMissing(varAddress) Or varAddress = vbNullString Then
ParseAddress = ""
Else
aryAddressTokens = Split(Trim(varAddress), " ")
'
If strAddressPart = "Zip" Then
ParseAddress = aryAddressTokens(UBound(aryAddressTokens))
ElseIf strAddressPart = "State" Then
ParseAddress = UCase(aryAddressTokens(UBound(aryAddressTokens) - 1))
ElseIf strAddressPart = "City" Then
strCity = aryAddressTokens(UBound(aryAddressTokens) - 2)
If Right(strCity, 1) = "," Then strCity = Left(strCity, Len(strCity) - 1)
ParseAddress = strCity
ElseIf strAddressPart = "Street1" Or strAddressPart = "Street2" Then
'Find Street2 if present because Street1 output is dependent on it.
' Assume address never begins with a # or Suite.
intCtr = 1
strStreet2 = ""
intStreet2Tokens = 0
While (intCtr < UBound(aryAddressTokens) - 2) And strStreet2 = ""
If Left(aryAddressTokens(intCtr), 1) = "#" Then
If Len(aryAddressTokens(intCtr)) = 1 Then
strStreet2 = aryAddressTokens(intCtr) & aryAddressTokens(intCtr + 1)
intStreet2Tokens = 2
Else
strStreet2 = aryAddressTokens(intCtr)
intStreet2Tokens = 1
End If
ElseIf Left(aryAddressTokens(intCtr), 5) = "Suite" Then
If Len(aryAddressTokens(intCtr)) = 5 Then
strStreet2 = aryAddressTokens(intCtr) & " " & aryAddressTokens(intCtr + 1)
intStreet2Tokens = 2
Else
strStreet2 = aryAddressTokens(intCtr)
intStreet2Tokens = 1
End If
ElseIf Left(aryAddressTokens(intCtr), 3) = "Apt" Then
strStreet2 = aryAddressTokens(intCtr) & " " & aryAddressTokens(intCtr + 1)
intStreet2Tokens = 2
End If
intCtr = intCtr + 1
Wend
If Not IsEmpty(strStreet2) Then
If Right(strStreet2, 1) = "," Then strStreet2 = Left(strStreet2, Len(strStreet2) - 1)
End If
' Now Street1.
strStreet1 = ""
For intCtr = 0 To UBound(aryAddressTokens) - (3 + intStreet2Tokens)
strStreet1 = strStreet1 & " " & aryAddressTokens(intCtr)
Next
If Right(strStreet1, 1) = "," Then strStreet1 = Left(strStreet1, Len(strStreet1) - 1)
'Assign.
If strAddressPart = "Street1" Then
ParseAddress = Trim(strStreet1)
Else
ParseAddress = Trim(strStreet2)
End If
End If
End If
End Function