Excel VBA Extract Specific Data from Column of Mixed Data - excel

I have a list of Tyres form the internet, the list is 5,000 lines long in one column.
I need to extract from each line the data in BOLD ideally into the next column
EXAMPLE of TYRES
LS388 (145/70 R13 71T)
LS388 (155/65 R13 73T)
LS388 (155/65 R14 75T)
4-Seasons (155/65 R14 75T)
CT6 (155/70 R12 104N) 72EE
LS388 (155/70 R13 75T)
The problem is that the number can be between 59 and 120 and the letter could be H T V R N X Z and so on. Also the text could be anywhere within the line of data not always towards the end as shown.
There could be 100 variations to look for and
Rather than having one line of code to search for a LIKE 71T for each line of tyres, can I use a source table of these variations and reference them one by one in the code is some sort of loop? or other suggestions if in VBA appreciated
At the moment I have this VBA code for each possible variation, one line for each variant.
ElseIf ActiveCell.Value Like "*79S*" Then
ActiveCell.offset(0,1).Value = "79S"

Insert this formula in a cell it is assuming your string is present in column A, you can change it if it is not the case and check how many it extracts.
=MID(A1,SEARCH(" ",A1,SEARCH("R1?",A1))+1,SEARCH(")",A1)-SEARCH(" ",A1,SEARCH("R1?",A1))-1)
filter out the remaining ones, find some thing common in them and let me know and we can build another formula for those cells.

I recommend to use Regular Expressions for that if you need to do it with VBA. There is a pretty good explanation at
How to use Regular Expressions (Regex) in Microsoft Excel both in-cell and loops.
As pattern you could use something like .+\(.+ (.+)\).* (see https://regex101.com/r/jK1zKc/1/)
For a manual solution
Use Split text into different columns with the Convert Text to Columns Wizard to split into columns by the spaces.
Then do a simple replace ")" by "" in column D.
Or do the manual solution with VBA (assuming your data in column A):
Option Explicit
Sub SplitAndDelet()
Range("A:A").TextToColumns Destination:=Range("A1"), DataType:=xlDelimited, _
TextQualifier:=xlDoubleQuote, ConsecutiveDelimiter:=True, Tab:=False, _
Semicolon:=False, Comma:=False, Space:=True, Other:=True, OtherChar:= _
")", FieldInfo:=Array(Array(1, 1), Array(2, 1), Array(3, 1), Array(4, 1), Array(5, 1)) _
, TrailingMinusNumbers:=True
Range("A:C,E:E").Delete Shift:=xlToLeft
End Sub

If you want to do this in vba you could set up an array of tyres and loop through them for each cell. for example if you had this on your sheet;
you could do something like this;
Public Sub FindTyres()
' Column to Loop
Dim col As String
col = "B"
' rows to Loop
Dim startRow As String
Dim endRow As String
startRow = "2"
endRow = "7"
' Get list of Tyres
Dim tyresArr()
tyresArr = getTyresArr()
' Set Range to Loop
Dim rng As Range, cell As Range
Set rng = Range(col & startRow & ":" & col & endRow)
' Looping through Array params
Dim tyre As Variant
' Loop through Cells
For Each cell In rng
currentCellVal = cell.Value
' Loop through tyres
For Each tyre In tyresArr
Debug.Print tyre
' if you find it do something
If InStr(1, currentCellVal, CStr(tyre)) <> 0 Then
MsgBox "Value " & CStr(tyre) & " Contained in Cell " & cell.Address
Exit For
End If
Next tyre
Next cell
End Sub
Private Function getTyresArr()
Dim tyresArr(3)
tyresArr(0) = "71T"
tyresArr(1) = "73T"
tyresArr(2) = "75T"
tyresArr(3) = "104N"
getTyresArr = tyresArr
End Function
Please note this assumes you will only ever have one tyre code per line.
you may get some false positives if these strings exist for other reasons.
you would need to enter all the codes into the function that returns the array.

Related

Looping Text to Columns Fixed Width on Individual Rows

I am attempting to build a loop that will look at each row in a column of data and split based on the first instance of an " ". I can get this to work on one line but the loop never activates. I tried my best at formatting this code but could not find a tutorial on how to have the commands appear as different colors and whatnot.
Dim num
Dim RowCnt As Integer
Dim x As Integer
ActiveCell.Select ' the cell to split
RowCnt = Range(Selection, Selection.End(xlDown)).Rows.Count 'determines #rows in column to split
With ActiveCell ' with block
For x = 1 To RowCnt ' define # loops
.Offset(, -1).FormulaR1C1 = "=FIND("" "",RC[1],1)" ' determine first " "
num = .Offset(, -1).Value ' assign the number of chars to 'num'
Selection.TextToColumns Destination:=ActiveCell, DataType:=xlFixedWidth, _
FieldInfo:=Array(Array(0, 1), Array(num, 1)), TrailingMinusNumbers:=True ' splits once based on 'num'
.Offset(, -1).ClearContents ' clear
.Offset(1, 0).Activate
Next x
End With
End Sub
I was able to cheat the answer. The issue is the Text to Columns always referred to the first cell until the sub ended. My solution was to make the looped code its own sub and call it in a separate subs loop. That way it ends the sub each time before being called again.
Use this code instead (tested: works!)
Sub updated_delimitter()
start_cell = ActiveCell.AddressLocal
n = Range(start_cell, Range(start_cell).End(xlDown)).Rows.Count 'determines #rows in column to split
Application.ScreenUpdating = False
For x = 0 To n - 1 ' define # loops
this_cell = Range(start_cell).Offset(x).AddressLocal
Range(this_cell).Select
word_ = Range(this_cell).Value
split_at = InStr(word_, " ")
Range(this_cell).TextToColumns Destination:=Range(this_cell), DataType:=xlFixedWidth, _
FieldInfo:=Array(Array(0, 1), Array(split_at, 1)), TrailingMinusNumbers:=True ' splits once based on 'num'
Next
Application.ScreenUpdating = True
End Sub
original code had issues with referencing in relation to 'activecell' which you referenced in the text-to-columns section - removed the with statement and no need to insert num when you can simply store it within VB (getting rid of its placements also mean no code required to remove it...
You could achieve the same in 3 lines of code♦ (w/ for loop) using the following:
Sub test2()
'Range("d2").Select
With Selection
.Offset(, 3).Formula2R1C1 = _
"=LET(x_,RC[-3]:OFFSET(RC[-3],MATCH(0,IFERROR(SEARCH("""",RC[-3]:OFFSET(RC[-3],ROWS(C[-3])-ROWS(RC[-3])-1,0)),0),0)-1,0),IF(ISODD(SEQUENCE(1,2,1,1)),MID(x_,1,IFERROR(SEARCH("" "",x_)-1,LEN(x_))),IF(ISERROR(SEARCH("" "",x_)),"""",MID(x_,SEARCH("" "",x_)+2,LEN(x_)))))"
Range(.AddressLocal, .End(xlDown).Offset(, 1)).Value = Range(Replace(.Offset(, 3).AddressLocal, "$", "") & "#").Value
.Offset(, 3).ClearContents
End With
End Sub
This uses the function:
=LET(x_,D2:OFFSET(D2,MATCH(0,IFERROR(SEARCH("",D2:OFFSET(D2,ROWS(D:D)-ROWS(D2)-1,0)),0),0)-1,0),IF(ISODD(SEQUENCE(1,2,1,1)),MID(x_,1,IFERROR(SEARCH(" ",x_)-1,LEN(x_))),IF(ISERROR(SEARCH(" ",x_)),"",MID(x_,SEARCH(" ",x_)+2,LEN(x_)))))
... which is an array function that reproduces the original list with relevant cells split as req.
REVISED
Here for sample file (requires Microsoft Onedrive a/c - read only file avail.)
♦ Office 365 compatibility; '3 lines' ignoring with/end/sub/etc.
ta 💪

VBA: Impossible to Suppress Auto Conversion of Strings in Range.TextToColumns?

The TextToColumns method of the Range object automatically converts strings to numbers, but it would be nice to suppress this feature. The method has a TextQualifier parameter, but it doesn't seem to do what I'm looking for. The following data illustrates the issue. Strings in column A are delimited with semicolons that separate a text part from a number part. Note that the numbers all begin with zero, and that numbers in row 4-6 are prefixed with an apostrophe:
Column A
StringRow1;01000
StringRow2;02000
StringRow3;03000
StringRow4;'01000
StringRow5;'02000
StringRow6;'03000
The following macro splits the strings into a text part in Column B and a number part in Column C.
Sub TTC()
Application.DisplayAlerts = False
Dim rToSplit As Range
Set rToSplit = ThisWorkbook.Worksheets(1).Range("A1:A6")
rToSplit.TextToColumns _
Destination:=Range("B1"), _
DataType:=xlDelimited, _
Semicolon:=True, _
TextQualifier:=xlTextQualifierNone
End Sub
The last column illustrates the desired output:
Column A Column B Column C
Data Output Output Desired Output
StringRow1;01000 StringRow1 1000 01000
StringRow2;02000 StringRow2 2000 02000
StringRow3;03000 StringRow3 3000 03000
StringRow4;'01000 StringRow4 '01000 01000
StringRow5;'02000 StringRow5 '02000 02000
StringRow6;'03000 StringRow6 '03000 03000
I have tried formatting column C ahead of the split, like this:
rToSplit.Offset(, 3).NumberFormat = "#", but it has no effect. Switching the TextQualifier parameter to xlTextQualifierSingleQuote has the effect of treating rows 4-6 in the same way as rows 1-3.
Am I asking for the impossible? Or is there maybe some application level setting I'm not aware of? Or could I do something smart with the strings in column A?
(I could of course loop through Column C after the split and fix the issue, but for large data sets that's not attractive. For my particular application, the strings in column A are generated by code that searches for patterns in tens of thousands of text rows in several different text files; each match is added to a dictionary and then I use array() = Dictionary.Items and DestinationRange.Value = Application.WorksheetFunction.Transpose(array) to read the data to the worksheet. This is very fast. My workaround to deal with the issue described here is to save the number strings in a separate dictionary which is read to column C after the split. This works well, so I posted this out of curiosity to see what I can learn...)
You can use the FieldInfo property to set the data type for each column. You will need to know how many columns you have beforehand though, or know which column will contain the numbers.
The FieldInfo parameter takes an array of arrays, with each of the sub-arrays having 2 values. The first value represents the column number (starting at 1), and the second number is the XLColumnDataType you would like that column to be formatted as.
In this case, you'd like everything to be formatted as text (instead of a number, like it's currently doing), so you would use xlTextFormat (this is just a system defined constant equal to 2).
x.TextToColumns _
Destination:=Range("B1"), _
DataType:=xlDelimited, _
Semicolon:=True, _
TextQualifier:=xlTextQualifierNone, _
FieldInfo:=Array(Array(1, xlTextFormat), Array(2, xlTextFormat)) 'Format columns 1 and 2 as text
If you are willing to do it with a loop:
Sub TTC()
Dim row As Long, lastRow As Long, splitSpot As Integer, cellValue As String
Application.ScreenUpdating = False
With ThisWorkbook.Worksheets(1)
lastRow = .Cells(.Rows.Count, "A").End(xlUp).row
For row = 1 To lastRow
cellValue = CStr(.Range("A" & row).Value)
splitSpot = InStr(cellValue, ";")
.Range("B" & row & ":C" & row).NumberFormat = "#"
.Range("B" & row).Value = Left(cellValue, splitSpot - 1)
If Mid(cellValue, splitSpot, 1) = "'" Then
.Range("C" & row).Value = Right(cellValue, Len(cellValue) - splitSpot + 1)
Else
.Range("C" & row).Value = Right(cellValue, Len(cellValue) - splitSpot)
End If
Next
End With
End Sub
ss:

Macro for Text to columns

Sub Macro9()
'
' Macro9 Macro
'
'
Selection.TextToColumns Destination := Range("A3"), DataType := xlFixedWidth, _
FieldInfo := Array(Array(0,1),Array(60,1),Array(63,1),Array(68,1),Array(71,1), _
Array(85,1),Array(88,1),Array(93,1),Array(99,1),Array(107,1),Array(111,1),Array _
(120,1),Array(123,1),Array(127,1),Array(130,1),Array(134,1),Array(143,1),Array( _
147,1),Array(157,1),Array(162,1),Array(165,1),Array(170,1),Array(202,1),Array( _
233,1),Array(236,1),Array(238,1),Array(248,1),Array(251,1),Array(260,1),Array( _
265,1),Array(277,1),Array(283,1),Array(287,1),Array(291,1),Array(295,1),Array( _
299,1),Array(302,1),Array(306,1),Array(310,1),Array(322,1),Array(326,1),Array( _
332,1),Array(335,1),Array(338,1),Array(344,1),Array(348,1),Array(356,1),Array( _
360,1),Array(367,1),Array(373,1),Array(375,1),Array(384,1),Array(387,1),Array( _
394,1),Array(398,1),Array(403,1),Array(409,1),Array(413,1),Array(419,1),Array( _
424,1),Array(429,1),Array(432,1),Array(438,1),Array(444,1),Array(449,1),Array( _
454,1),Array(458,1),Array(463,1),Array(468,1),Array(474,1),Array(478,1),Array( _
481,1),Array(484,1),Array(489,1),Array(493,1),Array(524,1),Array(554,1),Array( _
557,1),Array(563,1),Array(565,1),Array(577,1),Array(594,1),Array(613,1),Array( _
616,1),Array(620,1),Array(626,1),Array(629,1),Array(634,1),Array(646,1),Array( _
654,1),Array(659,1),Array(667,1),Array(669,1),Array(675,1),Array(683,1),Array( _
689,1),Array(696,1),Array(699,1),Array(706,1),Array(714,1),Array(717,1),Array( _
721,1),Array(728,1),Array(730,1),Array(743,1),Array(751,1),Array(754,1),Array( _
758,1),Array(767,1),Array(774,1),Array(779,1),Array(787,1),Array(790,1),Array( _
798,1),Array(805,1),Array(808,1),Array(817,1),Array(822,1),Array(826,1),Array( _
835,1),Array(845,1),Array(853,1),Array(857,1),Array(864,1),Array(869,1),Array( _
877,1),Array(881,1),Array(891,1),Array(895,1),Array(903,1),Array(912,1),Array( _
916,1),Array(920,1),Array(927,1),Array(933,1),Array(937,1),Array(941,1),Array( _
End Sub
I have 800 words in cell A3 in sheet input1, i recorded above macro by using function "Text to columns" in Excel 2007 which is giving error "Too many line continuations".
Can someone tell me the exact code please, indeed I want to add all the 800 words in different individual cells as one word in each cell in the same row.
I do not believe it is possible to tell the Macro Recorder to create longer lines so I do not think TextToColumns can be made to record this code for you.
You are using the fixed width option so words are starting at position 0, 60, 63, 68, 71 and so on. The start positions for about 120 words have been have been recorded so, if you wanted to build an array like this, you will have a lot of typing.
You say "words". To me that implies variable length strings separated by spaces. If that is correct, try the code below. It uses function Split to split cell A3 into words by space. These are then spread out along row 4 with any gaps created by double or triple spaces ignored.
Option Explicit
Sub SplitCell()
Dim CellCrnt As Long
Dim InxW As Long
Dim Word() As String
With Worksheets("input1")
Word = Split(.Range("A3"), " ")
CellCrnt = 1
For InxW = LBound(Word) To UBound(Word)
' Any double spaces will cause empty entries in Word.
' Ignore these empty entries
If Word(InxW) <> "" Then
.Cells(4, CellCrnt).Value = Word(InxW)
CellCrnt = CellCrnt + 1
End If
Next
End With
End Sub

Excel VBA - Text into Columns

I have the below strings of data in column B. I would like to shift text into different columns using Excel VBA 2007. In the below example there are 4 groups of data, divided by "..". However, there can also be 2 or 3 groups of data.
Example 4 groups:
InstrumentBaseClass..Mega~Corporate~InstrumentBaseClass..Mid~Energy~InstrumentBaseClass..Micro~Oil Field Services
Example 2 groups:
InstrumentBaseClass..Mega~Corporate
How can I set text into different columns such that if there are only 2 sets of data, the two following columns say "none"? If there are 3 sets of data I need one column of "none" and so on.
An easy, no programmatic way is to simply fill all 4 columns with none and then let the text to columns feature over write any 'none's it needs to.
The following code uses hard coded inputs, but that could be replaced with range references or passed as varibles on calling the sub.
Sub TextDotColumns()
Dim sInput As String, sMid As String, rOutput As Object 'these should be passed when calling sub
Dim lBreak As Long, lColOff As Long, sCheck As String
sInput = "InstrumentBaseClass..Mega~Corporate~InstrumentBaseClass..Mid~Energy~InstrumentBaseClass..Micro~Oil Field Services"
sCheck = ".."
Set rOutput = Range("d10")
lBreak = InStr(1, sInput, sCheck, vbTextCompare)
rOutput.Select
lColOff = 1
Do While lBreak > 0
sMid = Mid(sInput, 1, lBreak - 1)
rOutput.Offset(1, lColOff).Value = sMid
sInput = Right(sInput, Len(sInput) - lBreak - 1)
lColOff = lColOff + 1
lBreak = InStr(1, sInput, sCheck, vbTextCompare)
Loop
rOutput.Offset(1, lColOff).Value = sInput
End Sub
Try This:
Sub Test()
'// Assumes your range is in column A
With Range("A2", Cells(Rows.Count, "A").End(xlUp))
'// Text to columns
.TextToColumns Destination:=.Offset(, 1), DataType:=xlDelimited, _
TextQualifier:=xlNone, ConsecutiveDelimiter:=True, Other:=True, OtherChar:="."
'// Fill blank spaces with "None"
.Offset(, 1).Resize(, 4).SpecialCells(xlCellTypeBlanks).Value = "None"
End With
End Sub

Selecting Cells Based on Criteria, then Copy and Paste Special (Transpose) - Macro Help

I was wondering if anyone could help me with the following problem. I have two excel workbooks. Workbook A contains bill data running from 1 to 1000. Each bill is on a different line in numerical order. Workbook B contains bill sponsor information. However, it is formatted as 1 sponsor per line, so 1 bill can occupy multiple rows. Also, the bill number is in column A, while the sponsor name is in column B. So, you have to select the names from column B based on the values from column A.
I would like to select the names of each sponsor for each bill from workbook B and paste special (transpose) them into workbook A for each bill. I can do this by hand, but it will take a very long time. Is there anyway to automate it? Thank you in advance.
The data look like this
Workbook A
Column A
1
2
3
4
5
Workbook B
Column A Column B
1 Name ID
1 Name ID
2 Name ID
2 Name ID
2 Name ID
2 Name ID
A possible solution is to use a user-defined formula, that when used as array formula, will return a comma-separated list of bill sponsors for each bill id. I posted the code for the UDF previously here. Once you have entered the code in a VBA module, enter the following formula in B2 in Workbook A:
=CCARRAY(IF(A2=[Workbook_B]Sheet_Name!$A$2:$A$2000,[Book2]Sheet_Name!$B$2:$B$2000),", ")
Press Ctrl+Shift+Enter to enter the formula as an array formula. Then fill-down for all Bill IDs.
Just to be clear, you'll need to insert the appropriate file and sheet names and adjust the number of rows to match your data. Also, since array formulas can be kind of computationally clunky, you'll probably want to copy column B and paste special 'Values only' back to column B.
Untested...
Sub Tester()
Dim Bills As Excel.Worksheet
Dim Sponsors As Excel.Worksheet
Dim c As Range, f As Range
Set Bills = Workbooks("WorkbookA").Sheets("Bills")
Set Sponsors = Workbooks("WorkbookB").Sheets("Sponsors")
Set c = Sponsors.Range("A2")
Do While c.Value <> ""
Set f = Bills.Range("A:A").Find(c.Value, , xlValues, xlWhole)
If Not f Is Nothing Then
Bills.Cells(f.Row, Bills.Columns.Count).End(xlToLeft).Offset(0, 1).Value = c.Offset(0, 1).Value
Else
c.Font.Color = vbRed
End If
Set c = c.Offset(1, 0)
Loop
End Sub
Here's a macro that will do the trick.
It does the work in memory variant arrays to provide resonable speed. Looping over the cells/rows would produce simpler code, but would run much slower.
It requires (and tests for) that all the BillIDs are present in the sponsor list
Also, it uses , to seperate the sponsors list, so , must not be in any of the sponsor names. If it is choose a different character
.
Sub GetSponsors()
Dim rngSponsors As Range, rngBills As Range
Dim vSrc As Variant
Dim vDst() As Variant
Dim i As Long, j As Long
' Assumes data starts at cell A2 and extends down with no empty cells
Set rngSponsors = Sheets("Sponsors").[A2]
Set rngSponsors = Range(rngSponsors, rngSponsors.End(xlDown))
' Count unique values in column A
j = Application.Evaluate("SUM(IF(FREQUENCY(" _
& rngSponsors.Address & "," & rngSponsors.Address & ")>0,1))")
ReDim vDst(1 To j, 1 To 2)
j = 1
' Get original data into an array
vSrc = rngSponsors.Resize(, 2)
' Create new array, one row for each unique value in column A
vDst(1, 1) = vSrc(1, 1)
vDst(1, 2) = "'" & vSrc(1, 2)
For i = 2 To UBound(vSrc, 1)
If vSrc(i - 1, 1) = vSrc(i, 1) Then
vDst(j, 2) = vDst(j, 2) & "," & vSrc(i, 2)
Else
j = j + 1
vDst(j, 1) = vSrc(i, 1)
vDst(j, 2) = "'" & vSrc(i, 2)
End If
Next
Set rngBills = Sheets("Bills").[A2]
Set rngBills = Range(rngBills, rngBills.End(xlDown))
' check if either list has missing Bill numbers
If UBound(vDst, 1) = rngBills.Rows.Count Then
' Put new data in sheet
rngBills.Resize(, 2) = vDst
rngBills.Columns(2).TextToColumns , _
Destination:=rngBills.Cells(1, 2), _
DataType:=xlDelimited, _
TextQualifier:=xlDoubleQuote, _
ConsecutiveDelimiter:=False, _
Tab:=False, _
Semicolon:=False, _
Comma:=True, _
Space:=False, _
Other:=False
ElseIf UBound(vDst, 1) < rngBills.Rows.Count Then
MsgBox "Missing Bills in Sponsors list"
Else
MsgBox "Missing Bills in Bills list"
End If
End Sub

Resources