I have a bunch of entries in a spreadsheet which I want to split into two different columns.
The data looks something like this:
102483 STEIN LOKK B4-702
102482 STEIN LOKK BF-701
102413 RINGFUGEKULL 352X353X214 POS 2 Å1
102412 RINGFUGEKULL 352X353X135 POS 1 ÅI
102388 STEIN ISOLER MOSCONI MSB-475 500x250x 76
102387 STEIN ISOLER MOSCONI MSB-475 500x250x152
102384 OVNSFUNDAMENT CRADLE
102383 STEIN PLATE HA-040 KVAL,HSU95
102382 STEIN PLATE HA-039 KVAL,HSU95
102376 OLJE SYNT. MITRA 220
102341 KULL BUNN ÅI/ÅIIC D 3365 x 550 x 490
102291 OLJE 10W-40 HAVOLINE FORMULA 3 DIESEL
102241 FETT MINERALSK PATRON STARPLEX EP 2
102231 OLJE FYRINGSOLJE NR.1 (F)
102211 CALDE SRRIX 14
102141 STEIN ISOLER AAM HIPOR 230X114X 76
102103 STAMPEMASSE ILDFAST AL-85-F
102102 STEIN BORGESTAD INSULATING FIREBRICKS
102101 STAMPEMASSE TYPE T-JUSTERT ELKEM
101964 PAKNING LEX THERMOSEAL PGF-1 LEX Ø12mm
101939 BOKS KOMPENSASJON F/OVN 4 OG 4B 1170
The delimiter is the bunch of spaces between the product number and name.
Trying to use Excel's text to columns function, there doesn't seem to be a way to specify more than one character as a delimiter, and if I only use one space it creates issues with splitting up the product name as well.
I wrote a small macro to do it for me (see below), but I feel like I may be making things overly complicated. Is there some simpler way to do this? Are there any obvious ways my approach can fail? I am not overly familiar with regexes, so I'm not sure if the pattern I've chosen is the best...
Sub split_column()
Dim ws As Worksheet
Dim regexp As Object
Dim reMatches As Object
Dim c As Range
Call deaktiver
Set regexp = CreateObject("VBScript.RegExp")
Set ws = År_2017
With regexp
.Global = False
.MultiLine = False
.IgnoreCase = False
.Pattern = "^(\d+)\s{2,}(.+)$"
End With
For Each c In ws.Range("A2:A" & ws.Range("A2").End(xlDown).Row)
Set reMatches = regexp.Execute(c.Value2)
If reMatches.Count > 0 Then
c = Trim(reMatches(0).SubMatches(0))
c.Offset(0, 1) = Trim(reMatches(0).SubMatches(1))
End If
Next c
Call reaktiver
End Sub
Private Sub deaktiver()
Application.EnableEvents = False
Application.ScreenUpdating = False
Application.DisplayStatusBar = False
Application.Calculation = xlCalculationManual
End Sub
Private Sub reaktiver()
Application.EnableEvents = True
Application.ScreenUpdating = True
Application.DisplayStatusBar = True
Application.Calculation = xlCalculationAutomatic
End Sub
Your data has a nice fixed format. Rather than TextToColumns or VBA, in cell B1 enter:
=LEFT(A1,6)
and in C1 use:
=MID(A1,20,99)
EDIT#1:
For non-regular data, use:
=LEFT(A1,FIND(" ",A1)-1)
=MID(TRIM(A1),FIND(" ",TRIM(A1))+1,99)
for B1 and C1 respectively.
EDIT#2:
Siddharth has a good point. It is better to use:
=MID(TRIM(A1),FIND(" ",TRIM(A1))+1,LEN(A1))
rather than:
=MID(TRIM(A1),FIND(" ",TRIM(A1))+1,99)
Yes, there is much simpler... and faster.
Dim input_, output_, i&
input_ = ws.Range("A2:A" & ws.Range("A2").End(xlDown).Row).Value2
ReDim output_(LBound(input_) To UBound(input_), 1 To 2)
For i = LBound(input_) To UBound(input_)
output_(i, 1) = Split(input_(i, 1), " ")(0)
output_(i, 2) = Split(input_(i, 1), " ")(1)
Next
ws.Range("A2:B" & ws.Range("A2").End(xlDown).Row) = output_
Related
Excel workbook consist of 10,000 rows and 25 columns and take 15 mins to complete this process. i need to reduce the runtime to complete this process into less than 1 min. kindly help me out from this situtaion.
For Each cl In rng.SpecialCells(2)
For i = Len(cl.Value) To 1 Step -1
If cl.Characters(i, 1).Font.Strikethrough Then
cl.Characters(i, 1).Delete
End If
Next i
Next cl
Very fast approach via xlRangeValueXMLSpreadsheet Value
Using the relatively unknown xlRangeValueXMLSpreadsheet Value, also referred to as ►.Value(11) solves the question by a very simple string replacement (though the xml string handling can reveal to be very complicated under special conditions).
This approach (quickly tested for 10000 rows) seems to be up to 90 times faster as Tim's valid solution refining the original code, but lasting 14 minutes :-)
Sub RemoveStrThr(rng As Range, Optional colOffset As Long = 1)
'a) Get range data as xml spreadsheet value
Dim xmls As String: xmls = rng.Value(xlRangeValueXMLSpreadsheet) ' //alternatively: xmls = rng.Value(11)
'b) find start position of body
Dim pos As Long: pos = InStr(xmls, "<Worksheet ")
'c) define xml spreadsheet parts and remove <S>-node sections in body
Dim head As String: head = Left(xmls, pos - 1)
Dim body As String: body = Mid(xmls, pos)
'remove strike throughs
Dim results: results = Split(Replace(body, "</S>", "^<S>"), "<S>")
results = Filter(results, "^", False) ' negative filtering of special char "^"
body = Join(results, "")
'd) write cleaned range back
rng.Offset(0, colOffset).Value(11) = head & body
End Sub
Example call
Sub TestRemove()
Application.ScreenUpdating = False
Dim t As Double
t = Timer
RemoveStrThr Sheet1.Range("A2:Z10000"), 27 ' << change to your needs
Debug.Print "done", Format(Timer - t, "0.00 secs")
Application.ScreenUpdating = True
End Sub
Help reference links
Excel XlRangeValueDataType enumeration
Excel Range Value
Addendum (due to #Tim' valuable comment)
Note that if the whole cell content should be struck out then this will not remove the struck-out content: there are no <S> or </S> tags in the element, since the strikethrough is applied via a Style rule (via the xml spreadsheet value head).
To meet this eventuality
"...you could add a second step using something like Application.FindFormat.Font.Strikethrough = True: rng.Replace What:="*", Replacement:="", LookAt:=xlPart, SearchOrder:=xlByRows, SearchFormat:=True: Application.FindFormat.Clear to take care of those cells."
Any use of the Characters collection tends to be kind of slow, so the best you can do (beyond turning off screenupdating) is get some minor improvements by (eg) ignoring cells with no strikethrough, checking for cases where all content is struck through, and batching your calls to Delete.
Sub tester()
Dim t
Range("C1:C3").Copy Range("A1:A999") 'creating some dummy cell values (no/mixed/all ST)
Application.ScreenUpdating = False
t = Timer
RemoveStrikeThrough Range("A1:A999")
Debug.Print "done", Timer - t
End Sub
Sub RemoveStrikeThrough(rng As Range)
Dim cl As Range, hasST, i As Long, pos As Long, st As Boolean
For Each cl In rng.Cells
'only process cells which have any strikethrough style applied
' hasST will be False (no ST), True (all ST) or Null (mixed ST)
hasST = cl.Font.Strikethrough
If TypeName(hasST) = "Boolean" Then
If hasST Then
cl.ClearContents 'all text is struck out, so clear the cell
Else
'Debug.Print "No strikethrough", cl.Address
End If
Else
'mixed - do char by char
For i = Len(cl.Value) To 1 Step -1
If cl.Characters(i, 1).Font.Strikethrough Then
If Not st Then 'new run?
st = True
pos = i
End If
Else
If st Then 'previous run?
cl.Characters(i + 1, pos - i).Delete
st = False
End If
End If
Next i
'remove last strikethough if any
If st Then cl.Characters(1, pos).Delete
st = False 'reset this
End If
Next cl
End Sub
I am having several columns with text. Furthermore, I have a column, which is called Replacement text. This columns contains strings that have markers in it, like [1], [2], etc.
I would like to replace the markers with the text that is in the marker`s row.
For example, Here you can find [5] becomes Here you can find b, because [5] is the markers column and in the row of the string b is the value for the marker.
I was thinking of creating a large if-else construct and substitute the text, which is extremely error-prone.
However, I kindly ask you if there is an easier solution?
I appreciate your input!
This answer is highly plagiarised from Thomas Inzina's first version of his answer, but a very simple way of performing the replacement would be:
Sub ReplaceText()
Dim r As Long
Dim c As Long
With ActiveSheet
For r = 3 To .Range("K" & .Rows.Count).End(xlUp).Row
For c = 1 To 10
.Cells(r, "K").Replace "[" & c & "]", .Cells(r, c).Value
Next
Next
End With
End Sub
The above code will attempt to do a substitution using all ten columns.
As Thomas has noted, the following code will only do the substitution if a substitution is necessary and therefore could be an order of magnitude faster, so is undoubtedly a better solution:
Sub ReplaceText()
Dim r As Long
Dim c As Long
With ActiveSheet
For r = 3 To .Range("K" & .Rows.Count).End(xlUp).Row
For c = 1 To 10
If Instr(.Cells(r, "K").Value, "[" & c & "]") > 0 Then
.Cells(r, "K").Replace "[" & c & "]", .Cells(r, c).Value
End If
Next
Next
End With
End Sub
(Many thanks to Thomas for his effort in performing speed tests on the two different methods.)
ReplaceBrackets1: uses RegEx to extract the column number. It takes 15.03 Seconds to processed 100K records.
Sub ReplaceBrackets1()
'http://analystcave.com/excel-regex-tutorial/
Dim c As Range
Dim Match As Object, Matches As Object, regex As Object
Set regex = CreateObject("VBScript.RegExp")
With regex
.Global = True
.Pattern = "\[(.*?)\]"
End With
For Each c In Range("K3", Range("K" & Rows.Count).End(xlUp))
If regex.Test(c.Text) Then
Set Matches = regex.Execute(c.Text)
For Each Match In Matches
c.Replace Match, c.EntireRow.Columns(CInt(Match.SubMatches(0)))
Next Match
End If
Next
End Sub
ReplaceBrackets2: loades the data into arrays, uses RegEx to extract the column number and only writes to the worksheet 1 time. It takes 1.27 seconds to process 100K records.
Sub ReplaceBrackets2()
'http://analystcave.com/excel-regex-tutorial/
Dim x As Long, column As Long
Dim arData, values
Dim Match As Object, Matches As Object, regex As Object
Set regex = CreateObject("VBScript.RegExp")
With regex
.Global = True
.Pattern = "\[(.*?)\]"
End With
values = Range("K3", Range("K" & Rows.Count).End(xlUp))
arData = Range("A3", "L" & UBound(values, 1) + 2)
For x = 1 To UBound(values, 1)
If regex.Test(values(x, 1)) Then
Set Matches = regex.Execute(values(x, 1))
For Each Match In Matches
column = Match.SubMatches(0)
values(x, 1) = arData(x, column)
Next Match
End If
Next
Range("K3", Range("K" & Rows.Count).End(xlUp)) = values
End Sub
After converting ReplaceBrackets1 into a UDF (getReplacedText) I was amazed to find that it only took 2.53 seconds to fill the formula in for a 100K records. I'm not sure way this would be faster that the original. But having that many formulas really slows down the spreadsheet.
getReplacedText: Uses a Static RegEx to parse the data.
Function getReplacedText(ReplacementText As String, Source As Range)
'http://analystcave.com/excel-regex-tutorial/
Dim Match As Object, Matches As Object
Static regex As Object
If regex Is Nothing Then
Set regex = CreateObject("VBScript.RegExp")
With regex
.Global = True
.Pattern = "\[(.*?)\]"
End With
End If
If regex.Test(ReplacementText) Then
Set Matches = regex.Execute(ReplacementText)
For Each Match In Matches
ReplacementText = Replace(ReplacementText, Match, Source.Columns(CInt(Match.SubMatches(0))))
Next Match
End If
getReplacedText = ReplacementText
End Function
I had a piece of code commissioned earlier this week (cheaper to have an expert write it than for me to spend a week trying to!). However, when putting it use I've hit a bit of a snag.
The macro looks at a name on one excel worksheet, matches it to a list of names and associated ID numbers on a different worksheet, then inserts the ID on the first worksheet. This was all fine until I started using it on real data.
Here's some sample data (all of this information is in one cell...):
WARHOL*, Andy
PETO, John F
D3 GRECO, Emilio -20th C
HASELTINE, William Stanley
D3 DALI, Salvador
D3 SOSNO, Sacha
WEGMAN**, WILLIAM
One asterisk means it's a print, two a photograph, D3 a sculpture, and nothing a painting.
When I run the code with this data, it sees * as a wildcard, and so will always insert the ID of the first variation of the artist in the sheet. What I need is a way for the macro to not read it as a wildcard.
I did some research, and found that inserting ~ before * negates the wildcard properties. How would I make my code do this? I've discovered the main issue of having code written by someone else... You might not understand it!
Here is the code:
Public Sub match_data()
'ctrl+r
On Error GoTo errh
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
Dim r1, r2, i, exc As Long
Dim fp As Range
Sheets("Data").Activate
r1 = Cells(Rows.Count, "B").End(xlUp).Row
r2 = Sheets("List").Cells(Sheets("List").Rows.Count, "B").End(xlUp).Row
'MsgBox r1 & r2
exc = 0
For i = 2 To r1
If Range("B" & i).Value <> "" Then
With Sheets("List").Range("B2:B" & r2)
Set fp = .Find(Range("B" & i).Value, LookIn:=xlValues, lookat:=xlWhole)
If Not fp Is Nothing Then
Range("B" & i).Interior.Color = xlNone
Range("A" & i).Value = Sheets("List").Range("A" & fp.Row).Value
Else
Range("B" & i).Interior.Color = xlNone
Range("B" & i).Interior.Color = vbYellow
exc = exc + 1
End If
End With
End If
Next i
MsgBox "There are " & exc & " exceptions."
errh:
If Err.Number > 0 Then
MsgBox Err.Description
End If
Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic
End Sub
Oh also, I would need to do this for the list of names and ID's wouldn't I? If so, that only needs doing once, so if you could give me a hint about that too, I'd be so grateful!
Thanks!
PS I know the system we are using at the moment absolutely sucks (definitely not 3rd form normalised!), but we are fast running out of time and money, and need to get our product up and running ASAP!
EDIT: To clarify, here is a pic of the spreadsheets I'm working with... Obviously in cells A14 and A15 I wanted the ID numbers 11 & 12 respectively
Here is one way to tell the stars from the planets:
Sub StaryNight()
Dim s As String, OneStar As String, TwoStar As String, ThreeStar As String
OneStar = "*"
TwoStar = "**"
ThreeStar = "***"
t = Range("A1").Text
ary = Split(t, ",")
s = ary(0)
If Right(s, 3) = ThreeStar Then
MsgBox "scupture"
Exit Sub
ElseIf Right(s, 2) = TwoStar Then
MsgBox "photograph"
Exit Sub
ElseIf Right(s, 1) = OneStar Then
MsgBox "print"
Exit Sub
End If
MsgBox "Painting"
End Sub
Okay, I have solved the problem! I had a play around with changing the variables in the Find and Replace box.
If I put ~* in both the find AND replace boxes, and uncheck Match entire cell contents, I can replace all of the * with ~* (really don't understand that but oh well!)
So I do this on the Data worksheet, but NOT on the List worksheet, run the macro as normal and the problem is solved!
I'm trying to get Excel to list every variation of a certain value.
If A1= ABC1904
& A2= ABC1910
I'd like column B to list.
ABC1904
ABC1905
ABC1906
ABC1907
ABC1908
ABC1909
ABC1910
This is the best I could do w/ a purely formula solution:
=LEFT(A$1,3) & MID(A$1,MIN(FIND({0,1,2,3,4,5,6,7,8,9},A$1&"0123456789")),LEN(A$1))+MIN(RIGHT($A$2,1)+0,ROWS(A$1:A1)-1)
It leaves a bit to be desired because you'll have a bunch of duplicates if you drag the formula too far down.
If you're not opposed to a VBA solution, you could give this a go:
Sub VariationOfValue()
Dim startNumber As Long, _
endNumber As Long, _
counter As Long
Dim leadingString As String
With Application
.ScreenUpdating = False
.Calculation = xlCalculationManual
End With
counter = 1
With Sheet1
leadingString = Left(Sheet1.Range("A1").Value, 3)
startNumber = Evaluate("=MID(A$1,MIN(FIND({0,1,2,3,4,5,6,7,8,9},A$1&""0123456789"")),LEN(A$1))") + 0
endNumber = Evaluate("=MID(A$2,MIN(FIND({0,1,2,3,4,5,6,7,8,9},A$2&""0123456789"")),LEN(A$2))") + 0
Do While startNumber <= endNumber
.Range("B" & counter).Value = leadingString & startNumber
counter = counter + 1
startNumber = startNumber + 1
Loop
End With
With Application
.ScreenUpdating = True
.Calculation = xlCalculationAutomatic
End With
End Sub
Does basically the same thing, but only until the last number is reached. Either way, hope one or both of these helps out a bit.
If you are willing to have a few columns to achieve this then it is really not very difficult.
Col A is your starting data. [ABC1904 and ABC1910}
Col B contains equations =VALUE(RIGHT(A1,LEN(A1)-3)) which gives the numeric part of the strings. [1904 and 1910]
Col C contains equations =IF(OR(C2=B$2,C2=""),"",C2+1) - except C1 which is just =b1
- this gives the series of numbers you want {1904 to 1910]
Col D contains equations =IF(C1="","",LEFT(A$1,3)&C1)
- this puts the text back on the front of the numbers [ABC1904 to ABC1910]
..this would be clearer with a screenshot but I apparently do not have enough reputation to post one
I have a set of VBA codes which work really perfectly with around of 20 000 x 16 cells.
However, I need to use the codes with max 80 000 x 16 cells.
I have identified two types of codes which run really slow:
c = 2 'this is the first row where your data will output
d = 2 'this is the first row where you want to check for data
Application.ScreenUpdating = False
Do Until Range("A" & c) = "" 'This will loop until column U is empty, set the column to whatever you want
'but it cannot have blanks in it, or it will stop looping. Choose a column that is
'always going to have data in it.
ws1.Range("U" & d).FormulaR1C1 = "=RC[-20] & RIGHT(""0000"" & RC[-14], 6)"
c = c + 1 'Advances a and b when there is a matching case
d = d + 1
Loop
Application.ScreenUpdating = True
End Sub
Sub OpenValue()
Dim l As Integer
Dim k As Integer
Dim m As Integer
m = Sheets("Input").Range("AC:AC").End(xlDown).Row
For l = 2 To m
If Range("AC" & l) = "Delievered" Then
Range("AD" & l) = 0
ElseIf Range("AC" & l) = "Cancelled" Then
Range("AD" & l) = 0
Else
Range("AD" & l) = Val(Range("Z" & l)) * Val(Range("J" & l))
End If
Next
End Sub
What can I do to poptimize them ....
The link provided by #GSerg is an awesome way to cut the running time of your script down. I found myself using:
Application.ScreenUpdating set to False
Application.Calculation set to xlCalculationManual
Application.EnableEvents set to False
Application.DisplayAlerts set to False
so often that I combined them into a single public subroutine. #Garys-Student provided the inspiration:
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'INPUT : True or False (i.e. fast or slow)
'DESCRIPTION : this sub turns off screen updating and alarms then
' sets calculations to manual
'
Public Sub GoFast(OnOrOff As Boolean)
Dim CalcMode As XlCalculation
CalcMode = Application.Calculation
With Application
.ScreenUpdating = Not OnOrOff
.EnableEvents = Not OnOrOff
.DisplayAlerts = Not OnOrOff
If OnOrOff Then
.Calculation = xlCalculationManual
Else
.Calculation = CalcMode
End If
End With
End Sub
In practice, you can now add the one-liner:
Call GoFast(True)
at the beginning of your script as part of the setup, then add:
Call GoFast(False)
at the end of your script as part of the teardown. Modify as you see fit!
The Do Until can be replaced with a one liner:
ws1.Range("A2", ws1.Range("A2").End(xlDown)).Offset(0,20).FormulaR1C1 = _
"=RC[-20] & RIGHT(""0000"" & RC[-14], 6)"
Note that this will fail if A3 is empty. If you have headers in row 1 you can change the second A2 to A1.
For the other Sub I'm not sure if you are doing something special with Val but if not you could change it to something similar:
Sub OpenValue()
Dim r As Range
Set r = Sheets("Input").Range("AD2:AD" & Sheets("Input").Range("AC1").End(xlDown).Row)
With r
.FormulaR1C1 = "=IF(OR(RC[-1]=""Delivered"",RC[-1]=""Cancelled""),0,RC10*RC26"
'If you want these as values uncomment the following lines
'.Calculate
'.Copy
'.PasteSpecial xlPasteValues
End With
End Sub
Sprinkle Application stuff around if needed (Calculation, ScreenUpdating, DisplayAlerts, EnableEvents).
Why is this faster:
To put it simply, VBA and Excel have to open a 'channel' to communicate between each other and this costs some time. So looping through a Range and adding formulas one-by-one is much slower for large ranges than doing it all at once since you'll only open the 'channel' once.