I have a data set with Names and Addresses in an Excel file in following format.
Name1
134/47/1,
adrs1, adr2, country
Name2
adrs1, adrs2, country
Name3
107/c,
adrs3, adrs3, country
etc…
I want to split these data into multiple rows in following format
Name1
134/47/1,
adrs1,
adrs2,
country
Name2
No 134/63,
adrs1,
adrs2,
country
etc…
I tried following but it worked for one row cell only.
Sub tst()
Dim X As Variant
X = Split(Range("A1").Value, ",")
Range("A1").Resize(UBound(X) - LBound(X) + 1).Value = Application.Transpose(X)
End Sub
The following macro might help you. You would have to select the very last cell in your table containing a multipart address. When you start the macro it will then work its way up to the top and insert address lines where needed (only in the current column) and then exit.
Option Base 1
Sub trnsfrm()
Dim i%, n%, ret(3, 1)
Set r = Selection
Do
a = Split(r, ",")
ret(1, 1) = Trim(a(0))
ret(2, 1) = Trim(a(1))
ret(3, 1) = Trim(a(2))
r.Range([a2], [a3]).Insert Shift:=xlDown
r.Range([a1], [a3]) = ret
If r.Row <= 4 Then Exit Do
Set r = r.Offset(-4)
Loop
End Sub
If you want to insert lines across the whole table you should replace the line (10)
r.Range([a2], [a3]).Insert Shift:=xlDown
by
r.Range([a2], [a3]).EntireRow.Insert Shift:=xlDown
Assumptions / Warning
Since the macro will actually change your current table and 'undo' does not work with macros you should definitely save everything before you try it.
The macro assumes that each address block consists of exactly 4 lines. If there are fewer or more lines to an address the maro will get out of sync and will very likely output garbage or halt.
I'm not sure whether your sample data had trailing commas on single values as a typo or if that is what accurately represents your data but that should be accounted for. A rogue comma as a suffix will create an extra element to the variant array thereby throwing off dimensions created by referencing the UBound function.
Sub split_from_below_space()
Dim rw As Long, v As Long, vVALs As Variant
With Worksheets("Sheet1") 'set this worksheet reference properly!
For rw = .Cells(Rows.Count, 1).End(xlUp).Row To 1 Step -1
.Cells(rw, 1) = Trim(.Cells(rw, 1).Value2)
If CBool(InStr(1, .Cells(rw, 1).Value2, Chr(44) & Chr(32))) Then
vVALs = Split(.Cells(rw, 1).Value2, Chr(44) & Chr(32))
.Cells(rw + 1, 1).Resize(UBound(vVALs), 1).EntireRow.Insert
.Cells(rw, 1).Resize(UBound(vVALs) + 1, 1) = _
Application.Transpose(vVALs)
For v = UBound(vVALs) - 1 To LBound(vVALs) Step -1
.Cells(rw, 1).Offset(v, 0) = _
Trim(.Cells(rw, 1).Offset(v, 0).Value2) & Chr(44)
Next v
End If
Next rw
End With
End Sub
You will need to insert rows to accommodate the data and that method is almost always (as in this case) better performed by working from the bottom to the top.
Related
So my problem is that for previous users who are keeping track of inventory they have labeled items with a ID of example: ABC1234 - ABC1244 but the problem is that when we keep track of our items we need each and ever individual item to be properly accounted for as each item has a unique ID that we track.
So for the past half a year we have been slowly filling in everything and since there are tons of other information in the row that is repeated I was wondering if there was a way to write a VBA macro to expand and insert these rows of data.
So from this
ID
Description
ABC1234 - ABC1237
Screw type A
to this
ID
Description
ABC1234
Screw type A
ABC1235
Screw type A
ABC1236
Screw type A
ABC1237
Screw type A
I have tried using the record macro functions but its not dynamic which is not what I want as the Database can change over time with the influx of new items so I hope there is a way to dynamically complete this process. If anyone knows a solution please help have been banging my head against a wall for awhile now :'D
not sure if this is what you are looking for.
I am assuming your ABC is always the same, the only thing that is changing is the last 4 number.
Sub Formatting()
Dim xlwrks As Object
Dim xlrng As Object
Dim xlcell As Object
Dim lowerlimit As Integer, upperlimit As Integer
Set xlwrks = ThisWorkbook.Sheets("Sheet1")
Set xlrng = xlwrks.Range("A2", xlwrks.Range("A" & Rows.Count).End(xlUp).Address) 'from A2 to last cell in column A
For Each xlcell In xlrng 'iterate ech cell from xlrng
'e.g splitting this into array of string with the delimiter " - " by using split
'which will look something like "ABC1234", "ABC1237" for A2 and "ABC1238", "ABC1242" for A3
'next we only the last 4 number, by using right
lowerlimit = Right(Split(xlcell.Value2, " - ")(0), 4)
upperlimit = Right(Split(xlcell.Value2, " - ")(1), 4)
Do Until lowerlimit = upperlimit + 1
xlwrks.Range("D" & Rows.Count).End(xlUp).Offset(1, 0).Value2 = "ABC" & lowerlimit
xlwrks.Range("D" & Rows.Count).End(xlUp).Offset(0, 1).Value2 = xlcell.Offset(0, 1).Value2
lowerlimit = lowerlimit + 1
Loop
Next xlcell
End Sub
adding on to the requirement, as mentioned, need to monitor the trend. wrote something to check for the trend instead of manually eyeball the trend. Do note with this, the run time will be longer, because it will loop through each cell to look at the array, it will also loop through each array to look at each character. hope this help happy coding!~~
Dim xlwrks As Object
Dim xlrng As Object
Dim xlcell As Object
Dim lowerlimit As Long, upperlimit As Long
Dim charpos As Integer, characters As String, ID As String
Set xlwrks = ThisWorkbook.Sheets("Sheet1")
Set xlrng = xlwrks.Range("A2", xlwrks.Range("A" & Rows.Count).End(xlUp).Address) 'from A2 to last cell in column A
For Each xlcell In xlrng 'iterate ech cell from xlrng
'e.g splitting this into array of string with the delimiter " - " by using split
'which will look something like "ABC1234", "ABC1237" for A2 and "ABC1238", "ABC1242" for A3
'next we only the last 4 number, by using right
characters = Split(xlcell.Value2, " - ")(0)
For charpos = 1 To Len(characters)
If Not IsNumeric(Mid(characters, charpos, 1)) Then
ID = ID & Mid(characters, charpos, 1)
Else
Exit For
End If
Next charpos
For charpos = 1 To Len(characters)
If IsNumeric(Mid(characters, charpos, 1)) Then
lowerlimit = CStr(lowerlimit) & CStr(Mid(characters, charpos, 1))
End If
Next charpos
characters = Split(xlcell.Value2, " - ")(1)
For charpos = 1 To Len(characters)
If IsNumeric(Mid(characters, charpos, 1)) Then
upperlimit = CStr(upperlimit) & CStr(Mid(characters, charpos, 1))
End If
Next charpos
Do Until lowerlimit = upperlimit + 1
xlwrks.Range("D" & Rows.Count).End(xlUp).Offset(1, 0).Value2 = ID & lowerlimit
xlwrks.Range("D" & Rows.Count).End(xlUp).Offset(0, 1).Value2 = xlcell.Offset(0, 1).Value2
lowerlimit = lowerlimit + 1
Loop
lowerlimit = 0
upperlimit = 0
ID = ""
Next xlcell
Honestly, I would not do this with VBA inside the spreadsheet. I would write a separate piece of VB or VBScript that reads the existing spreadsheet and produces a new altered copy of it.
When it reads a line in the original spreadsheet with just "ABC1234", it just copies that line to the new spreadsheet. When it reads a line that contains "ABC1234 - ABC1237", it recognizes the pattern and figures out how many lines it needs to generate in the new spreadsheet. In this case, it will generate four lines: one line for ABC1234, one line for ABC1235, one line for ABC1236, and one line for ABC1237.
I think this approach will be easier to deal with than a VBA script inside the spreadsheet. You will run it once, check the new spreadsheet, then rename the old one for safe-keeping, and rename the new one to give it the original sheet's name.
Some background: Each month I build a pivot table that has approx 30 or so business units (along the y axis) - lets call them groups. Each group has a number of GL accounts that change month to month. For example, Group 14 might have 10 GL accounts one month than the next have only 3. For each group, we need the summation of the totals for the GL accounts (that start with PL203000 & PL211010) for each group. Before we had to total these GL accounts for each group by hand. This has been solved with the code I have displayed below.
The code works perfectly when each group has more than one GL account (See pic 1)
The problem I am facing is when there is only one GL account, the code doesn't sum the correct amounts (see 2nd pic).
When digging into my code, you can see that it is summing the incorrect sections since i have a Rows.Count.End(xlUp) establishing the range. If there is only one GL account, it skips to the next section thereby establishing an incorrect formula
Perhaps my code needs to be completely revamped in order to account for groups where there is only one GL account to sum? If so, what sort of if statement can i code where it ignores groups that have only one GL account?
If not, than is the solution to have VBA count the range and if it is less than 3, ignore group and move on to the next?
'this section spits out the values needed to sum
For i = nRowMax To 4 Step -1
If Left(Cells(i, 1), 8) = "PL211010" Or Left(Cells(i, 1), 8) = "PL203000"
Then
Cells(i, 4).Copy
Cells(i, 5).PasteSpecial xlPasteValues
Range(Cells(i, 1), Cells(i, 4)).Select
With Selection.Interior
.Pattern = xlSolid
.PatternColorIndex = xlAutomatic
.Color = 65535
.TintAndShade = 0
.PatternTintAndShade = 0
End With
End If
Next i
Application.CutCopyMode = False
'this section uses the values the first section specified to write the sum formula
'i believe the macro uses this section of code to write the first formula and the next section of code writes the formulas for the rest of the groups
Dim firstRow As Variant
Dim finalRow As Variant
finalRow = Range("E" & Rows.Count).End(xlUp).Row
firstRow = Cells(finalRow, 5).End(xlUp).Row
If IsNumeric(Cells(finalRow + 1, 5)) Then
Cells(firstRow, 6).Formula = "=SUM(D" & firstRow & ":D" & finalRow & ")"
End If
'this section goes through the whole sheet to sum each group
For y = firstRow To 5 Step -1
finalRow = Cells(y, 5).End(xlUp).Row
firstRow = Cells(finalRow, 5).End(xlUp).Row
If firstRow < 5 Then firstRow = 5
If IsNumeric(Cells(finalRow + 1, 5)) Then
Cells(firstRow, 6).Formula = "=SUM(D" & firstRow & ":D" & finalRow &")"
End If
y = firstRow
'If firstRow = 5 Then Exit Sub
Next y
If your dataset is an accurate enough example, you can scan through your business units and pick out only what you need. I have some example code here that will build up your sum range by using the Union function and applying that to the SUM formula when the entire business unit has been scanned. Of course, this is only an example that fits the data shown. You'll have to expand it to fit situations that are not visible to me.
To simplify the logic, I've separated the code into a function that will start scanning rows for a business unit and will stop when it reaches the end of the business unit -- the test I'm using for detecting the start of the next BU is a line that does not start with "PL". This may or may not be correct for all your data.
Because this code is checking each line and accumulating the sum range using the Union, if you only have one cell, you'll still get a formula that says =SUM($D$30) but it works.
Option Explicit
Sub test()
Dim dataArea As Range
Set dataArea = ActiveSheet.Range("A1")
Do While Not IsEmpty(dataArea.Cells(1, 1))
Set dataArea = AddSums(dataArea)
Loop
End Sub
Private Function AddSums(ByRef businessUnitStart As Range) As Range
'--- loops through cells following the 'Start' range given,
' and accumulates the range of accounts to summarize
' RETURNS the start of the next business unit range
Dim accountRow As Range
Dim account As String
Set accountRow = businessUnitStart.Offset(1, 0)
Dim sumArea As Range
Do While Left$(accountRow.Cells(1, 1).Value2, 2) = "PL"
account = accountRow.Cells(1, 1).Value2
If (Left$(account, 8) = "PL211010") Or (Left$(account, 8) = "PL203000") Then
'--- add this account to the sum formula
If sumArea Is Nothing Then
Set sumArea = accountRow.Cells(1, 4)
Else
Set sumArea = Union(sumArea, accountRow.Cells(1, 4))
End If
End If
Set accountRow = accountRow.Offset(1, 0)
Loop
If Not sumArea Is Nothing Then
Dim accountSum As Range
Set accountSum = businessUnitStart.Offset(1, 6)
accountSum.Formula = "=SUM(" & sumArea.Address & ")"
End If
Set AddSums = accountRow
End Function
I have an Excel file and i want to compare the date in columns A and D and delete the gap between them.
For example based on this picture
enter image description here
Time in column A start at 14:56:23 and in D at 14:56:18. So i want to delete all the data in column D till 14:56:23 so that it will be the same in both A and D.
this problem will be repeated many times so i want to develop a macro to do it.
that is a small program to compare just first two cells in column A and D
Sub Edit_Date_time()
Dim r As Range
Dim l As Range
Set r = Range("A2")
Set l = Range("D2")
If r.Value <> l.Value Then
Range("D2:E2").Select
Selection.Delete Shift:=xlUp
End If
End Sub
the problem is that the cells contain date & time so i can not compare it as values.I have also to expand this code to cover the whole A2 & D2 column not only the first two cells.
Your question has morphed many times, but I am going with the question "how to eliminate rows without matching timestamps" ...
Comparing timestamps can be tricky, even if they are correctly formatted. You would expect 2/17/2016 14:56:29 to be equal to 2/17/2016 14:56:29, but there may be a difference in milliseconds that you cannot see in the string or in the general format. Therefore, you should use a tolerance when determining <, >, or =.
Remember, with timestamps 1.0 = 1 day. So 1/10 of a second is (1/24/60/60/10).
I assume you want to delete pressure rows when those timestamps are earlier than the corresponding temperature timestamps, AND you want to delete temperature rows when those timestamps are earlier than the corresponding pressure timestamps.
This means, worst case, a loop will need to go through the entire data set twice.
I tested code against this data ...
After processing, the yellow cells should align, the orange rows should be deleted. Here are the results I get ...
using this code ...
Sub ParseDateTime()
Dim TRange As Range, PRange As Range
Dim iLoop As Long, LoopEnd As Long
Dim theRow As Long, LastRow As Long
' set the range for the temperature data
LastRow = Sheets("Sheet1").Range("A" & Sheets("Sheet1").Rows.Count).End(xlUp).Row
Set TRange = Sheets("Sheet1").Range("A1:C" & LastRow)
LoopEnd = LastRow
' set the range for the pressure data
LastRow = Sheets("Sheet1").Range("D" & Sheets("Sheet1").Rows.Count).End(xlUp).Row
Set PRange = Sheets("Sheet1").Range("D1:E" & LastRow)
If LastRow > LoopEnd Then LoopEnd = LastRow
' loop through the range
theRow = 1
For iLoop = 2 To 2 * LoopEnd
theRow = theRow + 1
' stop searching when no more data
If TRange(theRow, 1) = "" And PRange(theRow, 1) = "" Then Exit For
' if out of temperature data, eliminate the rest of the pressure data
If TRange(theRow, 1) = "" Then
PRange.Rows(theRow).Delete Shift:=xlUp
theRow = theRow - 1
End If
' if out of pressure data, eliminate the rest of the temperature data
If PRange(theRow, 1) = "" Then
TRange.Rows(theRow).Delete Shift:=xlUp
theRow = theRow - 1
End If
' eliminate pressure rows where the time stamp is earlier than the temperature timestamp
If (TRange(theRow, 1).Value > PRange(theRow, 1).Value) And _
(Abs(TRange(theRow, 1).Value - PRange(theRow, 1).Value) >= (1# / 24# / 60# / 60# / 10#)) Then
PRange.Rows(theRow).Delete Shift:=xlUp
theRow = theRow - 1
Else
' eliminate temperature rows where the time stamp is earlier than the pressure timestamp
If (TRange(theRow, 1).Value < PRange(theRow, 1).Value) And _
(Abs(TRange(theRow, 1).Value - PRange(theRow, 1).Value) >= (1# / 24# / 60# / 60# / 10#)) Then
TRange.Rows(theRow).Delete Shift:=xlUp
theRow = theRow - 1
End If
End If
Next iLoop
End Sub
The redefinition(s) of this question makes it hard to deal with; particularly so as some (now removed) criteria would render some solutions to the current problem impractical.
I remember when your data came from multiple CSV files; some containing temperatures and some containing pressures. In fact there was so much data that it could conceivably 'spill' over to another worksheet. This fact alone renders individual worksheet value comparisons impractical. Even if it all fit on a single worksheet, comparing a million datetimes with a second set of a million datetimes and removing entries that do not fit both categories is going to be an arduous and time-consuming task.
Arduous and time-consuming tasks are best processed 'in-memory'. Repeatedly returning to the worksheet(s) to compare values is going to bog down processing and should be avoided unless absolutely necessary.
This seems like it should be an SQL question where two different sets of CSVs are loaded into two temporary but consolidated database tables and indexed on their respective datetimes. An INNER JOIN could then be performed to build a third table of matching records. Easy-peasy.
But this is an excel / vba question and should be answered in kind.
A VBA Scripting.Dictionary object is like an in-memory database table and comes with a unique primary 'index' called the key. It also has a single additional 'field' of the variant type which can receive any style of value or values that a variant can. Loading two dictionaries with the respective values (one for temperatures and another for the pressures) using the datetimes as the keys would seem to be the most efficient method of combining the two.
Sample data
I started with several CSVs similar to the following.
Temperaturen-25.csv SPS-25.csv
Three temperature CSVs and three pressure CSVs totalled ~300K records (~150K each) with periods of intentionally missing datetimes from each.
Module2 (Code)
Option Explicit
'public constant dictating the maximum number of entries per worksheet (never set higher than Rows.Count-3)
Public Const iMAXROWS As Long = 50000
Sub main()
Dim fp As String, fn As String, tmp As Variant
Dim dt As Variant, tdic As Object, pdic As Object
Dim tpwb As Workbook, a As Long, d As Long, w As Long
'apptggl btggl:=false 'uncomment this when you have finished debugging
'create 2 dictionary objects to receive ALL of the data
Set tdic = CreateObject("Scripting.Dictionary")
Set pdic = CreateObject("Scripting.Dictionary")
tdic.CompareMode = vbBinaryCompare
pdic.CompareMode = vbBinaryCompare
'load the dictionaries using the overwrite method
fp = Environ("TMP") & Chr(92) & "TempPress"
fn = Dir(fp & Chr(92) & "*.csv", vbNormal)
Do While CBool(Len(fn))
Select Case True
Case LCase(fn) Like "*temperaturen*"
'debug.Print "found " & fn
loadTPDictionary CStr(fp & Chr(92) & fn), tdic, 3
Case LCase(fn) Like "*sps*"
'debug.Print "found " & fn
loadTPDictionary CStr(fp & Chr(92) & fn), pdic, 2
Case Else
'do nothing; not temperature or pressure
End Select
'debug.Print tdic.Count & ":" & pdic.Count
fn = Dir
Loop
'debug.Print tdic.Count
'debug.Print pdic.Count
'At this point you have two dictionary object; one for temps and one for pressures
'They have a unique indexed key on their datetime values
'Time to merge the two
'First load all matching pressures into the temperatures
For Each dt In tdic
If pdic.Exists(dt) Then
tdic.Item(dt) = Array(tdic.Item(dt)(0), tdic.Item(dt)(1), tdic.Item(dt)(2), _
pdic.Item(dt)(1), pdic.Item(dt)(0))
End If
Next dt
'Second, get rid of temps that had no matching pressure
For Each dt In tdic
If UBound(tdic.Item(dt)) < 4 Then
tdic.Remove dt
End If
Next dt
'debug.Print tdic.Count
'debug.Print pdic.Count
'At this point the temp dictionary object contains a merged set of matching temps and pressures
'Time to put the values into one or more worksheets
'create a new target workbook and set up the first target worksheet
Set tpwb = Workbooks.Add
With tpwb
For w = 1 To Int(tdic.Count / iMAXROWS) + 1
a = 1: d = 1
'first load an array with the dictionary's values
ReDim tmp(1 To iMAXROWS, 1 To 5)
For Each dt In tdic
If d > (w * iMAXROWS) Then
Exit For
ElseIf d > ((w - 1) * iMAXROWS) Then
tmp(a, 1) = tdic.Item(dt)(0)
tmp(a, 2) = tdic.Item(dt)(1)
tmp(a, 3) = tdic.Item(dt)(2)
tmp(a, 4) = tdic.Item(dt)(3)
tmp(a, 5) = tdic.Item(dt)(4)
a = a + 1
End If
d = d + 1
Next dt
On Error GoTo bm_Need_Worksheet
With .Worksheets(w + 1) '<~~ ignore the original blank worksheet from the new workbook
'dump the values back into the worksheet
.Cells(2, 1).Resize(UBound(tmp, 1), UBound(tmp, 2)) = tmp
'format the datetimes
.Range("A2:A" & UBound(tmp, 1) + 1 & ",E2:E" & UBound(tmp, 1) + 1).NumberFormat = _
"[Color10]mm/dd/yyyy hh:mm:ss"
.Columns("A:E").AutoFit
End With
'clear the variant array
Erase tmp
Next w
'get rid of the original unprepped worksheet created with the new workbook
.Worksheets(1).Delete
'save as a binary workbook due to size considerations
.SaveAs Filename:=fp & Chr(92) & Format(Date, "\T\P\_yyyymmdd\_") & CLng(Timer), _
FileFormat:=xlExcel12, AddToMru:=True
'close savechanges:=false 'uncomment this after debugging
End With
'we got safely here; skip over worksheet creation
GoTo bm_Safe_Exit
bm_Need_Worksheet:
On Error GoTo 0
With tpwb.Worksheets.Add(After:=Sheets(Sheets.Count))
On Error GoTo bm_Need_Worksheet
.Range("A1:E1") = Array("Date and Time", "Temperature 1", "Temperature 2", _
"Pressure", "Date and Time (p)")
.Name = "Temperaturen & Pressure " & w
With .Parent.Windows(1)
.SplitColumn = 0: .SplitRow = 1
.FreezePanes = True
.Zoom = 75
End With
End With
Resume
bm_Safe_Exit:
'discard the dictionary objects
tdic.RemoveAll: Set tdic = Nothing
pdic.RemoveAll: Set pdic = Nothing
'restore the application environment
appTGGL
End Sub
Sub loadTPDictionary(fpn As String, ByRef dict As Object, flds As Long)
Dim f As Long, v As Long, vVALs As Variant, wb As Workbook
Workbooks.OpenText Filename:=fpn, StartRow:=1, DataType:=xlDelimited, _
ConsecutiveDelimiter:=False, _
Comma:=True, Tab:=False, Semicolon:=False, Space:=False, Other:=False, _
FieldInfo:=IIf(flds = 3, Array(Array(1, 3), Array(2, 1), Array(3, 1)), _
Array(Array(1, 3), Array(2, 1)))
With ActiveWorkbook
With Worksheets(1)
'Debug.Print .Cells(1, 1).Value
vVALs = .Range(.Cells(2, 1), .Cells(Rows.Count, flds).End(xlUp)).Value2
End With
.Close SaveChanges:=False
End With
If flds = 3 Then
For v = LBound(vVALs, 1) To UBound(vVALs, 1)
'fastest load method but overwrites duplicate datetime values with the last temp1, temp2
dict.Item(vVALs(v, 1)) = Array(vVALs(v, 1), vVALs(v, 2), vVALs(v, 3))
Next v
Else
For v = LBound(vVALs, 1) To UBound(vVALs, 1)
'fastest load method but overwrites duplicate datetime values with the last pressure
dict.Item(vVALs(v, 1)) = Array(vVALs(v, 1), vVALs(v, 2))
Next v
End If
Erase vVALs
End Sub
Public Sub appTGGL(Optional bTGGL As Boolean = True)
With Application
.ScreenUpdating = bTGGL
.EnableEvents = bTGGL
.DisplayAlerts = bTGGL
.AutoRecover.Enabled = bTGGL 'no interruptions with an auto-save
.Calculation = IIf(bTGGL, xlCalculationAutomatic, xlCalculationManual)
.CutCopyMode = False
.StatusBar = vbNullString
End With
Debug.Print Timer
End Sub
Refer to the in-code comments to follow the program flow. I seriously advise you to start with a smaller sample data set and work through the code using F8 and ctrl+F8. Set some watches on the vars. I've left many Debug.Print statements in that can be uncommented and their information observed through the VBE's Immediate window.
btw, my default workbooks are created with a single worksheet, not three like the default. You might want to adjust the code to remove everything but a single blank Worksheet Object immediately after the creation of a new target Workbook Object if you still open a new workbook with three blank worksheets. New worksheets are created to receive the data and are formatted on creation appropriately.
Results
While the results were produced quickly enough, I thought ~150K records (~135K after processing) were sufficient for testing. These results were split into multiple worksheets because of the iMAXROWS constant I set at 50K per worksheet.
TP_20160501_65489.xlsb
150K+150K processed into ~140K consolidated records in ~29 seconds.
You might also wish to seriously consider moving to a database solution.
See Highlight Duplicates and Filter by color alternative for pointers on dealing with large worksheets.
I have below piece of code to remove duplicates from a sheet by looking into two columns (column 3 & 5).
lRow = .Cells(Rows.Count, "A").End(xlUp).Row
'.Range("A1:BR" & lRow).RemoveDuplicates Columns:=Array(3, 5), Header:=xlYes
.Range("$A$1:$BR$" & lRow).RemoveDuplicates Columns:=Array(3, 5), Header:=xlYes
It works fine in Windows but unfortunately not on Mac.
Can anybody please suggest me what do I need to change here?
This piece of code will create a list of unique values and copy into another cell. So create unique list.
You have to specify where your list starts, and where you want to copy to. You can do this by changing the fromCell and toCell variables. I hope this helps.
Sub uniqueList()
fromCell = "A1"
toCell = "B1"
fromColumn = Mid(fromCell, 1, 1) 'This will resolve to A
toColumn = Mid(toCell, 1, 1) 'This will resolve to B
fromRow = Mid(fromCell, 2) 'This will resolve to 1
toRow = Mid(toCell, 2) 'This will resolve to 1
Dim cl As Range, UniqueValues As New Collection, uValue As Variant
Application.Volatile
numRows = Range(fromCell).End(xlDown).Row
On Error Resume Next
For Each cl In Range(fromCell & ":" & fromColumn & numRows)
UniqueValues.Add cl.Value, CStr(cl.Value)
Next cl
y = toRow - 1
For Each uValue In UniqueValues
y = y + 1
Range(toColumn & y) = uValue
Next uValue
End Sub
I think the answers to this are dated. I'm updating, in case someone else searches.
.removeduplicates works in Excel in mac. It should just be whatever your selection is and then .removeduplicates.
so this...
Range().RemoveDuplicates
Please be aware that I am working with a series of ~1000 line medical information databases. Due to the size of the databases, manual manipulation of the data is too time consuming. As such, I have attempted to learn VBA and code an Excel 2010 macro using VBA to help me accomplish parsing certain data. The desired output is to split certain characters from a provided string on each line of the database as follows:
99204 - OFFICE/OUTPATIENT VISIT, NEW
will need to be split into
Active Row Active Column = 99204 ActiveRow Active Column+3 = OFFICE/OUTPATIENT VISIT, NEW
I have researched this topic using Walkenbach's "Excel 2013: Power Programming with VBA" and a fair amount of web resources, including this awesome site, but have been unable to develop a fully-workable solution using VBA in Excel. The code for my current macro is:
Sub EasySplit()
Dim text As String
Dim a As Integer
Dim name As Variant
text = ActiveCell.Value
name = Split(text, "-", 2)
For a = 0 To 1
Cells(1, a + 3).Value = Trim(name(a))
Next a
End Sub
The code uses the "-" character as a delimiter to split the input string into two substrings (I have limited the output strings to 2, as there exists in some input strings multiple "-" characters). I have trimmed the second string output to remove leading spaces.
The trouble that I am having is that the output is being presented at the top of the activesheet, instead of on the activerow.
Thank you in advance for any help. I have been working on this for 2 days and although I have made some progress, I feel that I have reached an impasse. I think that the issue is somewhere in the
Cells(1, a + 3).Value = Trim(name(a))
code, specifically with "Cells()".
Thank you Conrad Frix!
Yah.. funny enough. Just after I post I have a brainstorm.. and modify the code to read:
Sub EasySplit()
Dim text As String
Dim a As Integer
Dim name As Variant
text = ActiveCell.Value
name = Split(text, "-", 2)
For a = 0 To 1
ActiveCell.Offset(0, 3 + a).Value = Trim(name(a))
Next a
End Sub
Not quite the colkumn1,column4 output that I want (it outputs to column3,column4), but it will work for my purpose.
Now I need to incorporate a loop so that the code runs on each successive cell in the column (downwards, step 1) skipping all bolded cells, until it hits an empty cell.
Modified answer to modified request.
This will start on row 1 and continue until a blank cell is found in column A. If you would like to start on a different row, perhaps row 2 if you have headers, change the
i = 1
line to
i = 2
I added a check on the upper bound of our variant before doing the output writes, in case the macro is run again on already formatted cells. (Does nothing instead of erroring out)
Sub EasySplit()
Dim initialText As String
Dim i As Double
Dim name As Variant
i = 1
Do While Trim(Cells(i, 1)) <> ""
If Not Cells(i, 1).Font.Bold Then
initialText = Cells(i, 1).text
name = Split(initialText, "-", 2)
If Not UBound(name) < 1 Then
Cells(i, 1) = Trim(name(0))
Cells(i, 4) = Trim(name(1))
End If
End If
i = i + 1
Loop
End Sub
just add a variable to keep track of the active row and then use that in place of the constant 1.
e.g.
Dim iRow as Integer = ActiveCell.Row
For a = 0 To 1
Cells(iRow , a + 3).Value = Trim(name(a))
Next a
Alternate method utilizing TextToColumns. This code also avoids using a loop, making it more efficient and much faster. Comments have been added to assist with understanding the code.
EDIT: I have expanded the code to make it more versatile by using a temp worksheet. You can then output the two columns to wherever you'd like. As stated in your original question, the output is now to columns 1 and 4.
Sub tgr()
Const DataCol As String = "A" 'Change to the correct column letter
Const HeaderRow As Long = 1 'Change to be the correct header row
Dim rngOriginal As Range 'Use this variable to capture your original data
'Capture the original data, starting in Data column and the header row + 1
Set rngOriginal = Range(DataCol & HeaderRow + 1, Cells(Rows.Count, DataCol).End(xlUp))
If rngOriginal.Row < HeaderRow + 1 Then Exit Sub 'No data
'We will be using a temp worksheet, and to avoid a prompt when we delete the temp worksheet we turn off alerts
'We also turn off screenupdating to prevent "screen flickering"
Application.DisplayAlerts = False
Application.ScreenUpdating = False
'Move the original data to a temp worksheet to perform the split
'To avoid having leading/trailing spaces, replace all instances of " - " with simply "-"
'Lastly, move the split data to desired locations and remove the temp worksheet
With Sheets.Add.Range("A1").Resize(rngOriginal.Rows.Count)
.Value = rngOriginal.Value
.Replace " - ", "-"
.TextToColumns .Cells, xlDelimited, Other:=True, OtherChar:="-"
rngOriginal.Value = .Value
rngOriginal.Offset(, 3).Value = .Offset(, 1).Value
.Worksheet.Delete
End With
'Now that all operations have completed, turn alerts and screenupdating back on
Application.DisplayAlerts = True
Application.ScreenUpdating = True
End Sub
You can do this in a single shot without looping using the VBA equivalent of entering this formula, then taking values only
as a formula
=IF(NOT(ISERROR(FIND("-",A1))),RIGHT(A1,LEN(A1)-FIND("-",A1)-1 ),A1)
code
Sub Quicker()
Dim rng1 As Range
Set rng1 = Range([a1], Cells(Rows.Count, "A").End(xlUp))
With rng1.Offset(0, 3)
.FormulaR1C1 = "=IF(NOT(ISERROR(FIND(""-"",RC[-3]))),RIGHT(RC[-3],LEN(RC[-3])-FIND(""-"",RC[-3])-1 ),RC[-3])"
.Value = .Value
End With
End Sub