I have a wide data set composed of identifiers and then a series of 20 groups of repeating columns, with the same 8 columns in each group. I would like restructure these data into rows such that the identifiers each repeat and each series of 8 represents a unique row.
What I currently have and what I'm trying to accomplish:
I have code that gets me most of the way there, it works if I just run it on the identifiers and the first two columns. It won't go through columns 3-8 of each group. Here is the code that successfully runs on the first two columns of each group.
Sub StackData()
Dim Key, Dic As Object, cl As Range, Data As Range, i&, n&
Set Dic = CreateObject("Scripting.Dictionary")
Dic.CompareMode = vbTextCompare
i = Cells(Rows.Count, "A").End(xlUp).Row
n = 1
Set Data = Range("F2:F" & i & "," & "N2:N" & i & "," & "V2:V" & i & "," & "AD2:AD" & i & "," & "AL2:AL" & i & "," & "AT2:AT" & i & "," & "BB2:BB" & i & "," & "BJ2:BJ" & i & "," & "BR2:BR" & i & "," & "BZ2:BZ" & i & "," & "CH2:CH" & i & "," & "CP2:CP" & i & "," & "CX2:CX" & i & "," & "DF2:DF" & i & "," & "DN2:DN" & i & "," & "DV2:DV" & i & "," & "ED2:ED" & i & "," & "EL2:EL" & i & "," & "ET2:ET" & i & "," & "FB2:FB" & i)
Dic.Add "|Name", "Var1|Var2|Var3|Var4|Var5|Var6|Var7|Var8"
For Each cl In Data
If Cells(cl.Row, "A") <> "" Then
Dic.Add n & "|" & Cells(cl.Row, "A"), cl.Text & "|" & cl.Offset(, 1).Text
n = n + 1
End If
Next cl
n = 1
For Each Key In Dic
Worksheets("Worksheet").Cells(n, "A") = Split(Key, "|")(1)
Worksheets("Worksheet").Cells(n, "B") = Split(Dic(Key), "|")(0)
Worksheets("Worksheet").Cells(n, "C") = Split(Dic(Key), "|")(1)
n = n + 1
Next Key
End Sub
When I add to the "For each Key in Dic" I get an error. Any input as to what I am doing wrong? Also open to different ways to approach this that might be cleaner than this kludgey approach.
You can also do this with Power Query, available in Excel 2010+
In the code below I did it using your provided example.
You'll need to make a few changes to adapt it to your actual data.
eg:
In the code, I selected the first two columns and Unpivoted other columns; in your actual data you'll have to select the first three
For the Integer/Division column, I divided by three; you'll probably need to divide by eight
In the code, I added three custom columns (one for each of the identifier columns, and one for the remaining. You'll need to add four custom columns
In any event:
MCode
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(Source, {"Name", "ID"}, "Attribute", "Value"),
#"Added Index" = Table.AddIndexColumn(#"Unpivoted Other Columns", "Index", 0, 1),
#"Inserted Integer-Division" = Table.AddColumn(#"Added Index", "Integer-Division", each Number.IntegerDivide([Index], 3), Int64.Type),
#"Removed Columns" = Table.RemoveColumns(#"Inserted Integer-Division",{"Attribute", "Index"}),
#"Grouped Rows" = Table.Group(#"Removed Columns", {"Integer-Division"}, {{"Grouped", each _, type table [Name=text, ID=number, Value=text, #"Integer-Division"=number]}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Name", each List.First(Table.Column([Grouped],"Name"))),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "ID", each List.First(Table.Column([Grouped],"ID"))),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "Custom", each Table.Column([Grouped],"Value")),
#"Extracted Values" = Table.TransformColumns(#"Added Custom2", {"Custom", each Text.Combine(List.Transform(_, Text.From), ";"), type text}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Extracted Values", "Custom", Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), {"Custom.1", "Custom.2", "Custom.3"}),
#"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Custom.1", type text}, {"Custom.2", type text}, {"Custom.3", type text}}),
#"Removed Columns1" = Table.RemoveColumns(#"Changed Type",{"Integer-Division", "Grouped"}),
//Rename last columns
origColNames =List.Buffer(List.Range(Table.ColumnNames(#"Removed Columns1"),2)),
newNameNum = List.Generate(() => 1 , each _ <=List.Count(origColNames), each _ + 1),
//There has to be a better way to convert the numbers to strings
#"Converted to Table" = Table.FromList(newNameNum, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Changed Type1" = Table.TransformColumnTypes(#"Converted to Table",{{"Column1", type text}}),
newNames = Table.Column(#"Changed Type1","Column1"),
renameCols = Table.RenameColumns(#"Removed Columns1",List.Zip({origColNames,newNames}))
in
renameCols
Stack Data
Copy the code into a standard module (e.g. Module1).
Carefully adjust the values under Source, Target and Workbook.
The code is written for Thisworkbook, the workbook containing this code.
The Code
Sub stackData()
' Error Handler
Const Proc As String = "stackData"
On Error GoTo cleanError
' Source
Const srcName As String = "Sheet1" ' Worksheet Name
Const srcFirst As String = "A2" ' First Cell Address
Const LRCol As Long = 1 ' Last Row Column Number
Const IdentCols As Long = 3 ' Number of Identifier Columns
Const GroupCols As Long = 8 ' Number of Group Columns
Const GroupsCount As Long = 20 ' Number of Groups
' Target
Const tgtName As String = "Sheet2" ' Worksheet Name
Const tgtFirst As String = "A2" ' First Cell Address
' Workbook
Dim wb As Workbook: Set wb = ThisWorkbook
' Write values of Source Range to Source Array.
Dim ws As Worksheet: Set ws = wb.Worksheets(srcName)
Dim rng As Range
Set rng = ws.Columns(LRCol).Find("*", , xlValues, , , xlPrevious)
If rng Is Nothing Then Exit Sub
If rng.row < ws.Range(srcFirst).row Then Exit Sub
Dim LastRow As Long: LastRow = rng.row
Set rng = Nothing
Dim LastCol As Long: LastCol = IdentCols + GroupCols * GroupsCount
Dim Source As Variant
Source = ws.Range(ws.Range(srcFirst), ws.Cells(LastRow, LastCol))
Set ws = Nothing
Dim ubS As Long: ubS = UBound(Source)
' Write values of Source Array to Target Array.
Dim Target As Variant
ReDim Target(1 To ubS * GroupsCount, 1 To IdentCols + GroupCols)
Dim i As Long, j As Long, k As Long, m As Long
GoSub writeIdentifiers
GoSub writeGroups
' Write values of Target Array to Target Range.
Set ws = wb.Worksheets(tgtName)
ws.Range(tgtFirst).Resize(UBound(Target), UBound(Target, 2)).Value = Target
' Inform user.
MsgBox "Data stacked.", vbInformation, "Success"
Exit Sub
' Subroutines
writeIdentifiers:
m = 1
For i = 1 To ubS
For j = 1 To GroupsCount
For k = 1 To IdentCols
Target(m, k) = Source(i, k)
Next k
m = m + 1
Next j
Next i
Return
writeGroups:
m = 1
For i = 1 To ubS
For j = 1 To GroupsCount
For k = 1 To GroupCols
Target(m, k + IdentCols) = _
Source(i, k + IdentCols + (j - 1) * GroupCols)
Next k
m = m + 1
Next j
Next i
Return
' Error Handler
cleanError:
MsgBox "An unexpected error occurred in '" & Proc & "'." & vbCr _
& "Run-time error '" & Err.Number & "':" & vbCr & Err.Description _
, vbCritical, Proc & " Error"
End Sub
Related
I have a set of files, some have no suffix and some with different suffixes. I would like to segregate the file names irrespective of their suffixes and list them along with the range of their row indices in the same spreadsheet. Below is the example and my failed code. Also attached the spreadsheet snapshot. Can you please help? Any new code/logic is welcome.
Input:
Row index
Filename
1
File1
2
File2_a
3
File2_b
4
File2_c
5
File3_a
6
File3_b
Output:
Filename
Row indices range
File1
1
1
File2
2
4
File3
5
6
VBA code
Sub GetUniqueFiles()
Dim sh As Worksheet
Set sh = ThisWorkbook.Sheets("Sheet1")
lastrow = sh1.Range("A" & Rows.Count).End(xlUp).Row
SameFile = False ' Flag to compare 2 consecutive file names
i = 3: j = 3
While i <= (lastrow - 1)
name_curt = sh.Range("B" & i).Value
name_next = sh.Range("B" & i + 1).Value
file_curt = Split(name_curt, "_")(0)
file_next = Split(name_next, "_")(0)
If file_curt <> file_next Then
sh.Range("D" & j).Value = file_curt
k1 = i
sh.Range("E" & j).Value = k1
sh.Range("F" & j).Value = k2
i = i + 1: j = j + 1
ElseIf file_curt = file_next Then
SameFile = True
sh.Range("B" & j).Value = file_curt
k1 = i
While SameFile
i = i + 1
name_curt = sh.Range("B" & i).Value
name_next = sh.Range("B" & i + 1).Value
file_curt = Split(name_curt, "_")(0)
file_next = Split(name_next, "_")(0)
Wend
End If
Wend
End Sub
Try this:
Sub GetUniqueFiles()
Dim sh As Worksheet, m, indx, rw As Range, f As String
Dim r As Long
Set sh = ThisWorkbook.Sheets("Sheet1")
Set rw = sh.Range("A3:B3") 'first input row
r = 2 'start row for output
Do While Application.CountA(rw) = 2 'loop while have data
indx = rw.Cells(1).Value
f = Split(rw.Cells(2).Value, "_")(0) ' "base" file name
m = Application.Match(f, sh.Columns("D"), 0) 'see if already listed
If IsError(m) Then 'not already listed ?
sh.Cells(r, "D").Value = f 'write file name
sh.Cells(r, "E").Value = indx 'write "first seen" index
m = r
r = r + 1
End If
sh.Cells(m, "F").Value = indx 'write "last seen" index
Set rw = rw.Offset(1, 0) 'next input row
Loop
End Sub
You can solve this without VBA. Add something like this in column C:
=LEFT(B2,IFERROR(FIND("_",B2)-1,LEN(B2)+1))
It will cut the underscore and everything after it. Next step is counting the distinct values. I would go for a pivot table, but there are lots of other ways.
You can obtain your desired output using Power Query, available in Windows Excel 2010+ and Office 365 Excel
Select some cell in your original table
Data => Get&Transform => From Table/Range or From within sheet
When the PQ UI opens, navigate to Home => Advanced Editor
Make note of the Table Name in Line 2 of the code.
Replace the existing code with the M-Code below
Change the table name in line 2 of the pasted code to your "real" table name
Examine any comments, and also the Applied Steps window, to better understand the algorithm and steps
M Code
let
//Change table name in next line to actual table name in your workbook
Source = Excel.CurrentWorkbook(){[Name="Table22"]}[Content],
//split on the underscore and remove the splitted suffix
#"Split Column by Delimiter" = Table.SplitColumn(Source, "Filename",
Splitter.SplitTextByDelimiter("_", QuoteStyle.Csv), {"Filename", "Filename.2"}),
//set data types -- frequently a good idea in PQ
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{
{"Filename", type text}, {"Filename.2", type text}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type1",{"Filename.2"}),
//Group by file name and extract the lowest and highest rows
#"Grouped Rows" = Table.Group(#"Removed Columns", {"Filename"}, {
{"Start Row", each List.Min([Row index]), type number},
{"End Row", each List.Max([Row index]), type number}})
in
#"Grouped Rows"
For execution speed and resource conservation it's best to minimize interaction with the worksheet from VBA. For example the following references the worksheet precisely twice no matter how long the list of files. Don't underestimate the value of restricting worksheet interaction from VBA.
Sub GetUniqueFiles()
Dim c&, i&, a$, b$, vIn, vOut
Const FILES_IN$ = "b3"
Const FILES_OUT$ = "d3"
With Range(FILES_IN)
vIn = .Resize(.End(xlDown).Row - .Row + 1)
End With
ReDim vOut(1 To UBound(vIn), 1 To 3)
For i = 1 To UBound(vIn)
b = Split(vIn(i, 1), "_")(0)
If a <> b Then
a = b
c = c + 1
vOut(c, 1) = b
vOut(c, 2) = i
If c > 1 Then vOut(c - 1, 3) = i - 1
End If
Next
If c > 1 Then vOut(c, 3) = i - 1
Range(FILES_OUT).Resize(UBound(vIn), 3) = vOut
End Sub
I've managed to record all changes my team has done and in order to use it further I'll need to transpose the table into format supported by our systems. I was able to transpose one row but there are hundreds in my file so my code was useless.
My input table look like this
My goal for output looks like this
I've managed to transpose the whole table from tutorial (Essential Excel), however it doesn't fit my needs.
Sub TransposeTest()
Dim wks As Worksheet, wks2 As Worksheet
Dim MyArray As Variant
Dim LastRow As Integer, LastColumn As Integer
Dim StartCells As Range
Set wks = ThisWorkbook.Sheets("Sheet1")
Set wks2 = ThisWorkbook.Sheets("Transpose")
Set StartCell = wks.Range("A2")
LastRow = wks.Cells(wks.Rows.Count, StartCell.Column).End(xlUp).row
LastColumn = wks.Cells(StartCell.row, wks.Columns.Count).End(xlToLeft).Column
MyArray = wks.Range(StartCell, wks.Cells(LastRow, LastColumn)).Value2
MyArray = WorksheetFunction.Transpose(MyArray)
wks2.Range("a2", wks2.Cells(LastColumn, LastRow)).Value = MyArray
End Sub
I've been advised to use array for months and loop through each row however I unable to achieve it.
Please, try the next code. It should be very fast, even for large ranges. It uses arrays and works only in memory, the result being dropped at once:
Sub TransposeData()
Dim sh As Worksheet, shTr As Worksheet, lastR As Long, arr, arrfin, ArrH, i As Long, k As Long, j As Long
Set sh = ActiveSheet
Set shTr = sh.Next 'use here the sheet you need to return.
'if the next sheet is empty you can let the code as it is
lastR = sh.Range("A" & sh.rows.count).End(xlUp).row 'last row
arr = sh.Range("A1:Q" & lastR).value 'place the range in an array for faster iteration
ReDim arrfin(1 To UBound(arr) * 12 + 1, 1 To 7): k = 1 'redim the final array dimensions
ArrH = Split("Index,Person,Dept,Month,Sales,STMP,User", ",") 'create an array from the header strings
'place the headers in the first row of the final array:
For i = 0 To UBound(ArrH): arrfin(k, i + 1) = ArrH(i): Next: k = k + 1
'build the final array:
For i = 2 To UBound(arr)
For j = 1 To 12
arrfin(k + j - 1, 1) = arr(i, 1): arrfin(k + j - 1, 2) = arr(i, 2): arrfin(k + j - 1, 3) = arr(i, 3)
arrfin(k + j - 1, 4) = j & "." & Year(Date): arrfin(k + j - 1, 5) = arr(i, j + 3)
arrfin(k + j - 1, 6) = arr(i, 16): arrfin(k + j - 1, 7) = arr(i, 17)
Next j
k = k + j - 1 'reinitialize k variable for the next data row
Next i
'drop the final array content at once, and do some formatting:
With shTr.Range("A1").Resize(UBound(arrfin), UBound(arrfin, 2))
.value = arrfin
.rows(1).Font.Bold = True
.EntireColumn.AutoFit
For i = 7 To 9
.Borders(i).Weight = xlThin
.Borders.LineStyle = xlContinuous
Next
End With
MsgBox "Ready..."
End Sub
You can obtain your desired output using Power Query, available in Windows Excel 2010+ and Office 365 Excel
Select some cell in your original table
Data => Get&Transform => From Table/Range or From within sheet
When the PQ UI opens, navigate to Home => Advanced Editor
Make note of the Table Name in Line 2 of the code.
Replace the existing code with the M-Code below
Change the table name in line 2 of the pasted code to your "real" table name
Examine any comments, and also the Applied Steps window, to better understand the algorithm and steps
M Code
let
//Change table name in next row to the actual table name in your workbook
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
//set the data types
#"Changed Type" = Table.TransformColumnTypes(Source,{
{"Index", Int64.Type}, {"Person", Int64.Type}, {"Dept", Int64.Type},
{"Jan", Int64.Type}, {"Feb", Int64.Type}, {"Mar", Int64.Type},
{"Apr", Int64.Type}, {"May", Int64.Type}, {"Jun", Int64.Type},
{"Jul", Int64.Type}, {"Aug", Int64.Type}, {"Sep", Int64.Type},
{"Oct", Int64.Type}, {"Nov", Int64.Type}, {"Dec", Int64.Type},
{"Time", type datetime}, {"User", type text}},"en-150"),
//Unpivot the Month columns
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type",
{"Index", "Person", "Dept", "Time", "User"}, "Month", "Sales"),
//Transform "Month" column to "MonthNum.YearNum
//Not sure where the year should come from.
// for now will just hard code at as 2022
// but could use a different method.
mnthCol = Table.TransformColumns(#"Unpivoted Other Columns", {"Month", each
Date.ToText(Date.FromText("1-" & _ & "-2022"),"M.yyyy"),type text}),
//Reorder the columns
#"Reordered Columns" = Table.ReorderColumns(mnthCol,{"Index", "Person", "Dept", "Month", "Sales", "Time", "User"}),
//Rename the columns as per your example
rename = Table.RenameColumns(#"Reordered Columns",{
{"Time","STMP"},
{"Dept","Depr"}
})
in
rename
Original Data
Partial Results
I have 2 columns in excel as below:
CutNo Data
1 A
1 B
1 C
2 A
2 B
3 A
I want to concatenate data of column data if the Cut No is the same and put it in another Column Named Concatenate and Count the number of occurrences and put it in another column as below
CutNo Data Concatenate Occurrences
1 A A & B & C 1
1 B
1 C
2 A A & B 1
2 B
3 A A 1
I use the following code
Sub Unique()
Dim Rng, Cel As Range
Dim lr As Long
Dim x As Integer
Dim str As String
lr = Sheets("Report").Cells(Rows.count, 1).End(xlUp).Row
Set Rng = Sheets("Report").Range("A2:A" & lr)
For x = 1 To Rng.count
For Each Cel In Rng.Cells
If Cel.Value = x Then
str = str & Rng.Cells(x, 1).Offset(0, 7) & ","
End If
Next Cel
Rng.Cells(x, 1).Offset(0, 10).Value = str
Next x
End Sub
I did not get the proper result I need,
Appreciate your support
Thanks, Regards
Moheb Labib
If you have Excel O365 with the FILTER function, you don't need VBA:
(Note: I have asssumed that Occurrences can be calculated by just counting the number of rows of CutNo. If you mean something else, please clarify)
C2: =IF(AND(A2<>A1,A2<>""),TEXTJOIN(" & ",TRUE,FILTER($B:$B,$A:$A=A2)),"")
D2: =IF(AND(A2<>A1,A2<>""),COUNTIF($A:$A,A2),"")
and fill down.
You can also do this using Power Query available in Excel 2010+
Select the entire range to include
*cannot auto-select since there are blank rows
In Excel 2016+ : Data --> Get & Transform --> From Table/Range
I'm not sure about the earlier versions, where you would download a free MS add-in for this functionality.
When the PQ Editor opens, select Home --> Advanced Editor and paste the M Code below into the window that opens.
Change the Table name in Line 2 to be the name of the Table generated when you opened PQ.
For explanations, examine the items in the Applied Steps window. If you float your cursor over any of the i icons, you will see the comment associated; if you double click on a gear wheel, it will open a dialog window so you can examine what was done.
Close and Load to: I select the column next to the original data, but there are other ways to do this.
M Code
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"CutNo", Int64.Type}, {"Data", type text}}),
//make the grouping easier, else we'd have a group with the blank rows
#"Removed Blank Rows" = Table.SelectRows(#"Changed Type", each not List.IsEmpty(List.RemoveMatchingItems(Record.FieldValues(_), {"", null}))),
//Group by CutNo -- hence no need to sort
#"Grouped Rows" = Table.Group(#"Removed Blank Rows", {"CutNo"}, {{"Grouped", each _, type table [CutNo=nullable number, Data=nullable text]}}),
//add a blank row at the bottom of each grouped table (each CutNo group)
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "addBlankRow", each Table.InsertRows([Grouped],
Table.RowCount([Grouped]),
{[CutNo=null, Data=null]})),
//remove unneded columns
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"CutNo", "Grouped"}),
#"Added Custom1" = Table.AddColumn(#"Removed Columns", "Custom", each Table.Column([addBlankRow],"Data")),
//Concatenate the "Data"
#"Extracted Values" = Table.TransformColumns(#"Added Custom1", {"Custom", each Text.Combine(List.Transform(_, Text.From), " & "), type text}),
//Count the rows (subtract one since last row will be blank
#"Added Custom2" = Table.AddColumn(#"Extracted Values", "Custom.1", each Table.RowCount([addBlankRow])-1),
//Expand the Table column to put a blank row between each group of CutNo
#"Expanded addBlankRow" = Table.ExpandTableColumn(#"Added Custom2", "addBlankRow", {"CutNo"}, {"addBlankRow.CutNo"}),
//Add Index column so we can null out where there should be empty cells in the Concatenate Column
#"Added Index" = Table.AddIndexColumn(#"Expanded addBlankRow", "Index", 0, 1, Int64.Type),
#"Added Custom3" = Table.AddColumn(#"Added Index", "Concatenate", each
if [Index] = 0
then [Custom]
else if [addBlankRow.CutNo] = null
then null
else if [addBlankRow.CutNo] = #"Expanded addBlankRow"[addBlankRow.CutNo]{[Index]-1}
then null
else [Custom]),
//Blank cells in the Occurrence column if blank in the CutNo column
#"Added Custom4" = Table.AddColumn(#"Added Custom3", "Occurrences", each
if [Concatenate] = null then null
else [Custom.1]),
//Remove unneeded columns
#"Removed Columns1" = Table.RemoveColumns(#"Added Custom4",{"addBlankRow.CutNo", "Custom", "Custom.1", "Index"}),
//Remove bottom row which will be blank
#"Removed Bottom Rows" = Table.RemoveLastN(#"Removed Columns1",1)
in
#"Removed Bottom Rows"
First, your data in VBA form:
Cells.Clear
Cells(2, 1) = "1"
Cells(2, 2) = "A"
Cells(3, 1) = "1"
Cells(3, 2) = "B"
Cells(4, 1) = "1"
Cells(4, 2) = "C"
Cells(6, 1) = "2"
Cells(6, 2) = "A"
Cells(7, 1) = "2"
Cells(7, 2) = "B"
Cells(9, 1) = "3"
Cells(9, 2) = "A"
Second, your code reworked:
Dim rng As Range, Cel As Range
Dim lr As Long
Dim x As Integer, y As Integer
Dim str As String
lr = Cells(Rows.Count, 1).End(xlUp).Row
Set rng = Range("A2:A" & lr)
For x = 1 To rng.Count
str = ""
y = 0
For Each Cel In rng
If Cel.Value = x Then
str = str & rng.Cells(Cel.Row - 1, 2) & ","
If y = 0 Then y = Cel.Row - 1
End If
Next Cel
If y>0 Then rng.Cells(y, 4) = Left(str, Len(str) - 1)
Next x
Output:
Notes:
I have left 'occurrences' out as it seems vague.
Dim rng, Cel As Range should be Dim rng As Range, Cel As Range, otherwise rng is a
declared as a Variant.
Other than that, I have just trimmed bits off, and added a routine to calculate and format the data properly.
Previously, you were using Rng.Cells(x, 1), but the value of x doesn't change throughout the Cel loop, so you need to access the Cel.Row property to find out where the row in question is.
The y variable stores the first occurrence of x for display purposes.
Try the next code, please. Since you did not answer my clarification question, the code works with the assumption that occurrences means counting of each concatenated item:
Sub testConcatenateOnCriteria()
Dim sh As Worksheet, lastRow As Long, dict As New Scripting.Dictionary
Dim i As Long, count As Long, strVal As String, arr As Variant
Set sh = ActiveSheet 'use here your sheet
lastRow = sh.Range("A" & Rows.count).End(xlUp).Row
For i = 2 To lastRow
strVal = sh.Range("A" & i).Value
If sh.Range("B" & i).Value <> "" Then
If Not dict.Exists(strVal) Then
dict.Add strVal, Array(sh.Range("B" & i).Value, 1, i)
Else
dict(strVal) = Array(dict(strVal)(0) & sh.Range("B" & i).Value, dict(strVal)(1) + 1, dict(strVal)(2))
End If
End If
Next i
ReDim arr(1 To lastRow, 1 To 2)
arr(1, 1) = "Concatenate": arr(1, 2) = "Occurrences"
For i = 0 To dict.count - 1
arr(dict(dict.Keys(i))(2), 1) = dict(dict.Keys(i))(0): arr(dict(dict.Keys(i))(2), 2) = dict(dict.Keys(i))(1)
Next i
sh.Range("C1").Resize(UBound(arr), 2).Value = arr
End Sub
I am having a data sheet with approx 90k Rows and I need to transpose data in every row to 2 columns. To explain a bit more, I am having product id in the 1st column and the next few ( each product id is having different sizes and stock) cells are having the sizes and stock, but I want to have all that data in two columns only instead of only one row. For reference, I have attached the screenshot.
The first screenshot is about the raw data
The second screenshot is how I want the data to be placed
My code was
Sub TransposeInsertRows()
'UpdatebyExtendoffice20161125
Dim xRg As Range
Dim i As Long, j As Long, k As Long
Dim x As Long, y As Long
Set xRg = Application.InputBox _
(Prompt:="Range Selection...", _
Title:="Product base", Type:=8)
Application.ScreenUpdating = False
x = xRg(1, 1).Column + 2
y = xRg(1, xRg.Columns.Count).Column
For i = xRg(xRg.Rows.Count, 1).Row To xRg(1, 1).Row Step -1
If Cells(i, x) <> "" And Cells(i, x + 1) <> "" Then
k = Cells(i, x - 2).End(xlToRight).Column
If k > y Then k = y
For j = k To x + 1 Step -1
Cells(i + 1, 1).EntireRow.Insert
With Cells(i + 1, x - 2)
.Value = .Offset(-1, 0)
.Offset(0, 1) = .Offset(-1, 1)
.Offset(0, 2) = Cells(i, j)
End With
Cells(i, j).ClearContents
Next j
End If
Next i
Application.ScreenUpdating = True
End Sub
Before running code
After running code
If your "real raw" data consists of sequential rows, with the data for each id in adjacent columns, then you can (relatively) easily solve your problem using Power Query (available in Excel 2010+)
I am assuming your original data looks something like:
That being the case, using Power Query you can:
Select the id column and Unpivot other columns
Group by each pair of rows
Extract the values from the size-stock columns which gives you a comma separated pair of values
Group by the id which creates a column of Tables for each id
Add a blank row to the bottom of each table (gives you the blank rows in your results
Expand the tables, and create a column where only the first id entry for each set of stock-sizes is showing.
Along the way extra columns are removed, and data types get set.
If you paste the M-Code below into the Power Query Advanced Editor, and change the Table Name in Line 2 to reflect your data table, you should be able to follow along in the Applied Steps window to see what is going on.
In that window, double clicking the steps with a little gearwheel on the right will open a dialog window which will show useful information.
M-Code
let
Source = Excel.CurrentWorkbook(){[Name="Table8"]}[Content],
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(Source, {"id"}, "Attribute", "Value"),
#"Added Index" = Table.AddIndexColumn(#"Unpivoted Other Columns", "Index", 0, 1, Int64.Type),
#"Inserted Integer-Division" = Table.AddColumn(#"Added Index", "Integer-Division", each Number.IntegerDivide([Index], 2), Int64.Type),
#"Removed Columns" = Table.RemoveColumns(#"Inserted Integer-Division",{"Attribute", "Index"}),
#"Grouped Rows" = Table.Group(#"Removed Columns", {"Integer-Division", "id"}, {{"Grouped", each _, type table [id=nullable number, Value=any, #"Integer-Division"=number]}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Sizes Stock", each Table.Column([Grouped],"Value")),
#"Extracted Values" = Table.TransformColumns(#"Added Custom", {"Sizes Stock", each Text.Combine(List.Transform(_, Text.From), ";"), type text}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Extracted Values", "Sizes Stock", Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), {"Sizes Stock.1", "Sizes Stock.2"}),
#"Removed Columns1" = Table.RemoveColumns(#"Split Column by Delimiter",{"Integer-Division", "Grouped"}),
#"Grouped Rows1" = Table.Group(#"Removed Columns1", {"id"}, {{"Grouped", each _, type table [id=nullable number, Sizes Stock.1=nullable text, Sizes Stock.2=nullable number]}}),
//Change List.Repeat count argument for number of blank rows between each id
#"Added Custom1" = Table.AddColumn(#"Grouped Rows1", "Custom", each Table.InsertRows([Grouped],Table.RowCount([Grouped]),
List.Repeat({[id = "", Sizes Stock.1 = "", Sizes Stock.2=""]},1))),
#"Removed Columns2" = Table.RemoveColumns(#"Added Custom1",{"id", "Grouped"}),
#"Expanded Custom" = Table.ExpandTableColumn(#"Removed Columns2", "Custom", {"id", "Sizes Stock.1", "Sizes Stock.2"}, {"id", "Sizes Stock.1", "Sizes Stock.2"}),
#"Added Index1" = Table.AddIndexColumn(#"Expanded Custom", "Index", 0, 1, Int64.Type),
#"Added Custom2" = Table.AddColumn(#"Added Index1", "ID.1", each if [Index]= 0 then [id] else
if #"Added Index1"[id]{[Index]-1} = [id]
or [id] = "" then null
else [id]),
#"Removed Columns3" = Table.RemoveColumns(#"Added Custom2",{"id"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns3",{{"ID.1", "id"}, {"Sizes Stock.1", "Size"}, {"Sizes Stock.2", "Stock"}}),
#"Reordered Columns" = Table.ReorderColumns(#"Renamed Columns",{"id", "Size", "Stock", "Index"}),
#"Changed Type3" = Table.TransformColumnTypes(#"Reordered Columns",{{"id", type text}, {"Size", type text}, {"Stock", type number}}),
#"Removed Columns4" = Table.RemoveColumns(#"Changed Type3",{"Index"})
in
#"Removed Columns4"
Results
*Note: if you want more than one blank row between each set of id values, a minor change in the code will do that.
You did not answer my clarification question...
Then, test the next code, please. It assumes that you have the sheet to be processed, prepared as you show it in the picture (processed with enough spaces between the rows with data:
Sub TESTSplitTransposeArrays()
Dim sh As Worksheet, lastRow As Long, i As Long, arr As Variant, arrFin As Variant
Dim lastCol As Variant, lastcolUR As Long, k As Long, j As Long, lastRlastCol As Long
Set sh = ActiveSheet 'use here your worksheet
lastRow = sh.Range("A" & Rows.count).End(xlUp).Row
lastcolUR = sh.UsedRange.Columns.count
arr = sh.Range(sh.Range("A1"), sh.cells(lastRow, lastcolUR)).Value
lastRlastCol = sh.cells(lastRow, Columns.count).End(xlToLeft).Column
ReDim arrFin(1 To lastRow + Int((lastRlastCol - 1) / 2) + 2, 1 To 2)
For i = 2 To lastRow
If sh.Range("A" & i) <> "" Then
lastCol = sh.cells(i, Columns.count).End(xlToLeft).Column
arr = sh.Range(sh.Range("B" & i), sh.cells(i, lastCol)).Value
For j = 1 To lastCol - 1
If j Mod 2 <> 0 Then
arrFin(i + k - 1, 1) = arr(1, j)
arrFin(i + k - 1, 2) = arr(1, j + 1)
k = k + 1
End If
Next j
k = 0
End If
Next i
sh.Range(sh.Range("B2"), sh.cells(lastRow, lastcolUR)).ClearContents
sh.Range("B2").Resize(UBound(arrFin), 2).Value = arrFin
End Sub
But, if the necessary blank rows have not been inserted, use the next code, please:
Sub testInsertMassRows()
Dim sh As Worksheet, lastR As Long, i As Long, roNo As Long
Set sh = ActiveSheet
lastR = sh.Range("A" & Rows.count).End(xlUp).Row
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
Application.EnableEvents = False
For i = lastR To 2 Step -1
roNo = sh.cells(i - 1, Columns.count).End(xlToLeft).Column - 1
sh.Rows(i & ":" & i + Int(roNo / 2) + 2).EntireRow.Insert xlShiftDown
Next i
Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic
Application.EnableEvents = True
End Sub
Edited:
If your workbook will not be consistent, meaning not having the necessary pairs to be extracted, the code will return an error (a half row cannot be inserted and an array cannot be Redim using decimals). In order to check this aspect, please, previously run the next checking Sub. It will send a message with all the problematic rows address. Correct them and run the other subs only after this check. If everything will run smooth from your point of view and you need often running such an application, I can transform this last Sub in a function to be called by the other two, to preliminary check and go on only if it will return True:
Sub testInconsistencyCheck()
Dim sh As Worksheet, lastR As Long, i As Long, ColNo As Long, strInc As String
Set sh = ActiveSheet
lastR = sh.Range("A" & Rows.count).End(xlUp).Row
For i = 2 To lastR
ColNo = sh.cells(i - 1, Columns.count).End(xlToLeft).Column - 1
If ColNo Mod 2 <> 0 Then
strInc = strInc & sh.Rows(i).Address & vbCrLf
End If
Next i
If Not strInc = "" Then
MsgBox "Wrong number of columns on row(s):" & vbCrLf & strInc
Else
MsgBox "Everything OK!"
End If
End Sub
I would first suggest you do not clear anything from your sheet named "Sheet1".
You will have problems as your transposed rows may erase the datas in "Sheet1".
Try building all your datas in arrays and then write all the values in a new sheet.
I have data in 2 columns. The data in column B is comma delimited. I need each instance to appear on a new row while retaining it's original ID from column A. I also need the data in 3 columns so Name is in B and Number in C. It appears as so:
A--------B
1--------Sam Jones, 1 hours, Chris Bacon, 2 hours
2--------John Jacob, 3 hours
3--------John Hancock, 4 hours, Brian Smith, .5 hours
I am able to get it as such using my code below:
A--------B
1--------Sam Jones, 1
1--------Chris Bacon, 2 hours
2--------John Jacob, 3 hours
3--------John Hancock, 4
3--------Brian Smith, .5 hours
I need it to be: (notice last value in string also has hours removed when added to new line)
A---------B------------------------C
1---------Sam Jones-----------1
1---------Chris Bacon----------2
2---------John Jacob-----------3
3---------John Hancock-------4
3---------Brian Smith----------.5
I have the following code started: (I can't manage to remove the "hours" from the last person in each delimited string and I can't get it into 3 columns)
Sub splitByColB()
Dim r As Range, i As Long, ar
Set r = Worksheets("Sheet1").Range("B2").End(xlDown)
Do While r.Row > 1
ar = Split(r.Value, " hours, ")
If UBound(ar) >= 0 Then r.Value = ar(0)
For i = UBound(ar) To 1 Step -1
r.EntireRow.Copy
r.Offset(1).EntireRow.Insert
r.Offset(1).Value = ar(i)
Next
Set r = r.Offset(-1)
Loop
End Sub
Something like this is what you're looking for:
Sub tgr()
Dim ws As Worksheet
Dim aData As Variant
Dim aTemp As Variant
Dim aResults(1 To 65000, 1 To 3) As Variant
Dim ResultIndex As Long
Dim i As Long, j As Long
Set ws = ActiveWorkbook.Sheets("Sheet1")
With ws.Range("B2", ws.Cells(ws.Rows.Count, "B").End(xlUp))
If .Row < 2 Then Exit Sub 'No data
aData = .Offset(, -1).Resize(, 2).Value
End With
For i = LBound(aData, 1) To UBound(aData, 1)
If Len(Trim(aData(i, 2))) = 0 Then
ResultIndex = ResultIndex + 1
aResults(ResultIndex, 1) = aData(i, 1)
Else
aTemp = Split(aData(i, 2), ",")
For j = LBound(aTemp) To UBound(aTemp) Step 2
ResultIndex = ResultIndex + 1
aResults(ResultIndex, 1) = aData(i, 1)
aResults(ResultIndex, 2) = Trim(aTemp(j))
aResults(ResultIndex, 3) = Trim(Replace(aTemp(j + 1), "hours", vbNullString, , , vbTextCompare))
Next j
End If
Next i
ws.Range("A2").Resize(ResultIndex, UBound(aResults, 2)).Value = aResults
End Sub
You can use Power Query. It is a free MS add-in in 2010, 2013 and included in 2016 where it is called Get & Transform
Split column 2 by delimiter custom --> hours,
Select the ID column and unpivot other columns
Select column 2 and split by delimiter = comma
Remove unnecessary column
Replace value "hours"
And if you add to the table, you can re-run the query
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Data", type text}}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Changed Type", "Data", Splitter.SplitTextByDelimiter("hours,", QuoteStyle.Csv), {"Data.1", "Data.2"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Data.1", type text}, {"Data.2", type text}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type1", {"ID"}, "Attribute", "Value"),
#"Split Column by Delimiter1" = Table.SplitColumn(#"Unpivoted Other Columns", "Value", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), {"Value.1", "Value.2"}),
#"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"Value.1", type text}, {"Value.2", type text}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type2",{"Attribute"}),
#"Replaced Value" = Table.ReplaceValue(#"Removed Columns","hours","",Replacer.ReplaceText,{"Value.2"})
in
#"Replaced Value"
I would use a class with the name data
Option Explicit
Public Id As String
Public FullName As String
Public hours As String
and the following code
Option Explicit
Sub SplitIt()
Dim rg As Range
Dim col As New Collection
Dim dataLine As data
Set rg = Worksheets("Sheet1").Range("A1").CurrentRegion
Dim vDat As Variant
vDat = rg.Columns
Dim lDat As Variant
Dim i As Long, j As Long
For i = LBound(vDat) To UBound(vDat)
lDat = Split(vDat(i, 2), ",")
For j = LBound(lDat) To UBound(lDat) Step 2
Dim hDat As Variant
hDat = Split(Trim(lDat(j + 1)), " ")
Set dataLine = New data
dataLine.Id = vDat(i, 1)
dataLine.FullName = Trim(lDat(j))
dataLine.hours = hDat(0)
col.Add dataLine
Next j
Next i
' Print Out
For i = 1 To col.Count
Set dataLine = col(i)
rg.Cells(i, 1) = dataLine.Id
rg.Cells(i, 2) = dataLine.FullName
rg.Cells(i, 3) = dataLine.hours
Next i
End Sub
Why not split on hours to a) add a record delimiter and b) get rid of hours?
Option Explicit
Sub splitByColB()
Dim r As Long, i As Long, hrs As Variant, cms As Variant
With Worksheets("sheet1")
For r = .Cells(.Rows.Count, "B").End(xlUp).Row To 2 Step -1
hrs = Split(.Cells(r, "B").Value2 & ", ", " hours, ")
ReDim Preserve hrs(UBound(hrs) - 1)
If CBool(UBound(hrs)) Then _
.Cells(r, "A").Offset(1, 0).Resize(UBound(hrs), 1).EntireRow.Insert
For i = UBound(hrs) To LBound(hrs) Step -1
cms = Split(hrs(i), ", ")
.Cells(r, "A").Offset(i, 0) = .Cells(r, "A").Value
.Cells(r, "A").Offset(i, 1) = cms(0)
.Cells(r, "A").Offset(i, 2) = cms(1)
Next i
Next r
End With
End Sub