Restructuring massive excel document - excel

I have a task to restructure a two column excel sheet and expand it. Here is a picture to show what needs to be done, the data on the left of the green column is the original data and the data on the right is how it should look, but its only done for the first entry, I need to replicate it for all 10,000 rows of data.
To explain more indepth, each CRD it needs to be expanded to 160 rows, and go from 1978->2018 while listing the quarters out for each year. What is the best approach? Is it possible to write a macro to solve this ?

The following expects Sheet1 ans Sheet2 to be the names. And goes for 158 quarters.
Option Explicit
Sub doFromThru()
' clear contents
Sheets("Sheet2").Select
Cells.Select
Selection.ClearContents
Range("A1").Select
Cells(1, "A") = "CRD"
Cells(1, "B") = "Year"
Cells(1, "C") = "Quarter"
Cells(1, "D") = "QuarterNumerical"
Cells(1, "E") = "Disclosure"
Dim nOutRow As Integer
nOutRow = 1
' step thru all the rows on the input sheet
Dim nInRow As Long, maxInRow As Long, nInCRD As String, nInDisc As String
maxInRow = Sheets("Sheet1").Cells(Rows.Count, "A").End(xlUp).Row
For nInRow = 2 To maxInRow
nInCRD = Sheets("Sheet1").Cells(nInRow, "A")
nInDisc = Sheets("Sheet1").Cells(nInRow, "L")
' create the new rows on Sheet2
Dim dFrom As String, nQtr As Integer
dFrom = DateValue("Oct 1978") ' starting from here
For nQtr = 1 To 158
nOutRow = nOutRow + 1
Sheets("Sheet2").Cells(nOutRow, "A") = nInCRD
Sheets("Sheet2").Cells(nOutRow, "B") = Format$(dFrom, "yyyy")
Sheets("Sheet2").Cells(nOutRow, "C") = Format$(dFrom, "Q")
Sheets("Sheet2").Cells(nOutRow, "D") = nQtr
Sheets("Sheet2").Cells(nOutRow, "E") = nInDisc
dFrom = DateAdd("Q", 1, dFrom)
Next nQtr
Next nInRow
End Sub
Add some diagnostics to tell you more. After nOutRow = nOutRow + 1
Sheets("Sheet2").Cells(1, "G") = nInRow
Sheets("Sheet2").Cells(1, "H") = nOutRow
Sheets("Sheet2").Cells(1, "I") = nQtr
Sheets("Sheet2").Cells(1, "J") = nInDisc

Untested.
Code assumes you want to start from Q4 of the year 1978 and will loop
for 159 quarters after 1978 Q4. (If necessary, you can change this by changing the value of TOTAL_QUARTERS and START_QUARTER in the code)
You will need to change "Sheet1" in the code to whatever the name of your sheet is.
Code tries to overwrite the contents of columns CH to CL on said sheet. So you might want to save a copy of your workbook before running.
Code:
Option Explicit
Sub ExpandRows()
Const START_YEAR as long = 1978
Const START_QUARTER as long = 4
Const TOTAL_QUARTERS as long = 160
With thisworkbook.worksheets("Sheet1")
Dim lastRow as long
lastRow = .cells(.rows.count, "A").row
Dim inputCRD() as variant
inputCRD = .range("A2:A" & lastRow).value2
Dim inputDisclosure() as variant
inputDisclosure = .range("L2:L" & lastRow).value2
Dim yearOffset as long
Dim quarterIndex as long
Dim numericalQuarterIndex as long
Dim totalRowCount as long
totalRowCount = (lastRow - 1) * TOTAL_QUARTERS ' -1 to skip first row
Dim outputArray() as variant
Redim outputArray(1 to totalRowCount, 1 to 5)
Dim readIndex as long
Dim writeIndex as long
For readIndex = lbound(inputCRD,1) to ubound (inputCRD,1)
quarterIndex = START_QUARTER
For numericalQuarterIndex = 1 to TOTAL_QUARTERS
writeIndex = writeIndex + 1
outputArray(writeIndex, 1) = inputCRD(readIndex, 1)
outputArray(writeIndex, 2) = START_YEAR + yearOffset
outputArray(writeIndex, 3) = quarterIndex
outputArray(writeIndex, 4) = numericalQuarterIndex
outputArray(writeIndex, 5) = inputDisclosure(readIndex, 1)
If quarterIndex < 4 then
quarterIndex = quarterIndex + 1
Else
yearOffset = yearOffset + 1
quarterIndex = 1
End if
Next numericalQuarterIndex
Next readIndex
.range("CH2").resize(ubound(outputArray,1), ubound(outputArray,2)).value2 = outputArray
End with
End sub

Related

Splitting this Cells into Multiple Rows

I am trying to split some cells into multiple rows. I currently am trying to break the cells up into column G (see pic below and) have all the same data in the lines below, with just the cells being split into multiple rows. Is this possible to be done?
Starting here:
and finishing with this:
If macro function is ok, here is sample code
Sub Split()
Dim meargedline As String
Dim rowNumber As Integer
rowNumber = 2
Dim splitData
For i = 2 To 4
meargedline = Cells(i, 3)
splitData = VBA.Split(meargedline, Chr(10))
For j = LBound(splitData, 1) To UBound(splitData, 1)
Cells(j + rowNumber, 4) = Cells(i, 1)
Cells(j + rowNumber, 5) = Cells(i, 2)
Cells(j + rowNumber, 6) = splitData(j)
Next j
rowNumber = rowNumber + UBound(splitData, 1) + 1
Next i
This will do a basic loop through and generate an output. Note the constants and make sure you specify for your workbook. You can see an example of it here.
Sub runSplitter()
Const topRightCellAddress = "E20"
Const startCellToSetValues = "A1" 'where new rows will be placed
Const sheetOneName = "Start" 'make sure these match"
Const sheetTwoName = "Output"
Const codeOfLineSplitter = 10 'asci code line splitter
Dim firstSheet As Worksheet, secondSheet As Worksheet
Set firstSheet = Sheets(sheetOneName)
Set secondSheet = Sheets(sheetTwoName)
Dim aCell As Range
Set aCell = firstSheet.Range(topRightCellAddress)
Dim aRR() As String
Dim r As Long
Do While Not IsEmpty(aCell)
aRR = Split(aCell.Value2, Chr(codeOfLineSplitter), -1)
Dim i As Long
With secondSheet.Range(startCellToSetValues)
For i = LBound(aRR) To UBound(aRR)
.Offset(r, 0).Value2 = aCell.Offset(0, -2).Value2
.Offset(r, 1).Value2 = aCell.Offset(0, -1).Value2
.Offset(r, 2).Value2 = aRR(i)
r = r + 1
Next i
End With
Set aCell = aCell.Offset(1, 0)
Loop
End Sub
you can use power query to achieve the target you want. Click here for reference https://www.youtube.com/watch?v=wJ6y2anloW4.

VBA code to not insert if data already in worksheet

I have the following macro which is so close to what I need. The issue I have is if the data is already in sheet2 it inserts a new line and the same data where as I don't want it duplicated. I have tried a few things but I cant quite get there
'start with sheet 1
Sheets(1).Activate
Dim rowStartSheet1, rowStartSheet2, lastRowSheet1, lastRowSheet2 As Integer
'change this variable if your row doesn't start on 2 as in this example for sheet1 and sheet2
rowStartSheet1 = 2
rowStartSheet2 = 2
'gets you the last row in sheet 1
lastRowSheet1 = Cells(Rows.Count, 1).End(xlUp).Row
'this entire for block is to check if a data row in sheet 1 is in sheet 2 and if so, copy and paste the rest of the data points
For i = rowStartSheet1 To lastRowSheet1
'case 1 where column C matches column A first time around (no duplicates)
'change this variable if sheet 2 starts on a different row
Sheets(2).Activate
lastRowSheet2 = Cells(Rows.Count, 1).End(xlUp).Row
'loops through sheet 2 column A to check if it matches what we want in sheet1 Column C
For ii = rowStartSheet2 To lastRowSheet2
'inputs if found first time around
If Sheets(1).Cells(i, 3) = Cells(ii, 1) And Cells(ii, 7) = "" Then
Cells(ii, 7) = Sheets(1).Cells(i, 1)
Cells(ii, 8) = Sheets(1).Cells(i, 2)
Exit For
'if sheet2 column G already has info in it, create a new row
ElseIf Sheets(1).Cells(i, 3) = Cells(ii, 1) And Cells(ii, 7) <> "" Then
Rows(ii).Select
Selection.Insert Shift:=xlShiftDown
Cells(ii, 1) = Sheets(1).Cells(i, 3)
Cells(ii, 7) = Sheets(1).Cells(i, 1)
Cells(ii, 8) = Sheets(1).Cells(i, 2)
Exit For
End If
Next ii
Next i
End Sub
All help appreciated
SHEET1
SHEET2
In my code below I refer to columns by their name (like "A", "B") instead of their number as you have done. This isn't intended as criticism. On the contrary, I much prefer to use numbers and usually declare them in enumerations. However, I felt that you might find my code more readable with the syntax I chose.
Sub CopyUniqueItems()
' 09 Aug 2017
Const RsFirst As Long = 2
Const RtFirst As Long = 2
Const Lot As Long = 1
Const Part As Long = 2
Const Col As Long = 3
Dim WsS As Worksheet ' S = Source
Dim WsT As Worksheet ' T = Target
Dim Rng As Range
Dim Itm As Variant
Dim Rs As Long, RsLast As Long ' Row / last row in WsS
Dim Rt As Variant, RtLast As Long ' Row / last row in WsT
Set WsS = Worksheets(1) ' { better to call by name
Set WsT = Worksheets(2) ' { like Worksheets("Sheet2")
RsLast = WsS.Cells(WsS.Rows.Count, "C").End(xlUp).Row
Application.ScreenUpdating = False
For Rs = RsFirst To RsLast
With WsS
Itm = .Range(.Cells(Rs, "A"), .Cells(Rs, "C")).Value
End With
With WsT
RtLast = .Cells(.Rows.Count, "A").End(xlUp).Row
With .Columns("A")
Set Rng = .Range(.Cells(RtFirst), .Cells(RtLast))
End With
On Error Resume Next
Rt = Application.Match(Itm(1, Lot), Rng, 0)
If IsError(Rt) Then
' not found
Rt = Application.Max(RtLast + 1, RtFirst)
Else
' exists already
Rt = Rt + RtFirst - 1
Do
If (.Cells(Rt, "G").Value = Itm(1, Part)) And _
(.Cells(Rt, "H").Value = Itm(1, Col)) Then
Rt = 0
Exit Do
Else
Rt = Rt + 1
End If
Loop While .Cells(Rt, "A").Value = Itm(1, Lot)
.Rows(Rt).Insert Shift:=xlShiftDown
End If
If Rt Then
.Cells(Rt, "A").Value = Itm(1, Lot)
.Cells(Rt, "G").Value = Itm(1, Part)
.Cells(Rt, "H").Value = Itm(1, Col)
End If
End With
Next Rs
Application.ScreenUpdating = True
End Sub
BTW, Dim rowStartSheet1, rowStartSheet2, lastRowSheet1, lastRowSheet2 As Integer declares only lastRowSheet2 as integer. All the others are undefined and therefore variants.

Excel VBA: How to transform this kind of cells?

I am not sure if the title is correct. Please correct me if you have a better idea.
Here is my problem: Please see the picture.
This excel sheet contains only one column, let's say ColumnA. In ColumnA there are some cells repeat themselvs in the continued cells twice or three times (or even more).
I want to have the excel sheet transformed according to those repeated cells. For those items which repeat three times or more, keep only two of them.
[Shown in the right part of the picture. There are three Bs originally, target is just keep two Bs and delete the rest Bs.]
It's a very difficult task for me. To make it easier, it's no need to delete the empty rows after transformation.
Any kind of help will be highly appreciated. Thanks!
#
Update:
Please see the picture. Please dont delete the items if they show again...
EDITED - SEE BELOW Try this. Data is assumed to be in "Sheet1", and ordered data is written to "Results". I named your repeted data (A, B, C, etc) as sMarker, and values in between as sInsideTheMarker. If markers are not consecutive, the code will fail.
Private Sub ReOrderData()
Dim lLastRow As Long
Dim i As Integer
Dim a As Integer
Dim j As Integer
Dim sMarker As String
Dim sInsideTheMarker As String
'Get number of rows with data:
lLastRow = Worksheets("Sheet1").Cells(Rows.Count, 1).End(xlUp).Row
j = 0
k = 1
a = 2
'Scan all rows with data:
For i = 1 To lLastRow
If (Worksheets("Sheet1").Cells(i + 1, 1).Value = Worksheets("Sheet1").Cells(i, 1).Value) Then 'If two consecutive cells holds the same value
j = j + 1
If j = 1 Then
k = k + 1
a = 2
sMarker = Worksheets("Sheet1").Cells(i, 1).Value
Worksheets("Results").Cells(k, 1).Value = sMarker
End If
Else 'If not same values in consecutive cells
sInsideTheMarker = Worksheets("Sheet1").Cells(i, 1).Value
Worksheets("Results").Cells(k, a).Value = sInsideTheMarker
a = a + 1
j = 0
End If
Next i
End Sub
EDITION: If you want results in the same sheet ("Sheet1"), and keep the empty rows for results to look exactly as your question, try the following
Private Sub ReOrderData()
Dim lLastRow As Long
Dim i As Integer
Dim a As Integer
Dim j As Integer
Dim sMarker As String
Dim sInsideTheMarker As String
'Get number of rows with data:
lLastRow = Worksheets("Sheet1").Cells(Rows.Count, 1).End(xlUp).Row
j = 0
k = 1
a = 5
'Scan all rows with data:
For i = 1 To lLastRow
If (Worksheets("Sheet1").Cells(i + 1, 1).Value = Worksheets("Sheet1").Cells(i, 1).Value) Then 'If two consecutive cells holds the same value
j = j + 1
If j = 1 Then
k = i
a = 5
sMarker = Worksheets("Sheet1").Cells(i, 1).Value
Worksheets("Sheet1").Cells(k, 4).Value = sMarker
End If
Else 'If not same values in consecutive cells
sInsideTheMarker = Worksheets("Sheet1").Cells(i, 1).Value
Worksheets("Sheet1").Cells(k, a).Value = sInsideTheMarker
a = a + 1
j = 0
End If
Next i
End Sub
If you can delete the values that have more than two counts, then I suggest that this might work:
Sub count_macro()
Dim a As Integer
Dim b As Integer
a = 1
While Cells(a, 1) <> ""
b = WorksheetFunction.CountIf(Range("A1:A1000"), Cells(a, 1))
If b > 2 Then
Cells(a, 1).Delete Shift:=xlUp
End If
b = 0
a = a + 1
Wend
End Sub
This should do it. It takes input in column A starting in Row 2 until it ends, and ignores more than 2 same consecutive values. Then it copies them in sets and pastes them transposed. If your data is in a different column and row, change the sourceRange variable and the i variable accordingly.
Sub SETranspose()
Application.ScreenUpdating = False
Dim sourceRange As range
Dim copyRange As range
Dim myCell As range
Set sourceRange = range("A2", Cells(Rows.count, 1).End(xlUp))
Dim startCell As range
Set startCell = sourceRange(1, 1)
Dim i As Integer
Dim haveTwo As Boolean
haveTwo = True
For i = 3 To Cells(Rows.count, 1).End(xlUp).Row + 1
If Cells(i, 1).Value = startCell.Value Then
If haveTwo Then
range(startCell, Cells(i, 1)).Copy
startCell.Offset(0, 4).PasteSpecial Transpose:=True
Application.CutCopyMode = False
haveTwo = False
End If
End If
'if the letter changes or end of set, then copy the set over
'If LCase(Left(Cells(i, 1).Value, 1)) <> LCase(startCell.Value) Or _
'i = Cells(Rows.count, 1).End(xlUp).Row + 1 Then
If Len(Cells(i, 1).Value) > 1 Then
Set copyRange = Cells(i, 1)
copyRange.Copy
Cells(startCell.Row, Columns.count).End(xlToLeft).Offset(0, 1).PasteSpecial
Application.CutCopyMode = False
'Set startCell = sourceRange(i - 1, 1)
ElseIf Len(Cells(i, 1).Value) = 1 And Cells(i, 1).Value <> startCell.Value Then
Set startCell = sourceRange(i - 1, 1)
haveTwo = True
End If
Next i
'clear up data
Set sourceRange = Nothing
Set copyRange = Nothing
Set startCell = Nothing
Application.ScreenUpdating = True
End Sub

How to change my code to run it more speedy?

I've one workbook with 170K rows, I will delete all rows when the result between cells is 0,
For those operation, normally I use the code below, but with 170K (the rows will be deleted are 90K) the code run very slowly.
Someone know another way more performance.
Thank
Last = Cells(Rows.Count, "K").End(xlUp).Row
For i = Last To 2 Step -1
If (Cells(i, "K").Value + Cells(i, "L").Value) < 1 Then
Cells(i, "A").EntireRow.Delete
End If
Next i
As long as your fine putting the data on a new tab, the code below will do everything you need in 1.5 seconds.
Sub ExtractRows()
Dim vDataTable As Variant
Dim vNewDataTable As Variant
Dim vHeaders As Variant
Dim lastRow As Long
Dim i As Long, j As Long
Dim Counter1 As Long, Counter2 As Long
With Worksheets(1)
lastRow = .Cells(Rows.Count, "K").End(xlUp).row
vHeaders = .Range("A1:L1").Value2
vDataTable = .Range("A2:L" & lastRow).Value2
End With
For i = 1 To UBound(vDataTable)
If vDataTable(i, 11) + vDataTable(i, 12) > 0 Then
Counter1 = Counter1 + 1
End If
Next
ReDim vNewDataTable(1 To Counter1, 1 To 12)
For i = 1 To UBound(vDataTable)
If vDataTable(i, 11) + vDataTable(i, 12) > 0 Then
Counter2 = Counter2 + 1
For j = 1 To 12
vNewDataTable(Counter2, j) = vDataTable(i, j)
Next j
End If
Next
Worksheets.Add After:=Worksheets(1)
With Worksheets(2)
.Range("A1:L1") = vHeaders
.Range("A2:L" & Counter1 + 1) = vNewDataTable
End With
End Sub
Here, my approach for your problem according to rwilson's idea.
I already tested it. It very very reduce executing time. Try it.
Sub deleteRow()
Dim newSheet As Worksheet
Dim lastRow, newRow As Long
Dim sheetname As String
Dim startTime As Double
sheetname = "sheetname"
With Sheets(sheetname)
Set newSheet = ThisWorkbook.Worksheets.Add(After:=Sheets(.Name))
'Firstly copy header
newSheet.Rows(1).EntireRow.Value = .Rows(1).EntireRow.Value
lastRow = .Cells(.Rows.Count, "K").End(xlUp).row
newRow = 2
For row = 2 To lastRow Step 1
If (.Cells(row, "K").Value + .Cells(row, "L").Value) >= 1 Then
newSheet.Rows(newRow).EntireRow.Value = .Rows(row).EntireRow.Value
newRow = newRow + 1
End If
Next row
End With
Application.DisplayAlerts = False
Sheets(sheetname).Delete
Application.DisplayAlerts = True
newSheet.Name = sheetname
End Sub
Here is a non-VBA option you can try:
In column M compute the sum of columns K and L
Highlight column M and the click Find and select > Find
Type in 0 in the Find what box and also select values in the Look in box
Select Find all and in the box that shows the found items select all entires (click in the box and press CTRL + A)
On the ribbon select Delete and then Delete sheet rows
Now manually delete column M
I haven't tried this with 170k+ rows but maybe worth assessing performance versus the VBA loop.
thank at all for your ideas but the really fast code is: use an array tu populate whit the correct date and replare all table of the end sort the table:
Sub Macro13(control As IRibbonControl)
Dim avvio As Date
Dim arresto As Date
Dim tempo As Date
Application.ScreenUpdating = False
Application.Calculation = xlManual
avvio = Now()
Dim sh As Worksheet
Dim arng As Variant
Dim arrdb As Variant
Dim UR As Long, x As Long, y As Long
Dim MyCol As Integer
Set sh = Sheets("Rol_db")
MyCol = 1
sh.Select
UR = sh.Cells(Rows.Count, MyCol).End(xlUp).Row
ReDim arrdb(2 To UR, 1 To 12) As Variant
For x = 2 To UR
If Cells(x, 11) + Cells(x, 12) > 0 Then
For y = 1 To 12
arrdb(x, y) = Cells(x, y)
Next y
Else
For y = 1 To 12
arrdb(x, y) = ""
Next y
End If
Next x
sh.Range("A2:L" & UR) = arrdb
arresto = Now()
tempo = arresto - avvio
Debug.Print "Delete empty rows " & tempo
Range("A2:L" & UR).Sort key1:=Range("A2:L" & UR), _
order1:=xlAscending, Header:=xlNo
Range("A4").Select
ActiveWindow.FreezePanes = True
conclusioni:
Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic
End Sub
time for my sheet 170K 00:00:07.
as soon as I have a minute I feel a loop of the columns

VBA Writes Extra Lines When Impossible To Do So

I was wokring on the piece of code below, and had it functioning correctly. Somehow a line has changed and meant now it fails to function
I have a table of tags and weightings like so:
Tag | Weight
---------------
Sport | 1
Music | 1
And then another table of users, with tag + weight
User | Tag | Weight
The cell(j, "B") contains the username, as does the cell(2,"C") in the other worksheet
I am using the following code:
Sub swipeleft()
LastRowUser = Worksheets(13).Range("B65536").End(xlUp).Row
LastRowInput = Worksheets(14).Range("F65536").End(xlUp).Row
LastRowUser = LastRowUser + 1
newcount = 1
For j = 2 To LastRowUser
For k = 9 To LastRowInput
If Worksheets(14).Cells(k, "F") = Worksheets(13).Cells(j, "C") And Worksheets(13).Cells(j, "B") = Worksheets(14).Cells(2, "C") Then
Worksheets(13).Cells(j, "D") = Worksheets(13).Cells(j, "D") - Worksheets(14).Cells(k, "G")
ElseIf Not Worksheets(13).Cells(j, "B") = Worksheets(14).Cells(2, "C") Then
Worksheets(13).Cells(newcount + LastRowUser, "C") = Worksheets(14).Cells(k, "F")
Worksheets(13).Cells(newcount + LastRowUser, "D") = Worksheets(14).Cells(k, "G") * (-1)
Worksheets(13).Cells(newcount + LastRowUser, "B") = Worksheets(14).Cells(2, "C")
newcount = newcount + 1
End If
Next k
Next j
End Sub
This adds the rows when data is not present, but for some reason after the first run it keeps adding exponentially more rows, even though the second else condition is not met?
UPDATED FROM COMMENTS BELOW
Here is the user input page (Worksheet 14):
Here is the user database page (Worksheet 13):
On the user database page I would like it to add the two rows that dont exist (Music, Dance) and add the Sports tag weighting (-1) from the input page to the current value in user database page
Is this what you want?
Code:
Dim AR
Dim nWeight As Long
Dim wsI As Worksheet, wsO As Worksheet
Dim LRowWsI As Long, LRowWsO As Long, NewRowWsO As Long
Sub swipeleft()
Dim i As Long, j As Long
Set wsI = ThisWorkbook.Sheets(14)
Set wsO = ThisWorkbook.Sheets(13)
LRowWsI = wsI.Range("F" & wsI.Rows.Count).End(xlUp).Row
LRowWsO = wsO.Range("B" & wsI.Rows.Count).End(xlUp).Row
NewRowWsO = LRowWsO + 1
AR = wsI.Range("F9:G" & LRowWsI).Value
With wsO
For i = LBound(AR) To UBound(AR)
For j = 2 To LRowWsO
If RecordExists(wsI.Range("C2").Value, AR(i, 1)) Then
.Range("D" & j).Value = AR(i, 2)
Else
.Range("B" & NewRowWsO).Value = wsI.Range("C2").Value
.Range("C" & NewRowWsO).Value = AR(i, 1)
.Range("D" & NewRowWsO).Value = AR(i, 2)
NewRowWsO = NewRowWsO + 1
End If
Next j
Next i
End With
End Sub
Function RecordExists(sUser As Variant, sTag As Variant) As Boolean
Dim a As Long
With wsO
For a = 2 To LRowWsO
If .Range("B" & a).Value = sUser And .Range("C" & a).Value = sTag Then
RecordExists = True
Exit For
End If
Next
End With
End Function
Screenshot:

Resources