Delete duplicate rows and leave specific row behind - excel

I have date in the following example format:
ABC 001
ABC 002
ABC 003
ABC 004
I want to remove duplcate rows in column A BUT leave the line with the highest value in column B (in this case 004). A simple duplicate removal doesn't give me the control on which value is not deleted (unless I'm missing something).
This is part of a larger VBA code and therefore, I'd like to do it via VBA. I greatly appreciate any and all help.

Assuming that column B contains numeric values, then you can use the code below to remove all non-max-duplicates. This works however the data is sorted since it loads the information into an array that keeps track of which value from column B is the largest.
Sub RemoveDuplicates()
Dim sht As Worksheet
Dim NonDupArr() As Variant
Dim i As Integer
Dim j As Integer
Dim EntryFound As Boolean
Set sht = ActiveSheet
'Reads range into an array and retains the records with the largest value
For i = 2 To sht.Cells(sht.Rows.Count, 1).End(xlUp).Row Step 1
EntryFound = False
'If first entry
If i = 2 Then
ReDim Preserve NonDupArr(1 To 2, 1 To 1)
NonDupArr(1, 1) = sht.Cells(i, 1).Value
NonDupArr(2, 1) = sht.Cells(i, 2).Value
'For all other entries
Else
'Loops through array to see if entry already exist
For j = LBound(NonDupArr, 2) To UBound(NonDupArr, 2)
If sht.Cells(i, 1).Value = NonDupArr(1, j) Then
'If enty exists it replaces the value from column B if larger than
'the entry allready in the array
If sht.Cells(i, 2).Value > NonDupArr(2, j) Then
NonDupArr(2, j) = sht.Cells(i, 2).Value
End If
EntryFound = True
Exit For
End If
Next j
'If no entry were found it will be added to the array
If Not EntryFound Then
ReDim Preserve NonDupArr(1 To 2, 1 To UBound(NonDupArr, 2) + 1)
NonDupArr(1, UBound(NonDupArr, 2)) = sht.Cells(i, 1).Value
NonDupArr(2, UBound(NonDupArr, 2)) = sht.Cells(i, 2).Value
End If
End If
Next i
'Loops through the sheet and removes all rows that doesn't match rows in the array
For i = sht.Cells(sht.Rows.Count, 1).End(xlUp).Row To 2 Step -1
'Searches for match in array
For j = LBound(NonDupArr, 2) To UBound(NonDupArr, 2)
'If this is not the largest entry then the row is removed
If sht.Cells(i, 1).Value = NonDupArr(1, j) And sht.Cells(i, 2).Value <> NonDupArr(2, j) Then
sht.Cells(i, 1).EntireRow.Delete
Exit For
End If
Next j
Next i
End Sub

Related

Generate random numbers based on row and colum totals

I would like populate the blue area with random numbers.
sum of C3 to R3 should be equal to B3 value: 124
also;
sum of C3 to C26 should be equal to C2 value: 705
I tried to achieve it with the following code:
(this code was originally posted here: Code by #Mech
Sub RandomNumbersArray()
' dim your variables. this tells vba what type of variable it is working with
Dim lRow As Long
Dim wb As Workbook: Set wb = ThisWorkbook
Dim ws As Worksheet: Set ws = wb.Worksheets("SPLIT BY DAYS")
' find the last row in column b (2) in the above defined ws
lRow = ws.Cells(ws.Rows.Count, 2).End(xlUp).Row
' loop through rows 3 to last row
For i = 3 To lRow
' generate a random number between 0 and the row contents of column B (5)
ws.Cells(i, 3).Value = Int(Rnd() * (ws.Cells(i, 2).Value + 1))
' generate a random number between 0 and the difference between column B and colum C
ws.Cells(i, 4).Value = Int(Rnd() * (ws.Cells(i, 2).Value - ws.Cells(i, 3).Value))
' subtract the difference between column B and the sum of column C and column D
ws.Cells(i, 5).Value = ws.Cells(i, 2).Value - (ws.Cells(i, 3).Value + ws.Cells(i, 4).Value)
' subtract the difference between column B and the sum of column C and column D and column E
ws.Cells(i, 6).Value = ws.Cells(i, 2).Value - (ws.Cells(i, 3).Value + ws.Cells(i, 4).Value + ws.Cells(i, 5).Value)
' subtract the difference between column B and the sum of column C and column D and column E and column F
ws.Cells(i, 7).Value = ws.Cells(i, 2).Value - (ws.Cells(i, 3).Value + ws.Cells(i, 4).Value + ws.Cells(i, 5).Value + ws.Cells(i, 6).Value)
Next i
' sum column C (column 3) and place the value in C2
ws.Cells(2, 3).Value = Application.WorksheetFunction.Sum(Range(Cells(3, 3), Cells(lRow, 3)))
' sum column D (column 4) and place the value in D2
ws.Cells(2, 4).Value = Application.WorksheetFunction.Sum(Range(Cells(3, 4), Cells(lRow, 4)))
' sum column E (column 5) and place the value in E2
ws.Cells(2, 5).Value = Application.WorksheetFunction.Sum(Range(Cells(3, 5), Cells(lRow, 5)))
' sum column F (column 6) and place the value in F2
ws.Cells(2, 6).Value = Application.WorksheetFunction.Sum(Range(Cells(3, 6), Cells(lRow, 6)))
' sum column G (column 7) and place the value in F2
ws.Cells(2, 7).Value = Application.WorksheetFunction.Sum(Range(Cells(3, 7), Cells(lRow, 7)))
End Sub
EDIT: Just to clarify, no negative numbers.
Here is something to try:
Set all cells to 0. Create a list of all cells (some kind of reference to each cell).
Now, randomly choose a cell from your list, and add 1 to that cell. The very first time, all cells will be 0, except for one, which will now be 1.
For this cell that you just incremented, add up the row and column and see if the sums have been reached. If either the row or the column sum has been reached, remove this cell reference from the list.
Repeat (randomly choose a cell from those remaining on the list) until the list is empty.
At each iteration you are randomly choosing one of the remaining cells in the reference list (not choosing from all the cells) and this list is getting smaller and smaller as column or row sums are reached.
It should be the case that random cells will increment, and if the columns and sums can in fact be calculated by values without logical inconsistencies, you should fairly quickly reach that point when the reference list falls empty.
I have a solution.
Answers so far have mostly been about finding values which are random, then fixing them to fit the totals.
I tried finding a calculated (non random) solution that fits the totals, then made a separate sub to randomize it. This way you can prevent the randomization from introducing negative values.
There are two procedures, This sub will call them both on the same Range.
Sub Call_Random_Array
Dim wb As Workbook: Set wb = ThisWorkbook
Dim ws As Worksheet: Set ws = wb.Worksheets("SPLIT BY DAYS")
Dim RangeToFill as Range: Set RangeToFill = ws.Range("C3:R26") 'Edit this line to select whatever range you need to fill randomly
'Proportionately fill the array to fit totals:
Call ProportionateFillArray(RangeToFill)
'Randomize it x times
For x = 1 to 10 'increase this number for more randomisation
Call RandomizeValues(RangeToFill)
Next
End Sub
Proportionately fill the array to fit totals:
Sub ProportionateFillArray(rngAddress As Range)
Dim ws As Worksheet: Set ws = rngAddress.Worksheet
'Horizontal and Vertical target values as ranges:
Dim hTarg As Range, vTarg As Range
Set hTarg = rngAddress.Rows(1).Offset(-1, 0)
Set vTarg = rngAddress.Columns(1).Offset(0, -1)
'Check the totals match
If Not WorksheetFunction.Sum(hTarg) = WorksheetFunction.Sum(vTarg) Then
'totals don't match
MsgBox "Change the targets so both the horizontal and vertical targets add up to the same number."
Exit Sub
End If
With rngAddress
'Now fill rows and columns with integers
Dim Row As Long, Col As Long
For Row = 1 To .Rows.Count
For Col = 1 To .Columns.Count
.Cells(Row, Col) = Int( _
hTarg.Cells(Col) * vTarg.Cells(Row) / WorksheetFunction.Sum(hTarg) _
)
Next
Next
'Correct rounding errors
For Row = 1 To .Rows.Count
For Col = 1 To .Columns.Count
If Row = .Rows.Count Then
'Last row, so this column must be corrected come what may
.Cells(Row, Col) = .Cells(Row, Col) - WorksheetFunction.Sum(.Columns(Col)) + hTarg.Cells(Col)
ElseIf Col = .Columns.Count Then
'Last column, so must be corrected come what may
.Cells(Row, Col) = .Cells(Row, Col) - WorksheetFunction.Sum(.Rows(Row)) + vTarg.Cells(Row)
ElseIf _
(WorksheetFunction.Sum(.Rows(Row)) - vTarg.Cells(Row)) * _
(WorksheetFunction.Sum(.Columns(Col)) - hTarg.Cells(Col)) > 0 Then
'both row and column are incorrect in the same direction
.Cells(Row, Col) = .Cells(Row, Col) - WorksheetFunction.Max( _
WorksheetFunction.Sum(.Rows(Row)) - vTarg.Cells(Row), _
WorksheetFunction.Sum(.Columns(Col)) - hTarg.Cells(Col))
End If
Next
Next
End With
End Sub
Randomize an array without changing row or column totals:
Sub RandomizeValues(rngAddress As Range)
Dim ws As Worksheet: Set ws = rngAddress.Worksheet
Dim rngIncrease(1 To 2) As Range, rngDecrease(1 To 2) As Range, lDiff As Long
With rngAddress
'Select two cells to increase at random
For a = 1 To 2
Set rngIncrease(a) = .Cells(RndIntegerBetween(1, .Rows.Count), RndIntegerBetween(1, .Columns.Count))
rngIncrease(a).Select
Next
'Corresponding cells to decrease to make totals the same:
Set rngDecrease(1) = ws.Cells(rngIncrease(1).Row, rngIncrease(2).Column)
Set rngDecrease(2) = ws.Cells(rngIncrease(2).Row, rngIncrease(1).Column)
'Set the value to increase/decrease by - can't be more than the smallest rngDecrease Value, to prevent negative values
If Not WorksheetFunction.Min(rngDecrease) > 1 Then
'Don't decrease a value below 1
Exit Sub
Else
lDiff = RndIntegerBetween(1, WorksheetFunction.Min(rngDecrease)-1)
End If
'Now apply the edits
For a = 1 To 2
rngIncrease(a) = rngIncrease(a) + lDiff
rngDecrease(a) = rngDecrease(a) - lDiff
Next
End With
End Sub
'The below is the Random Integer function, I also used it in my other answer
Function RndIntegerBetween(Min As Long, Max As Long) As Long
RndIntegerBetween = Int((Max - Min + 1) * Rnd + Min)
End Function
This code is for what you were trying to do, not exactly how you explained it though (see comments). If this is what you were looking for, then your explanation was a bit off, otherwise let me know what you did mean.
Sub RandomNumbersArray()
Dim lRow As Long, lColumn As Long, remainingValue As Long
Dim wb As Workbook: Set wb = ActiveWorkbook
Dim ws As Worksheet: Set ws = wb.Worksheets("SPLIT BY DAYS")
lRow = ws.Cells(ws.Rows.Count, 2).End(xlUp).Row
lColumn = ws.Cells(2, ws.Columns.Count).End(xlToLeft).Column
For i = 3 To lRow 'loop through the rows
remainingValue = ws.Cells(i, 2).Value2
For j = 3 To lColumn 'loop through all the columns per row
' generate a random number between 0 and the row contents of column B - previous column
If j = lColumn Then 'last cell can't be random unless you want to extend the columns until the sum in B-column is met
ws.Cells(i, j).Value2 = remainingValue
Else
ws.Cells(i, j).Value2 = Int((remainingValue + 1) * Rnd)
End If
remainingValue = remainingValue - ws.Cells(i, j).Value2
Next j
Next i
For j = 3 To lColumn 'loop through the columns to set the sum
ws.Cells(2, j).Value2 = Application.WorksheetFunction.Sum(Range(Cells(3, j), Cells(lRow, j)))
Next j
End Sub
I'm yet to get past the O-column with any value above 0 however

How to get code to correctly count items (a variable) from one spreadsheet and successfully display this information?

I need my VBA code to count all the "x's" on a certain spreadsheet(pc) and then transfer this information to a report (rp) I am creating to display all the individuals choices. At the moment the code identifies all the ID on the sheet but however only acknowledges the first 4 options for each individual, where as some have much more than this. Throughout the course of this code I have made edits to options from Column K to Y and I assume this is the reason why the code is only acknowledging the options that haven't been altered. I have made adaptions to the code but have no idea how to correct this so that all options are successfully displayed.
Any help would be greatly appreciated!
Specific Spreadsheet Code will Read from
Code Report Results
rp.Cells(1, 1) = "Modules"
rp.Cells(1, 2) = "Student Count"
rp.Cells(1, 4) = "Students registered"
rp.Cells(1, 10) = "Students registered2" 'new
nRow = 2
For c = 2 To pc.Cells(1, Columns.Count).End(xlToLeft).Column
rp.Cells(nRow, 1) = pc.Cells(1, c)
rp.Cells(nRow, 2) = WorksheetFunction.CountIf(pc.Columns(c), "x")
nRow = nRow + 1
Next c
rp.Cells(1, 1).CurrentRegion.Borders.LineStyle = xlContinuous
If rp.Cells(2, 4).Text <> "" Then
rp.Cells(1, 4).CurrentRegion.Borders.LineStyle = xlContinuous
End If
rp.Rows(1).Font.Bold = True
rp.UsedRange.Columns.AutoFit
Although your code snippet is not sufficient to determine the cause of your problem you would definitely gain by not interacting with the sheet when manipulating data. consider the example hereunder as an alternative approach:
Option Explicit
Sub consolidate()
Dim arr, arrH
With Sheet1
arr = .Range("A1").CurrentRegion.Offset(1, 0).Value2 'get all data in memory
arrH = .Range(.Cells(1, 1), .Cells(1, UBound(arr, 2))).Value2 'get the header in an array
End With
Dim j As Long, i As Long, ii As Long: ii = 1
Dim arrC: ReDim arrC(1 To 1, 1 To UBound(arrH, 2)) '=> setup counter array
Dim arr2: ReDim arr2(1 To UBound(arr), 1 To UBound(arr, 2)) '=> setup new array to modify source data
For j = 1 To UBound(arr) 'traverse rows
For i = 1 To UBound(arr, 2) 'traverse columns
'here we can access each cell by referencing our array(<rowCounter>, <columnCounter>
'e.g. arr(j,i) => if j = 1 and i = 1 we'll have the values of Cell A1
'we can dump these values anywhere in the activesheet, other sheet, other workbook, ..
'but to limit the number of interactions with our sheet object we can also use an intermediant arrays
If arr(j, i) <> "" Then 'check if x
arr2(j, ii) = arrH(1, i) 'replace x with the value from the header
arr2(j, 1) = arr(j, 1) 'force the value in col1
ii = ii + 1 'increment consolidated counter
arrC(1, i) = arrC(1, i) + 1 'increment sum
End If
Next i
ii = 1 'reset consolidated counter for next line
Next j
'when we are ready with our data we dumb to the sheet
With Sheet2 'the with allows us the re-use the sheet name without typing it again
'the ubound function allows us to size the "range" to the same size as our array, once that's done we can just dumb it to the sheet
.Range(.Cells(1, 1), .Cells(UBound(arrH, 2), 1)).Value2 = Application.WorksheetFunction.Transpose(arrH) 'transpose to get the col's in rows
.Range(.Cells(1, 2), .Cells(UBound(arrC, 2), 2)).Value2 = Application.WorksheetFunction.Transpose(arrC)
.Range(.Cells(1, 4), .Cells(UBound(arr2), UBound(arr2, 2) + 3)).Value2 = arr2
End With
End Sub

Use VBA to display select range values in Excel listbox

I have tried without success to display a non-contiguous range of row cells into a 2-column listbox in Excel 2016. The lastest suggestion I'm using is to move range values into an array and use the array to write to the listbox.
I used the FindAll function to search through a large name range to extract rows that met the string criteria. I end up with a 28 row x 4 column non-contiguous range ( eg: A3:D3, A7:D7, A15:D15, A25:D25, ...) .
My issue is at the top of the 2nd loop cycle:
Dim result As Range, item As Range
Dim Arr() As String, sStr As String
sStr = "A3:D3","A7:D7","A15:D15","A25:D25"
Set result = Range(sStr)
i = 0
For Each item In result
ReDim Preserve Arr(i+1, 4)
Arr(i , 1) = item(1, 1).Value
Arr(i , 2) = item(1, 2).Value
Arr(i , 3) = item(1, 3).Value
Arr(i , 4) = item(1, 4).Value
i = i + 1
Next
The 1st loop interation works with correct values passed from item -> Arr, but at the 2nd loop all goes awry. Looking at the watch list the item.values are not correct and loop bombs at ReDim statement with a runtime error '9': Subscript out of range. It must be the range index and/or ReDim statement.
Does this work for you?
Dim result As Range, item As Range
Dim Arr() As String, sStr As String
sStr = "A3:D3,A7:D7,A15:D15,A25:D25"
Set result = Range(sStr)
ReDim Arr(result.Areas.Count, 4)
i = 0
For Each item In result.Areas
Arr(i, 1) = item(1, 1).Value
Arr(i, 2) = item(1, 2).Value
Arr(i, 3) = item(1, 3).Value
Arr(i, 4) = item(1, 4).Value
i = i + 1
Next

Architecture to grab range

My code mostly works but it's taking a while to debug so I am beginning to think my architecture may be flawed XD So how can I architect this better?
I have groups of data separated by a blank row. You can tell each group apart by the ID in column C in addition to the blank row. For each ID, I have various numbers in column B that I need to capture. Sometimes those numbers only start with 5, sometimes it starts with 7. I need to capture the 5 and the 7 separately.
With projWS
With .Range("C1:C6000")
Set f = .Find(cc, LookIn:=xlValues, lookat:=xlPart)
End With
If Not f Is Nothing Then 'first occurence found
counter = 0
i = f.Row
Do
acct = .Cells(i, 2)
If (Len(projWS.Cells(i, 3)) < 1 Or Left(acct, 1) = "7") And done = False Then
acctStart = f.Row
acctRows = i - acctStart
Set acctRng = .Range(.Cells(acctStart, 2), .Cells(i - 1, 5))
Set amountRng = .Range(.Cells(acctStart, 7), .Cells(i - 1, 8))
done = True 'set flag to show range has been filled
End If
counter = counter + 1 'increment counter
i = i + 1 'move to next row
Loop Until Len(.Cells(i, 3)) < 1 'keep looping until blank row
End If
If counter - 1 > acctRows Then 'how we determine if there's a "7"
flag = True 'so we set flag to true
Set depreRng = Range(.Cells(acctStart + acctRows, 2), .Cells(i - 1, 8))
dep = depreRng.Value2 'store range into array
End If
End With
After capture, I need to drop it into another worksheet. This worksheet already has a block of 7 built in. Hence this is the loop I am using to drop the range of 7. There is no built in block for the 5.
For r = 112 To 120
For k = 1 To UBound(dep())
If .Cells(r, 1).Value2 = Trim(dep(k, 1)) Then
Debug.Print .Cells(r, 1).Value2
.Cells(r, 6) = dep(k, 6)
.Cells(r, 7) = dep(k, 7)
Exit For
Else
.Cells(r, 6) = 0
.Cells(r, 7) = 0
End If
Next k
Next r
I have debugged several errors already. The current one is that depreRng is breaking because my math is bad. Instead of debugging each error as I stumble onto it, how can I architect this better?
Ok, my approach it's different. First i use a filter for find the range of rows with the index you are looking for and then loop inside this filtered rows for find the 5xx and the 7xx range. The code:
Sub Macro1()
Dim rng_5xx_start, rng_5xx_stop, rng_7xx_start, rng_7xx_stop As Integer
rng_5xx_start = 0
rng_5xx_stop = 0
rng_7xx_start = 0
rng_7xx_stop = 0
Dim range_5xx, range_7xx As String
'filter for the index you are looking for
'specify the maximum range, the field is the "offset" from the column B (the firts of the range), so for filter for column C you need to put 2, criteria...is the critera :)
ActiveSheet.Range("$B$1:$H$6000").AutoFilter Field:=2, Criteria1:="b"
'the filter returns only the rows with the specifyed index, now a for inside this rows for find the 5xx and the 7xx sub-ranges
For Each Row In ActiveSheet.Range("b1:b6000").SpecialCells(xlCellTypeVisible)
If Cells(Row.Row, 2).Value > 4999 And Cells(Row.Row, 2).Value < 6000 Then
'or any test for understnd if i'm in the 5xx range, if you prefer use the strings use something like left(cells(row.row,2).value,1) = "5"
If rng_5xx_start = 0 Then 'found the first row with a 5xx value
rng_5xx_start = Row.Row 'set the start of the range to this row
End If
If rng_5xx_stop < Row.Row Then 'the row where i am is in the 5xx range and is grater than the current end i noticed
rng_5xx_stop = Row.Row 'refresh the end of the range...at the end this will have the last number of row of the 5xx range
End If
End If
If Cells(Row.Row, 2).Value > 6999 And Cells(Row.Row, 2).Value < 8000 Then
'same as above but for 7xx range
If rng_7xx_start = 0 Then
rng_7xx_start = Row.Row
End If
If rng_7xx_stop < Row.Row Then
rng_7xx_stop = Row.Row
End If
End If
Next
If rng_5xx_start = 0 Then
'not found 5xx rows
range_5xx = "" 'or False, or what you prefer...
Else
range_5xx = "B" & rng_5xx_start & ":H" & rng_5xx_stop
End If
If rng_7xx_start = 0 Then
'not found 7xx rows
range_7xx = "" 'or False, or what you prefer...
Else
range_7xx = "B" & rng_7xx_start & ":H" & rng_7xx_stop
End If
End Sub
That's how i would imagine a macro for your job ;)
Edit 1:
I forgot that this will leave the sheet with the filter on...use activesheet.showalldata for show all the rows and not only the filtered ones
Edit 2:
The tests
If rng_5xx_stop < Row.Row Then
rng_5xx_stop = Row.Row
End If
and
If rng_7xx_stop < Row.Row Then
rng_7xx_stop = Row.Row
End If
are not necessary, it's enough do rng_5xx_stop = Row.Row and rng_7xx_stop = Row.Row and save the two IF statements
You are grouping cells based on the first number of the cell values in column B (I am assuming that they can never be letters). If that is the case, then you can create an array of 0 to 9 and store your ranges in there. Then go through the range.areas in order to get the groupings you're looking for (as highlighted in your screenshot).
To do this, something like this is all you need. I commented code to try to explain it more:
Sub tgr()
Dim wsData As Worksheet
Dim rColB As Range
Dim BCell As Range
Dim aRanges(0 To 9) As Range
Dim SubGroup As Range
Dim lRangeNum As Long
Dim i As Long
'Change to your actual worksheet
Set wsData = ActiveWorkbook.ActiveSheet
'Change to your actual column range, this is based off the sample data
Set rColB = wsData.Range("B1", wsData.Cells(wsData.Rows.Count, "B").End(xlUp))
'Loop through the column range
For Each BCell In rColB.Cells
'Make sure the cell is populated and the starting character is numeric
If Len(BCell.Value) > 0 And IsNumeric(Left(BCell.Value, 1)) Then
'Get the starting digit
lRangeNum = Val(Left(BCell.Value, 1))
'Check if any ranges have been assigned to that array index location
'If not, start a range at that array index
'If so, combine the ranges with Union
Select Case (aRanges(lRangeNum) Is Nothing)
Case True: Set aRanges(lRangeNum) = BCell
Case Else: Set aRanges(lRangeNum) = Union(aRanges(lRangeNum), BCell)
End Select
End If
Next BCell
'You can use any method you want to access the ranges, this just loops
'through the array indices and displays the range areas of each
For i = 0 To 9
If Not aRanges(i) Is Nothing Then
For Each SubGroup In aRanges(i).Areas
'Do what you want with it here
'This just selects the subgroup so you can see it found the groups properly
SubGroup.Select
MsgBox SubGroup.Address
Next SubGroup
End If
Next i
End Sub
I see you've allready rewritten your code, but I'd like to offer how I would do it and would like to know your thoughts about it. Would this be inefficient? I guess it could be because you have to read the first character in cells 4 times for every increment, but not shure if that is a big problem.
Dim start_row As Long
Dim end_row As Long
start_row = 1
end_row = 0
For i = 2 To Range("B" & Rows.Count).End(xlUp).Row
If Cells(i - 1, 2) = "" Then
start_row = i
ElseIf Left(Cells(i - 1, 2), 1) <> Left(Cells(i, 2), 1) Then
start_row = i
End If
If Cells(i + 1, 2) = "" Then
end_row = i
ElseIf Left(Cells(i + 1, 2), 1) <> Left(Cells(i, 2), 1) Then
end_row = i
End If
If end_row <> 0 Then
Call copy_range(start_row, end_row)
end_row = 0
End If
Next i
Another approach that lets you only read the character once could be
Dim start_row As Long
Dim end_row As Long
Dim char_above As String
Dim this_char As String
start_row = 1
end_row = 1
For i = 1 To Range("B" & Rows.Count).End(xlUp).Row
If Cells(i, 2) = "" Then
end_row = i - 1
if i <>1 then Call copy_range(start_row, end_row,char_above)
start_row = i + 1
Else
this_char = Left(Cells(i, 2), 1)
If this_char <> char_above Then
end_row = i - 1
if i<> 1 then Call copy_range(start_row, end_row,char_above)
start_row = i
End If
char_above = this_char
End If
Next i
Let me know your thoughts.

VBA Excel - Looping through a large data set and finding average of certain rows

I am very new to writing in VBA and am struggling with the following task.
I have a workbook with multiple sheets with a large set of data in each (10000 rows). I am able to remove the data I don't need quite easily and can sort the data. I am left with column 1 - a list of Parts, and columns 4 and 5 - Planned Time and Actual Time.
What I want to do with this data is to find the average of columns 4 and 5 for each unique value in column 1. I thought it would be easiest to do the following
Loop for each worksheet
Sort the data for "Part"
Create a variable array
Loop for each row
If the previous row "Part" is the same as the current row then add the that row's "Planned Time" and "Actual Time" to the variable array
If the previous row "Part" is different calculate the average of data in the variable array
Output the averages to a Results sheet with their unique "Part"
Any help would be appreciated. Mainly how to work with the variable array and how to perform the check to fill the array. Thank you.
Mark,
I've put this VBA macro togther for you which should do the trick. The script will loop through all your worksheets and summarise the info into an array (ask you asked). The array is then output into the results table.
Note : You will need to ensure that your workbook contains a sheet called "Results". The script will output the details you need to the "Results" sheet.
Option Explicit
Sub getResults()
'set variables
Dim ws As Worksheet
Dim lastRow As Long
Dim i As Long
Dim ii As Long
Dim partName As String
'set array to contain the parts/avarage data
Dim partsAverageArray() As Variant
ReDim partsAverageArray(1 To 4, 1 To 1)
'loop through each sheet in the workbook
For Each ws In ActiveWorkbook.Sheets
'ignore worksheet if it's name is "Results"
If Not ws.Name = "Results" Then
'get last row in the sheet using column A (size of the table of parts)
lastRow = ws.Cells(ws.Rows.Count, 1).End(xlUp).Row
'loop down the table of parts data starting at row 2 (assuming that row 1 contains the heading of the columns
i = 2
For i = 2 To lastRow
'get the part name
partName = ws.Cells(i, 1).Value
'check if the part does/does not exist within the array yet
'loop through the array to get this info
'check if array has any info in it yet
If partsAverageArray(1, 1) = "" Then
'array is blank so add the first part
'add part name
partsAverageArray(1, 1) = partName
'part occurences
partsAverageArray(2, 1) = 1
'sum of time planned
partsAverageArray(3, 1) = ws.Cells(i, 4).Value
'sum of time taken (actual)
partsAverageArray(4, 1) = ws.Cells(i, 5).Value
Else
'array already exists so loop through it looking for a part match
ii = 1
For ii = 1 To UBound(partsAverageArray, 2)
'test for a part match
If partsAverageArray(1, ii) = partName Then
'match found
'so add/cumulate data into the array
'part occurences (add 1)
partsAverageArray(2, ii) = partsAverageArray(2, ii) + 1
'sum of time planned (total)
partsAverageArray(3, ii) = partsAverageArray(3, ii) + ws.Cells(i, 4).Value
'sum of time taken (actual) (total)
partsAverageArray(4, ii) = partsAverageArray(4, ii) + ws.Cells(i, 5).Value
'stop the loop of the array
ii = UBound(partsAverageArray, 2)
Else
'part name does not match
'check if the end of the array has been reached
If ii = UBound(partsAverageArray, 2) Then
'the end of the array has been reached and the part not found
'therefore add an additional dimension to the array and put the part's details into it
ReDim Preserve partsAverageArray(1 To 4, 1 To (UBound(partsAverageArray, 2) + 1))
'add part name
partsAverageArray(1, UBound(partsAverageArray, 2)) = partName
'part occurences
partsAverageArray(2, UBound(partsAverageArray, 2)) = 1
'sum of time planned
partsAverageArray(3, UBound(partsAverageArray, 2)) = ws.Cells(i, 4).Value
'sum of time taken (actual)
partsAverageArray(4, UBound(partsAverageArray, 2)) = ws.Cells(i, 5).Value
'stop the loop of the array
ii = UBound(partsAverageArray, 2)
Else
'part name has not been found and the array has not looped to the end.
'therefore keep the array looping and do nothing
End If
End If
Next ii
End If
Next i
End If
Next ws
'--------------------------------------------------------
'output data from the array to the reults sheet
'--------------------------------------------------------
Set ws = Sheets("Results")
'set the results table headings
ws.Cells(1, 1).Value = "Part"
ws.Cells(1, 2).Value = "Part Count"
ws.Cells(1, 3).Value = "Planned Time (Average)"
ws.Cells(1, 4).Value = "Actual Time (Average)"
'clear the old results from the table before adding the new results
lastRow = ws.Cells(ws.Rows.Count, 1).End(xlUp).Row
ws.Range("A2:D" & lastRow).ClearContents
i = 1
For i = 1 To UBound(partsAverageArray, 2)
'part name
ws.Cells(i + 1, 1).Value = partsAverageArray(1, i)
'part count
ws.Cells(i + 1, 2).Value = partsAverageArray(2, i)
'average (planned)
ws.Cells(i + 1, 3).Value = partsAverageArray(3, i) / partsAverageArray(2, i)
'average (actual)
ws.Cells(i + 1, 4).Value = partsAverageArray(4, i) / partsAverageArray(2, i)
Next i
'view results
ws.Activate
End Sub
Hope this helps!

Resources