VBA to count distinct dates based on condition - excel

I am currently using this formula to calculate the unique number of dates for each user in column R
=SUM(IF(Q1=$J$2:$J$39115, 1/(COUNTIFS($J$2:$J$39115, Q1, $H$2:$H$39115, $H$2:$H$39115)), 0))
This formula works, but takes forever to calculate due to the large number of rows it has to work through. I am sure there must be a quicker way using a macro or a vba.
Column J contains the list of users (users appears multiple times) and column H contains the dates.
I need to calculate the number of days worked by each user in column Q.(column Q contains the unique list of users .

Add the following into a VBA module:
Public Function count_distinct(user As Range, userlist As Range, datelist As Range) As Long
Dim i As Long, j As Long, dates() As Variant, found As Boolean, this_date As Variant
Dim varUL As Variant, varDL As Variant, unique_count As Long
varUL = userlist.Value
varDL = datelist.Value
ReDim dates(0 To 0)
For i = 1 To UBound(varUL, 1)
If varUL(i, 1) = user Then
found = False
this_date = varDL(i, 1)
For j = 1 To unique_count
If this_date = dates(j) Then
found = True
Exit For
End If
Next j
If Not found Then
unique_count = unique_count + 1
ReDim Preserve dates(0 To unique_count)
dates(unique_count) = this_date
End If
End If
Next i
count_distinct = unique_count
End Function
Then replace your formula in column R with: =count_distinct(Q1,$J$2:$J$39115,$H$2:$H$39115)
It's not the quickest and I am sure it could be improved, but it may give you back a few minutes!!

Related

Split time range in 1 hour intervals

want to split a time range into 1 hour intervals
split the given time range into 1 hour intervals from cell A2 and A3, the time range will be changed a serval time and on a (Macro) click it should split the given time range into 1 hour intervals.
Create an Hourly Sequence
Sub CreateHourlySequence()
Dim ws As Worksheet: Set ws = ActiveSheet ' improve!
Dim dt1: dt1 = ws.Range("A2").Value
Dim dt2: dt2 = ws.Range("A3").Value
Dim dfCell As Range: Set dfCell = ws.Range("C2")
dfCell.Resize(ws.Rows.Count - dfCell.Row + 1).ClearContents
Select Case False
Case IsDate(dt1), IsDate(dt2): Exit Sub
End Select
Dim dtDiff As Long: dtDiff = DateDiff("h", dt1, dt2)
Dim dtStart As Date, dStep As Long
Select Case dtDiff
Case Is > 0: dtStart = dt1: dStep = 1
Case Is < 0: dtStart = dt2: dStep = -1
End Select
Dim rCount As Long: rCount = Abs(dtDiff) + 1
Dim Data() As Date: ReDim Data(1 To rCount, 1 To 1)
Dim d As Long, r As Long
If dStep = 0 Then
Data(1, 1) = dtStart
Else
For d = 0 To dtDiff Step dStep
r = r + 1
Data(r, 1) = DateAdd("h", d, dtStart)
Next d
End If
dfCell.Resize(rCount).Value = Data
End Sub
If you are ok with a non-VBA solution, then you have some options.
Option 1: SEQUENCE
For the Excel version listed here, you could use the SEQUENCE function as suggested by chris neilsen.
Example:
Let's assume that your data starts at A1 like this:
Then, in C2, you could have :
=SEQUENCE((A3-A2)/VALUE("01:00:00")+1,1,A2,VALUE("01:00:00"))
Note that VALUE("01:00:00") represents 1 hour (but you could also use TIME(1,,) as suggested by Mayukh Bhattacharya).
Option 2: Dynamic Array Formula
You have an Excel version listed here, you can use a dynamic array formula .
Example:
Making the same assumptions as option 1, for where the data is, you could use a formula like this one:
=(ROW(INDIRECT("1:"&(A3-A2)/VALUE("01:00:00")+1))-1)*VALUE("01:00:00")+A2
Explanations:
Using the INDIRECT function inside the ROW function is a neat trick to get an array with consecutive values. For instance, INDIRECT("1:9") return the array containing rows 1 to 9 and passing it to ROW will return the array as a column like this {1;2;3;4;5;6;7;8;9} (we get only one element per row).
Since we don't know in advance how many steps we will take we calculate the number of elements using (A3-A2)/VALUE("01:00:00")+1 and concatenate it to "1:" to get the range of size that we need.
When we have the sequential array, we just need to make sure it starts by zero, which is why we remove 1 to all elements of the array like this:
ROW(INDIRECT("1:"&(A3-A2)/VALUE("01:00:00")+1))-1
Finally, we multiply each element of the array by the value corresponding to 1 hour and add the starting point in A2.
Option 3: Old array formula
Same idea as option 3 but using the old array formula explained here. Basically, you'll have to use Ctrl+Shift+Enter.
A Simple Solution given your example (to clear cells it is your job :-)
It would be better to write to an array but as example it should be ok.
Option Explicit
Sub TimeToHour()
Dim startTime As Double, endTime As Double, i As Double, z As Double
startTime = Range("a2")
endTime = Range("a3")
Columns(3).NumberFormat = Range("a2").NumberFormat ' Column C
z = 2
For i = startTime To endTime Step 1 / 24
Cells(z, 3) = i ' write to column c starting in row 2
z = z + 1
Next
End Sub

Assign a variable to cells to compare mutliple numbers

I have a data set where I need to compare the first number in each transect against each other.
For example, in the below data set I need to compare cells D2, D7, D12 and D17 and assign a value based on which one is the smallest number, the next smallest and so on. This will be used to assign the transect numbers in column A.
My issue is that the number of sections (in this example 4) and the number of transects (also 4 in this example) will vary. So the cells I need to compare will change.
I have written the code that calculates the number of transects, which is:
Dim tlength As Worksheet
Dim tb As Long *'tb=transect break*
Sub tlength_start_stop_coords()
Dim ws As Worksheet
Set ws = ThisWorkbook.Worksheets("sheet1")
Set tlength = ThisWorkbook.Worksheets("transect lengths") *' assigns the sheet to a variable
for efficient coding*
tb = 0 *'counter to calculate the number of transects*
j = 2 *'counter for row*
Lastrow = Lastrow + 1 *'add a row to last row so that the last row includes a blank line.*
*'the following for loop is used to calculate the number of transects*
For i = 2 To Lastrow
If tlength.Range("D" & i) = vbNullString Then
If tlength.Range("D" & i - 1) <> vbNullString Then
tb = tb + 1 *'updates the counter*
End If
End If
Next i
tbtotal = tb *'stores the total counter in variable tbtotal for later use*
I think I may need to use a loop. But I am stuck trying to figure out how to manage the unknown number of comparisons in changing cell locations.
The desired result is in the below screenshot of the expected outcome, with results in column A. To begin with, I only need to get the number for the first row of each transect. Once I have that, I can copy using xldown.
Expected outcome:
Another possible data set and outcome expected might be:
enter image description here
with an expected outcome of:
enter image description here
Worked for me using your second set of sample data:
Sub Tester()
Dim tlength As Worksheet, i As Long, tNum As Long, v, vPrev, arr
Dim col As New Collection, e, c As Range, rng As Range
Set tlength = ThisWorkbook.Worksheets("transect lengths")
'collect all the Section 1 Latitudes and row numbers
For i = 2 To tlength.Cells(Rows.Count, "B").End(xlUp).Row
If tlength.Cells(i, "B") = 1 Then
col.Add Array(i, tlength.Cells(i, "D").Value) 'store start row and first Latitude
End If
Next i
SortCollection col, 2 'sort collection by second element in each array
tNum = 0
'loop over the sorted collection and assign the order of the transects
For Each e In col
tNum = tNum + 1
Set c = tlength.Cells(e(0), "B")
'following assumes all transects have at least 2 sections...
tlength.Range(c, c.End(xlDown)).Offset(0, -1).Value = tNum
Next e
End Sub
'Sort a collection of 0-based arrays in ascending order by the n'th element of each array
'Adapted from https://stackoverflow.com/a/3588073/478884
Sub SortCollection(col As Collection, n As Long)
Dim i As Long, j As Long, vTemp As Variant
For i = 1 To col.Count - 1 'Two loops to bubble sort
For j = i + 1 To col.Count
If col(i)(n - 1) < col(j)(n - 1) Then 'change to > for ascending sort
vTemp = col(j) 'store the lesser item
col.Remove j 'remove the lesser item
col.Add Item:=vTemp, before:=i 're-add the lesser item before the greater Item
End If
Next j
Next i
End Sub

Loop through name list and if names exist in selection start after last name

I apologize, this is my first crack at Excel VBA so excuse my lack of knowledge!
So I have a list of (currently) 3 names to assign to the days in column A in a repeating order in Excel.
Currently my VBA code allows it to populate the selected cells with the names in a repeating pattern (this part is good), however there are two pieces I need help with.
1- with current code, once it reaches the bottom of the names it checks for the blank box that would end that list and starts over at the tops as directed but it puts a blank cell first (see screenshot). How can I have it put next name without adding blank cell first?
2- I want to be able to (once this gets going)select the entire D column through what dates need to be filled and:
-check the lowest non blank box
-match to list and set the
counter to name below that so
it continues the name order
from the last person who was
assigned
This is code I have now:
Sub EXAMPLE()
Dim count As Integer
count = 0
For Each c In Selection
c.Value = Range("X1").Offset(count, 0).Value
If c.Value = "" Then count = -1 And c.Value = Range("x1").Offset(count, 0).Value
count = count + 1
Next c
End Sub
Sorry I know that was long, I hope this makes sense.
I think it's worth reading about arrays, as this task is ideally suited to their use. Your best bet would be to read the names into an array and then build a recurring array whose dimension is equal to the number of rows in your dates column (or selection, or however you want to define the size of the output range).
Code would look a little like this:
Dim v As Variant
Dim people() As Variant, output() As Variant
Dim rowCount As Long, i As Long, j As Long
Dim endRange As Range
'Read the list of names into an array.
'This just takes all data in column "X" -> amend as desired
With Sheet1
Set endRange = .Cells(.Rows.Count, "X").End(xlUp)
v = .Range(.Cells(1, "X"), endRange).Value
End With
'Sense check on the names data.
If IsEmpty(v) Then
MsgBox "No names in Column ""X"""
Exit Sub
End If
If Not IsArray(v) Then
ReDim people(1 To 1, 1 To 1)
people(1, 1) = v
Else
people = v
End If
'Acquire the number of rows for repeating list of names.
'This just takes all data in column "A" -> amend as desired
With Sheet1
Set endRange = .Cells(.Rows.Count, "A").End(xlUp)
rowCount = .Range(.Cells(3, "A"), endRange).Rows.Count
End With
'Sense check date data.
If endRange.Row < 3 Then
MsgBox "No dates in Column ""A"""
Exit Sub
End If
'Make a recurring array.
ReDim output(1 To rowCount, 1 To 1)
i = 1
Do While i <= rowCount
For j = 1 To UBound(people, 1)
output(i, 1) = people(j, 1)
i = i + 1
If i > rowCount Then Exit Do
Next
Loop
'Write the output to column "D"
Sheet1.Range("D3").Resize(UBound(output, 1)).Value = output

Excel VBA Array Column Sort

I'm still trying to figure out VBA and I have a query on sorting
I have a function call MatrixSort(matrix) that takes in a n1 x n2 matrix.
Is there any sorting function that VBA has that enables me to sort the matrix by count as seen below?
Inputting the box into Matrix Sort and getting the output below:
Would appreciate if anyone can enlighten me on this
Thanks you very much!
Edit:
Thanks to pEH for your code and logic. I have came up with the function based on your idea!
Although the code might not be efficient, I realized that there isn't an easy way to do CountA since the function will substitute empty cells as 0, as such I had to manually input "" and have the counter to ignore it.
'Sorts the Matrix into Decending Count Order
'Key Idea: Calculate count in each column and saves into ArrCount
'Then use Max(ArrCount) to find the max row count
'Use Match to get the column number with the max row count, then input this to first column under MatrixOut
'Kill the count that was copied under ArrCount(iMax) = -1 so that the next maximum count can be found
'Thanks to pEH from Stackoverflow for helping out
Function MatrixSort(matrix)
Dim MatrixTemp As Variant
Dim max_row As Integer
Dim max_col As Integer
Dim p As Object
Dim i As Integer
Dim j As Integer
Dim counter As Double 'Counts the number of filled range in matrix
Dim iMax As Integer 'Stores the max count for sorting phase
MatrixTemp = matrix
'To preserve empty cells as empty instead of 0
max_row = UBound(MatrixTemp, 1)
max_col = UBound(MatrixTemp, 2)
ReDim MatrixIn(1 To max_row, 1 To max_col)
For i = 1 To UBound(MatrixTemp, 1)
For j = 1 To UBound(MatrixTemp, 2)
If MatrixTemp(i, j) = "" Then
MatrixIn(i, j) = ""
Else
MatrixIn(i, j) = MatrixTemp(i, j)
End If
Next j
Next i
Set p = Application.WorksheetFunction
'Counting of Each Columns
ReDim ArrCount(1 To max_col) 'Counts filled rows in each column
ReDim column_extract(1 To max_row) 'For CountA to work by counting each column individually
For j = 1 To max_col
For i = 1 To max_row
If MatrixIn(i, j) <> "" Then
counter = counter + 1
End If
Next i
ArrCount(j) = counter 'Stores the total count
counter = 0 'Resets the counter before another loop
Next j
'Creation of Final Output Matrix
ReDim MatrixOut(1 To max_row, 1 To max_col) 'For the Final Output
'Column Sort
For j = 1 To max_col
iMax = p.Match(p.Max(ArrCount), ArrCount, False)
For i = 1 To max_row
MatrixOut(i, j) = MatrixIn(i, iMax)
Next i
ArrCount(iMax) = -1
Next j
MatrixSort = MatrixOut
End Function
Imagine the following data:
To sort it by the count of filled rows in each column you just need to calculate that count .CountA(RngIn.Columns(iCol)) for each column and save the results into an array ArrCount.
Then you can use .Max(ArrCount) to find the maximum row count and .Match to get the column number which is maximum. This is your first column so write it to the destination RngOut. Now we just need to kill the count that was already copied ArrCount(iMax) = -1 so the next maximum can be found and copied to the next destination column … and so on …
Option Explicit
Public Sub MatrixSortColumnsByRowCount()
'input range
Dim RngIn As Range
Set RngIn = Worksheets("Sheet1").Range("B2:F8")
'output range
Dim RngOut As Range
Set RngOut = Worksheets("Sheet1").Range("B12:F18")
'count filled rows in each column
ReDim ArrCount(1 To RngIn.Columns.Count) As Long
Dim iCol As Long
For iCol = 1 To RngIn.Columns.Count
ArrCount(iCol) = Application.WorksheetFunction.CountA(RngIn.Columns(iCol))
Next iCol
'sort columns
Dim iMax As Long
For iCol = 1 To RngIn.Columns.Count
iMax = Application.WorksheetFunction.Match(Application.WorksheetFunction.Max(ArrCount), ArrCount, False)
RngOut.Columns(iCol).Value = RngIn.Columns(iMax).Value
ArrCount(iMax) = -1
Next iCol
End Sub
The output then will be …

Comparing Data from Column A and Column B Once

I am currently in the process of analysing data from Excel, and would like to make comparisons between data in Column A and Column B, identifying duplicate data. I am using the following code:
Sub Find_Matches()
Dim CompareRange As Variant, x As Variant, y As Variant
Set CompareRange = Range("c2", Range("c2").End(xlUp))
For Each x In Selection
For Each y In CompareRange
If x = y Then x.Offset(0, 1) = x
Next y
Next x
End Sub
This code has been taken from MSDN, so if it finds a match in Column C against Column A, it will display the matched number in Column B. For the most part it does what I need. However I am looking to modify this code so it only matches a number in the list once.
Example of what the code currently does:
A2 B2 C2
1 1 1
1 1 2
1 1 3
So essentially, because the number 1 appears once in Column C, Column A keeps finding a match.
What I would like it to do is:
A2 B2 C2
1 1 1
1 2
1 3
So because the number 1 only appears in Column C once, it should only be matched once against the numbers in Column A.
I'm assuming this is probably something simple, but I can't seem to determine the logic. Could someone point me in the right direction please?
Testing for duplicates can be simple or complicated depending on how fast you want your procedure to be and how large the data sets are.
I personally favour the Collection object because it has a unique key and testing for the existence of that key is very fast, especially if the dataset is large. The unique test is done by seeing if the code throws an error when you interrogate the Collection for a particular key. Some are philosophically opposed to testing for errors - I have to say that I'm one, so I actually prefer the Dictionary object but for a task this mundane, I won't go through the steps to reference that.
You'll also see that the code below works with arrays rather than cells on the worksheet itself - again, just a matter of personal taste because it's quicker.
Const SOURCE_COL As String = "A"
Const SOURCE_START_ROW As Long = 2
Const COMPARE_COL As String = "C"
Const COMPARE_START_ROW As Long = 2
Const OUTPUT_COL As String = "B"
Dim ws As Worksheet
Dim sourceValues As Variant
Dim compareValues As Variant
Dim outputValues() As Variant
Dim sourceIndex As Long
Dim compareIndex As Long
Dim uniques As Collection
Dim val As Variant
Dim key As String
Dim exists As Variant
Set ws = ThisWorkbook.Worksheets("Sheet1")
sourceValues = ws.Range(ws.Cells(SOURCE_START_ROW, SOURCE_COL), _
ws.Cells(Rows.Count, SOURCE_COL).End(xlUp)).Value2
compareValues = ws.Range(ws.Cells(COMPARE_START_ROW, COMPARE_COL), _
ws.Cells(Rows.Count, COMPARE_COL).End(xlUp)).Value2
Set uniques = New Collection
ReDim outputValues(1 To UBound(sourceValues, 1), 1 To 1)
For sourceIndex = 1 To UBound(sourceValues, 1)
val = sourceValues(sourceIndex, 1)
key = CStr(val)
exists = Empty
On Error Resume Next
exists = uniques(key)
On Error GoTo 0
If IsEmpty(exists) Then
For compareIndex = 1 To UBound(compareValues, 1)
If val = compareValues(compareIndex, 1) Then
outputValues(sourceIndex, 1) = val
uniques.Add val, key
Exit For
End If
Next
End If
Next
ws.Cells(SOURCE_START_ROW, OUTPUT_COL).Resize(UBound(outputValues, 1)).Value = outputValues

Resources