Finding duplicates and checking for overlapping periods - excel

I have a sheet with three columns as illustrated below
Id
StartDate
EndDate
1
20201101
20210131
2
20200801
20201031
2
20201101
99991231
3
20200901
99991231
3
20210301
99991231
4
20200301
20200930
4
20201001
20210430
4
20210315
99991231
I want to test if there are overlapping time ranges within each group of duplicate ids.
For example, the two rows with id = 3 have overlapping time ranges. The same goes for the last two rows of id = 4.
So I imagine that whenever the loop(?) finds an overlap, it throws me a msgbox with the corresponding id.
How can I achieve this?
I'm thinking that it might require a nested loop? And to test for overlap I would use the following logic:
If StartDate1 <= EndDate2 And StartDate2 <= EndDate1 Then ...

You are rigth, a nested loop should do the job, and inside the loop you can test if the cells have the same ID
I think this code will solve your problem:
Sub overlap()
Dim wb As Workbook
Dim ws As Worksheet
Set wb = ThisWorkbook
Set ws = wb.Worksheets("Sheet1")
last_line = ws.Range("a100000").End(xlUp).Row
For i = 2 To last_line
StartDate1 = ws.Cells(i, 2)
EndDate1 = ws.Cells(i, 3)
ID1 = ws.Cells(i, 1)
For j = i + 1 To last_line
ID2 = ws.Cells(j, 1)
If ID1 = ID2 Then
StartDate2 = ws.Cells(j, 2)
EndDate2 = ws.Cells(j, 3)
If StartDate1 <= EndDate2 And StartDate2 <= EndDate1 Then
MsgBox "overlap at row: " & i & " (ID: " & ID1 & ")"
End If
End If
Next j
Next i
End Sub

Related

Excel VBA Sort not 100% correct

New to the forum and hope someone can help.
Create a function to loop through a set of data row by row for sorting.
Firstly need to check if column 1 is not equal to 9999. If not insert into the appropriate row using column 1 as the sort criteria. If it equals 9999 then insert into the appropriate spot using column 3 and column 2.
The problem I'm encountering is that some row and not sort. I think its because as I'm cutting and pasting the row is missed. Below is my code and the sample data
Sub insertionTableSort()
'PURPOSE: loop through all employee and apply the sort as follows:
'1) Seniority <> 9999 Seniority number
'2) Seniority = 9999 sort start date then employee number
Dim ws As Worksheet
Dim tbl As ListObject
Set ws = ThisWorkbook.Worksheets("Roster Applications")
Set tbl = ws.ListObjects("RosterRequest") '## modify to your table name.
Set tblRow = tbl.ListRows
'Loop Through Every Row in Table
For x = 2 To tbl.Range.Rows.Count - 1
'Debug.Print x & ", " & tbl.DataBodyRange(x, 1).Value, tbl.DataBodyRange(x, 2).Value, tbl.DataBodyRange(x, 4).Value
For y = 2 To tbl.Range.Rows.Count
'seniroity = 9999
If tbl.DataBodyRange(x, 1) = 9999 Then
'sort by start date then Staff Num
If tbl.DataBodyRange(x, 4) < tbl.DataBodyRange(y - 1, 4) And tbl.DataBodyRange(x, 2) < tbl.DataBodyRange(y - 1, 2) Then
tbl.ListRows(x).Range.Cut
tbl.ListRows(y - 1).Range.Insert
Exit For
End If
Else
'seniroity <> 9999
'sort by seniority
If tbl.DataBodyRange(x, 1) < tbl.DataBodyRange(y - 1, 1) Then
tbl.ListRows(x).Range.Cut
tbl.ListRows(y - 1).Range.Insert
Exit For
End If
End If
Next y
Next x
End Sub
sample data after running the above
sorts well until this point where row 2 should be before row 1 and there are other examples
Data initially sort by Register No.
Errors highlighted in yellow
Finished Sort

How to increment column in Excel VBA?

Hi I am trying to compare two sets of data by having indicators if they increased, decreased, or stayed the same. I was able to get it working on one column. My problem is I can't loop it on multiple columns.
Basically:
If A1 = C1 then D1.Value = 0
If A1 > C1 then D1.Value = 1
If A1 < C1 then D1.Value = 2
I've tried to do the "do while" to add increments on the columns but it did not work.
Sub ChangeIndicator2()
Dim i As Long
Dim a As Long
Dim b As Long
Dim x As Long
Dim y As Long
Dim ProgramCount As Long
i = 2
a = 8
b = 2
x = 0
y = 8
ProgramCount = 12
Do While y <= ProgramCount
For Each c In Worksheets("Sheet1").Range("A2:A20").Offset(x, y)
If Worksheets("Sheet1").Cells(i, a).Value = Worksheets("Sheet1").Cells(i, b).Value Then
c.Value = 0
ElseIf Worksheets("Sheet1").Cells(i, a).Value < Worksheets("Sheet1").Cells(i, b).Value Then
c.Value = 1
ElseIf Worksheets("Sheet1").Cells(i, a).Value > Worksheets("Sheet1").Cells(i, b).Value Then
c.Value = 2
End If
i = i + 1
Next c
a = a + 2
b = b + 2
y = y + 2
Loop
End Sub
Only the first column works, the second column only shows 0 values.
So basically, what you want to do is compare 2 columns which are 2 columns apart and repeat that on another pair of columns which is 8 columns from the first column. If my assumption is correct then have a go at this:
For i = 0 To (ProgramCount * 8) Step 8
With Worksheets("Sheet1").Range("A2:A20").Offset(, i + 3)
.FormulaR1C1 = "=IF(RC[-3]=RC[-1],0,IF(RC[-3]>RC[-1],1,2))"
.Value2 = .Value2
End With
Next
Adjust the offset to suit your needs (I may have misunderstood the actual columns you target to update). Hope this helps.

How to add values from different sheets, and retain the formula pointing to the sheets+cells?

I am trying to add values from different sheets (Sheet 2 to 5) into my main sheet (Sheet 1). In Sheet 1 I want the cells to contain the right formula pointing to the different sheets (if possible).
Typically like this:
='Sheet2'!D5+'Sheet3'!D165
All my sheets have different products, but some sheets contain same products. So I want to search through them all and ADD them in my Main Sheet (Sheet 1).
Sub UpdateMainSheet()
' Kode for å Oppdatere Plukkeliste Alle Artikler Summert
Dim AktivtArk As String
Dim AktivtArkNavn As String
Dim K As Integer
Dim Count As Integer
'Line of code to delete old data in Main Sheet:
Worksheets("Sheet1").Range("A2:H10000").Clear
AktivtArkOverskrift = "List of Articles from Sheet 2 to 5"
'Creates Headline in Main Sheet:
eRow = Sheet1.Cells(Rows.Count, 1).End(xlUp).Offset(1, 0).Row
Worksheets("Sheet1").Cells(eRow, 1) = AktivtArkOverskrift
Worksheets("Sheet1").Cells(eRow, 1).Font.Bold = True
'Script to check and gather data from the other sheets (Sheet 2, 3, 4 and 5):
For K = 2 To 5
'For loop to check each line in sheet "K"
For I = 2 To 1000
'If function to check if the cell I in column F is empty, if so it_jumps to next row and do the same check:
If Worksheets(K).Cells(I, 6) > 0 Then
Count = 0
'For loop to check if I already have a row in the Main Sheet with the article I'm checking:
For L = 2 To 1000
'If function to check if the articles have the same article number:
If Worksheets(K).Cells(I, 1) = Worksheets("Sheet1").Cells(L, 1) Then
'Line of code that are supposed to ADD the values that is currently in the Main Sheet, togheter with the value in Sheet K:
Worksheets("Sheet1").Cells(I, 4).Formula = Worksheets("Sheet1").Cells(I, 4) + Worksheets(K).Cells(L, 4)
End If
Next L
End If
Next I
Next K
End Sub
So what I need to fix in my code is this part (located furthest inside the For Loop):
Worksheets("Sheet1").Cells(I, 4).Formula = Worksheets("Sheet1").Cells(I, 4) + Worksheets(K).Cells(L, 4)
And make it create a formula in the wanted cell, that looks something like this:
='Sheet2'!D5+'Sheet3'!D165
It must be able to add another cell as well, since the Loop are running through several Sheets (Sheet 2 to 5) that may contain the same products.
I.e. I only want one line in my Main Sheet for each product.
I managed to find the solution in the end.
It seemed I had switched the L and I in som of the looping, which resulted in the values not to be added togheter.
The following code (I did not translate to English, but can do this if someone wants/need it) solved my issue, and gave me the values from Sheet 2 to 5 sorted by product in Sheet 1:
Sub OppdaterePlukkelisteSummert()
'Kode for å Oppdatere Plukkeliste Alle Artikler Summert
Dim AktivtArk As String
Dim AktivtArkNavn As String
Dim K As Integer
Dim Teller As Integer
Dim value1 As Integer
Dim value2 As Integer
'Sletter Plukklisten for å oppdatere og sortere på nytt:
Worksheets(1).Range("A2:H10000").Clear
'HENTING AV DATA FRA ARKET "K":
AktivtArk = "Artikler Summert fra Alle Ark"
AktivtArkOverskrift = "Artikler Summert fra Alle Ark"
'Setter inn Overskrift som Forteller kva ark utstyret kommer fra:
eRow = Sheets(1).Cells(Rows.Count, 1).End(xlUp).Offset(1, 0).Row
Worksheets(1).Cells(eRow, 1) = AktivtArkOverskrift
Worksheets(1).Cells(eRow, 1).Font.Bold = True
'Sjekker hvilke/hvilket rader/utstyr som skal plukkes, og legger det inn i "Ark1":
For K = 2 To 5
For I = 2 To 1000
If Worksheets(K).Cells(I, 6) > 0 Then
Teller = 0
For L = 2 To 1000
If Worksheets(K).Cells(I, 1) = Worksheets(1).Cells(L, 1) Then
value1 = Worksheets(1).Cells(L, 4)
value2 = Worksheets(K).Cells(I, 4)
Worksheets(1).Cells(L, 4) = value1 + value2
Worksheets(1).Cells(L, 6) = value1 + value2
Else
Teller = Teller + 1
End If
Next L
If Teller > 998 Then
eRow = Sheets(1).Cells(Rows.Count, 1).End(xlUp).Offset(1, 0).Row
For J = 1 To 11
Worksheets(1).Cells(eRow, J) = Worksheets(K).Cells(I, J)
Next J
Worksheets(1).Cells(eRow, 6).Font.Color = RGB(0, 150, 0)
Worksheets(1).Cells(eRow, 7).Font.Color = RGB(0, 150, 0)
End If
End If
Next I
Next K
Worksheets(1).Range("A2").Select
End Sub
I hope this can be useful for someone else :-)
All help and suggestion in the comments are appreciated!
I was going to illustrate with this simple example:
I = 1 'for example
For K = 2 To 5
Worksheets("Sheet1").Cells(I, 4).Value = Worksheets("Sheet1").Cells(I, 4).Value + _
WorksheetFunction.SumIf(Worksheets(K).Range("A:A"), "Bananas", Worksheets(K).Range("D:D"))
Next K

VBA-Excel and large data sets causes program to crash

First time poster and new to programming in general. I have a project in which i have to build a financial model to mine for data in excel. I have succeeded in building said model on VBA. I have ran tests on 3,000 line dataset and it was successful. I will briefly explain what it does.
I track a given stock on a given day on multiple exchanges. I download the data (roughly 935,000 lines) The first step is copy all the data for a given exchange (roughly 290,000) onto a new sheet (this takes roughly 8min), then I create a new column to log the bid ask spread (12secs), the next step is what Im having trouble with, I basically rank each line of data twice, one column for Bid size and one column for Ask size. I created a function which uses excel Percentile function and ranks based on where the given bid and ask size lands. As of right now, I have been running the Macro for the last 35min and has yet to execute. I cant attempt the other macros since each macro depends on the previous one.
So my basic issue is that since my data set is large, my model keeps crashing.The code seems to be fine when working with the test data, and it doesn't throw any errors when I run the program, but with the larger data set it just crashes. Does anyone have any suggestions? Is this normal with such large amounts of data?
Thanks in advance.
Sham
Here is the sub and function thats giving me the trouble, the sub takes in the required inputs to run the function and then pops into the assigned cell. The code is suppose to repeat the process for three separate sheets. For now, Id like it to work on one sheet, hence used the comments to not include the loop
Sub Bucketting()
Dim firstRow As Long
Dim lastRow As Long
Dim counter As Long
Dim bidRange As Range
Dim offerRange As Range
Dim bidScroll As Range
Dim offerScroll As Range
Dim Ex As String
Dim i As Integer
'For i = 1 To 1 Step 1 'Sheet Selection Process
' If i = 1 Then
' Ex = "Z"
' ElseIf i = 2 Then
' Ex = "P"
' Else
' Ex = "T"
' End If
Sheets("Z").Select 'Sheet selected
With ActiveSheet
firstRow = .UsedRange.Cells(1).Row + 1
lastRow = .UsedRange.Rows.Count
Set bidRange = .Range("F2:F" & lastRow)
Set offerRange = .Range("G2:G" & lastRow)
For counter = lastRow To firstRow Step -1
Set bidScroll = .Range("F" & counter)
Set offerScroll = .Range("G" & counter)
With .Cells(counter, "J")
.Value = DECILE_RANK(bidRange, bidScroll)
End With
With .Cells(counter, "K")
.Value = DECILE_RANK(offerRange, offerScroll)
End With
Next counter
End With
Range("J1").Select
ActiveCell = "Bid Rank"
ActiveCell.Offset(0, 1) = "Offer Rank"
'Next i
End Sub
Function DECILE_RANK(DataRange, RefCell)
'Credit: BJRaid
'DECILE_RANK(The Range of data)
'Declares the function that can be called in the spreadsheet cell - enter '=DECILE_RANK(A5:A50,A5)
'Using the percentile worksheet function calculate where the 10th, 20th etc percentile of the reference range are
DEC1 = Application.WorksheetFunction.Percentile(DataRange, 0.1)
DEC2 = Application.WorksheetFunction.Percentile(DataRange, 0.2)
DEC3 = Application.WorksheetFunction.Percentile(DataRange, 0.3)
DEC4 = Application.WorksheetFunction.Percentile(DataRange, 0.4)
DEC5 = Application.WorksheetFunction.Percentile(DataRange, 0.5)
DEC6 = Application.WorksheetFunction.Percentile(DataRange, 0.6)
DEC7 = Application.WorksheetFunction.Percentile(DataRange, 0.7)
DEC8 = Application.WorksheetFunction.Percentile(DataRange, 0.8)
DEC9 = Application.WorksheetFunction.Percentile(DataRange, 0.9)
' Calculate the Decile rank that the reference cell value sits within
If (RefCell <= DEC1) Then DECILE_RANK = 1
If (RefCell > DEC1) And (RefCell <= DEC2) Then DECILE_RANK = 2
If (RefCell > DEC2) And (RefCell <= DEC3) Then DECILE_RANK = 3
If (RefCell > DEC3) And (RefCell <= DEC4) Then DECILE_RANK = 4
If (RefCell > DEC4) And (RefCell <= DEC5) Then DECILE_RANK = 5
If (RefCell > DEC5) And (RefCell <= DEC6) Then DECILE_RANK = 6
If (RefCell > DEC6) And (RefCell <= DEC7) Then DECILE_RANK = 7
If (RefCell > DEC7) And (RefCell <= DEC8) Then DECILE_RANK = 8
If (RefCell > DEC8) And (RefCell <= DEC9) Then DECILE_RANK = 9
If (RefCell > DEC9) Then DECILE_RANK = 10
End Function
935,000 lines is a lot for excel. Like, really a lot. Barring saying using a real database, If your application is literally putting a =Percentile(...) in each cell, I would recommend Trying to use another tool for that. Perhaps something within VBA itself. More generally, use something outside of a cell - then store the result value in the cell. There is a lot of overhead in maintaining those formulas that are interdependent on 935k rows of data.
The problem is that your looping through each row individually, the Excel way is to try and work with whole ranges at once whenever possible. I would load the ranges into arrays, then modify your DECILE_RANK code to work with the items in the array.
Note that variant arrays that read ranges in are 2-D.
Here is the fully functioning code including my custom VBA array slicer. Note that it was only tested on a small dataset:
Sub Bucketting()
Dim lastRow As Long
Dim bidArray As Variant
Dim offerArray As Variant
Sheets("Sheet1").Select 'Sheet selected
With ActiveSheet
lastRow = .UsedRange.Rows.Count + 1
bidArray = .Range("F2:F" & lastRow)
offerArray = .Range("G2:G" & lastRow)
Range("J2:J" & lastRow).Value = GetArraySlice2D(DECILE_RANK(bidArray), "column", 1, 1, 0)
Range("K2:K" & lastRow).Value = GetArraySlice2D(DECILE_RANK(offerArray), "column", 1, 1, 0)
End With
Range("J1").Select
ActiveCell = "Bid Rank"
ActiveCell.Offset(0, 1) = "Offer Rank"
End Sub
Function DECILE_RANK(DataRange As Variant) As Variant
' Credit: BJRaid
' DECILE_RANK(The Range of data)
' Declares the function that can be called in the spreadsheet cell - enter '=DECILE_RANK(A5:A50,A5)
Dim DEC(0 To 10) As Variant
Dim i As Integer, j As Integer
'Using the percentile worksheet function calculate where the 10th, 20th etc percentile of the reference range are
DEC(0) = 0
For i = 1 To 9
DEC(i) = Application.WorksheetFunction.Percentile(DataRange, 0.1 * i)
Next i
DEC(10) = Application.WorksheetFunction.Max(DataRange)
' Calculate the Decile rank that the reference cell value sits within
For i = 1 To UBound(DataRange, 1)
For j = 1 To 10
If ((DataRange(i, 1) > DEC(j - 1)) And (DataRange(i, 1) <= DEC(j))) Then
DataRange(i, 1) = j
Exit For
End If
Next j
Next i
DECILE_RANK = DataRange
End Function
Public Function GetArraySlice2D(Sarray As Variant, Stype As String, Sindex As Integer, Sstart As Integer, Sfinish As Integer) As Variant
' this function returns a slice of an array, Stype is either row or column
' Sstart is beginning of slice, Sfinish is end of slice (Sfinish = 0 means entire
' row or column is taken), Sindex is the row or column to be sliced (NOTE:
' 1 is always the first row or first column)
' an Sindex value of 0 means that the array is one dimensional 3/20/09 Lance Roberts
Dim vtemp() As Variant
Dim i As Integer
On Err GoTo ErrHandler
Select Case Sindex
Case 0
If Sfinish - Sstart = UBound(Sarray) - LBound(Sarray) Then
vtemp = Sarray
Else
ReDim vtemp(1 To Sfinish - Sstart + 1)
For i = 1 To Sfinish - Sstart + 1
vtemp(i) = Sarray(i + Sstart - 1)
Next i
End If
Case Else
Select Case Stype
Case "row"
If Sfinish = 0 Or (Sstart = LBound(Sarray, 2) And Sfinish = UBound(Sarray, 2)) Then
vtemp = Application.WorksheetFunction.Index(Sarray, Sindex, 0)
Else
ReDim vtemp(1 To Sfinish - Sstart + 1)
For i = 1 To Sfinish - Sstart + 1
vtemp(i) = Sarray(Sindex, i + Sstart - 1)
Next i
End If
Case "column"
If Sfinish = 0 Or (Sstart = LBound(Sarray, 1) And Sfinish = UBound(Sarray, 1)) Then
vtemp = Application.WorksheetFunction.Index(Sarray, 0, Sindex)
Else
ReDim vtemp(1 To Sfinish - Sstart + 1)
For i = 1 To Sfinish - Sstart + 1
vtemp(i) = Sarray(i + Sstart - 1, Sindex)
Next i
End If
End Select
End Select
GetArraySlice2D = vtemp
Exit Function
ErrHandler:
Dim M As Integer
M = MsgBox("Bad Array Input", vbOKOnly, "GetArraySlice2D")
End Function
I'm not sure if this will directly address your problem, but have you considered using Application.ScreenUpdating = False? Don't forget to set it back to true once your data has processed.

Excel formula help

I have a excel sheet with 2 columns
Contact Status Max Probability
80-Opp Closed Won
0-NC Closed Won
40-Pending 30-Connect
10-Working 20- Engagement
80-Opportunity 30-Connect
40-Pending 10- Engagement
I need to check if the no in contact status ie the numeric value before '-' is less than numeric value in max probabilty before '-', then i need the whole value of max probability update the contact status field
For eg.
10-Working 20- Engagement
10-Working should be replaced by 20- Engagement
OR
We can have 20- Engagement in a new column if the above mentioned conditioned is satisfied, if replacing the contact Status column is difficult
How can achieve this?
This little macro will do the trick. Just change Row to the starting row and ColA/B to the two columns you want to use.
Note that GetNum returns -1 if there is no number at the start which, in this implementation, means no copying. It was unclear what you wanted to do tith Closed Won so I chose the safest option. If you do want it copied, just return a huge number instead of -1.
Option Explicit
Function GetNum(s As String) As Integer
If Mid(s, 1, 1) < "0" Or Mid(s, 1, 1) > "9" Then
GetNum = -1
Else
GetNum = Val(s)
End If
End Function
Sub Macro1()
Dim Row As String
Dim ColA As String
Dim ColB As String
ColA = "A"
ColB = "B"
Row = "2"
While Range(ColA & Row).Value <> ""
If GetNum(Range(ColA & Row).Value) < GetNum(Range(ColB & Row).Value) Then
Range(ColA & Row).Value = Range(ColB & Row).Value
End If
Row = CStr(Val(Row) + 1)
Wend
End Sub
This was tested on:
to generate:
This macro will do:
Sub a()
ColumnContSt=1
ColumnMaxProb=2
i = 1
While (Cells(i, 2) <> "")
colB = Cells(i, ColumnMaxProb)
colA = Cells(i, ColumnContSt)
posB = InStr(1, colB, "-")
posA = InStr(1, colA, "-")
If (posB <> 0 And posA <> 0) Then
intColB = CInt(Mid(colB, 1, posB - 1))
intColA = CInt(Mid(colA, 1, posA - 1))
If (intColA < intColB) Then
Cells(i, 1) = Cells(i, 2)
End If
End If
i = i + 1
Wend
End Sub
Just change ColumnContSt = 1 and ColumnMaxProb = 2 for your actual column NUMBERS ie. A=1, B=2, etc.
HTH!
If you are flexible about the actual data, the easiest way to do this is to make two columns (can be hidden in some far away sheet):
0 NC
10 Engagement
30 Connect
...
And then use VLOOKUP(CELL_WITH_PROBABILITY, REGION_WITH_TABLE, 2, true). So if you put my small 3 element table in the upper corner of a sheet and probability in C1, it'd look like VLOOKUP(A1:B3, C1, 2, true)

Resources