Delete rows based on semi-unique ID string - excel

I have a series of semi-unique IDs, similar to the below:
XX123456/01
XX123456/02
XX123456/03
XX122222/01
XX122222/02
XX122222/03
XX155555/01
XX155555/05
The lengths are always the same, and always split by a delimiter with a secondary unique ID following.
I want to run down a column and delete rows, leaving only the row with the largest secondary unique ID, e.g.:
XX123456/03
XX122222/03
XX155555/05
What sort of logic do I need to apply for this?
I know how to deal with standard duplicates by looping down the column and adding each ID into an array, and deleting a row if the ID is already in the array, but I am not sure how to deal with the secondary unique ID.

If this is a one-time job it would be safer and easier to do it without VBA. If you must do it using VBA, you could do the below:
Sub RowDelete()
Dim Rng As Range
Set Rng = Range("C2") 'Change this to the column and cell where your data begins (below the header).
Rng.Sort key1:=Rng, order1:=xlAscending 'Sorts your data in ascending order.
'The For Loop below is a little ugly, but it tests the number to the left of the delimiter and finds
'the largest value to the right of the delimiter. It fills a temp column 50 columns over with the value "DELETE".
For i = 1 To Range("C2", Rng.End(xlDown)).Rows.Count
If Left(Rng.Offset(i, 0).Value, 8) = Left(Rng.Offset(i - 1, 0).Value, 8) _
And Right(Rng.Offset(i, 0).Value, 2) > Right(Rng.Offset(i - 1, 0).Value, 2) Then Rng.Offset(i - 1, 50).Value = "DELETE"
Next i
'Loops through the temp column and deletes the row if its value is "DELETE"
For i = Range("C2", Rng.End(xlDown)).Rows.Count To 1 Step -1
If Rng.Offset(i - 1, 50).Value = "DELETE" Then Rng.Offset(i - 1, 50).EntireRow.Delete
Next i
End Sub
Please keep in mind that I don't know the size or importance of your project. I quickly threw this together in hopes that it might give you insight on ways to solve this problem.

Related

EXCEL VBA: For Loop involving checking Duplicates and continuing serial

I am new at using VBA and I am trying to do something that seems "simple." I have my VBA code generate a string (CP20210100001) and I want my for loop to check if that string has already been used in that column. If already used, generate the next in the serial until the next unique value in the serial has been generated.
My boss wants to paste a different ID occasionally in the column and this disturbs the code. My code looks at the last row and adds one to the String + serial. This will result in duplicates.
I figured out through much googling to get the code to check the current value for duplicates but I can't figure out how to get it to check for future IDs in the series until it comes across a unique value.
Below you can see my column. I had 10 successful submission and then my boss pasted 3 rows. With my VBA the next generated ID would be CP20210200004 but last part of the code found it as duplicate so it added 1 and inputted CP20210200005. Ideally the VBA should for loop until the next in the serial shows up. In this case CP20210200011. This way no matter how many times my boss disrupts my table, my ID sequence stays in tact.
**Reference ID**
CP20210100000
CP20210200001
CP20210200002
CP20210200003
CP20210200004
CP20210200005
CP20210200006
CP20210200007
CP20210200008
CP20210200009
CP20210200010
JS20210200001
JS20210200002
JS20210200003
CP20210200005
Below is the the VBA
#Timestamp is part of the String + Serial Combo
Timestamp = Format(Year(Date)) + Format(Month(Date), "00")
#I found this online. Essentially if A2 is blank then input CP + Timestamp + 00001 (CP20210100001)
#It looks at the last row to find the old value (OVAL) and generate the new value (NVAL)
If Sheets(ws_output).Range("A2") = "" Then
Sheets(ws_output).Range("A2").Value = "CP" & Timestamp + 1
Else
lstrow = Sheets(ws_output).Cells(Rows.Count, "A").End(xlUp).Row
Oval = Sheets(ws_output).Range("A" & lstrow)
NVAL = "CP" & Timestamp & Format(Right(Oval, 4) + 1, "00000")
#Here I am trying to see if NVAL is a duplicate value. If so add one to the serial.
Count = Application.WorksheetFunction.Countif(Sheets(ws_output).Range("A2:A100000"), NVAL)
Dim Cell As Range
For Each Cell In Sheets(ws_output).Range("A2:A100000")
If Count > 1 Then
NXVAL = NVAL
Else
NXVAL = "CP" & Timestamp & Format(Right(NVAL, 4) + 1, "00000")
End If
Next
Please please please help.
EDIT
I Should clarify that all of this is triggered on a form. The module is connected to a submit button. Once the button is pressed all the values in the form write to a separate sheet. Reference ID is the only part that isn't on the form. Essentially once the button is pressed, it triggers the query to write the next available reference ID. The next line in the query is
Sheets("Sheet2").Cells(next_row, 1).Value = NXVAL
I need the new Reference ID to equal a variable.
Your code seems to give you much grief and little comfort. The reason is that you didn't take a strictly logical approach. The tasks are ...
Find the last used number. I suggest to use VBA's own Find function.
Insert the next number. It consists of prefix, Date and serial number.
So, you arrive at code like this:-
Sub STO_66112119()
' 168
Const NumClm As Long = 1 ' 1 = column A
Dim Prefix As String
Dim LastNumber As Long
Dim Fnd As Range ' search result
Prefix = "JS" ' you could get this from an InputBox to
' enable numbering for other prefixes
With Columns(NumClm)
On Error Resume Next ' if column A is blank
Set Fnd = .Find(What:=Prefix, _
After:=.Cells(1, 1), _
LookIn:=xlValues, _
Lookat:=xlPart, _
SearchOrder:=xlByRows, _
SearchDirection:=xlPrevious, _
MatchCase:=False)
End With
LastNumber = Val(Right(Fnd.Value, 5))
On Error GoTo 0
Cells(Rows.Count, NumClm).End(xlUp).Offset(1).Value = Prefix & Format(Date, "yyyymm") _
& Format(LastNumber + 1, "00000")
End Sub
You need to spend a moment on preparation, however.
Define the column to work in. I put this in the Const NumClm. It's at the top of the code so as to make maintenance easier (won't need to dig in the code to make a change).
My code shows Prefix = "JS". You want to change this to "CP". I inserted "JS" to show that you could use any prefix.
The above code will continue counting up in a new month and even a new year. If you want to start each year with a new series just change the way you handle the found previous. The Find function will return the cell where the prefix was last used. You might further examine that cell's value.

How to sort multiple columns in Excel by matching value

I am doing data clean up. I collected data from multiple excel spreadsheets.
I am trying to achieve sorting on the basis on exact match i.e.
**Original dataset**
FirstName_Store1 FirstName_Store2 FirstName_Store3
Aat Cat Dat
Cat Bat Zat
Cot
Eat
**Result intended**
FirstName_Store1 FirstName_Store2 FirstName_Store3
Aat
Bat
Cat Cat
Cot
Dat
Eat
Zat
The reason I intended result because I want to find out which record to delete. I want to delete the value from Store1 which is not in store2 and store3.
Original dataset contains about 4000 records. Normal alphabet sorting doesn't display exact column value from Store1 side by side with Store2 and Store3.
Things I tried so far:
Match function Match (Store1,Store2:Store3,0) and the result was an error.
I also tried Vlookup, however, Vlookup doesn't do the sorting.
Personally, I think this is an XY Problem where you are trying to solve a problem that has little if anything to do with your actual issue.
The reason I intended result because I want to find out which record to delete. I want to delete the value from Store1 which is not in store2 and store3.
This situation should be solved easily by looping through each cell in the FirstName_Store1 column from bottom to top and performing an application.countif on the remaining columns.
With that said, here's one solution to the question you actually posed.
Sample data before sortAndSift sub procedure
sortAndSift code for public module code sheet
Option Explicit
Sub sortAndSift()
Dim i As Long, j As Long, m As Variant, n As Variant
With Worksheets("sheet6")
With .Cells(1, 1).CurrentRegion
With .Cells.Resize(.Rows.Count - 1, .Columns.Count).Offset(1, 0)
'store the original values
m = .Value2
End With
'put all values into column A
For i = 2 To .Columns.Count
With .Range(.Cells(2, i), .Cells(Rows.Count, i).End(xlUp))
.Parent.Cells(.Parent.Rows.Count, 1).End(xlUp).Resize(.Rows.Count, .Columns.Count).Offset(1, 0) = .Value
End With
Next i
End With
'reassert CurrentRegion since it probably changed
With .Cells(1, 1).CurrentRegion
With .Columns(1).Cells
'remove duplicates from column A
.RemoveDuplicates Columns:=1, Header:=xlYes
'sort column A
.Sort Key1:=.Columns(1), Order1:=xlAscending, _
Orientation:=xlTopToBottom, Header:=xlYes
End With
'put a copy of the expanded, de-duplicated and sorted column A
' in all other columns and make a copy of the values
With .Cells.Resize(.Rows.Count - 1, .Columns.Count).Offset(1, 0)
.FillRight
n = .Value2
End With
'remove values from each 'column' in the array that were not in the original
For i = LBound(n, 1) To UBound(n, 1)
For j = LBound(n, 2) To UBound(n, 2)
If IsError(Application.Match(n(i, j), Application.Index(m, 0, j), 0)) Then
n(i, j) = vbNullString
End If
Next j
Next i
End With
'put values back on the worksheet
.Cells(2, 1).Resize(UBound(n, 1), UBound(n, 2)) = n
End With
End Sub
Sample data after sortAndSift sub procedure
A unique list can easily be created by putting your data into a PivotTable (how though seems off topic for SO ref). The results may be Copy/Paste Special/Values, Remove Duplicates applied, sorted alphabetically and then placed in ColumnE, for example.
Then in F2 copied across and down to H8:
=IFERROR(IF(MATCH($E2,A:A,0),$E2),"")
Copy headings across.
Assumes Aat is in A2.

Automate averaging sets of columns in excel

I have to average sets of 3 columns.
EXAMPLE:
Blood_Patient1_0_R1, Blood_Patient1_0_R2, Blood_Patient1_0_R3
There average is in a new column Blood_Patient1_0
Similarly, Blood_Patient1_3_5_R1, Blood_Patient1_3_5_R2, Blood_Patient1_3_5_R3
The average is in a new column Blood_Patient1_3_5
This process is being repeated for 8 such sets of columns.
Currently I am averaging using the formula: IF(ISERROR(AVERAGE(B7:D7)),"",AVERAGE(B7:D7)) and auto-filling 21,000 plus rows.
Since there is a pattern in column headings, I was thinking to automate the whole process.
This is what I have thought so far in terms of algorithm:
0, 3_5, 6_25 are time values in column headers.
at each time instant, there are 3 replicates R1, R2,R3 as part of column headers
for time array [3.5h, 6.25h, 9.5h, 11.5h, 16.5h, 25h, 49h, and 156h
]
create a new column
for rows from 2 to 21458
average over replicates from R1 to R3 using above formula
I do not know how to write this in excel. Any help would be appreciated.
Give this a go.
This solution assumes that you have a continuous data set, that is, no gaps between the columns you wish to search through.
Firstly, you will need to include this function. Paste it into the same module as the subroutine. The purpose of this function is to allow the string in each heading to be compared against an array of substrings, as opposed to the single substring permitted by the InStr function.
Function Time_Search(strCheck As String, ParamArray anyOf()) As Boolean
Dim item As Long
For item = 0 To UBound(anyOf)
If InStr(1, strCheck, anyOf(item)) <> 0 Then
Time_Search = True
Exit Function
End If
Next
End Function
Next, paste in this subroutine. I have assummed that the dataset begins at cell A1. Also, I have allowed for a dynamic range, should the number of columns or rows ever change.
Sub Insert_Average_Columns()
Dim HeaderRange As Range
Dim LastRow As Long
Dim c As Range
Set HeaderRange = Range(Range("A1"), Range("A1").End(xlToRight))
LastRow = Range("A" & Rows.Count).End(xlUp).Row
For Each c In HeaderRange.Cells
If Right(c.Value, 2) = "R3" Then
If Time_Search(c.Value, "3_5", "6_25", "9_5", "11_5", "16_5", "25", "49", "156") Then
c.Offset(0, 1).EntireColumn.Insert
c.Offset(0, 1) = "Average of " & Left(c.Value, Len(c.Value) - 3)
c.Offset(1, 1).FormulaR1C1 = "=IFERROR(AVERAGE(RC[-3]:RC[-1]),"""")"
c.Offset(1, 1).AutoFill Range(c.Offset(1, 1).Address, Cells(LastRow, c.Offset(1, 1).Column))
End If
End If
Next c
End Sub
There is one issue with your data. If you want the procedure to insert an average column for T = 25, then it will do so for all columns where T contains the string "25". If there are T= 8.25, 10.25, 15.25, etc, these will all have averages applied. The only way around it would be to include more of the heading string in the parameter array, but I presume you will be dealing with a variable Blood_Patient ID so that probably isn't an option.

Identify and duplicate unique rows

I have files of data with the following format:
In column A, identifiers occur either doubly (e.g. 302_60) or singularly (e.g.310_58). Additional information is present in column B.
What I want to do is:
tag the rows that have single identifiers in column A with
TRUE/FALSE in Column C
for any TRUE tag, insert a line BELOW
copy into the inserted row the contents of the ENTIRE tagged row (here just columns A,B)
I solved #1 using =COUNTIF(A:A, A1)=1
I then wrote a VBA script to solve #2
Sub ins_below_and_copy()
Dim c As Range
For Each c In Range("C1:C100")
If InStr(1, c, "TRUE", vbTextCompare) > 0 Then
Rows(c.Offset(1, 0).Row & ":" & c.Offset(1, 0).Row).Insert Shift:=xlDown
End If
Next c
End Sub
Achieving the desired end result (#3)
seems simple enough, right? I have been trying .Copy and .Paste commands, but keep getting type-mismatch errors, an error that does not make sense to me (since I am not a competent VBA coder). Any ideas?
You have down all the hard work, filling in the gaps is easy. Select the two columns, HOME > Editing - Find & Select, Go To Special..., Blanks, OK, =, UP and Ctrl+Enter.
You can run this after you have your empty rows created.
Dim sheet As String
Dim lastRow As Long
sheet = "SheetName"
lastRow = Sheets(sheet).Range("A" & Rows.Count).End(xlUp).Row
For r = 2 To lastRow 'Assuming you have a Header Row
If Sheets(sheet).Cells(r, 1) = "" Then
Sheets(sheet).Cells(r - 1, 3) = "FALSE"
Sheets(sheet).Cells(r, 1) = Sheets(sheet).Cells(r - 1, 1)
Sheets(sheet).Cells(r, 2) = Sheets(sheet).Cells(r - 1, 2)
Sheets(sheet).Cells(r, 3) = Sheets(sheet).Cells(r - 1, 3)
End If
Next r

Getting excel to put together split strings

I'm trying to get excel to put together a series of text strings that haven't been formatted systematically, so that they end up split into different rows on a data sheet.
I'm aware this might've been solved elsewhere so sorry for that but I'm struggling to describe the issue, and I can't post images on it but basically it's
Column 1 with a list of the entries, and
Column 2 with text strings that are spread over 2 or more rows
Is it possible to write some kind of formula or macro that would be able to check the first column and then stitch together all entries in the second column going down until it found a new entry in the first column? I've got a feeling it might be possible using some sort of loop thing with index functions, but I've no idea where to start even.
Thanks,
Mike
Mike give this a ty
Sub appendValues()
'The sub is designed to loop through code and when ever there is a null value and column a it will take the value of what is in column B and appended to the row above it and delete the row.
Dim row As Integer
row = 1
'This code starts with row one but this can be changed at will.
Do Until ThisWorkbook.Sheets("sheet1").Cells(row, 2).Value = ""
'loop statement is designed to continue to Loop until there is a null value inside of you the value in the second column.
If ThisWorkbook.Sheets("sheet1").Cells(row, 1).Value = "" Then
ThisWorkbook.Sheets("sheet1").Cells(row - 1, 2).Value = ThisWorkbook.Sheets("sheet1").Cells(row - 1, 2).Value & ThisWorkbook.Sheets("sheet1").Cells(row, 2).Value
Rows(row).Delete
Else
'else statement is needed because there is an implied looping by decreasing the total number of rows after the delete.
row = row + 1
End If
Loop
End Sub
Sub appendValues()
'The sub is designed to loop through code and when ever there is a null value and column a it will take the value of what is in column B and appended to the row above it and delete the row.
Dim row As Integer
row = 1
'This code starts with row one but this can be changed at will.
Do Until ThisWorkbook.Sheets("sheet1").Cells(row, 2).Value = ""
'loop statement is designed to continue to Loop until there is a null value inside of you the value in the second column.
If ThisWorkbook.Sheets("sheet1").Cells(row, 1).Value = "" Then
ThisWorkbook.Sheets("sheet1").Cells(row - 1, 2).Value = ThisWorkbook.Sheets("sheet1").Cells(row - 1, 2).Value & ThisWorkbook.Sheets("sheet1").Cells(row, 2).Value
Rows(row).Delete
Else
'else statement is needed because there is an implied looping by decreasing the total number of rows after the delete.
row = row + 1
End If
Loop
End Sub

Resources