Delete a duplicated line (only if it's the next one!) - excel

Is it possible to use a function in Excel 2003 to delete an entire row if it's the same as the previous one? For example:
apple
apple
plum
vinegar
apple
banana
banana
banana
apple
I want to delete #2, 7# and #8, but I don't want #5 and #9 to be deleted. I want to delete a duplicated entry ONLY IF it's the next one. I hope I managed to keep it clear to you.
If there isn't a function, how can I do that in VBA? Thanks in advance!

If it is a one-off, you can do it without VBA fairly simple by adding a formula in the next column - let's say your column is A:
in B2 put the following:
=IF(A1=A2,"DELETE","")
drag the formula down
add an autofilter on both columns
filter on "DELETE" in column B
delete all visible rows
remove column B

Here's something that might fit, deleting all duplicate rows (physically, not only cell data!), so keep in mind that it can be used only if you use one-column sheets. Otherwise you will loose some other data.
Sub Unduplicate()
Dim prev As String
Dim sel As Range
Dim i As Integer
Set sel = Range(Selection, Selection)
prev = sel.Offset(0).Value ' set prev as the first value - never will be deleted
i = 1 ' start from 2nd row
Do While sel.Offset(i).Value <> "" And sel.Offset(i).Value <> ""
If sel.Offset(i).Value = prev Then ' if duplicate - delete row
sel.Offset(i).EntireRow.Delete
Else ' else set new prev, and go further
prev = sel.Offset(i).Value
i = i + 1
End If
Loop
End Sub
After running this macro for your example, I'm getting:
You can modify it, to store the values in an array, and than fill the column with remembered array instead of deleting the rows, but that should be easy now.

Just a quick note...Make sure you work your way from the bottom of the range if you are deleting entire rows. You may get some unexpected results if you work your way down from the top of the range. You may also want to consider clearing the cell value and then sorting, instead of deleting. I would write up an example for you but I am short on time. If you get stuck I can write it for you later.
Edit:
My original answer is not necessarily true as shown in makciook's solution below. In the past I have mistakenly approached the problem this way (DO NOT USE THIS AS A SOLUTION!!!):
Sub duplicates()
Dim c As Range, rng As Range
Set rng = Selection ''Select the entire list before running
For Each c In rng
If c.Value = c.Offset(-1, 0).Value Then c.EntireRow.Delete
Next
End Sub
With this script, the cell range does not reset once a row is deleted and shifted up. So running this would not catch the second duplicate of banana in your list. As an alternative I usually find the last row and work my way up to the first row to account for the rows shifting up when a deletion occurs. I am going to give myself a -1 if possible.

Related

Copy a range of 2 rows and paste in next available blank space

I'm new to VBA and am trying to create a window order form to fit a landscape 8x11.5 page. The rows are set up as followed:
Each cell has a different data validation dropdown menu. I'd like to be able to copy this template and paste it below Line #1.
In this case Line #2 would begin at A5.
This code does the trick very crudely:
Sub CopyTemplate()
Worksheets("WINDOW TEMPLATE").Range("A3:R4").Copy Range("A5:R6")
End Sub
As you can see, the job is done crudely but I'd have to create a macro for each new line manually.
Is there a way to replace the destination range with a variable that will find the next available empty row to create Line 2 like
In the case of Line 2 it would be A7 (and then every two lines would be the loop I guess).
I've seen several example similar to mine but they never do what I need or I'm too inexperienced to retrofit the code to my needs. As I understand it, .End finds the next blank cell but all it seems to do is goto the bottom of the sheet.
Find the last numbered item in Col A then move down two rows:
Sub CopyTemplate()
Dim c as range
With Worksheets("WINDOW TEMPLATE")
Set c = .Cells(.Rows.Count, "A").End(xlUp).Offset(2) 'move down two rows from last numbered item
.Range("A3:R4").Copy c
c.value = c.offset(-2).value + 1 'increment item#
End With
End Sub

VBA For Each ... Loop skipping cells when clearing row

I know this is a common problem people have when trying to delete rows when looping through ranges but that is not the case here. Rather than deleting the rows, I am cutting the entire row and pasting into another worksheet, leaving a blank row. The issue is that the For Each ... Loop acts as if I deleted the row and skips the next cell in the loop. Below is a snippet of the code.
last_row = 100
Set search_rng = n_ws.Range(n_ws.Range("L2"), n_ws.Cells(last_row, 12))
For Each cell In search_rng
find_amt = cell.Value * -1
Set s_cell = search_rng.Find(find_amt, LookIn:=xlFormulas)
If Not s_cell Is Nothing Then
' do stuff
Else
'No matching value found, move row to o_ws
paste_row = o_ws.UsedRange.Rows.Count + 1
n_ws.Rows(cell.Row).Cut o_ws.Cells(paste_row, 1).EntireRow
End If
Next
Anyone know what I'm doing wrong here? I imagine I can fix it by changing the For Each ... Loop to a For i in Range loop but I'm more curious as to why this is happening.
Cutting the row still deletes it in pratice. You can convince yourself of this by just manually doing so on the spread sheet. You will see a shift in range which means this is the same issue as deleting rows in a loop. A common way to avoid this issue would be to switch to For i loop and loop backwards.
Since you are not actually deleting rows here you could also try the following....
Using your same code, you can just copy & paste the row, not cut, and then circle back around and clear the rows content to leave a blank row in place.
Else
paste_row = o_ws.UsedRange.Rows.Count + 1
n_ws.Rows(Cell.Row).Copy o_ws.Cells(paste_row, 1)
n_ws.Rows(Cell.Row).ClearContents
End If
Next cell
Note
The better way to do this would be to loop through your range and create a Union (collection of cells) that meet your criteria. Once the loop is done you can copy, paste, & clear the Union all at once. This means you have one instance of actions taken to worksheet rather many

Create list of unique values from a range in Excel

I have data (names) spread out in a sheet in Excel. It is in columns A through M.
Some rows have data in every column (A-M), some rows have nothing.
I want to extract a list of every unique value (or even a list of every value, and then I can remove duplicates) in the range.
For the life of me, I cannot figure out how to do this and haven't found an example online that pulls unique values from a RANGE, and not a single column.
Any ideas? Let me know if you need a screenshot of the data for more context.
Give this macro a try:
Sub MAINevent()
Dim it As Range, r As Range, x0
With CreateObject("scripting.dictionary")
For Each it In Range("A:M").SpecialCells(2)
x0 = .Item(it.Value)
Next
Set r = Cells(1, "N").Resize(.Count, 1)
r.Value = Application.Transpose(.Keys)
End With
End Sub
You can use Go to Special (Ctrl+G -> Special) to select blanks and remove them (Ctrl+- with cells selected). Then use the formula described here to stack all your columns in one, on top of each other. Then replace formulas with values and finally, use Remove Duplicates.

Removing duplicates from large sheet

I want to remove rows based on duplicate cells in a column from large sheet, without leaving duplicate sample (like "Remove Duplicates" Excel command does). So if I have:
1
2
2
3
I want, as a result:
1
3
This can be accomplished with conditional formatting, then filtering or sorting duplicates and deleting filtered data, but the process is very slow for large sheet.
Conditional formatting takes second, but just clicking on filter takes around 5min to display filter context menu and additional 20-30min to do actual filtering based on color. I tried this process on a different PCs with 4 cores and plenty of RAM and 100.000 rows sheet
I then thought to write VBA, iterate column cells and if cell is colored, then delete entire row (this is possible in Excel 2010, with Cells().DisplayFormat) but processing takes even more time.
Can someone suggest a faster way to remove duplicates on large sheet?
EDIT: Note that I have used 2 functions. Of this, test is a function to test whether the function works (which you will have to modify per your scenario).
Also, I filled cell A1 to A100000 with test values. Please modify it per your needs.
Option Explicit
Function GetUniqueItems(ByVal src As Range) As Variant
Dim returnValue
Dim dictOfItemsWith1Value
Dim dictOfItemsWithMoreThan1Value
Dim countOfCells As Long
Dim counter As Long
Dim srcValues As Variant
Dim currentValue
Dim cell As Range
srcValues = src.Value
countOfCells = src.Cells.Count
Set dictOfItemsWith1Value = CreateObject("Scripting.Dictionary")
Set dictOfItemsWithMoreThan1Value = CreateObject("Scripting.Dictionary")
For counter = 1 To countOfCells
currentValue = srcValues(counter, 1)
If dictOfItemsWithMoreThan1Value.exists(currentValue) Then
dictOfItemsWithMoreThan1Value(currentValue) = dictOfItemsWithMoreThan1Value(currentValue) + 1
Else
If Not dictOfItemsWith1Value.exists(currentValue) Then
dictOfItemsWith1Value.Add currentValue, 1
Else
dictOfItemsWith1Value.Remove currentValue
dictOfItemsWithMoreThan1Value.Add currentValue, 1
End If
End If
Next
ReDim returnValue(1 To dictOfItemsWith1Value.Count, 1 To 1)
Dim key
counter = 1
For Each key In dictOfItemsWith1Value.keys
returnValue(counter, 1) = key
counter = counter + 1
Next
GetUniqueItems = returnValue
End Function
Sub test()
Debug.Print Now
Dim uniqueValues
uniqueValues = GetUniqueItems(Range("A1:A100000"))
Range("A1:A100000").ClearContents
Range("A1").Resize(UBound(uniqueValues, 1)) = uniqueValues
Debug.Print Now
End Sub
My way to deal with large excel files where I have to remove large chunks of data:
After the last column, use a countif() (much like KazJaw and DanM's countif)
=COUNTIF($A$1:$A$100000,A1)
$A$1:$A$100000 contains your ids. Change accordingly.
Drag the formula to the bottom (Fill Down, or select the range $B$1:$B$100000 if this is the column you put the helper column then Ctrl+D)
Copy column and paste values in place to remove the formula. This will prevent any recalculations during/after any filtering.
Sort by the column with the counts. This makes deleting the large portion of rows much faster later on.
Look for where you start to get counts of 2 and delete all rows till bottom.
Delete the helper column.
Now, if you want to restore the original order, put yet another column after the count, after step 3 above, and after step 5, sort this new column by ascending order before deleting it in step 6.
If you data is in located in column A, this formula should do what you need fairly efficiently:
=COUNTIF(A$1:A$100000,A1)
This formula counts how many times the value in A1 appears in the range A1:A100000. (The dollar signs keep that range from moving down as your drag your formula down.)
Place this in B1 and drag down* to B100000 (assuming you have 100,000 rows).
Then just do a filter on column B to show only 1. (More than 1 means you have duplicates and shouldn't show it.)
*A short cut for dragging down is to just select B1, then press Ctrl-End, then hold down shift and click B100000. Then do Ctrl-D (which is a shortcut for Fill Down).

Delete a row in Excel VBA

I have this piece of code which finds the excel row of an item from a list and deletes the items from a list. What I want... is to delete the Excel row as well.
The code is here
Private Sub imperecheaza_Click()
Dim ws As Worksheet
Dim Rand As Long
Set ws = Worksheets("BD_IR")
Rand = 3
Do While ws.Cells(Rand, 4).Value <> "" And Rand < 65000
If ws.Cells(Rand, 4).Value = gksluri.Value * 1 And ws.Cells(Rand, 5).Value = gksluri.List(gksluri.ListIndex, 1) * 1 Then
ws.Range(Rand, 1).EntireRow.Delete '(here I want to delete the entire row that meets the criteria from the If statement)
gksluri.RemoveItem gksluri.ListIndex
Exit Do
End If
Rand = Rand + 1
Loop
End Sub
Where I added ws.Range(Rand,1).EntireRow.Delete is where I want to delete the entire row but I don't know how to do it. What I want... if it finds the same value in a cell like in some selected item of my list to be able to remove both the entire row in excel and the item from the listbox. It works to remove the item from the listbox but I don't know how to remove the row as well
Chris Nielsen's solution is simple and will work well. A slightly shorter option would be...
ws.Rows(Rand).Delete
...note there is no need to specify a Shift when deleting a row as, by definition, it's not possible to shift left
Incidentally, my preferred method for deleting rows is to use...
ws.Rows(Rand) = ""
...in the initial loop. I then use a Sort function to push these rows to the bottom of the data. The main reason for this is because deleting single rows can be a very slow procedure (if you are deleting >100). It also ensures nothing gets missed as per Robert Ilbrink's comment
You can learn the code for sorting by recording a macro and reducing the code as demonstrated in this expert Excel video. I have a suspicion that the neatest method (Range("A1:Z10").Sort Key1:=Range("A1"), Order1:=xlSortAscending/Descending, Header:=xlYes/No) can only be discovered on pre-2007 versions of Excel...but you can always reduce the 2007/2010 equivalent code
Couple more points...if your list is not already sorted by a column and you wish to retain the order, you can stick the row number 'Rand' in a spare column to the right of each row as you loop through. You would then sort by that comment and eliminate it
If your data rows contain formatting, you may wish to find the end of the new data range and delete the rows that you cleared earlier. That's to keep the file size down. Note that a single large delete at the end of the procedure will not impair your code's performance in the same way that deleting single rows does
Change your line
ws.Range(Rand, 1).EntireRow.Delete
to
ws.Cells(Rand, 1).EntireRow.Delete
Better yet, use union to grab all the rows you want to delete, then delete them all at once. The rows need not be continuous.
dim rng as range
dim rDel as range
for each rng in {the range you're searching}
if {Conditions to be met} = true then
if not rDel is nothing then
set rDel = union(rng,rDel)
else
set rDel = rng
end if
end if
next
rDel.entirerow.delete
That way you don't have to worry about sorting or things being at the bottom.
Something like this will do it:
Rows("12:12").Select
Selection.Delete
So in your code it would look like something like this:
Rows(CStr(rand) & ":" & CStr(rand)).Select
Selection.Delete

Resources