fastest way to process 115 million cells? - excel

I have been given a work task where im to find and replace 8 digits numbers with a corresponding new values coming from a 2 column table....basically a vlookup then replace the old value with a new one...
The challenge im facing is.... the 2 column table is 882k rows, and the cells im trying to replace is about 120 million (41,000 rows x 3000 columns)...
I tried running a vba code i found somewhere...
Option Explicit
Sub Replace_Overwrite()
Dim LRow As Long, i As Long
Dim varSearch As Variant
With Sheets("Sheet2")
LRow = .Cells(.Rows.Count, 1).End(xlUp).Row
varSearch = .Range("A2:B" & LRow)
End With
With Sheets("Sheet1").UsedRange
For i = LBound(varSearch) To UBound(varSearch)
.Replace what:=varSearch(i, 1), replacement:=varSearch(i, 2), lookat:=xlWhole
Next
End With
End Sub
I tried using this and it ran it for 8 hours and my work laptop crashed....
I'm not sure anymore if this is still possible with MS Excel alone...
I wonder if anyone can help me with a code that can process it.. i can leave my system open over the weekend if its stable and does work.. it only has 8GB ram btw, running excel 2013...

To speed up things, do as much as possible in memory and minimize the interaction between VBA and Excel (as this makes things really slow).
The following attempt reads the lookup-list into a dictionary and then processes the data column by column.
I did a test, creating 880.000 lookup rows and 40.000 x 100 cells of data. Building the dictionary took less than a minute, processing the columns took 3-4 seconds per column. I added a logic that after every 10 columns, the whole workbook is saved, that increased the processing time but ensures that after a crash you can more or less continue where you left (the yellow color tells you where, just replace the 1 in for col=1 with the column where you want to restart).
I have added some DoEvents, that in theory slows down the process a little bit. Advantage is that you can see the output of the debug.print and the whole Excel process is not displayed as unresponsive in the task manager.
To build the dictionary, I read the complete data into an array at once (if you are not familiar with Dictionaries: You need to add a reference to the Microsoft Scripting Runtime).
Function createDict() As Dictionary
Dim d As New Dictionary
Dim rowCount As Long
Dim list()
Debug.Print Now, "Read data from Lookup sheet"
With ThisWorkbook.Sheets(1)
rowCount = .Cells(.Rows.Count, 1).End(xlUp).row
list = .Range("A1:B" & rowCount).Value
End With
Debug.Print Now, "Build dictionary."
Dim row As Long
For row = 1 To UBound(list)
If Not d.Exists(list(row, 1)) Then d.Add list(row, 1), list(row, 2)
If row Mod 1000 = 0 Then DoEvents
Next row
Set createDict = d
End Function
As said, replacing the data is done column by column. Again, I read the whole column at once into an array, do the replace on this array and then write it back to the sheet.
Sub replaceAll()
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
Dim d As Dictionary
Set d = createDict
Dim row As Long, col As Long
Dim rowCount As Long, colCount As Long
With ThisWorkbook.Sheets(2)
rowCount = .Cells(.Rows.Count, 1).End(xlUp).row
colCount = .Cells(1, .Columns.Count).End(xlToLeft).Column
For col = 1 To colCount
Debug.Print Now & "processing col " & col
DoEvents
Dim data
data = .Range(.Cells(1, col), .Cells(rowCount, col))
For row = 1 To rowCount
If d.Exists(data(row, 1)) Then data(row, 1) = d(data(row, 1))
Next row
.Range(.Cells(1, col), .Cells(rowCount, col)) = data
.Cells(1, col).Interior.Color = vbYellow
If col Mod 10 = 0 Then ThisWorkbook.Save
Next
End With
Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic
End Sub
One remark: You should consider to use a database for such amount of data.

Related

How to delete rows from table, based on parameter check by formula?

I want to delete entire rows of data, if the formula matches the set value.
I am running a check (example: sum of three columns = 0?) through a set of 17K records. The code takes around 20 minutes to complete.
Dim currentRow As Integer
Dim rowCheck As Long
Dim ws As Worksheet
Set ws = ActiveSheet
For currentRow = ws.UsedRange.Rows.Count To 2 Step -1
rowCheck = Application.WorksheetFunction.Sum(Cells(currentRow, 5), Cells(currentRow, 6), Cells(currentRow, 7))
Select Case rowCheck
Case 0
ws.Rows(currentRow).Delete ' it takes around 20 minutes to complete with 17K records to run through
Case Else
End Select
Next
Set ws = Nothing
The code is working, however, it seems, I am doing something wrong, as I believe the code should work so much faster with given set of data (only 17K records).
Is there a way to optimize the deletion line?
Having to go through it line for line isn't the fast way to do this. You would be better off with a temporary helper column which calculates the sum. You can then filter the range on this column and delete all rows that match your criteria at once. So something like this. (assuming Column H is empty)
Dim currentRow As Integer
Dim rowCheck As Long
Dim ws As Worksheet
Dim lastRow as integer
Set ws = ActiveSheet
lastRow = ws.UsedRange.Rows.Count
ws.range("h2").formula = "=sum(e2:g2)"
ws.range("h2").autofill destination:= ws.range("h2:h" & lastRow)
ws.range("a1:h1").autofilter field:=8, criteria1:="0"
ws.range("a2:h" & lastRow).SpecialCells(xlCellTypeVisible).entirerow.delete
ws.autofiltermode = false
ws.range("h1:h" & lastRow).clearcontents
set ws = Nothing
Edit: You could also filter columns E, F, and G on 0 but that only works if all values are 0 or positive. Doing it the way I suggested gives you more control, because you can easily adjust the formula you put in cell H2.
It will be much faster to find all the rows that you want to delete, select all the rows, and delete them in one go, instead of doing it row by row.
let's say you found you wanted to delete rows 35, 37, 39, and 40 then the code will be something like
for each row in row_to_evaluate
delete_row = evaluate(row)
if delete_row = True then Delete_Row_List = Delete_Row_List & "," & row
next
'Example: Delete_Row_List = "35,37,39,40"
Rows(Delete_Row_List).Delete Shift:=xlUp
also remember application.screenupdating = False before you run the code and application.screenupdating = True after you run it for a bit better performance.
hope it helps
EDIT:
Ah I see an answer before mine basically recommended the same
It just occurred to me to speed up your code substantially. I assumed that you have to use the delete rows capability, but it will actually be much faster to do the following (note this is pseudo code, panel beat to work for you):
with thisworkbook.worksheets("Sheet1")
redim New_Sheet(1 to nr_rows, 1 to nr_columns) as variant
Old_Sheet = .range(.cells(1,1),.cells(nr_rows,nr_columns)
'keep the headers
for col = 1 to nr_columns
New_Sheet(1,col) = Old_Sheet(1,col)
next col
k = 1
for row = 2 to nr_rows 'start at 2 to protect the headers
keep_row = evaluate_row(row,Old_sheet) 'this function must evaluate the row. return True if you want to keep the row, return false if you want to delete it
if keep_row then
k = k+1
for col = 1 to nr_col
New_sheet(k,col) = Old_Sheet(row,col)
next col
next row
.range(.cells(1,1),.cells(nr_rows,nr_columns) = New_Sheet
end with

Nested loops causing Excel crash

I am attempting to run a VBA macro that iterates down about 67,000 rows with 100 columns in each row. For each of the cells in these rows, the value is compared against a column with 87 entries in another sheet. There are no errors noted when the code is run but Excel crashes every time. The odd thing is that the code seems to work; I have it set to mark each row in which a match is found and it does so before crashing. I have attempted to run it many times and it has gotten through between 800 and 11,000 rows before crashing, depending on the attempt.
My first suspect was memory overflow due to the volume of calculations but my system shows CPU utilization at 100% and memory usage around 50% while running this code:
Sub Verify()
Dim codes As String
Dim field As Object
For i = 2 To Sheets("DSaudit").Rows.Count
For Each field In Sheets("Dsaudit").Range(Cells(i, 12), Cells(i, 111))
r = 1
While r <= 87
codes = ThisWorkbook.Sheets("287 Denominator CPT").Cells(r, 1).Value
If field = codes Then
Cells(i, 112).Value = "True"
r = 88
Else
r = r + 1
End If
Wend
Next field
i = i + 1
Next i
End Sub
It should also be noted that I am still very new to VBA so it's likely I've made some sort of egregious rookie mistake. Can I make some alterations to this code to avoid a crash or should I scrap it and take a more efficient approach?
When ever possible iterate variant arrays. This limits the number of times vba needs to access the worksheet.
Every time the veil between vba and Excel is pierced cost time. This only pierces that veil 3 times not 9,031,385,088
Sub Verify()
With Sheets("DSaudit")
'Get last row of Data
Dim lastrow As Long
lastrow = .Cells(.Rows.Count, 12).End(xlUp).Row 'if column 12 ends before the last row of data change to column that has them all.
'Load Array with input Values
Dim rng As Variant
rng = .Range(.Cells(2, 12), .Cells(lastrow, 111)).Value
'Create output array
Dim outpt As Variant
ReDim outpt(1 To UBound(rng, 1), 1 To 1)
'Create Match array
Dim mtch As Variant
mtch = Worksheets("287 Denominator CPT").Range("A1:A87").Value
'Loop through first dimension(Row)
Dim i As Long
For i = LBound(rng, 1) To UBound(rng, 1)
'Loop second dimension(Column)
Dim j As Long
For j = LBound(rng, 2) To UBound(rng, 2)
'Loop Match array
Dim k As Long
For k = LBound(mtch, 1) To UBound(mtch, 1)
'If eqaul set value in output and exit the inner loop
If mtch(k, 1) = rng(i, j) Then
outpt(i, 1) = "True"
Exit For
End If
Next k
'If filled true then exit this for
If outpt(i, 1) = "True" Then Exit For
Next j
Next i
'Assign the values to the cells.
.Cells(2, 112).Resize(UBound(outpt, 1), 1).Value = outpt
End With
End Sub

Deleting entire row on criteria cannot handle 400,000 rows

I have this macro to delete the entire rows for those that are not "chr9". I have a total of 401,094 rows. It seems to compile fine, but my Excel freezes and I have to Force Quit.
I think it may be an inefficient algorithm or maybe some error in the code?
Sub deleteNonChr9()
Dim lastrow As Long
Dim firstrow As Long
Dim i As Long
lastrow = 401094
firstrow = 0
' Increment bottom of sheet to upwards
For i = lastrow To firstrow Step -1
If (Range("C1").Offset(i, 0) <> "chr9") Then
Range("C1").Offset(i, 0).EntireRow.Delete
End If
Next i
End Sub
The fastest way to conditionally delete rows is to have them all at the bottom of the data block. Sorting them into that position and deleting is faster than individual looping or even compiling a discontiguous Union of rows to delete.
When any group or cells is contiguous (i.e. all together) Excel does not have to work as hard to get rid of them. If they are at the bottom of the Worksheet.UsedRange property, Excel doesn't have to calculate what to fill the empty space with.
Your original code did not allow for a column header text label in row 1 but I will account for that. Modify to suit if you do not have one.
These will turn off the three primary parasites of computing power. Two have already been addressed in the comments and answers, the third Application.EnableEvents property can also make a valid contribution to Sub procedure efficiency whether you have event driven routines or not. See the helper Sub procedure at the bottom for details.
Sample data²: 500K rows of random data in A:Z. ~33% Chr9 in column C:C. Approximately 333K randomly discontiguous rows to delete.
Union and delete
Option Explicit
Sub deleteByUnion()
Dim rw As Long, dels As Range
On Error GoTo bm_Safe_Exit
appTGGL bTGGL:=False 'disable parasitic environment
With Worksheets("Sheet1")
Set dels = .Cells(.Rows.Count, "C").End(xlUp).Offset(1)
For rw = .Cells(.Rows.Count, "C").End(xlUp).Row To 2 Step -1
If LCase$(.Cells(rw, "C").Value2) <> "chr9" Then
Set dels = Union(dels, .Cells(rw, "C"))
End If
Next rw
If Not dels Is Nothing Then
dels.EntireRow.Delete
End If
End With
bm_Safe_Exit:
appTGGL
End Sub
Elapsed time: <It has been 20 minutes... I'll update this when it finishes...>
Bulk load from worksheet to variant array, change, load back, sort and delete
Sub deleteByArrayAndSort()
Dim v As Long, vals As Variant
On Error GoTo bm_Safe_Exit
appTGGL bTGGL:=False 'disable parasitic environment
With Worksheets("Sheet1")
With .Cells(1, 1).CurrentRegion
.EntireRow.Hidden = False
With .Resize(.Rows.Count - 1, .Columns.Count).Offset(1, 0)
'bulk load column C values
vals = .Columns(3).Value2
'change non-Chr9 values into vbNullStrings
For v = LBound(vals, 1) To UBound(vals, 1)
If LCase$(vals(v, 1)) <> "chr9" Then _
vals(v, 1) = vbNullString
Next v
End With
'dump revised array back into column C
.Cells(2, "C").Resize(UBound(vals, 1), UBound(vals, 2)) = vals
'sort all of blank C's to the bottom
.Cells.Sort Key1:=.Columns(3), Order1:=xlAscending, _
Orientation:=xlTopToBottom, Header:=xlYes
'delete non-Chr9 contiguous rows at bottom of currentregion
.Range(.Cells(.Rows.Count, "C").End(xlUp), .Cells(.Rows.Count, "C")).EntireRow.Delete
End With
.UsedRange 'reset the last_cell property
End With
bm_Safe_Exit:
appTGGL
End Sub
Elapsed time: 11.61 seconds¹       (166,262 rows of data remaining²)
Original code
Elapsed time: <still waiting...>
Summary
There are obvious advantages to working within a variant array as well as deleting contiguous ranges. My sample data had ~66% of the rows to delete so it was a harsh task master. If there were 5 or 20 rows to delete, using an array to parse data for a sort may not be the best solution. You will have to make your own decisions based on your own data.
appTGGL helper Sub procedure
Public Sub appTGGL(Optional bTGGL As Boolean = True)
With Application
.ScreenUpdating = bTGGL
.EnableEvents = bTGGL
.Calculation = IIf(bTGGL, xlCalculationAutomatic, xlCalculationManual)
End With
Debug.Print Timer
End Sub
¹ Environment: old business class laptop with a mobile i5 and 8gbs of DRAM running WIN7 and Office 2013 (version 15.0.4805.1001 MSO 15.0.4815.1000 32-bit) - typical of the low end of the scale for performing this level of procedure.
² Sample data temporarily available at Deleting entire row cannot handle 400,000 rows.xlsb.
Toggling ScreenUpdating and Calculation will help. But as Jeeped stated, applying a custom sort order is the way to go.
Sub deleteNonChr9()
Dim lastrow As Long
Dim firstrow As Long
Dim i As Long
lastrow = 401094
firstrow = 1
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
' Increment bottom of sheet to upwards
For i = lastrow To firstrow Step -1
If (Cells(i, "C") <> "chr9") Then
Rows(i).EntireRow.Delete
End If
Next i
Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic
End Sub
Major Progress
The following code for dealing with deleting a very large number of rows is inspired by Ron de Bruin - Excel Automation.
Sub QuickDeleteRows()
Dim Sheet_Data As Worksheet, NewSheet_Data As Worksheet
Dim Sheet_Name As String, ZeroTime As Double, Data As Range
On Error GoTo Error_Handler
SpeedUp True
Set Sheet_Data = Sheets("Test")
Sheet_Name = Sheet_Data.Name
LastRow = Cells(Rows.Count, "A").End(xlUp).Row
LastColumn = Cells(1, Columns.Count).End(xlToLeft).Column
Set Data = Sheet_Data.Range("A1", Cells(LastRow, LastColumn))
Set NewSheet_Data = Sheets.Add(After:=Sheet_Data)
Data.AutoFilter Field:=3, Criteria1:="=Chr9"
Data.Copy
With NewSheet_Data.Cells
.PasteSpecial xlPasteColumnWidths
.PasteSpecial xlPasteAll
.Cells(1, 1).Select
.Cells(1, 1).Copy
End With
Sheet_Data.Delete
NewSheet_Data.Name = Sheet_Name
Safe_Exit:
SpeedUp False
Exit Sub
Error_Handler:
Resume Safe_Exit
End Sub
Sub SpeedUp(SpeedUpOn As Boolean)
With Application
If SpeedUpOn Then
.ScreenUpdating = False
.EnableEvents = False
.Calculation = xlCalculationManual
.DisplayStatusBar = False
.DisplayAlerts = False
Else
.ScreenUpdating = True
.EnableEvents = True
.Calculation = xlCalculationAutomatic
.DisplayStatusBar = True
.DisplayAlerts = True
End If
End With
End Sub
While my old version of code takes time rather long (about 130 seconds on average) for handling sample data provided by Jeeped, but the code above completes less than 4.6 seconds for handling 400,000 rows of sample data on my machine. It's a very significant gain in performance!
System Information of my PC (Very Minimum Computer Configurations for Students)
Operating System: Windows 7 Professional 32-bit (6.1, Build 7601)
Service Pack 1
System Manufacturer: Hewlett-Packard
System Model: HP Pro 3330 MT
Processor: Intel(R) Core(TM) i3-2120 CPU # 3.30GHz (4
CPUs), ~3.3GHz
Memory: 2048MB RAM
Original Answer
I'm aware that this answer is not really what the OP wants, but maybe this answer can be useful for other users and helpful to future users, if not the OP. Please see this answer as the alternative method.
Copy/paste, cut/insert, and delete entire row operations in Excel can take an excessively long time even when doing it in VBA Excel. For copy/paste and cut/insert operations the cause of the slowness is the formatting of the data itself. Memory over-allocation is another cause of those operations. So how do we resolve a situation like this? There are several things you can look for speeding up your code.
Use arrays instead of the range of cells. It's usually considered to be faster than working on the range of cells because it ignores the formatting of the data in cells.
Use .Value2 rather than the default property (.Value) because .Value2 will only treat all formatting numbers (currency, accounting, date, scientific, etc) as Doubles.
Suppose we have 10,000 rows of dummy data like the following dataset:
Instead of deleting entire rows of "non-chr9" data, I simply ignore those data and only consider the "chr9" data by copying all the "chr9" data into an array. How to code to implement such task? First of all, we must make a copy of our data to avoid data loss because we cannot undo all changes to recover the original data after running VBA Excel.
It seems you have done all the preparations needed. Now, we may start coding by first declaring every variable we need to the appropriate type of data.
Dim i As Long, j As Long, k As Long
Dim LastRow As Long, LastColumn As Long, LengthDataChr9 As Long
If you don't declare the variables, your code will run with those variables defaulting to the Variant type. While Variant can be enormously useful, but it can make your code slow. So, make sure each variable is declared with a sensible type. This is good programming practice and considerably faster.
Next, we determine all variables we will use to construct the size of arrays. We will need
LastRow = Cells(Rows.Count, "A").End(xlUp).Row
LastColumn = Cells(1, Columns.Count).End(xlToLeft).Column
LastRow and LastColumn are the row and column number of the last cell with data in one row or one column. Keep in mind, LastRow and LastColumn may not give you the desired row and column number if you are not setting them up properly or using a well-formatted data sheet. What I mean by a "well-formatted data sheet", is a worksheet with data that starts in cell A1 and the number of the rows in column A and columns in row 1 must be equal to the range of all data. In other words, the size of the range of all data must be equal to LastRowxLastColumn.
We also need the length of the array for storing all the "chr9" data. This can be done by counting all the "chr9" data using the following statement:
LengthDataChr9 = Application.CountIf(Columns("C"), "chr9")
We now know the size of the arrays and we can redimension it. Add the following code lines:
ReDim Data(1 To LastRow, 1 To LastColumn)
ReDim DataChr9(1 To LengthDataChr9, 1 To LastColumn)
We use ReDim instead of Dim because we use the dynamic arrays. VBA Excel has automatically declared the arrays defaulting to the Variant type, but they have no size yet. Next, we copy the data into the array Data by using statement
Data = Range("A1", Cells(LastRow, LastColumn)).Value2
We use .Value2 to improve the performance of the code (See speeding up tips point 2 above). Since the data has already copied to the array Data we may clear the worksheet data so we can use it to paste DataChr9.
Rows("1:" & Rows.Count).ClearContents
To clear everything (all contents, formats, etc.) on the worksheet, we may use Sheets("Sheet1").Cells.Clear or Sheet1.Cells.Clear. Next, we want the code to loop through the elements array Data in column 3 by using For ... Next statement because the desired data we're looking for are located there. If the element of array Data contains string "chr9" is found, the code then copying all the elements in the row where "chr9" is located into DataChr9. Again we use For ... Next statement. Here are the lines for implementing those procedures.
For i = 1 To UBound(Data)
If Data(i, 3) = "chr9" Then
j = j + 1
For k = 1 To LastColumn
DataChr9(j, k) = Data(i, k)
Next k
End If
Next i
where j = j + 1 is a counter for looping through the rows of DataChr9. The final step, we paste back all the elements of DataChr9 to the worksheet by adding this line to the code:
Range("A1", Cells(LengthDataChr9, LastColumn)) = DataChr9
and then you're done! Yay, finally!
OK, let's compile all the lines code above. We obtain
Sub DeleteNonChr9()
Dim i As Long, j As Long, k As Long
Dim LastRow As Long, LastColumn As Long, LengthDataChr9 As Long
LastRow = Cells(Rows.Count, "A").End(xlUp).Row
LastColumn = Cells(1, Columns.Count).End(xlToLeft).Column
LengthDataChr9 = Application.CountIf(Columns("C"), "chr9")
ReDim Data(1 To LastRow, 1 To LastColumn)
ReDim DataChr9(1 To LengthDataChr9, 1 To LastColumn)
Data = Range("A1", Cells(LastRow, LastColumn)).Value2
Rows("1:" & Rows.Count).ClearContents
For i = 1 To UBound(Data)
If Data(i, 3) = "chr9" Then
j = j + 1
For k = 1 To LastColumn
DataChr9(j, k) = Data(i, k)
Next k
End If
Next i
Range("A1", Cells(LengthDataChr9, LastColumn)) = DataChr9
End Sub
The performance of the code above is satisfying. It takes less than 0.5 seconds on average to complete the process of extracting all "chr9" data from 10,000 rows dummy data on my machine.

How to speed up Copy Paste Values when using Offset

I want make this basic function of "copy&paste-values-on-a-new-row-each-time" run as fast as possible since the macro repeats the calculations hundreds of thousands of times. I just can't find the exact answer after searching this forum for ages.
Currently, I'm copying output numbers from a fixed range and, elsewhere on the worksheet, pasting the values on a new row for each new set of results.
Here's the portion of the code doing this:
Row = Row +1
Range("g15:ax15").copy
Range("ea18").select
ActiveCell.Offset(Row,0).select
Selection.PasteSpecial Paste:=xlPasteValues
Now from what I have found on this forum, I can replace the Copy/Paste functions completely with Range(destination).value = Range(results).value to speed things up. However, I can't figure out how to do this if the destination rows need to be offset by 1 each time. Also, I've read that one could even do away with "select" to speed things up further! How?
There are a number of options:
//This uses the `Destination` key word
Sub CopyAndPaste()
Dim i as long
For i = 1 to 10
Range("g15:ax15").Copy Destination:=Range("ea18").Offset(i, 0)
next i
End Sub
//If you need `PasteSpecial` then you cannot use `Destination` hence this version
Sub CopyAndPaste()
Dim i as long
For i = 1 to 10
Range("g15:ax15").Copy
Range("ea18").Offset(i, 0).PasteSpecial Paste:=xlPasteValues
next i
End Sub
Sometimes reading values into an array first and then writing back to the spreadsheet is quicker. Here is an example:
Sub CopyAndPaste()
Dim i As Long, numbers As Variant, rw As Long
numbers = Range("g15:ax15")
rw = 18
For i = 1 To 10
rw = rw + 1
Range(Cells(rw, 131), Cells(rw, 131 + UBound(numbers, 2) - 1)) = numbers
Next i
End Sub
You can do it without copying as yo mention (using a variant array as you are copying values only, not formats)
X = Range("g15:ax15").Value2
[ea18].Offset(1, 0).Resize(UBound(X, 1), UBound(X, 2)).Value2 = X
or with your variable offset
Dim lngCnt As Long
lngCnt = lngCnt + 1
X = Range("g15:ax15").Value2
[ea18].Offset(lngCnt, 0).Resize(UBound(X, 1), UBound(X, 2)).Value2 = X
Row = Row +1
Range("g15:ax15").copy
Range("ea18").Offset(Row,0).PasteSpecial Paste:=xlPasteValues
Select is a more-or-less useless method inherited from recordings.

How to delete a row if it contains a string?

I have only one column of data. I need to write a macro that would go through all the values and delete all rows that contain the word "paper".
A B
1 678
2 paper
3 3
4 09
5 89
6 paper
The problem is that the number of rows is not fixed. Sheets may have different number of rows.
Here is another simple macro that will remove all rows with non-numeric values in column A (besides row 1).
Sub DeleteRowsWithStringsInColumnA()
Dim i As Long
With ActiveSheet '<~~ Or whatever sheet you may want to use the code for
For i = .Cells(.Cells(.Rows.Count, 1).End(xlUp).Row, 1).Row To 2 Step -1 '<~~ To row 2 keeps the header
If IsNumeric(.Cells(i, 1).Value) = False Then .Cells(i, 1).EntireRow.Delete
Next i
End With
End Sub
If you're confident that the rows in question would always contain "paper" specifically and never any other string, you should match based on the value paper rather than it being a string. This is because, particularly in Excel, sometimes you may have numbers stored as strings without realizing it--and you don't want to delete those rows.
Sub DeleteRowsWithPaper()
Dim a As Integer
a = 1
Do While Cells(a, 1) <> ""
If Cells(a, 1) = "paper" Then
Rows(a).Delete Shift:=xlUp
'Row counter should not be incremented if row was just deleted
Else
'Increment a for next row only if row not deleted
a = a + 1
End If
Loop
End Sub
The following is a flexible macro that allows you to input a string or number to find and delete its respective row. It is able to process 1.04 million rows of simple strings and numbers in 2.7 seconds.
Sub DeleteRows()
Dim Wsht As Worksheet
Dim LRow, Iter As Long
Dim Var As Variant
Var = InputBox("Please specify value to find and delete.")
Set Wsht = ThisWorkbook.ActiveSheet
LRow = Wsht.Cells(Rows.Count, 1).End(xlUp).Row
StartTime = Timer
Application.ScreenUpdating = False
With Wsht
For Iter = LRow To 1 Step -1
If InStr(.Cells(Iter, 1), Var) > 0 Then
.Cells(Iter, 1).EntireRow.Delete
End If
Next Iter
End With
Application.ScreenUpdating = True
Debug.Print Timer - StartTime
End Sub

Resources