Find & Replace for large list in Excel - excel

I would like to find and replace a large list of words in excel; If I have the set of words I'd like to find/search through in Sheet1, ColumnA and then if Sheet2, ColumnA reflects what is to be found and ColumnB contains the word(s) to replace found word with (all comma separated values), how do I go about doing this so that the replacements end up back in Sheet1 ColumnA?
I suspect this requires a macro, which I am not very familiar with.
Many thanks in advance for your time and assistance!

It's not super efficient but it will get the job done. You will have to account for the length of your two lists as well as if you change your sheet names.
Public Sub findsometext()
For i = 1 To 10
' change 10 to however many items are in your replacement list
' start at 2 if your data has headers
Worksheets("Sheet2").Activate
target = Cells(i, 1)
replacer = Cells(i, 2)
Worksheets("Sheet1").Activate
For j = 1 To 10
' change 10 to however many items are in your data list to be processed
' start at 2 if your data has headers
Cells(j, 1) = Replace(Cells(j, 1), target, replacer)
Next j
Next i
End Sub

Related

Optimizing Excel CountIfs - can I make it go faster?

I have some larger files I need to validate the data in. I have most of it automated to input the formulas I need automatically. This helps eliminate errors of copy and paste on large files. The problem is with this latest validation.
One of the latest validations involves counting the number of rows that match 3 columns. The 3 columns are in Sheet 2 and the rows to count are in Sheet 1. Then compare this count with an expected number based on Sheet 2. It is easy enough to do with CountIFs, but there are large files and it can take up to an hour on some of them. I am trying to find something faster.
I am using a smaller file and it is still taking about 1 minute. There are only about 1800 rows.
I have something like this:
In Check1 I am using: =COUNTIFS(Sheet1!A:A,A2,Sheet1!B:B,B2,Sheet1!C:C,C2)
My code puts that formula in the active cell. Is there a better way to do this?
Is there anyway - using VB or anything - to improve the performance.
When the rows start getting into the 10's of thousands it is time to start this and get lunch. And, then hope it is done when I get back to my desk!
Thanks.
You basically have to iterate over all rows for each column, this is expensive. You might be able to split this into two tasks:
Merge your Columns A-C into one value =CONCAT(A2,B2,C2)
Then do only a single countif on this column =COUNTIF(D:D,D2)
That way you get rid of two (time) expensive countifs at the cost of the new concat.
You should narrow the range CountIf acts on from entire columns to the actual used range
And your code could write the result of the formula instead of the formula itself
Like follows:
With Sheet1
Set sheet1Rng = Intersect(.UsedRange, .Range("A:C"))
End With
With Sheet2
For Each cell in Intersect(.UsedRange, .Range("A:A"))
cell.Offset(,3) = WorksheetFunction.CountIfs(sheet1Rng.Columns(1), cell.Value, sheet1Rng.Columns(2), cell.Offset(,1).Value, sheet1Rng.Columns(3),cell.Offset(2).Value)
Next cell
End With
I set up a mock sheet, using a layout similar to what you show, with 10,000 rows, and manually filled it with the COUNTIFS formula you show. Changing a single item in the data triggered a recalculation which took about ten seconds or so to execute.
I then tried the following macro, which executed in well under one second. All of the counting is done within the VBA macro. So this Dictionary method may be an answer to your speed problems.
Before running this, you may want to set the Calculation state to Manual (or do it in the code) if you have COUNTIFS on the worksheet.
Option Explicit
'set reference to Microsoft Scripting Runtime
Sub CountCol123()
Dim DCT As Dictionary
Dim V As Variant
Dim WS As Worksheet, R As Range
Dim I As Long
Dim sKey As String
Set WS = Worksheets("sheet2")
'read the info into an array
With WS
Set R = .Range(.Cells(1, 1), .Cells(.Rows.Count, 1).End(xlUp)).Resize(columnsize:=4)
V = R
End With
'Get count of the matches
Set DCT = New Dictionary
For I = 2 To UBound(V, 1)
sKey = V(I, 1) & "|" & V(I, 2) & "|" & V(I, 3)
If DCT.Exists(sKey) Then
DCT(sKey) = DCT(sKey) + 1
Else
DCT.Add Key:=sKey, Item:=1
End If
Next I
'Get the results and write them out
For I = 2 To UBound(V, 1)
sKey = V(I, 1) & "|" & V(I, 2) & "|" & V(I, 3)
V(I, 4) = DCT(sKey)
Next I
'If you have COUNTIFS on the worksheet when testing this,
' or any other formulas that will be triggered,
' then uncomment the next line
'Application.Calculation = xlCalculationManual
With R
.EntireColumn.Clear
.Value = V
End With
End Sub
The Excel alternative named Cell in Hancom Office 2020 is insanely fast at countifs. Not sure why. On my i7-5775C, Excel 2019 takes 90 seconds for a countifs with two criteria for populating 10,000 rows with the results. Using Cell, the exact same operation completes in less than 28 seconds. I have verified that the results match those generated by Excel 2019.

Excel VBA - Search through list and pop-up message

I would like to ask for some help regarding the following problem:
I have a workbook with two sheets. Sheet 2 contains 2 lists with phrases in them and Sheet 1 has some data. I want to create a macro that checks the data in Sheet 1 and if it finds a phrase that does not appear in the lists in Sheet 2 it gives a pop-up message. Furthermore, Sheet 1 will contain phrases separated by commas. Is it possible to also check that as well? (see screenshot)
Although really simple, I have worked out the following, however, I am sure the answer is more complicated.
Any help is appreciated!
Thank you in advance!
My code:
Sub check()
For i = 2 To 2
If Cells(i, "A") <> Worksheets("Sheet2").Cells(i, "B") Then
MsgBox "Phrase 1 does not match"
End If
For j = 2 To 2
If Cells(j, "B") <> Worksheets("Sheet2").Cells(j, "C") Then
MsgBox "Phrase 2 does not match"
End If
Next j
Next i
End Sub
The VBA code should look like this
For all cells in column 'Phase 1'
Use the SPLIT() function to split the comma separated values
and store them into a list of distinct values
For all distinct values in the list
Use the VLOOKUP function on the 'Phrase list 1' column in sheet2,
searching for the current value
If vlookup returns a valid result then
do nothing
else
pop up message
end if
end loop of distinct values
end loop of cells in column 'Phase 1'

Merging text in two columns

How to locate and merge text in two columns on my spreadsheet and delete the second one?
The first one has the title Postnr and the second one has the title Postort. I want to merge the text in these columns with two spaces between the original text.
Example:
| Postnr | Postort |
| 752 65 | Gothenburg |
Result after I run the code:
Postaddress
752 65 Gothenburg
My code to find and select Postnr
Dim rngPostnr As Range
Set rngPostnr = Range("A1:Z1").Find("Postnr")
Range(rngPostnr, rngPostnr.End(xlDown)).Select
I understand how to do it in the sheet, but I want a macro since I do this many many times a day.
I don't want to locate and mark these columns manually since my sheets have many columns.
I need a macro that locates the columns and concatenates them and removes them and make a new column with the concatenated values, preferably with the header Postaddress.
I understand you dont know where the column is located (but its in the first row).
First iteration on the first row will find you the columns, then iterate over the values per the found column numbers, build the new string and write it in column 10 (or any other column). You can add further logic per your needs, such as, when to stop iterating (when one lastRow is bigger than the other, etc...)
Here is the basic working code that you can expend later:
Sub findAndConcat()
'last column with data
lColumn = Cells(1, Columns.Count).End(xlToLeft).Column
'find Postnr and Postort columns
For i = 1 To lColumn
If Cells(1, i) = "Postnr" Then
PostnrColumn = i
'Convert from column number to column letter
PostnrColumnName = Split(Cells(, PostnrColumn).Address, "$")(1)
End If
If Cells(1, i) = "Postort" Then
PostortColumn = i
PostortColumnName = Split(Cells(, PostortColumn).Address, "$")(1)
End If
Next i
Dim PostnrValue As String
Dim PostortValue As String
Dim newString As String
'last row with data per column
PostnrLastRow = ActiveSheet.Cells(Rows.Count, PostnrColumnName).End(xlUp).Row
PostortLastRow = ActiveSheet.Cells(Rows.Count, PostortColumnName).End(xlUp).Row
'Iterating the columns rows and building the new concatinated string
For i = 2 To PostnrLastRow
PostnrValue = Cells(i, PostnrColumn).Value
PostortValue = Cells(i, PostortColumn).Value
newString = PostnrValue & " " & PostortValue
ActiveSheet.Cells(i, 10).Value = newString
Next i
End Sub
Here is the result:
If you want to do that inside the sheet, then on the formula barn you should use concatanate like the following: =CONCATENATE(A2;" ";B2)
If you want to achieve that in vba you have to use something like the following:
ActiveCell.Offset(0, 3).Value = ActiveCell.Value & " " & ActiveCell.Offset(0, 1).Value
Where activecell (Postnr) is the first cell, and the next cell (offset 0,1|Postort) is the second part you want to concatenate and the result is inserted further down into a new column.
Edit: Here is the results to understand the example given:
Edit 2: Because you question is not clear enough I will add another way. Maybe you want to "add" the second column to the first (merge the columns into one).
You can do that using vba only.
You have to do something like that:
ActiveCell.Value = ActiveCell.Value & " " & ActiveCell.Offset(0, 1).Value
I understand how to do it in the sheet, but I want to have macro that does it since I do this many many times a day.
I also understand your vba but for this to work I need to mark the cells or does it locate the postnr and postadress columns by itself. I dont want to have to locate and mark these columns myself since this is time consuming because my sheets have many many columns. Do you understand what I mean?
I need like a macro that locates the columns and concatenates them and removes them and make a new column with the concatenated values, preferably with the header Postaddress

How to concatenate based on number of duplicates - MS Excel

Is there a way to concatenate multiple columns if the a row is duplicate? I have a spreadsheet where column A has duplicate team but there area and LD (column b and c) are different value. I would like to have a formulate at column E where it will concatenate column B and C with dash and append next row values. See the attached picture highlighted row E. Any idea how to do this with excel formula or may be VBA. I tried this formula in column E =IF(A3=A4,D3&";"&D4) but it returns false for the last duplicate row.
This is not possible with formulas. It requires a VBA-based solution.
I wrote a custom routine for you. Please place this in a standard code module:
Public Sub ConcatTeamZones()
Const SOURCE = "A1"
Const OUTPUT = "E1"
Dim i&, j&, s$, v, w
v = Range(SOURCE).CurrentRegion
ReDim w(1 To UBound(v), 0)
For i = 2 To UBound(w)
If v(i, 1) <> v(i - 1, 1) Then
w(i - 1, 0) = s
s = s & v(i, 2) & "-" & v(i, 3)
s = ""
Else
s = s & ";"
End If
s = s & v(i, 2) & "-" & v(i, 3)
Next
w(i - 1, 0) = s
Range(OUTPUT).Resize(UBound(w)) = w
End Sub
And then from the worksheet with your team data, press ALT-F8 to bring up the Macro Dialog. Run the ConcatTeamZones macro.
Note 1: this assumes that column A is sorted.
Note 2: You can edit the first two lines to specify which columns contains the source (team data) and which column you wish the output.
It can be done using formulas, it’s just a matter of perspective:
Assuming data is sorted by Team
This formula gives the concatenated result with the maximum of combinations on top. Enter this formula in cell E2 and copy till last record.
=CONCATENATE($D2,IF(EXACT($A2,$A3),";"&$E3,""))
To assign the max possible combinations to each Team enter this formula in F2 and copy till last record.
=INDEX($E:$E,MATCH($A2,$A:$A,0),0)
Here's how I would do it...
Cell "A1": =COUNTIF(B$2:B2,B2)&B2 - This is to create a unique key. Copy down the length of your table
Then I would use an advanced query (with vba maybe) to create a list of unique values for team in the "F" column
Cell "G2": =VLOOKUP("1"&F2,A:D,3,0)&"-"&VLOOKUP("1"&F2,A:D,4,0)&IF(ISERROR(VLOOKUP("2"&F2,A:D,3,0)),"",", "&VLOOKUP("2"&F2,A:D,3,0)&"-"&VLOOKUP("2"&F2,A:D,4,0))&IF(ISERROR(VLOOKUP("3"&F2,A:D,3,0)),"",", "&VLOOKUP("3"&F2,A:D,3,0)&"-"&VLOOKUP("3"&F2,A:D,4,0))&IF(ISERROR(VLOOKUP("4"&F2,A:D,3,0)),"",", "&VLOOKUP("4"&F2,A:D,3,0)&"-"&VLOOKUP("4"&F2,A:D,4,0))
This function creates your combined references. It would be longer if you expected more than 4 occurrences of teams.
Just copy "IF(ISERROR(VLOOKUP("4"&F2,A:D,3,0)),"",", "&VLOOKUP("4"&F2,A:D,3,0)&"-"&VLOOKUP("4"&F2,A:D,4,0))" and change the "4" to "5" etc
You could hide column A (to tidy up).
Sorry, I tried to include an image but insufficient reputation :-)

Excel VBA count cells until a date is found

I've got an amount of data copied from a table in a .pdf that when pasted into excel puts it all into one column. There are actually multiple pages each with it's own table (the data is one continuous long table split over multiple pages more accurately) and at the top of each page is a series of lines that I'm not interested in (the same unwanted data is at the top of each page). What I am interested in is re-sorting the data under the headers as it is in the table on the original .pdf document, removing the headers in the process. The data as it has been pasted into one column essentially is a list of items in plain text for x rows, followed by a list of start dates for x rows, and then a list of end dates for x rows, repeated every page.
I've figured out how to count the number of lines I don't want by getting a macro to look for the first piece of data I'm interested in ("AAAA") starting at cell (B2);
Cells(2, 2).Select
For i = 1 To 50
If ActiveCell = "AAAA" Then
Exit For
End If
ActiveCell.Offset(1, 0).Select
Next i
Cells(2, 3) = i
If i = 51 Then
Range("B3") = "Cannot find data"
End If
Which starts a search at cell (B2) looking downwards until it finds "AAAA" it then prints how many rows it has moved downwards to find it in cell (C2).
I now wish to be able to start at the cell it has just found [(B34) in this case] and count downwards until it finds the first cell containing a date.
Ultimately I'll need to then count down the same number of cells to find the associated end date and print them all in one row, continuing for the entire column of data.
If anybody could help me with being able to start at the first cell "AAAA" and then count downwards until a date is found, that would be really helpful.
My biggest challeng is to understand what you want to be true. I tryed to make a list of the things what you want.
You have a PDF that when paste in Excel it transform all the
document in one column.
There is a header in each of the Excel pages that you want to delete.
After you find a header you want to find two dates, and they have the same distance from the header.
How I would do it:
For iCounter = 1 to Cells(1048576, 1).End(xlUp).Row
If Cells(iCounter,1) = "YOUR HEADER HERE" then
For kCounter = iCounter to Cells(1048576, 1).End(xlUp).Row
If IsDate(Cells(kCounter,1)) = true then
initialDate = Cells(kCounter,1)
endDate = Cells(2*kCounter-iCounter,1)
End if
Next kCounter
End if
Next iCounter
The following piece of code starts in cell A1 and searches downward until it finds a cell containing a date value. The code only searches until it reaches the last record in the first column (to avoid searching all the way down to the bottom of the sheet if no date is found).
Sub FindFirstDate()
Dim i As Long
For i = 1 To ActiveSheet.Cells(Rows.Count, 1).End(xlUp).Row
If IsDate(ActiveSheet.Cells(i, 1).Value) = True Then Exit For
Next i
MsgBox "The first cell with a date is " & ActiveSheet.Cells(i, 1).Address
End Sub
In this example the address of the cell with the first date in returned in a MsgBox.

Resources