Concat several columns using column names in Excel VBA - excel

As shown in image_1, I have the raw data of a product as shown in Column B to Column F. I want to add column A, which concats the "Model", "Year", "Number" data into a string. I know I can achieve this simply by
[a2] = "=concat(B2,D2,F2)", and then filldown. But the problem is that the raw file I receive every day is inconsistent in terms of the order of the columns. Therefore, I couldn't use a static line of code displayed above.
I can probably use a combination of for loop and if/else to test if the column name equal to "Model", "Year", "Number", and if yes, grab its column number...
However, I'm wondering if there's a more direct and elegant way of achieving this. Any thoughts?

A simple approach based on sorting columns by header and merging data in columns with constant numbers (assuming constant number of columns but different order). If the number of columns is variable, this code will not work.
Sub concat()
Dim rng As Range
With ThisWorkbook.Worksheets(1)
Set rng = .Range("A1").CurrentRegion
' columns sort
With .Sort
.SortFields.Clear
.SortFields.Add2 Key:=rng.Rows(1)
.SetRange rng
.Orientation = xlLeftToRight
.Apply
End With
Set rng = rng.Columns(1)
rng.Insert ' add cells at left for "Concat"
Set rng = rng.Offset(0, -1)
rng(1) = "Concat" ' add header
Intersect(rng, rng.Offset(1)).FormulaR1C1 = "=CONCAT(RC[2],RC[5],RC[3])"
End With
End Sub
Before
After

If you want to add the "Concat" column and formula without reordering the columns, you can do that with vba like this
Sub Demo()
Dim ws As Worksheet
Dim colModel As Variant
Dim colYear As Variant
Dim colNum As Variant
Dim LastRow As Long
Set ws = ActiveSheet ' or any means you choose
LastRow = ws.Cells(ws.Rows.Count, 1).End(xlUp).Row
If ws.Cells(1, 1) <> "Concat" Then
'Insert new column
ws.Columns(1).Insert
' New column header
ws.Cells(1, 1) = "Concat"
End If
' get colum positions
colModel = Application.Match("Model", ws.Rows(1), 0)
colYear = Application.Match("Year", ws.Rows(1), 0)
colNum = Application.Match("Number", ws.Rows(1), 0)
' Check if columns exist
If IsError(colModel) Then
MsgBox "Column ""Model"" not found", vbCritical + vbOKOnly, "Error"
Exit Sub
End If
If IsError(colYear) Then
MsgBox "Column ""Year"" not found", vbCritical + vbOKOnly, "Error"
Exit Sub
End If
If IsError(colNum) Then
MsgBox "Column ""Number"" not found", vbCritical + vbOKOnly, "Error"
Exit Sub
End If
' Insert Formula
ws.Range(ws.Cells(2, 1), ws.Cells(LastRow, 1)).FormulaR1C1 = "=RC[" & colModel - 1 & "]&RC[" & colYear - 1 & "]&RC[" & colNum - 1 & "]"
End Sub
Alternatively, you could also use a formula in column A to find the column positions
In Excel365
=XLOOKUP("Model",$1:$1,2:2,,0)&XLOOKUP("Year",$1:$1,2:2,,0)&XLOOKUP("Number",$1:$1,2:2,,0)
For pre 365
=INDEX(2:2,MATCH("Model",$1:$1,0))&INDEX(2:2,MATCH("Year",$1:$1,0))&INDEX(2:2,MATCH("Number",$1:$1,0))

If they are always the same columns, just the order changes then sort by column headings first before concatenating, that way they will always be in the same position.
If you have differing columns and the ones you are interested in are somewhere within it, then you could use the following formula:
=HLOOKUP("Heading_Name","Data_Range",Row_No,FALSE) to extract each of the columns you are interested in. Concatenating the results of these would give you what you want and will work for any arrangement of columns and sizes of data providing you declare the range properly.

Related

VBA_Offset for Column with Header

I've a data where I applied filter to the column based on its header. Now I want to make some changes to filtered data i.e. visible cells in the same column. I've below mentioned code where filter has been applied to column with header "Query Type" & it's column letter is "E". Is it possible to put offset based on the column header instead of column letter? Because column gets changing everytime. In below example, how E2 or E can be replaced dynamically to accommodate column with header? I tried replacing "E" with FiltCol; however it is not working.
Sub Filter()
Dim FiltCol As Variant
FiltCol = Rows("1:1").Find(What:="Query Type", LookAt:=xlWhole).Column
ActiveSheet.UsedRange.AutoFilter Field:=FiltCol, Criteria1:="Rejected"
ActiveSheet.Range("E2", Range("E" & Rows.Count).End(xlUp)).SpecialCells(xlCellTypeVisible).FormulaR1C1 = "Accepted"
End Sub
When you want to deal with column numbers, you can use the .Cells-property of the worksheet. Cells expects 2 parameters, row and column. The row is always a (long) number, the column can be specified as number or with the column character(s)
The following terms are all the same:
ActiveSheet.Range("D3")
ActiveSheet.Cells(3, 4)
ActiveSheet.Cells(3, "D")
Your code could look like
Sub Filter()
Dim FiltCol As Variant
With ActiveSheet
FiltCol = .Rows("1:1").Find(What:="Query Type", LookAt:=xlWhole).Column
Dim lastRow As Long
lastRow = .Cells(.Rows.Count, FiltCol).End(xlUp).row
.UsedRange.AutoFilter Field:=FiltCol, Criteria1:="Rejected"
Dim visibleCells As Range
On Error Resume Next ' Avoid runtime error if nothing is found
Set visibleCells = .Range(.Cells(2, FiltCol), .Cells(lastRow, FiltCol)).SpecialCells(xlCellTypeVisible)
On Error GoTo 0
If Not visibleCells Is Nothing Then
visibleCells.Value2 = "Accepted"
End If
End With
End Sub
cells(1,1).offset(2,3)
will get you from A1 to D3
lastRow = Cells(105000, FiltCol).End(xlUp).Row <<< This is poor, see edit below
ActiveSheet.Range(Cells(2, FiltCol).Offset(0, 1), Cells(lastRow, FiltCol).Offset(0, 1)).SpecialCells(xlCellTypeVisible).FormulaR1C1 = "Accepted"
edit:
Better to dynamically describe the last row
lastRow = Cells(Cells.Rows.Count, FiltCol).End(xlUp).Row

How to add custom text to a cell based on a specific entry in a different cell?

My laboratory is capable of running 20+ different analyses, and we get contracts from about the same 15 companies to do a combination of these analyses. I created an Excel spread sheet to keep track of the work as it comes in, where columns are the 20 different analyses we can run, and rows are the companies. I type in either a checkmark or "NA", depending on whether that company requests that specific analysis. (Each company requests its own combination of analyses).
I need some help with the following:
If I enter "Company 1" in cell A100, I want cell B100 to display "NA". If I enter "Company 2" instead, I want cell D100 to display "NA". And if I enter "Company 3", do nothing, for example. I am OK with adding the check marks manually, as there are other variables that need not be mentioned.
Now, I have been able to develop some toy solution in VBA to some extent (please see code below). However, I have two issues:
In order to run the code, I have to switch to the VBA editor and press F5 after every entry. Instead, I would like it to work like when using formulas for the cells. In other words, if I type in "Company 1" in any cell of column A and hit "Enter", I would like the "NA" to display automatically in the appropriate cells on the row. I guess I could record a macro for this, but the file is shared with many people and I would prefer to avoid that.
In the future I will need to add more companies and analyses, so I need a code I can quickly go in and update. Or maybe have a list of companies that I add to and link it somehow to my code.
Sub writeNA()
For i = 1 To 20 Step 1
x = Cells(i, 1).Value
If x = "Company 1" Then
Cells(i, 2).Value = "NA"
End If
If x = "Company 2" Then
Cells(i, 3).Value = "NA"
End If
If x = "Company 3" Then
Cells(i, 4).Value = "NA"
End If
Next
End Sub
Thank you!
You could add a Worksheet Change event handler, so that whenever the worksheet is changed, the function runs and adds "NA" where needed.
Here is the function that I used for proof of concept. It also adds "NA" when "Company 3" is entered (not sure if that is desired or not).
Private Sub Worksheet_Change(ByVal Target As Range)
Select Case Target.Text
Case "Company 1", "Company 2", "Company 3"
Target.Offset(0, 1).Cells.Value2 = "NA"
End Select
End Sub
Update Cells When Entering Values (Worksheet Change Event)
Usually the code has to be copied to different modules (if you want to use it in multiple worksheets). Optionally you can copy both codes into the sheet module.
Adjust the values in the constants section.
No need to run anything, it runs automatically.
If you already have values in the Criteria Column then do a copy/paste and the data will get updated.
Sheet module e.g. Sheet1
Option Explicit
Private Sub Worksheet_Change(ByVal Target As Range)
updateCompany Me, Target
End Sub
Standard Module e.g. Module1
Option Explicit
Sub updateCompany( _
ws As Worksheet, _
Target As Range)
Const ProcName As String = ""
On Error GoTo clearError
Const CompanyList As String = "Company 1,Company 2,Company 3"
Const ColsList As String = "B,D,"
Const CriteriaList As String = "NA,NA,"
Const FirstRow As Long = 2
Const CritCol As String = "A"
Dim cel As Range
Dim rng As Range
' Define Processing Range (First Cell to Bottom-Most Cell (1048576)).
Set rng = ws.Columns(CritCol) _
.Resize(ws.Rows.Count - FirstRow + 1) _
.Offset(FirstRow - 1)
' Define Last Non-Empty Cell.
Set cel = rng.Find( _
What:="*", _
LookIn:=xlFormulas, _
SearchDirection:=xlPrevious)
' Validate Last Non-Empty Cell
' i.e. check if Processing Range contains a value.
If cel Is Nothing Then
GoTo ProcExit
End If
' Define Source Range (First Cell to Last Non-Empty Cell).
Set rng = rng.Resize(cel.Row - rng.Row + 1)
' Define Target Range.
Set rng = Intersect(Target, rng)
' Validate Target Range i.e. check if the change happened in Source Range.
If rng Is Nothing Then
GoTo ProcExit ' Change didn't happen in Source Range.
End If
' Write values from Company List to Company Array.
Dim Company() As String: Company = Split(CompanyList, ",")
' Write values from Columns List to Columns Array.
Dim Cols() As String: Cols = Split(ColsList, ",")
' Write values from Criteria List to Criteria Array.
Dim Criteria() As String: Criteria = Split(CriteriaList, ",")
Application.EnableEvents = False
' Write values to cells in rows of changed cells.
Dim CurrentMatch As Variant
' Loop through cells of Target Range (can be non-contiguous).
For Each cel In rng.Cells
' Check if current cell is not blank (Empty or "").
If Len(cel.Value) > 0 Then
' Try to find the value in current cell (Company) in Company Array.
CurrentMatch = Application.Match(cel.Value, Company, 0)
' If found...
If IsNumeric(CurrentMatch) Then
' Define the current index of the found value.
CurrentMatch = CurrentMatch - 1 ' -1 because 0-based.
' Check if the value in Columns Array is different than "".
If Cols(CurrentMatch) <> "" Then
' Write value from Criteria Array to cell in current row
' of the column found in Columns Array.
Cells(cel.Row, Cols(CurrentMatch)) = Criteria(CurrentMatch)
Else
' The value in Columns Array is "".
End If
Else
' Couldn't find Company name in Company Array.
End If
Else
' Cell is blank or empty.
End If
Next cel
SafeExit:
Application.EnableEvents = True
ProcExit:
Exit Sub
clearError:
Debug.Print "'" & ProcName & "': Unexpected Error!" & vbLf _
& " " & "Run-time error '" & Err.Number & "':" & vbLf _
& " " & Err.Description
Resume SafeExit
End Sub

Deleting entire columns based on column headers

I am trying to delete all the columns in an Excel sheet except the columns with the headers of "Product code" "Size" and "Quantity".
I have written the following code.
Sub delcolumns()
Dim Rng As Range
Dim cell As Range
Set Rng = Range(("A1"), Range("A1").End(xlToRight))
For Each cell In Rng
If cell.Value <> "Product Code" Or "Size" Or "Quantity" Then cell.EntireColumn.Delete
Next cell
End Sub
After running the micro the error says "Type mismatch"
You should work backwards when deleting rows or columns or you risk skipping over one or more.
Sub delcolumns()
Dim c as long, cols as variant
cols = array("Product Code", "Size", "Quantity")
for c = cells(1, columns.count).end(xltoleft).column to 1 step -1
if iserror(application.match(cells(1, c).value, cols, 0)) then
columns(c).entirecolumn.delete
end if
next c
End Sub
'alternative
Sub delcolumns()
Dim c as long
for c = cells(1, columns.count).end(xltoleft).column to 1 step -1
select case cells(1, c).value
case "Product Code", "Size", "Quantity"
'do nothing
case else
columns(c).entirecolumn.delete
end select
next c
End Sub
As far as your own code code, there are a couple of problems.
If cell.Value <> "Product Code" Or "Size" Or "Quantity" Then cell.EntireColumn.Delete
The above line is improper syntax. Each criteria needs to be written out longhand.
If cell.Value <> "Product Code" Or cell.Value <> "Size" Or cell.Value <> "Quantity" Then cell.EntireColumn.Delete
See Is variable required instead of “or” for alternatives.
More importantly, your logic is flawed. If one column is Product Code, then it isn't Size or Quantity and it will get deleted. You actually want,
If cell.Value <> "Product Code" AND cell.Value <> "Size" AND cell.Value <> "Quantity" Then cell.EntireColumn.Delete
Using And instead of Or means that the column is none of the three then delete.
You won't have to delete backwards using this code. This method tends to be more efficient since the actions are outside of the loop.
Say that you have 20 columns and intend to delete 17 of them (keep your 3 columns that are needed). This means you will have 17 iterations of columns being deleted and rows being shifted.
Instead, keep track of your target columns to delete using Union (collection of cells) and then delete everything all at once outside of the loop. No matter how many columns you have to be deleted, you will always do it all in once instance rather n instances. The larger the number of columns to be deleted, the greater the gains from using this method.
Option Explicit
Sub DeleteMe()
Dim ws As Worksheet: Set ws = ThisWorkbook.Sheets("Sheet1") '<-- Update sheet
Dim LC As Long, MyHeader As Range, DeleteMe As Range
LC = ws.Cells(1, ws.Columns.Count).End(xlToLeft).Column
For Each MyHeader In ws.Range(ws.Cells(1, 1), ws.Cells(1, LC))
Select Case MyHeader
Case "Product code", "Size", "Quantity"
Case Else
If Not DeleteMe Is Nothing Then
Set DeleteMe = Union(DeleteMe, MyHeader)
Else
Set DeleteMe = MyHeader
End If
End Select
Next MyHeader
If Not DeleteMe Is Nothing Then DeleteMe.EntireColumn.Delete
End Sub
Write back a resized array without backward loops
In addition to the valid solutions above and in order to show an alternative approach using the advanced features of the
Application.Index function: all actions are executed within an array before writing it back to sheet.
Method
The Application.Index function allows not only to receive row and column numbers as arguments, but also row and column arrays with certain restructuring possibilities. The rows array contains the complete set of rows, the column array is built by a helper function getColNums() containing the related column numbers to the wanted titles "Product code", "Size" and "Quantity". - You might find some interesting pecularities of this function at Insert first column in datafield array without loops or API call.
Code example
This code example assumes a data range A1:F1000 which can be changed easily to your needs.
Sub RestructureColumns()
Dim rng As Range, titles(), v
titles = Array("Product code", "Size", "Quantity") ' << define wanted column titles
Set rng = ThisWorkbook.Worksheets("Sheet1").Range("A1:F1000") ' << change to wanted sheet and data range
' [1a] create 2-dim data field array (1-based)
v = rng.Value2
' [1b] filter out columns to be deleted, i.e. maintain the rest
v = Application.Index(v, Evaluate("row(1:" & rng.Rows.count & ")"), getColNums(v, titles))
' [2] write data field back to resized range
rng = "" ' clear lines
rng.Resize(UBound(v), UBound(v, 2)) = v ' write back only columns with predefined titles
End Sub
'Helper function getColNums()
Function getColNums(v, titles) As Variant()
' Purpose: return array of column numbers related to wanted titles, e.g. 1st, 3rd and 6th column
Dim tmpAr, title, foundCol, i& ' declare variables
ReDim tmpAr(0 To UBound(titles)) ' dimension array to titles length
For Each title In titles ' check the wanted titles only ...
foundCol = Application.Match(title, Application.Index(v, 1, 0), 0) ' ... against the complete title row
If Not IsError(foundCol) Then tmpAr(i) = foundCol: i = i + 1 ' if found add col no, increment counter
Next title
ReDim Preserve tmpAr(0 To i - 1) ' (redundant if all titles available)
getColNums = tmpAr ' return built array
End Function

How to require input to a column in Excel when another cell has a specific value?

I have a spreadsheet with Student Name, Race/Ethnicity, Gender, Degree, Major, Status, Year Started, and Career After Graduation as columns. Thank you for those who helped me with my codes for requiring input in columns Race/Ethnicity, Gender, and Degree if Student Name is provided in column A. Now I need to do something additional. If the value of Status in column F is "Graduated", I want Career After Graduation column (column H) to be filled out too. The closest codes I could come up with are listed below, and I now have a problem.
When Status in column F has a value of "Graduated", Excel not only requires a user to fill out Career After Graduation in column H, but also other columns. How should I modify the codes, so only column H will be required?
Thank you!
Option Explicit
Private Sub Workbook_BeforeClose(Cancel As Boolean)
Dim rngCell As Range, strBlanks As String
Application.ScreenUpdating = False
strBlanks = vbNullString
For Each rngCell In Worksheets("Sheet1").Range("F2:F20").Cells
If rngCell.Value = "Graduated" Then
If WorksheetFunction.CountA(rngCell.Offset(0, 2).Resize(1, 1)) < 1 Then
strBlanks = strBlanks & IIf(Len(strBlanks) > 0, ",", "") & _
Replace(rngCell.Offset(0, 2).Resize(1,1).SpecialCells(xlCellTypeBlanks).Address, "$", "")
End If
End If
Next
If Not strBlanks = vbNullString Then
MsgBox "Entries required in cells " & vbCrLf & vbCrLf & strBlanks
Cancel = True
Exit Sub
End If
End Sub
you could use Autofilter to avoid looping through cells and have a one-shot operation
Option Explicit
Private Sub Workbook_BeforeClose(Cancel As Boolean)
Dim toBeFilledAddress As String
With Worksheets("Sheet1") '<--| '<-- change "Sheet1" with your actual sheet name
With .Range("A1:H" & .Cells(.Rows.Count, 1).End(xlUp).Row) '<--| reference its range in columns A:B from row 1 to column "A" last non empty cell row
.AutoFilter field:=6, Criteria1:="Graduated" '<--| filter referenced range on its 6th column with "Graduated"
.AutoFilter field:=8, Criteria1:="" '<--|filter referenced range again on its 8th column with blanks
If Application.WorksheetFunction.Subtotal(103, .Resize(, 1)) > 1 Then toBeFilledAddress = .Offset(1, 7).Resize(.Rows.Count - 1, 1).SpecialCells(xlCellTypeVisible).Address(False, False) '<--| store all matching cells address
End With
.AutoFilterMode = False '<--| show all rows back
End With
If toBeFilledAddress <> "" Then '<--| if any cell other than header ones has been filtered...
MsgBox "Entries required in cells " & vbCrLf & vbCrLf & toBeFilledAddress
Cancel = True
Exit Sub '<--| this line could be avoided unless you're planning to add more lines after "End If"
End If
End Sub
why do you use the following?
& vbCrLf & vbCrLf
it seems that your code works fine but some how those two variables are getting filled with other columns.
Also, which of the columns does it return besides H? And when is the user supposed to be prompted to enter in the values?

Comparing two excelsheets for uncommon records based on common 'id' field(Column)

I am comparing two excelsheets in the same workbook.
I want to check whether the records from sheet1 are exactly same as records in sheet2 based on common Question_id(Column A of both worksheets)
This question_id(column A) has values such as
1
1a
1a.1
1a.1a
1a.1b
1a.1c
2
2a
2a.1
2a.1a
2a.1b
2a.1c etc....
I want to compare the records based on this Question_id(Column A Value).
If Question_id is same and records(the remaining row) are not same then am coloring those records in red background(only specific cells and not the whole row)
For the same, I have following code.
Sub RunCompare()
Call compareSheets("Sheet1", "Sheet2")
End Sub
Sub compareSheets(shtSheet1 As String, shtSheet2 As String)
Dim mycell As Range
Dim mydiffs As Integer
Application.ScreenUpdating = false
'Color Uncommon records in Red Background
For Each mycell In ActiveWorkbook.Worksheets(shtSheet2).UsedRange
If Not mycell.Value = ActiveWorkbook.Worksheets(shtSheet1).Cells(mycell.Row, mycell.Column).Value Then
mycell.Interior.Color = vbRed
mydiffs = mydiffs + 1
End If
Next
'Display no. of differences
MsgBox mydiffs & " differences found", vbInformation
ActiveWorkbook.Sheets(shtSheet2).Select
MsgBox "Data Scrubbed Successfully..."
Application.ScreenUpdating = True
End Sub
The above code runs fine when I have same sequence of Question_id (and therefore of records) in both the excelsheets.
Assume I have different sequence of Question_id (and therefore of records) in both the sheets.
Then how I can achieve this...?
Something Like using where clause in my code Where Sheet1.Question_id = Sheet2.Question_id
i.e. I'll pick up question_id and the full row from sheet1 and I will compare it against records in sheet2 based on matching Question_id(value of Column A) only.
Can someone tell where I can put the condition and what type of condition so that, even if both the excelsheets have random sequences of Question_id; I will be able to compare the records from sheet1 and sheet2.
EDIT: on 23rd March 2015
I have changed the code using find() method instead of loops as below:
Still I havn't arrived at my solution.
Here am trying to list Question_Ids of all non-matching rows from sheet2 in sheet3 - Column A.
Option Explicit
Sub test()
Dim rng As Range, c As Range, cfind As Range, mycell As Range, cfindRow As Range
On Error Resume Next
Worksheets("Sheet3").Cells.Clear
With Worksheets("Sheet2")
Set rng = .Range(.Range("A2"), .Range("a2").End(xlDown))
For Each c In rng
With Worksheets("Sheet1")
Set cfind = .Columns("A:A").Cells.Find _
(what:=c.Value, lookat:=xlWhole)
'Find method always returns Range; So the following line should be something If cfind is not Nothing OR cfind <> Nothing (Both the syntaxes are wrong. Suggest me the right syntax please.
If cfind = 1 Then
'Here please tell me how to reference a whole row based on Column A value
'Here using cfind and again using mycell is something wrong as mycell variable again compares rows in sheet2 with rows in sheet1 which include Question_Id too.
Set mycell = ActiveWorkbook.Worksheets("Sheet2").UsedRange.End(xlDown)
'My both the excelsheets have values from columns A to AD. Still I want to make the code for all used Ranges of columns instead of only A to AD.
Set cfindRow = Worksheets("Sheet1").Rows("A2:AD").Cells.Find _
(what:=mycell.Value, lookat:=xlWhole)
'Find method always returns Range; So the following line should be something If cfindRow is not Nothing OR cfindRow <> Nothing (Both the syntaxes are wrong. Suggest me the right syntax please.
If cfindRow = 1 Then
'MsgBox "Match Found" 'Right Now do Nothing
End If
Else
' mycell.Interior.Color = vbRed
' mydiffs = mydiffs + 1
'Copy the question numbers to sheet3 either if they are new in new sheet (Sheet2) or content against them (in the whole row-any column value) is changed.
cfind.Copy Worksheets("sheet3").Cells(Rows.Count, "A").End(xlUp).Offset(1, 0)
End If
End With
Next c
Application.CutCopyMode = False
End With
MsgBox "Data Scrubbed Successfully..."
End Sub
Can someone tell me how to refer those ranges based on key column values?
My new approach towards solution:
(It may be a hint to give me answer on how to reference Row values based on key column)
Getting row indices of both the sheets; column A values
(Question_Id's) i.e.
c.Row and cfind.Row
Then
Check If(Sheet2.Cells(c.Row, Columns) = Sheet1.Cells(cfind.Row,
Columns) (To compare columns against matching Question_Ids only.)
So Finally this what all am trying to achieve :
1)Compare two sheets based on key column:
Pick up the Question_Id from Sheet2 - column A and compare it against column A in Sheet1. If the key columns from both the sheets match and also the contents against them(the complete row) matches- then Do nothing.
If the key column value(Question_Id - Column A) matches but values(Row) against it do not match them color those specific cells (Only cells) and not the whole row in Red background.
The Question_Id's which are there in sheet2 but not in sheet1 should be listed under first column in sheet3. Starting from A2.
The Question_Id's which are there in sheet1 but not in sheet2 should be listed under second column in sheet3. Starting from B2.
I am basing my code off of your first approach, because I found it simpler and more readable than the second approach.
We'll just do the most naive algorithm, which is to iterate through every row in the used range of both worksheets. (The fastest algorithm would probably be to sort both ranges in memory and then compare, but simplicity of code over performance optimization for now.)
Sub compareSheets(shtSheet1 As String, shtSheet2 As String)
Dim range1 As Range, range2 as Range
Dim mydiffs As Integer, row1 As Integer, row2 As Integer, col As Integer
Application.ScreenUpdating = False
'First create the two ranges we will be using
Set range1 = ActiveWorkbook.Worksheets(shtSheet1).UsedRange
Set range2 = ActiveWorkbook.Worksheets(shtSheet2).UsedRange
'Iterate through the rows of both ranges
For row1 = 1 To range1.Rows.Count
For row2 = 1 To range2.Rows.Count
'Only process the ranges if they share a common key in column 1
If range1.Cells(row1, 1) = range2.Cells(row2, 1) Then
'If they share the same key, iterate through columns and compare
For col = 1 To WorksheetFunction.Max(range1.Columns.Count, range2.Columns.Count)
If Not range1.Cells(row1, col).Value = range2.Cells(row2, col).Value Then
range1.Cells(row1, col).Interior.Color = vbRed
range2.Cells(row2, col).Interior.Color = vbRed
mydiffs = mydiffs + 1
End If
Next
End If
Next
Next
'Display no. of differences
MsgBox mydiffs & " differences found", vbInformation
Application.ScreenUpdating = True
End Sub
There are some specifications I wasn't sure of. For example, what if a key is in one spreadsheet but not the other? Should it be colored red in the sheet where it exists?
Nevertheless, I think the above code should give you a good start to address your more conceptual questions, and I'm happy to help adjust as needed, so please comment if there are specific requirements I'm missing.
Update 1
Here's the update code after our discussion in chat (link in comments), which takes the unmatched keys from the full outer join and copies them to a third sheet.
Sub compareSheets(shtSheet1 As String, shtSheet2 As String, shtSheet3 As String)
Application.ScreenUpdating = False
Dim range1 As Range, range2 As Range
Dim myDiffs As Integer, row1 As Integer, row2 As Integer, col As Integer
Dim sheet3index1 As Integer, sheet3index2 As Integer, i As Integer
Dim leftKeyMatched As Boolean 'Boolean to keep track of whether the key in sheet1 has a match as we are looping
Dim rightKeysMatched() As Boolean 'Array to keep track of which keys in sheet2 have matches
Set range1 = ActiveWorkbook.Worksheets(shtSheet1).UsedRange
Set range2 = ActiveWorkbook.Worksheets(shtSheet2).UsedRange
ReDim rightKeysMatched(range2.Rows.Count)
For row1 = 1 To range1.Rows.Count
leftKeyMatched = False
For row2 = 1 To range2.Rows.Count
If range1.Cells(row1, 1) = range2.Cells(row2, 1) Then
'We have a match, so mark both sides as matched
leftKeyMatched = True
rightKeysMatched(row2 - 1) = True 'This -1 is because the array indexing starts at 0 but the rows in the spreadsheet start at 1
For col = 1 To WorksheetFunction.Max(range1.Columns.Count, range2.Columns.Count)
If Not range1.Cells(row1, col).Value = range2.Cells(row2, col).Value Then
range1.Cells(row1, col).Interior.Color = vbRed
range2.Cells(row2, col).Interior.Color = vbRed
myDiffs = myDiffs + 1
End If
Next
End If
Next
'Print out the key from sheet1 if it didn't find a match in sheet2
If leftKeyMatched = False Then
sheet3index1 = sheet3index1 + 1
ActiveWorkbook.Worksheets(shtSheet3).Cells(sheet3index1, 1) = range1.Cells(row1, 1)
End If
Next
'Now print out any key that still hasn't been matched in sheet2
For i = 0 To range2.Rows.Count
If rightKeysMatched(i) = False Then
sheet3index2 = sheet3index2 + 1
ActiveWorkbook.Worksheets(shtSheet3).Cells(sheet3index2, 2) = range2.Cells(i + 1, 1) '+1 for same reason as above, index starts at 0 versus 1
End If
Next
'Display no. of differences
'MsgBox myDiffs & " differences found", vbInformation
Application.ScreenUpdating = True
End Sub
I'll take a crack at this
Sub compareSheets(shtSheet1 As String, shtSheet2 As String)
Dim mycell As Range
Dim mydiffs As Integer
Dim ws1 as WorkSheet
Dim ws2 as WorkSheet
Dim rng as Range
Dim SourceRow as integer
Dim Col as integer
set ws1 = ActiveWorkbook.Worksheets(shtSheet1)
set ws2 = ActiveWorkbook.Worksheets(shtSheet2)
myDiffs = 0
'Application.ScreenUpdating = false 'enable this later, once it's all working
'Color Uncommon records in Red Background
'your key is in column A, so we'll only loop through that column
For sourceRow = 1 to ws2.usedrange.Rows.Count
set rng = ws1.range(ws1.address).find(what:=ws2.cells(sourcerow, 1), LookIn:=xlValues, _
LookAt=xlWhole, MatchCase:=False)
'making an assumption on MatchCase, change as needed
if not rng is Nothing then 'we found the key, now let's look at the rest of the row
col = 2
'loop through the rest of the columns for this row
while col < ws2.usedRange.Columns.Count
'if the cell in the row we just found on sheet1 <> the cell that we were looking for from sheet2
if rng.cells(1,col) <> ws2.cells(sourcerow,col) then
rng.cells(1,col).Interior.Color = vbRed
mydiffs = mydiffs+1
end if
col = col + 1
wend
else
'we didn't find the key. pop up a msgbox. you may want something else
MsgBox ("Sheet2 key: " & ws1.value & " not found on Sheet1")
end if
'Display no. of differences
MsgBox mydiffs & " differences found", vbInformation
ActiveWorkbook.Sheets(shtSheet2).Select
MsgBox "Data Scrubbed Successfully..."
Application.ScreenUpdating = True
End Sub
If you want to find a value in a range use the following:
.Find(What, After, LookIn, LookAt, SearchOrder, SearchDirection, MatchCase, MatchByte, SearchFormat)
Like this :
Application.ScreenUpdating = False
'On Error Resume Next 'Err.Numbers 9, 91 => Find: value not found
Dim findCell as range
ActiveWorkbook.Worksheets(shtSheet2).Select
ActiveWorkbook.Worksheets(shtSheet2).UsedRange.Select
'Color Uncommon records in Red Background
For Each mycell In ActiveWorkbook.Worksheets(shtSheet1).UsedRange
Set findCell = Selection.Find(What:=Trim(mycell.value & ""), LookIn:=xlValues)
If findCell Is Nothing Then
mycell.Interior.Color = vbRed
mydiffs = mydiffs + 1
End If
Next
Note :
Please change Application.ScreenUpdating = True to Application.ScreenUpdating = False
And for more information use this MSDN article
And for using a function like that you want:
Public Function look_up_id (r as Range) As Variant
'
'Function body
'
End Function
'....
Call look_up_id(ActiveWorkbook.Worksheets(shtSheet2).Range("A:A", table))
'....

Resources