I first made a VBA script to compare two excel files. Then optimized it using Variant as said in this question. But then, I changed it to VBScript later. Here the method said above doesn't seem to work.
Are there any other better ways to speed up the process? Especially for large files.
My core code is as follows:-
For Each cell In objxlWorksheet1.UsedRange
If cell.Value <> objxlWorksheet2.Range(cell.Address).Value Then
'fill the color in the cell if there is a mismatch and Increment the counter
objxlWorksheet2.Range(cell.Address).Interior.ColorIndex = 3
counter=counter+1
End If
Next
It depends on what it is that you are comparing. If you have two sheets with similar tables of data it would be easier to use formulas instead of VBA code. Just create a new worksheet and enter a formula like this: =Sheet1!A1=Sheet2!A1 Then you can use Ctrl-Find to search for False
Or if you can copy the data on one sheet side-by-side, you can use conditional formatting to highlight values that are different.
Related
I want to ask whether it is possible to have excel print out a complete row of raw data using two variables. So like let say we have the following data:
What we wish to have is that based on the values "2018" and "A", excel should give out the complete row data automatically as done so in the yellow cells.
I know how to do it for one variable, where I have been using
Index(range,MATCH(value, range,0),column())
But I am having difficulty when there are two unique variables, based on which the row data must be extracted.
Currently, I do it in two steps. So I first filter out the year and then use the above formula to extract the row data for A or B. But it is not a very good approach and would appreciate if it can be done using a single formula.
Does anyone has any clue on how it can be done without using Pivot Table?
UPDATE
Regarding the suggestion of using VBA. Using the VBA is a good option, since then I can just use the autofilter command, but the problem is defining the cells in VBA and also how can I have one code for two different columns?
My vba code which I have used for filtering the tables is the following:
Sub Autofilter_Filter12()
Dim lo as ListObject
Dim iCol As Long
Set lo = Sheet3.Listobjects(1)
iCol = lo.ListColumns("Year").Index
with lo.Range
.Autofilter Field:=iCol, Criterial:="XXXX"
End Sub
Now the problem with the VBA code is:
it is only applied for one column and not both.
Instead of XXX, how can I define a cell into the VBA? I have tried but failed again and again.
Thank you for the help.
If range C:H is always numbers then you can use SUMPRODUCT.
=SUMPRODUCT(($A$2:$A$5=$A$7)*($B$2:$B$5=$B$7)*C2:C5)
parameter 1 parameter 2 Value to return
In C7, then select C7:H7 and press CTRL+R.
This results in this:
When this fails it will return 0.
Not very nice, but it could be partially solved with I7 =
=IFERROR(IF(SUM(C7:H7)=0,"Filter failed",""),"")
In EXCEL 365 with dynamic formula you can put multiple columns in MATCH formula by merging them with &
=MATCH(A7&B7,A1:A6&B1:B6,0)
So you can use index-match combination for your case (no matter if values are nubmers or not):
=INDEX(C$1:C$6,MATCH($A$7&$B$7,$A$1:$A$6&$B$1:$B$6,0))
I am faced with creating a document in excel that requires repetitive instances of the same column of data at multiple points in the sheet, but they arent in 'regular' intervals - so i would like to have a function to both insert new rows and paste the data based on the current active cell (the cell into which i put in the function, or similar). Based on some basic googling it felt like it should be fairly simple with some Range commands, but the farthest ive gotten is #value! instead of compile issues. Guessing it is due to the range functions not quite being set up correctly in my code
Sub insertcopy()
Range(Sheet2!A2, Sheet2!A82).Copy
Range.Insert
End Sub
I am automating a report build in Excel using VBA. Part of that process I use vlookup to compare the lists. Tab 1 contains roughly 180,000 line items with the unique ID, the vlookup takes that ID and compares against "owners" in tab 2 with roughly 250,000 line items. Run time on this operation is roughly 25-30mins and I'm wondering if there is a faster way? Maybe I should perform this comparison using a script outside of Excel to reduce calculation time?
It's working fine so I haven't tried to troubleshoot. I have a few ideas around doing the work outside of excel, in the background but looking for ideas from the broader group.
Here is the line I'm using to perform the lookup now, it's repeated 5x in code.
Range("Table").Offset(1).Select
ActiveCell.FormulaR1C1 = "=IFNA(VLOOKUP([#ID],table,2,0),""Unassigned"")"
With each iteration of the above line in the workbook recalculates which is taking the 30mins. I have tried setting calculation to xlManual then back to xlAutomatic, no luck. Was thinking I could just run a single worksheet calc after the formulas where written.
Curious if anyone knows of a faster way to accomplish this. As I said the run time is 30mins for this section, and the total run time is 35-40mins.
If you can SORT your data, you can build a double VLOOKUP with the range_lookup parameter set to TRUE. This causes VLOOKUP to do a binary search which, on a large DB, may run 100x faster:
=IF(VLOOKUP(ID,Table,1,TRUE)=ID,VLOOKUP(ID,Table,2,TRUE),NA())
And if you are using the VLOOKUP method, you should be sure to turn off ScreenUpdating and also set Calculation to manual while you are populating the worksheet with the formulas.
Alternatively, it might be faster to just read the data into a VBA array or dictionary, and do all your lookup and matching within VBA. Again, if you can sort your list, you can use a binary search which will be much faster.
Maybe try to convert the result of your VLOOKUP formula into value after each iteration, something like that:
Sub foo()
Dim rngCell As Range
For Each rngCell In Range("Table").Offset(1)
rngCell.FormulaR1C1 = "=IFNA(VLOOKUP([#ID],table,2,0),""Unassigned"")"
rngCell.Value = rngCell.Value
Next rngCell
End Sub
This should prevent it from recalculating your VLOOKUP results.
Alternatively, use INDEX+MATCH combination, or - if your dataset is sorted - use VLOOKUP with match mode TRUE (approximate) instead of FALSE (exact).
I have a series of array formulas in Excel that key off of each other. These are automatically resized to fit a range of data that is generated via a proprietary Excel add-in.
However, whenever my code rewrites some of the array formulas to the correct new size, the cells all show as #N/A until either you edit another unrelated cell on the sheet, save the sheet, or press F9.
Using code such as Application.Calculate, ActiveSheet.Calculate, etc do not have any effect.
However, using SendKeys "{F9}" does.
As an example, these are two formulas on the sheet:
={IF(LEN(INDEX(A:A, ROW()))>0,ROW(A:A)+2)}
and
={LARGE(OFFSET($J$1,0,0,ROW()),1)}
The first formula works fine after writing it programmatically to a range of cells. It merely tells me the row number of a cell that has text in it.
The second formula does not work after writing it programmatically to a range of cells. It gives me the largest row number that has been previously seen in a list of numbers (which is the output of the first formula). If I press F9, the second formula updates correctly. If I do Application.Calculate in VBA, nothing happens. I've also tried the various other recalculate methods available at the Worksheet level as well, but no luck.
Has anyone encountered something like this before?
edit:
The resize code essentially boils down to something like this (stripping out all of the support code that allows me to make more generalized calls to it):
First, I do:
formula = dataSheet.Cells(startRow, startColumn).formula
Then later:
Set DeleteRange = dataSheet.Range(dataSheet.Cells(startRow, startColumn), dataSheet.Cells(bottomBound, rightBound))
DeleteRange.ClearContents
Set DeleteRange = Nothing
Then later on:
Set resultRange = dataSheet.Range(dataSheet.Cells(startRow, startColumn), dataSheet.Cells(startRow + Height - 1, startColumn + Width - 1))
resultRange.FormulaArray = formula
Set resultRange = Nothing
In a nut shell, I make a copy of the formula, clear the range, then rewrite it.
If you can't beat 'em, join 'em.
SendKeys "{F9}"
I have fleshed out my comment above given you have implemented this approach
using this code
Dim strFormula As String
strFormula = "=LARGE(OFFSET($J$1,0,0,ROW()),1)"
Range("a1:a5").FormulaArray = strFormula
xl03 gives numbers but needs a calc to update the cells properly
xl07 gives the "#N/A" (and raises a Calculate in the statusbar)
xl10 works fine
As you point out none of the calculation options including a full dependency tree rebuild work
using my RAND suggestion above does force the update in xl07
Dim strFormula As String
strFormula = "=LARGE(OFFSET($J$1,0,0,ROW()),1)+ RAND()*0"
Range("a1:a5").FormulaArray = strFormula
OFFSET is a volatile function see Voltatile Excel Functions (which includes a file that tests volatility)
Perhaps Charles Williams can shed some light on this, I will ping him
Looks like a FormulaArray bug in 2003 and 2007.
A simpler bypass for your formula would be to use Range("a1:A5").Formula instead of formula array since =LARGE(OFFSET($J$1,0,0,ROW()),1) does not need to be an array formula
My recollection is that there is also an Application.CalculateFull method - does that work?
My cells were not refreshing after ranges they depended on were modified either.
I also had to hand edit each one to get them to re-calculate.
SOLUTION:
Use fully qualified references within your formulas.
e.g. any formulas that look like this YourFunction(G200:H700) should be changed to look like this YourFunction('Your Sheet Name'!G200:H700).
Auto-Refresh now works perfectly for me.
We have a few very large Excel workbooks (dozens of tabs, over a MB each, very complex calculations) with many dozens, perhaps hundreds of formulas that use the dreaded INDIRECT function. These formulas are spread out throughout the workbook, and target several tables of data to look-up for values.
Now I need to move the ranges of data that are targeted by these formulas to a different location in the same workbook.
(The reason is not particularly relevant, but interesting on its own. We need to run these things in Excel Calculation Services and the latency hit of loading each of the rather large tables one at a time proved to be unacceptably high. We are moving the tables in a contiguous range so we can load them all in one shot.)
Is there any way to locate all the INDIRECT formulas that currently refer to the tables we want to move?
I don't need to do this on-line. I'll happily take something that takes 4 hours to run as long as it is reliable.
Be aware that the .Precedent, .Dependent, etc methods only track direct formulas.
(Also, rewriting the spreadsheets in whatever is not an option for us).
Thanks!
You could iterate over the entire Workbook using vba (i've included the code from #PabloG and #euro-micelli ):
Sub iterateOverWorkbook()
For Each i In ThisWorkbook.Worksheets
Set rRng = i.UsedRange
For Each j In rRng
If (Not IsEmpty(j)) Then
If (j.HasFormula) Then
If InStr(oCell.Formula, "INDIRECT") Then
j.Value = Replace(j.Formula, "INDIRECT(D4)", "INDIRECT(C4)")
End If
End If
End If
Next j
Next i
End Sub
This example substitues every occurrence of "indirect(D4)" with "indirect(C4)". You can easily swap the replace-function with something more sophisticated, if you have more complicated indirect-functions. Performance is not that bad, even for bigger Workbooks.
Q: "Is there any way to locate all the INDIRECT formulas that currently refer to the tables we want to move?"
As I read it, you want to look inside the arguments of INDIRECT for references to areas of interest.
OTTOMH I'd write VBA to use a regular expression parser, or even a simple INSTR to find INDIRECT( read forward to the matching ), then EVALUATE() the string inside to convert it to the actual address, repeat as required for multiple INDIRECT(...) calls and dump the formula and its translation to two columns in a sheet.
You can use something like this in VBA:
Sub ListIndirectRef()
Dim rRng As Range
Dim oSh As Worksheet
Dim oCell As Range
For Each oSh In ThisWorkbook.Worksheets
Set rRng = oSh.UsedRange
For Each oCell In rRng
If InStr(oCell.Formula, "INDIRECT") Then
Debug.Print oCell.Address, oCell.Formula
End If
Next
Next
End Sub
Instead of Debug.Print you can add code to suit your taste
Unfortunately, the arguments of
INDIRECT are usually more complex than
that. Here's an actual formula from
one of the sheets, not the most
complex formula we have:
=IF(INDIRECT("'"&$B$5&"'!"&$O5&"1")="","",INDIRECT("'"&$B$5&"'!"&$O5&"1"))
hm, you could write a simple parser by ignoring most of the characters and just looking for the relevant parts (in this example: "A..Z", "0..9" and "!:" etc.) but you will run into troubles if the arguments in "indirect" are functions.
maybe the safer approach would be to print every occurence of "indirect" in a third sheet. you could then add the desired output and write a small search and replace program to write your changes back.
If you "get" every cell in a huge
spreadsheet you might end up needing
monstrous amounts of memory. I am
still willing to try and take that
risk.
PabloG's method of selecting the used range is the way to go (added it into my original code). The speed is pretty good, especially if you check whether the current cell contains a formula. Obviously, this all depends on the size of your workbook.
I'm not sure what the etiquette of SO is concerning mention of products with which the writer is connected, but OAK, the Operis Analysis Kit, an Excel add-in, can replace the INDIRECT functions by the cell references they resolve to. You can then use Excel's audit tools to determine what dependents each range has.
You would, of course, do this to a temporary copy of the workbook.
More at
http://www.operisanalysiskit.com/oakpruning.htm
http://www.operisanalysiskit.com/help/2007/index.html?oakconceptpruning.htm
Given the age of this question you may well have found an alternative solution or workaround.