Speeding up VLOOKUP [duplicate] - excel

This question already has answers here:
How to optimize vlookup for high search count ? (alternatives to VLOOKUP)
(4 answers)
Closed 7 years ago.
I have the following formula:
=VLOOKUP(VLOOKUP(A1,[Clients]Sales!$B$1:$C$6,2,0),[Ledger]Sheet1!$G$1:$H$6,2,0)
that is working but I have over 100k lines of data and it is taking a few minutes to extract all the results.
Can it be speeded up?
Is VBA the quickest option?

It is difficult to completely answer your questions without sample data, but I think this is what you are looking for: How to optimize vlookup for high search count ? (alternatives to VLOOKUP)
If you are not familiar with VBA, I would definitely look into using INDEX-MATCH.
Hope this helps!

I imagine you'll want to use index-match. It's a pair of functions that can replicate vlookup and more, but is much faster than vlookup. I almost never use vookup for this reason any more.
I think what you want looks like below. No promises without seeing your workbook though.
=index([Ledger]Sheet1!$H$1:$H$6,match(index([Clients]Sales!$C$1:$C$6,match(A1,[Clients]Sales!$B$1:$B$6,0)),[Ledger]Sheet1!$G$1:$G$6,0))
The explanation is index pulls the nth item from a specific column. Match finds what n that happens to be.

Related

Excel Dynamic Array Dynamically placed under another [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 months ago.
Improve this question
Use Case: Golf irrigation valve assembly ordering. My boss said:
Hey [piratecheese13], I don't want this part summary to include blanks between [dynamic arrays] and I want to add and remove things to the [precedent to the array]
I have suggested inserting and deleting rows as needed. This is not a solution. She wants to get my list, add and remove rows from the precedent to the array, and send it to a vendor for ordering. Any work she does beyond that, is work she'd rather I do before it gets to her.
I have a neat little Dynamic Array function =FILTER(PartsAndQuantities1,(Parts1<>"")*(IF(Quantities1=0,"",Quantities1)<>""))and it works flawlessly to summarize while removing 0 counts and blank rows. The next step is to insert the same exact formula for a different range without risking a spill if a row is added or a blank if a row is removed
Here's that function in green working as intended with a 2nd function in red manually typed below
Here's what happens if my boss decides to remove a fitting from F9. She sees a blank in the summary a half hour later and assumes there was an error. Again, easily solved by removing a row but that's not an option.
Here's what happens if my boss decides to add a fitting to F12. She sees #Spill! and knows there is an error Again, easily solved by inserting rows but that's not an option.
Things I Have Attempted:
Just stick an & Between 2 functions. I really thought this would work but
this results in F7&F18 and G7&G18 etc as & appears to do work on every line instead of starting after the first FILTER() completes. adding parentheticals around each filter changes nothing. Using + results in value error. Nesting the function within AND() only ever results in Booleans TRUE & FALSE
I've also tried this tutorial on CHOOSE() which handles multiple arrays but only one column and seems like it wouldn't fit
Note: my boss doesn't want to insert lines to solve spill errors, she REALY doesn't want to deal with macro enabled workbooks
Thanks to Scott Craner this was exactly what I needed!
=VSTACK(FILTER(F7:G15,(F7:F15<>"")*(IF(G7:G15=0,"",G7:G15)<>"")),FILTER(F17:G23,(F17:F23<>"")*(IF(G17:G23=0,"",G17:G23)<>"")))

Is it possible to have a "dynamic array" inside MATCH function? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I have a large amount of data that was converted to excel recently.
There are just two columns and more than 100000 lines. There is an Image as an example attached here.
The systems are always in the second columns as well as the data, but the loads and the rest of the information are in the first column, as you can see in the example, the "Max load" always repeat, but in the system 001 there is a difference of two lines between it and the system name, but in the second system the difference is three lines. I want to make a table for Max load with all the systems, those are the things I tried:
I used MATCH to find the line where the System name appears, knowing that the difference would be either 2 or 3 lines, I made a table with 2 and 3 in the header and I added it to the line I found and MATCH, it kind of work, but I wanted something better than this improvisation.
I also tried using (where X2= the line where the system name appears)
=MATCH(AA1;OFFSET(A1:X2:0:100:1);0)
Because I thought that OFFSET would return a range to MATCH, but it didn't work.
I want to know if there is a way to find the text "Max load" after the line I found using the Systems' names in MATCH. Basically, I want to know if there is a way of using kind of a "dynamic array" inside MATCH, so I would use it to find "Max load" right after the system that I am looking for.
I don't if this works, but if there is any other way of doing it, I am glad to know about it.
Thanks very much,
Matheus
You can use multiple match with index. Use below formula. It will not care how many rows are different from system to data.
=INDEX(D:D,MATCH(H4,D:D,0)+MATCH("Max Load",INDIRECT("C" & MATCH(H4,D:D,0) & ":C200000"),0)-1)
I think you are overthinking this way too much. If you just have either two or three rows, if offsetting the answer by 1 gives you 0 then use a simple if statement to offset it by 3.
=IF(INDEX(E:E,MATCH(H3,E:E,0)+2)>0,INDEX(E:E,MATCH(H3,E:E,0)+2),INDEX(E:E,MATCH(H3,E:E,0)+3))
Another option
In I4, formula copied right to J4 and all copied down :
=INDEX($D:$D,AGGREGATE(15,6,ROW($C$6:$C$1000)/($C$6:$C$1000=I$3),ROW($A1)))
Another solution would be
=INDEX(INDEX($D:$D,MATCH($F4,$D:$D,0)):$D$200000,MATCH(G$3,INDEX($C:$C,MATCH($F4,$D:$D,0)):$C$200000,0))

Month, Year and Countif

I am trying to develop a query which should pick out and count the data from a data pool corresponding to the year/month in the given (A106) cell.
While such (array) construct works:
COUNTIF(IF(MONTH(INDEX(Data.$A$1463:$A$1827))=MONTH(A106);Data.$B$1463:$B$1827);">0");
such one - does not:
COUNTIF(IF( MONTH(INDEX(Data.$A$1463:$A$1827))=MONTH(A106) and YEAR(INDEX(Data.$A$1463:$A$1827))=YEAR(A106));Data.$B$1463:$B$1827);">0")
It there anything to be done about it or it is impossible?
I believe that you might want to use a sumproduct for this (though there are certainly other ways to achieve the same result). This might do the trick:
SUMPRODUCT((MONTH(INDEX(Data.$A$1463:$A$1827))=MONTH($A$106))*(YEAR(INDEX(Data.$A$1463:$A$1827))=YEAR($A$106)))

How to optimize COUNTIFS with very large data

I would like to create a report that look like this picture below.
My data has around 500,000 cells (it will continue to grow larger)
Right now, I'm using countifs function from excel but it takes a very long time to calculate. (cannot turnoff automatic calculate)
The main value is collected as date and the range of date is about 3 years, so I have to put a lot of formula to cover all range of value.
result
The picture below is the datasource the top one cannot be changed. , while the bottom is the one I created by myself (can change). I use weeknum to change date to week number.
data
Are there any better formula or any ways to make this file faster? Every kinds of suggestions are welcome!
I was thinking about using Pivot Table, but I don't know how to make pivot table from this kind of datasource.
PS. VBA is the last option.
You can download example file here: https://www.mediafire.com/?t21s8ngn9mlme2d
I will post this answer with the disclaimer that it is entirely dependent on the size of the data set. That turning on and off the auto calculate is the best way, but your question doesn't let me do that, so keep reading.
Your question made me curious, so I gave it a try and timed it. I essentially set up two columns of over 100,000 rand numbers choosing from 1-1000 and then tried to do a countif on the two columns if they were equal. I made a macro that I can run that turns off the autocalculate, inserts the start time, calculates, and then inserts the finish time. I highlighted in yellow the time difference.
First I tried your way, two criteria, countifs:
Then I tried to combine (concatenate) the two columns to see if I could make it easier by only having one countif criteria and data set. It doesn't. see result below:
Finally, realizing what was going on. I decided to make the criteria only match the FIRST value in the number to look for. I was essentially reducing the number of characters to check per cell. This had a positive result. See below:
Therefore my suggestion is to limit the length of the words you are comparing in anyway possible. You are mostly looking at dates, so you might have to get creative, but this seems to be the best way possible without going to manual calculation.
I have worked with Excel sheets of a similar size. Especially if you are using the data on a regular basis, I would heartily recommend switching to a proper database SQL based, Access, or whatever fits your purpose. I does wonders for the speed and also you won't run into the size limits of Excel. :-)
You can import the data you have now fairly easy.
I am happy as a clam with my postgresql db.

Improve Vlookup on large file

I´ve a very large file that I reduced as much as possible to 3 columns and 80k rows.
I need to perform a vlookup in order to bring values from column 1 or 2 match some other spreadsheets values.
The thing is Excel doesn´t seem to support such large searches, and it stops responding - the computer has 4GB and a Quad core, and not much more running at the same time.
As far as I understand, as I´m not looking for exact matches, I should not use match-index.
The only thing I thouhgt could help but not sure about that, is dividing the file in 2-4, and asking Excel many parallel searches instead of a big one. Could this work?
What else should I try?
Thanks!!!
Sort your data and use True as the 4th VLOOKUP argument. This makes VLOOKUP use binary search rather than linear search and is lightning fast.
If you need to handle missing data you will need to use the double VLOOKUP trick, see
http://fastexcel.wordpress.com/2012/03/29/vlookup-tricks-why-2-vlookups-are-better-than-1-vlookup/

Resources