Trying to improve efficiency of array formula - excel

I have a SUM array formula that has multiple nested IF statements, making it very inefficient. My formula spans over 500 rows, but here is a simple version of it:
{=SUM(IF(IF(A1:A5>A7:A11,A1:A5,A7:A11)-A13:A17>0,
IF(A1:A5>A7:A11,A1:A5,A7:A11)-A13:A17,0))}
As you can see, the first half of the formula checks where the array is greater than zero, and if they are, it sums those in the second part of the formula.
You will notice that the same IF statement is repeated in there twice, which to me is inefficient, but is the only way I could get the correct answer.
The example data I have is as follows:
Sample Data in spreadsheet http://clients.estatemaster.net/SecureClientSite/Download/TempFiles/example.jpg
The answer should be 350 in this instance using the formula I mentioned above.
If I tried to put in a MAX statement within the array, therefore removing the test to find where it was greater than zero, so it was like this:
{=SUM(MAX(IF(B2:B6>B8:B12,B2:B6,B8:B12)-B14:B18,0))}
However, it seems like it only calculates the first row of data in each range, and it gave me the wrong answer of 70.
Does anyone know a away that I can reduce the size of the formula or make it more efficient by not needing to repeat an IF statement in there?
UPDATE
Jimmy
The MAX formula you suggested didnt actually work for all scenarios.
If i changed my sample data in rows 1 to 5 as below (showing that some of the numbers are greater than their respective cells in rows 7 to 11, while some of the numbers are lower)
Sample Data in spreadsheet http://clients.estatemaster.net/SecureClientSite/Download/TempFiles/example2.jpg
The correct answer im trying to achive is 310, however you suggested MAX formula gives an incorrect answer of 275.
Im guessing the formula needs to be an array function to give the correct answer.
Any other suggestions?

=MAX( MAX( sum(A1:A5), sum(A7:A11) ) - sum(A13:A17), 0)

A more calculation-efficient (and especially re-calculation efficient) way is to use helper columns instead of array formulae:
C1: =MAX(A1,A7)-A13
D1: =IF(C1>0,C1,0)
copy both these down 5 rows
E1: =SUM(D1:D5)
Excel will then only recalculate the formulae dependent on any changed value, rather than having to calculate all the virtual columns implied by the array formula every time any single number changes. And its doing less calculations even if you change all the numbers.

You may want to look into the VB Macro editor. In the Tools Menu, go to Macros and select Visual basic Editor. This gives a whole programming environment where you can write your own function.
VB is a simple programming language and google has all the guidebooks you need.
There, you can write a function like MySum() and have it do whatever math you really need it to, in a clear way written by yourself.
I pulled this off google, and it looks like a good guide to setting this all up.
http://office.microsoft.com/en-us/excel/HA011117011033.aspx

This seems to work:
{=SUM(IF(A1:A5>A7:A11,A1:A5-A13:A17,A7:A11-A13:A17))}
EDIT
- doesn't handle cases where subtraction ends up negative
This works - but is it more efficient???
{=SUM(IF(IF(A1:A5>A7:A11,A1:A5,A7:A11)>A13:A17,IF(A1:A5>A7:A11,A1:A5,A7:A11)-A13:A17,0))}

What about this?
=MAX(SUM(IF(A1:A5>A7:A11, A1:A5, A7:A11))-SUM(A13:A17), 0)
Edit:
Woops - Missed the throwing out negatives part. What about this? Not sure it it's faster...
=SUM((IF(A1:A5>A7:A11,IF(A1:A5>A13:A17,A1:A5,A13:A17),IF(A7:A11>A13:A17,A7:A11,A13:A17))-A13:A17))
Edit 2:
How does this perform for you?
=SUM((((A1:A5>A13:A17)+(A7:A11>A13:A17))>0)*(IF(A1:A5>A7:A11,A1:A5,A7:A11)-A13:A17))

Related

Filter Cells as argument to a function in a formula

I am a programmer, so please bear with me. I understand that Excel isn't necessarily what I am used to in other domains, but I'm cracking my head open on how to accomplish something that seems somewhat simple.
I have a column of numbers that are themselves the basis of a formula. I want to filter those cells based on some criteria and pass them to another function to perform a calculation.
I understand that this can be done with "filters" in the excel sense. This would mean I would have to click multiple times for each calculation, filter the results, copy the value and paste it where I need it to be. If the data ever changes, I will have to do it all again.
What I am looking for is the equivalent of filtering in a programming language, here's an example:
let range = [1,2,3,4,0,-1,-2,-3,-4];
let subrange = range.filter(function (cell) { return cell > 0; });
subtotal(1,subrange);
So what my excel is like.
I have a column G, that has 12,000+ results in it, each one of these columns is like this:
=(En-Bn)/Bn
These are copied down, n means the row number from 5-12,000+
Now I would like to create a cell, M2 such that it contains:
=SUBTOTAL(1,[ Gn in G5:G12000 where Gn > 0 ])
The goal is that I do not want to have to point and click, because actually, there are many more cells I need to create (about 20) with similar kinds of "filter" predicates.
It would be nice, as much as possible, if I also don't have to specify the n...n-1 range of the column, as ideally that can change. Could be 10, could be 20,000, shouldn't matter.
The best formula or solution would be like:
SUBTOTAL(1, [ Gn in G0:GLENGTH where Gn > SOMECELL ])
Any pointers, or suggestions where to read, or a solution would be awesome. I've been searching on google, and it seems that I lack the right understanding to find the answer in the material presented.
Also, please excuse me for using programmer speak, I know that Excel formulas are not necessarily a 1:1, I'm just looking for a way to save time. Answers in VBA or using Macros are welcome, the main thing is to find a way to do it...
Best,
Jason
Update
I should specify that it needs to be a bit backwards compatible, so I can't use the FILTER function that is only available in >= 365
I'm not at all sure that your attempts at saving time by talking in programming language instead of English really saves either time or space. My best effort determines that you got us all confused. Please tell me why the simple formula below doesn't work.
=AVERAGEIF(G2:G15000,">"&A1,G2:G15000)
This formula requires A1 to hold a number and the formula supplies the > sign. A variation would have A1 contain both, number and comparison, like >1.2`
=AVERAGEIF(G2:G15000,A1,G2:G15000)
The above formulas start the range at G2. Change to G5 if that is what you need. G15000 is a random number intended to be larger than anything you will ever need. The function ignores blanks. However, if you are worried about having a sheet with 16000 rows just on the day you forgot where to adjust the formula I would recommend the use of a named range which you could format to be dynamic.
Named ranges are neater to handle than range addresses and names can be given descriptively, such as HourlyReadings. The above formula would then look like this:-
=AVERAGEIF(HourlyReadings, ">"&A1,HourlyReadings)
Theoretically, the formula by which HourlyReadings is defined could also be written into the worksheet formula but it would become unwieldy. As shown above, you would have to look into the Name Manager to know if the range is dynamic or not but, of course, once defined you can use the same name in many functions and formulas which saves a lot of maintenance time.
This is for Excel 365, using worksheet formulas. With data in column G starting in G5, in another cell enter:
=SUM(FILTER(G:G,G:G>0))
How about an array?
=SUM(IF(G:G>0,G:G,""))
put cursor in 'function bar' with formula. Then press CTRL+SHIFT+ENTER (in that order while holding them all down. {} will appear around formula.
Let me know if further assistance is needed.
Matt

Excel MAX/MIN but only if opposing cell is greater than 0

I've found similar examples through searching but I can't find anything that matches the issue that I have...
I have a table which shows parts received/rejected, I wish to see the maximum days early/late (I'll only need help with one as the other I can then do!)
- but there are dummy orders which I wish to ignore (they show a received/reject of 0).
Here is example data from the 'AnnualDump' sheet:
My current calculation is
=IF(ISBLANK(AnnualDump!$H$2),"BLANK",0-MIN(AnnualDump!$G:$G))
[Column H is Received/Rejected and G is VarianceDays]
This simply looks at if there is any data on the sheet before running the calculation, which is fantastic for 95% of the time... but I want to ignore any values that have a received/rejected of 0...
I want it to show 29, but it's showing 30 in this instance as it's not ignoring 0qty lines.
I've tried adding another IF statement but it didn't work :/
Completely stuck now and not sure what the next step to try is...
I can do it if I cheat (call both columns to another sheet, turn text white, use an 'IF cell greater than x, then value' to compare the whole lot and then min/max that third column) but I'm trying to avoid that!
Any pointers or help will be greatly appreciated (complete VBA noob in excel so I'd like to avoid that if possible).
Thanks
Try this array formula. Confirm with Ctrl, Shift and Enter and curly brackets will appear round the formula.
I would strongly suggest you don't use full column references though as these formulae are rather resource-intensive.
=IF(ISBLANK(AnnualDump!$H$2),"BLANK",0-MIN(IF(AnnualDump!$H:$H>0,AnnualDump!$G:$G)))

COUNTIF formula counts values that don't match

I'm using counting invoice numbers (text) in a table's column, but the Excel formula seems to be confusing some values.
I copied small sample of these - please refer to below:
The formulas are as follow:
=COUNTIFS(A1:A19,A1)
=COUNTIF(A1:A19,A1)
As you can see these invoice numbers differ and the results of these functions suggest as if all were the same.
I googled it for 1 hour but I didn't find such as issue as mine.
If anybody had any clue why could this behave in such way I'll be super grateful!
Rob
Each time you copy down this formula it will add 1 row to each. For example the second row of datas formula will be =COUNTIFS(A2:A20,A2). To lock these cells in the formula use $
Your formula should be =COUNTIFS(A$1:A$19,A1)
I've solved this myself:
ROOTCAUSE
Excel tried to be helpful and read these invoice numbers as actual numbers (despite these being defined already in Power Query as text)
Then, Excel fooled me and despite showing that it works on it as a string (I was evaluating the formula) it worked on it as number
Above means that it transformed exemplary "00100001010000018525" to 1.00001E+17, which cut down this to "100001010000018000" - that's the moment Excel stopped fooling around and showed that value in the formula bar.
I think I don't need to tell why countif perceived all these values as equal.
SOLUTION
I simply appended one letter after each invoice number to get e.g. "00100001010000018525a" what forces Excel to quit its gimmicks and games.
Case closed.
I suspect this is a bug in COUNTIF, or maybe by design.
However, to workaround this in the formula, without having to change your data, try adding a wild-card character:
=COUNTIF(A1:A19,"*"&A1)

How to improve the formula writing and avoid repeating the entire formula depending on the condition

So, say I have at cell A1:
=IF(A2=1,A2,0)
That OK, that's a tiny formula easy to understand.
If the formula starts to grow, I would have something like:
IF(...big formula here...=1,...repeat the big formula here...,0)
It's a dummy example but the key point here is that when I repeat the big formula at the TRUE condition position the formula double its size, what can hinder the formula debugging, for example.
Is there a way to not repeat the whole formula writting at this situation?
I don't want to use any macro/VBA to do this or any other 'helper' cells.
Thanks
In this particular case you don't have to use an IF statement, can just use
=--(A2=1)
Or for some other value, say 2,
=(A2=2)*2
These work if one of the results you want is zero.
It is a little more difficult if you have an IF statement like
=IF(A2>2,A2,2)
but you can often use MAX or MIN to avoid the IF statement
=MAX(A2,2)
If you had a chain of IF statements to divide the number in A2 into ranges like
=IF(A2>=2,20,IF(A2>=1,10,0))
You could replace it with a lookup
=IFERROR(VLOOKUP(A2,{1,10;2,20},2),0)
Sometimes you can replace a series of IF statements with CHOOSE, e.g. to return "Negative", "Positive" or "Zero"
=CHOOSE(SIGN(A2)+2,"Negative","Zero","Positive")
One tricky way I have seen is to use inverse functions one of which gives an error under certain conditions, so you could try
=IFERROR((SQRT(A2-2)^2)+2,2)
but I'm not sure I could recommend it as these methods can be vulnerable to rounding errors.
See this previous question
Create a helper column -- say, col X -- that calculates your big formula. Hide the column if you don't want to confuse other spreadsheet viewers.
Then your long, difficult to debug formula becomes IF(X1=1,...X1...,0).

Excel query with regards to IF statements

I have the following formula in an excel sheet that generally works perfectly:
=IF(F5>=30.01,(39+(C5*0.08)),IF(AND(F5>=20.01,F5<=30),(39+(C5*0.07)),IF(AND(F5>=10.01,F5<=20),(39+(C5*0.06)),IF(AND(F5>=5.01,F5<=10),(39+(C5*0.05)),IF(AND(F5>=2.01,F5<=5),(39+(C5*0.04)),IF(AND(F5>=1.01,F5<=2),(39+(C5*0.03)),IF(AND(F5>=0.25,F5<=1),(39+(C5*0.02)),IF(AND(F5>=0,F5<=0.245),(0.03*C5*F5)))))))))
I was just wondering if anyone could tell me how to edit this so that if the result of the formula is less than '43', that the number inputted into the cell should be 43?
I have been trying to edit this accordingly for a while and I'm not sure what I need to do to make that happen.
The rest of the formula works exactly as I need it to, I just need the sheet not to produce a result that is less than 43.
Thank you so much for all your assistance!
You don't need the AND statements as the very nature of the nested ifs picks them off scaling down, Here is your current formula amended:
=IF(F5>=30.01,(39+(C5*0.08)),IF(F5>=20.01,(39+(C5*0.07)),IF(F5>=10.01,(39+(C5*0.06)),IF(F5>=5.01,(39+(C5*0.05)),IF(F5>=2.01,(39+(C5*0.04)),IF(F5>=1.01,(39+(C5*0.03)),IF(F5>=0.25,(39+(C5*0.02)),IF(F5>=0,(0.03*C5*F5)))))))))
You could then wrap it in a MAX formula to get either the result or 43, whichever is larger like so:
=MAX(IF(F5>=30.01,(39+(C5*0.08)),IF(F5>=20.01,(39+(C5*0.07)),IF(F5>=10.01,(39+(C5*0.06)),IF(F5>=5.01,(39+(C5*0.05)),IF(F5>=2.01,(39+(C5*0.04)),IF(F5>=1.01,(39+(C5*0.03)),IF(F5>=0.25,(39+(C5*0.02)),IF(F5>=0,(0.03*C5*F5))))))))),43)
the formula can also be adjusted further in order to group common factors as follows:
=MAX(SUM(IF(F5<=0,0,39),
IF(F5>30,(C5*0.08),
IF(F5>20,(C5*0.07),
IF(F5>10,(C5*0.06),
IF(F5>5,(C5*0.05),
IF(F5>2,(C5*0.04),
IF(F5>1,(C5*0.03),
IF(F5>=0.25,(C5*0.02),
IF(F5>=0,(0.03*C5*F5),0))))))))),43)
Long formulas are always difficult to read, may I suggest to use the Alt+Enter keys combined to start a new line in the same cell thus breaking your formula in several lines within the same cell.
Certainly, The revised formula should be like this:
=IF(F5=0,0,
MAX(SUM(IF(F5<=0,0,39),
IF(F5>30,(C5*0.08),
IF(F5>20,(C5*0.07),
IF(F5>10,(C5*0.06),
IF(F5>5,(C5*0.05),
IF(F5>2,(C5*0.04),
IF(F5>1,(C5*0.03),
IF(F5>=0.25,(C5*0.02),
IF(F5>=0,(0.03*C5*F5),0))))))))),43))

Resources