Excel calculation process / INDIRECT in an array formula - excel

I got this spreadsheet where on daily basis I will be pasting data ranging from hundreds to thousands of rows. And this spreadsheet is heavy loaded with functions. I would like to cut calculation time as much as possible but not quite sure how excel is designed to process data. Is there a difference between two functions which are ranging like these:
SUM(A2:A50) and SUM(A2:A99999)
How I look at this: SUM function on first example will stop running once it reaches A50 cell and on second example it will keep running until it reaches A99999. From this I can say first is more efficient to have in your spreadsheet.
Please advice.
I got this formula which returns me an #N/A value and I believe this is because INDIRECT is included in array formula (Ctrl + Shift + Enter).
Whole formula itself:
=SUM(IF((INDIRECT("'" & $K$6 & "'!" & $K$7)>CODES!$K$2)+(MMULT(--('Data paste'!AD2:AF5>CODES!$K$1),{1;1;1})>0)+(MMULT(--('Data paste'!AD2:AF5>CODES!$K$3),{1;1;1})>1)+--('Data paste'!AD2:AD5*'Data paste'!AE2:AE5*'Data paste'!AF2:AF5>CODES!$K$4), 1, 0)*--('Data paste'!Q2:Q5=CODES!C2))
Please find a spreadsheet uploaded onto the GDrive so it is easier for you to understand what do I mean. Formula is located on CODES sheet E2 cell.
Thank you.

Obviously, referencing smaller ranges will always be faster or at least equally fast. However, Excel does have some optimization:
Many Excel built-in functions (SUM, SUMIF) calculate whole column references efficiently because they automatically recognize the last used row in the column. However, array calculation functions like SUMPRODUCT either cannot handle whole column references or calculate all the cells in the column.
so that shouldn't be terribly important for SUM(A2:A99999).
As for question number 2: Yes, you may use INDIRECT() within an array formula. It just currently returns NA because you're trying to add two arrays of different length:
{FALSE; TRUE} + {FALSE; FALSE; FALSE; FALSE} = {0; 1; #NA; #NA}
How to fix your issue:
change CODES!K7 to
=COUNTA('Data paste'!A:A)
and change CODES!E2 to
=SUM(IF(('Data paste'!AA2:INDEX('Data paste'!AA:AA,$K$7)>CODES!$K$2)
+(MMULT(--('Data paste'!AD2:INDEX('Data paste'!AF:AF,$K$7)>CODES!$K$1),{1;1;1})>0)
+(MMULT(--('Data paste'!AD2:INDEX('Data paste'!AF:AF,$K$7)>CODES!$K$3),{1;1;1})>1)
+--('Data paste'!AD2:INDEX('Data paste'!AD:AD,$K$7)*'Data paste'!AE2:INDEX('Data paste'!AE:AE,$K$7)*'Data paste'!AF2:INDEX('Data paste'!AF:AF,$K$7)>CODES!$K$4), 1, 0)
*--('Data paste'!Q2:INDEX('Data paste'!Q:Q,$K$7)=CODES!C2))
I have replaced Indirect() calls by Index(), which should benefit performance.

Related

Excel - Replicating a SUMPRODUCT formula with already summed up values

I have a sheet that uses the below formula to arrive at a figure:
=SUMPRODUCT(E12:K12,E23:K23)/M10
To my understanding, it's getting the sum of E12:K12(593+622+636+620+595+583+589) and multiplying that together with the sum of E23:K23(5740+5160+5432+4640+4716+7372+6696); then dividing the result by M10(39,756).
I have 2 new cells that contain the results of the two summed ranges but when I try to replicate the formula with a regular sum, the result is different:
=SUM(IntradayPlan2!D346*IntradayPlan2!D347)/P12
The result should be 604 but it's coming out at 4383. P12 contains the same number as M10 and the 2 cells in the SUM are summing the same values as the original SUMPRODUCT formula is.
For clarity, I'm currently replicating the main sheet a second time to generate this result. It's slowing down the workbook and most of the other detail on the replicated sheet isn't needed, so I'm trying to get rid of it.
I'm sure I'm missing something when it comes to SUMPRODUCT but after an hour of Googling around I can't work it out. Is there a way of replicating the same maths procedure with my new totals instead?
After the great explanation by Gowtham I don't think it's possible to recreate the result of a SUMPRODUCT by using total values of the 2 used arrays.
Those cells actually get their values from another sheet, which I need to keep anyway. So I've created a couple of new rows which puts those figures in a line (their otherwise spread out across the sheet) and have referenced those new groups instead.
=SUMPRODUCT(IntradayPlan2!D355:J355,IntradayPlan2!D356:J356)/O12
And I'm getting the expected result :)

Excel - Increment a number based on values from two other columns

I have 2 columns one is a period and another is a cycle. I need to create a 3rd column where I create a cycle identifier. Where the Letter changes on the cycle but resets every period.
I seem to have it with the following formula
IF(A1<>A2,1,IF(B1<>B2,C1+1,C1)). Which will give results of 1, 2 or 3. Then to get the numbers into letter form by using a switch SWITCH(C1,1,"A",2,"B",3,"C") in an adjacent cell. However I was curious if there is a more efficient or better way to accomplish this perhaps in all in one formula.
Any suggestions would be greatly appreciated.
Period & Cycle
Copy this formula to cell C2 and copy it down.
=IF(B2<>B1,IF(A2<>A1,CHAR(65),CHAR(CODE(C1)+1)),C1)
In Excel 365, you could use a spill formula like this:
=CHAR(B2:B15-XLOOKUP(A2:A15,A2:A15,B2:B15)+65)
You could argue that this is less efficient because it uses a lookup so there could be a speed hit with large amounts of data. On the other hand, it could be considered more efficient because it is a single formula and doesn't need to be pulled down.
If you were worried about the speed, you could set the binary search option in xlookup:
=CHAR(B2:B15-XLOOKUP(A2:A15,A2:A15,B2:B15,,2)+65)
(Column A has to be sorted ascending for this to work - I'm fairly sure that where there are duplicates this will still give the first match. However Microsoft are quoted as saying that there is only a slight benefit of using binary search according to this and other articles)
You could make the formula more dynamic:
=CHAR(B2:INDEX(B:B,COUNTA(B:B))-XLOOKUP(A2:INDEX(A:A,COUNTA(A:A)),A2:INDEX(A:A,COUNTA(A:A)),B2:INDEX(B:B,COUNTA(B:B)))+65)
Or using Let
=LET(Period,A2:INDEX(A:A,COUNTA(A:A)),
Cycle,B2:INDEX(B:B,COUNTA(B:B)),
CHAR(Cycle-XLOOKUP(Period,Period,Cycle)+65))

Automate concatenation process

Here I am stucked with one excel issue where i want to concatenate from column F till column I where the logic is when the benchmark column A3 (for example) is blank it need to concatenate column F till column I till there is a value at column A4.and this logic need to automatically concatenate the mentioned column till there is a value under the benchmark column. currently i need to keep change the concatenate range in order to concatenate it fully with the logic. Appreciate if anyone can help me out.
Below image shows how i am doing manually which very time consuming
You can use the MATCH function (with a wildcard) to find the next non-blank row; and use that in an INDEX function to detect the range to concatenate.
Assuming your data starts in A3 and the lowest possible row is row 1000 (change the 1000's in the formula below if it might be much different:
J2: =IF(A2="","",CONCAT(INDEX(F2:$I$1000,1,0):INDEX(F2:$I$1000,IFERROR(MATCH("*",A3:$A$1000,0),1000-ROW()),0)))
Note: It is possible to also develop solutions using INDIRECT and/or OFFSET. Unfortunately, these functions are volatile, which means they recalculate anytime anything changes on your worksheet. If there are a number of formulas using these functions, worksheet performance will be impaired. INDEX and MATCH are non-volatile (except in ancient versions of Excel - pre-2003 or so)
The OFFSET-function would come on handy here. One solution is to do it like
This works in my worksheet.
Cell Q6 just defines the number of rows downwards that the MATCH-function is checking for the next "HEADER1" value. If "HEADER1" is found, the MATCH-function returns how many rows down-1. If no "HEADER1"-value is found within that range, that value is then the number of rows used.
If the first column also has "HEADER2" and so on, you can add the MID-function to both references inside MATCH to limit which part of the string are to be searched for.
I tried to adjust the references properly to fit your sheet, but I may have missed something:
=IF(ISBLANK($B2),"",CONCAT(OFFSET($B2,0,0,IFNA(MATCH(MID($B2,1,6),MID(OFFSET($B2,1,0,$B$1),1,6),0),$B$1),4)))

Excel Sum Index Match Across Multiple columns

I am having significant issues trying to resolve my problem. Essentially I need an excel formula that replicates a SUMIFS function, as it appears that sumifs doesn't work in my scenario. Effectively I need to SUM across a horizontal axis, based on the date & header parameter. I have tried summing index-matches, sumifs, aggregates, summing sumif, summing vlookups & hlookups, and I either get errant values or I get the first value (for example, store A would return 0 for 7/8 & Store G would return -3,291)
=SUMIF($1:$1,B22,INDEX($C$2:$AQ$1977,1,MATCH($A982,$A$2:$A$9977,0)))
=SUMIFS(B2:N2,1:1,B22,A:A,A23)
=SUM(SUMIF($B$1:$N$1,$B23,INDEX($B$2:$N$12,1,MATCH($A23,$A$2:$A$12,0))),SUMIF($B$1:$N$1,$B23,INDEX($B$2:$N$12,2,MATCH($A23,$A$2:$A$12,0))),SUMIF($B$1:$N$1,$B23,INDEX($B$2:$N$23,3,MATCH($A23,$A$2:$A12,0)))).
I'm sure the sumrange is what is killing me, but I would ideally like the code to be dynamic enough to locate and sum the cells via references, in case the input data changes at some point. I am working with thousands of rows so the sum range is B2:AQ10000.
The formula is on a different sheet than this but i input it as an example.
What am I missing? Is there a way to do this with Excel?
Use:
=SIMIFS(INDEX($B$2:$N$12,MATCH($A23,$A$2:$A$12,0),0),$B$1:$N$1,B$22)

Sumproduct with 5 criterias from another workbook takes too long to run

I'm trying to lookup 5 different criteria from another file. The formula I'm using is below:
=IF(SUMPRODUCT(('[WorkBook]Sheet'!$A:$A=$A9),
('[WorkBook]Sheet'!$H:$H=$P9),
('[WorkBook]Sheet'!$D:$D=S$5),
(('[WorkBook]Sheet'!$E:$E="String1")+('[WorkBook]Sheet'!$E:$E="String2")) )>=1,TRUE,FALSE)
I could get the result in the first few cells. However, when I copy paste (or drag) the formula to the table bottom, it takes forever to calculate using 4 processors. Eventually, excel crashed.
Is it possible there's too many criteria used, and they are cross-referencing between 2 files, and on top of that, I nested it with IF function, and therefore the formula is too heavy to run on multiple cells (about 150k cells)? If so, can anyone suggest a better formula?
That SUMPRODUCT has nothing but booleans making it a COUNTIFS. The OR condition is handled with SUM(COUNTIFS(...)) and a hard-coded string array.
=AND(SUM(COUNTIFS('[WorkBook]Sheet'!$A:$A, $A9,
'[WorkBook]Sheet'!$H:$H, $P9,
'[WorkBook]Sheet'!$D:$D, S$5,
'[WorkBook]Sheet'!$E:$E, {"String1", "String2"})))
COUNTIFS can use full column references without calculation lag penalty while SUMPRODUCT is penalized greatly.
The wrapping AND does nothing more than convert a number to TRUE/FALSE.
Here is your original SUMPRODUCT with all ranges cut down to the row containing the last date in column H.
=IF(SUMPRODUCT(('[WorkBook]Sheet'!$a$2:index('[WorkBook]Sheet'!$a:$a, match(1e99, '[WorkBook]Sheet'!$h:$h))=$A9),
('[WorkBook]Sheet'!$h$2:index('[WorkBook]Sheet'!$h:$h, match(1e99, '[WorkBook]Sheet'!$h:$h))=$P9),
('[WorkBook]Sheet'!$d$2:index('[WorkBook]Sheet'!$d:$d, match(1e99, '[WorkBook]Sheet'!$h:$h))=S$5),
(('[WorkBook]Sheet'!$e$2:index('[WorkBook]Sheet'!$e:$e, match(1e99, '[WorkBook]Sheet'!$h:$h))="String1")+
('[WorkBook]Sheet'!$e$2:index('[WorkBook]Sheet'!$e:$e, match(1e99, '[WorkBook]Sheet'!$h:$h))="String2")))>=1, true, false)
Yes, that may look complicated but in fact it does much less work than the full column reference model.
Referencing across files is something I avoid like the plague. Is there a concrete reason why you can't simply use say a PivotTable in the same workbook as the data exists, and just filter the PivotTable to show what you need to show?
Much much simpler. Much much safer. Much much faster. To see this and other alternatives explained in detail, check out my answer at Optimizing Excel formulas - SUMPRODUCT vs SUMIFS/COUNTIFS

Resources