Excel SUMIFS - define sum_range by dynamically changing column

Excel SUMIFS - define sum_range by dynamically changing column - excel

I would like to make a SUMIFS formula where I can change the sum_range parametr dynamically on the grounds of this formula where I get the column: SUBSTITUTE(ADDRESS(1;MATCH("aaa";1:1;0);4);1;"")
In other words, I want to replace B:B in this formula =SUMIFS(B:B;A:A;"abc") with the formula above. But I am not able to combine those...
I found one solution here: https://stackoverflow.com/a/25814571/10452645
but I'm not quite satisfied with it. Is there another possibility to solve this task by combining SUMIFS and ADDRESS formula.

In the above example, you can find the sum of abc from column aaa using one of the three formulas:
=SUMPRODUCT((B2:D13)*(B1:D1="aaa")*(A2:A13="abc"))
or
=SUMIFS(INDEX(B2:D13,,MATCH("aaa",B1:D1,0)),A2:A13,"abc")
or
=SUMIFS(INDIRECT(ADDRESS(2,MATCH("aaa",A1:D1,0))&":"&ADDRESS(13,MATCH("aaa",A1:D1,0))),A2:A13,"abc")
You can replace aaa and abc with a cell reference so you can "dynamically" change the SUMIFS criteria.
Please note, as mentioned by #ScottCraner
ADDRESS and INDIRECT are volatile and should be avoided. INDEX is the quickest most solid method.
Reason being a Volatile Function is one that causes recalculation of the formula in the cell where it resides every time Excel recalculates. This occurs regardless of whether the precedent data and formulas on which the formula depends have changed, or whether the formula also contains non-volatile functions. It means it could potentially slow down the calculation of your Excel workbook if it gets complicated overtime.
Having that said, choose the function that suits your preference.

Related

Why use INDIRECT in place of direct reference in excel

Why would one use INDIRECT(cell) instead of a direct reference to cell?
Eg, I see a sheet where there are many references
A B C
1 SHEET1 B1 =INDIRECT("'"&A1&"'!"&B1)
2 SHEET1 B2 =INDIRECT("'"&A2&"'!"&B2)
3 SHEET1 B3 =INDIRECT("'"&A3&"'!"&B3)
Why not just
A B C
1 =SHEET1$B1
2 =SHEET1$B2
3 =SHEET1$B3

Indirect vs Direct Cell Reference
does not generally auto update vs does update
example adding or removing columns
arithmetic to change row or column vs what you type is what you get
example indirect("A"&3+5) vs =A3+5 is totally different
If you want to organize your formula references and change them all on the fly it is easier with indirect (although even easier just using the naming feature) but the real reason you "need" indirect is how else are you going to change the reference in your formula without manually typing it (answer: indirect)?

use case - programmatically list and loop a range of worksheets:
sheet references using formulas and values useful for addressing cells at scale.

Build references to many different cells, worksheets or workbooks that follow a logic
The most common use of the INDIRECT function is probably when someone wants to reference to many cells which cell references follow a logical rule. For example, assume you have an Excel workbook with hundreds of sheets, one for each day and the sheet names are the dates. Now you would like to summarize some of the values in those sheets on an overview sheet. In this case, you can type in your starting date, drag it down to your final date (Excel will increment the dates). Using the INDIRECT function, you can now easily build up the references within seconds.
However, consider that INDIRECT is a volatile function which will slow down your workbook. Further, if you insert rows/columns, the INDIRECT function won’t adapt. It get’s even worse when you reference to external sheets. Since INDIRECT updates it’s value with every change in the workbook, you will get #REF errors as soon as you close the referenced sheets.
I personally avoid INDIRECT for such cases (either using VBA or by choosing a different design for my workbooks, so no INDIRECT function is necessary)
Lock a cell reference
If you have a cell reference like let’s say =A10, Excel will always adapt the reference when you insert new rows or columns (if you for example insert a row above row 10, the reference changes to =A11). You can use the INDIRECT function in order to always keep the absolute cell reference: =INDIRECT(“A10”).
With named ranges
INDIRECT can be handy with named references. Have a look at the example where you have three named ranges:
NorthAmerica: B2:B5
Europe: C2:C5
Asia: D2:D5
You can now combine the INDIRECT function with many other Excel functions like SUM, MIN, MAX and so on. In the example, the drop down selection in G1 is referenced using INDIRECT to perform the calculation for the selected range.
Dynamic dropdowns
A similar example where you can use the INDIRECT function are dynamic drop downs. In this example there are two named ranges:
Fruits: A2:A4
Vegetables: B2:B4
In cell D3, there is a dropdown where you can select “Fruits” or “Vegetables”. In E3, we have a dynamic drop down with the source =INDIRECT($D$3). If you choose “Fruits” in D3, you will have a list with the fruits in the drop down.
So, there are definitely some things where INDIRECT might be an easy solution. But as I said, it is a volatile function that locks the cell reference. In most cases you can find different, better solutions. The main reason people use it is probably the lack of knowledge of better alternatives. In addition, I assume that the average Excel user is not aware of possible problems you might run into when using INDIRECT.

Indirect is very useful with Tables. For example, I create a table tblFindings with 10 rows. Then I assign the list to =Indirect("tblFindings"). Now I add 5 rows to the table, the dropdown list automatically updates.

Difference between SUMIF(Condition, Values), SUMPROD(Condition, Values) and SUM(Condition*Values)

Let's say I have an excel table with 2 columns: dates in cells A1 to A10 and values in B1 to B10.
I want to sum all the values of May dates. I have 3 possibilities:
{=SUM((MONTH(A1:A10)=6)*(B1:B10))}
or
=SUMPRODUCT((MONTH(A1:A10)=6)+0;B1:B10)
or
=SUMIFS(B1:B10;A1:A10;">="&DATE(2016;6;1);A1:A10;"<="&DATE(2016;6;30))
What is the best formula to use? In which case? And why?
I have found answers regarding the last two formulas, but nothing regarding the first one.

The first formula would give you an error if B1:B10 contains any text values, the second one won't (it will just ignore text in B1:B10). You can change the first one to allow text in B1:B10 by switching to this syntax:
=SUM(IF(MONTH(A1:A10)=6;B1:B10))
Both of the first two formulas will also give you an error if A1:A10 contains text - SUMIFS won't and can also handle error values in those ranges (as long as not in the sum range on a row that satisfies the conditions)
For those reasons SUMIFS is better, and faster as Scott says.
Disadvantages of SUMIFS:
Can't work with closed workbooks - is less flexible in that it can't accept arrays, so you can't use functions on the ranges
In your specific example SUMIFS only sums amounts for June 2016. The first two formulas will sum for any June date in any year, so that flexibility may suit you better in some circumstances

The first and Second (SUM and SUMPRODUCT) are array type formulas; they will iterate through the range, this is slow and if too many will cause a slow down in the calculation speed and even crash excel.
The third is not an array type formula and has been optimized, and as such can use full column references without detriment to speed.
When ever SUMIFS can be used it is recommended to use it.

How do I use an array formula over a whole column or varying range?

I have a spreadsheet that I'm importing data into. I need to find the value within a column that is closest to zero. The column contains both positive and negative values, and the value closest to zero will be used in another formula. I've found an answer using an array formula, but it will only work for a fixed range (e.g. K2:K10), and the number of records imported into my sheet will vary each time I use it.
Here's what I have so far:
=INDEX(K:K,MATCH(MIN(ABS(K:K)),ABS(K:K),0))
Is there a way to apply an array formula over an entire column and just include non-zero cells other than the column title? Or possibly just cells with numerical values? Or is it possible to control the range that it applies to?

We can dynamically find the last cell in the range by using another INDEX/MATCH formula that is not an array:
=INDEX(K:K,MATCH(1E+99,K:K))
This will find the last cell that has a number in column K.
So we now use this as the last cell in the range:
=INDEX($K$2:INDEX(K:K,MATCH(1E+99,K:K)),MATCH(MIN(ABS($K$2:INDEX(K:K,MATCH(1E+99,K:K)))),ABS($K$2:INDEX(K:K,MATCH(1E+99,K:K))),0))
And now the formula is dynamic.
This formula is still an array formula and must be confirmed with Ctrl-Shift-Enter when exiting edit mode. If done correctly then Excel will put{} around the formula.
If as you pointed out there is a chance of deleting row 2 then all the K2 references will also be deleted.
In place of K2 we can use INDEX(K:K,2) It will now always look at the second row and will not error when row 2 is erased. So use this instead:
=INDEX(INDEX(K:K,2):INDEX(K:K,MATCH(1E+99,K:K)),MATCH(MIN(ABS(INDEX(K:K,2):INDE‌X(K:K,MATCH(1E+99,K:K)))),ABS(INDEX(K:K,2):INDEX(K:K,MATCH(1E+99,K:K))),0))
There is nothing wrong with the Offset() function in small amounts, but it is a volatile function. Which means that it will calculate EVERY TIME excel calculate whether the data to which it is dependent has changed or not.

For the benefit of anyone reading this post, I ran into another issue and found a way around it. Scott Craner's answer above worked well until I ran a macro that I had for that sheet, which would delete certain rows. If row 2 got deleted, the formula would give a #REF error, because it was trying to call $K$2.
My solution was to replace $K$2 with
OFFSET(K1,1,0)
Therefore, the complete formula would be:
=INDEX(OFFSET(K1,1,0):INDEX(K:K,MATCH(1E+99,K:K)),MATCH(MIN(ABS(OFFSET(K1,1,0):INDEX(K:K,MATCH(1E+99,K:K)))),ABS(OFFSET(K1,1,0):INDEX(K:K,MATCH(1E+99,K:K))),0))
And as Scott mentioned, remember to hit Ctrl-Shift-Enter to execute the array formula.

Using a SUMIFS formula to select the same column in multiple tables in excel

I am trying to shorten my formula a little and having a hard time figuring out the proper method to do so. I am trying to select certain cells in multiple tables to produce a single total. My code is this:
=SUMIFS(TransactionsChase[INFLOW],TransactionsChase[DATE],">="&Dec,TransactionsChase[DATE],"<"&DecPayChk2,TransactionsChase[CATEGORY],"<>"&"From*")
+SUMIFS(TransactionsPatelcoChecking[INFLOW],TransactionsPatelcoChecking[DATE],">="&Dec,TransactionsPatelcoChecking[DATE],"<"&DecPayChk2,TransactionsPatelcoChecking[CATEGORY],"<>"&"From*")
+SUMIFS(TransactionsPatelcoMM[INFLOW],TransactionsPatelcoMM[DATE],">="&Dec,TransactionsPatelcoMM[DATE],"<"&DecPayChk2,TransactionsPatelcoMM[CATEGORY],"<>"&"From*")
+SUMIFS(TransactionsCash[INFLOW],TransactionsCash[DATE],">="&Dec,TransactionsCash[DATE],"<"&DecPayChk2,TransactionsCash[CATEGORY],"<>"&"From*")
I would love to simplify it if possible into one sumifs statement. Any ideas?

If you apply the four table names within the a SUMIFS function with a volatile INDIRECT¹ function then wrap the whole thing in a SUM function and finalize it as an array² formula, the formula can be shortened visually but not calculation-wise.
In the following image, your original formula is in J2. The revised formula is J3 as,
=SUM(SUMIFS(INDIRECT(N$2:N$5&"[INFLOW]"),
INDIRECT(N$2:N$5&"[DATE]"), ">="&Dec,
INDIRECT(N$2:N$5&"[DATE]"), "<"&DecPayChk2,
INDIRECT(N$2:N$5&"[CATEGORY]"), "<>From*"))
Results should similar to the following. Note the minor improvement made to the , "<>From*" criteria. The table names could also be written out longhand. Instead of N$2:N$5 as,
{"TransactionsChase", "TransactionsPatelcoChecking", "TransactionsPatelcoMM", "TransactionsCash"}
As you can see from the sample image above, this formula will survive tables of varying row length. The only question that remains would be 'Is it worth it?'
¹ Volatile functions recalculate whenever anything in the entire workbook changes, not just when something that affects their outcome changes. Examples of volatile functions are INDIRECT, OFFSET, TODAY, NOW, RAND and RANDBETWEEN. Some sub-functions of the CELL and INFO worksheet functions will make them volatile as well.
² Array formulas need to be finalized with Ctrl+Shift+Enter↵. Once entered into the first cell correctly, they can be filled or copied down or right just like any other formula. See Guidelines and examples of array formulas for more information.

Reference a range of cells and keep the order when adding new rows

I am referencing a range of cells in a first sheet, to build a second sheet. Often I add rows in the middle of the first sheet. In the second sheet that is referencing the first, there is a skip in the cell number where I have added a row.
SHEET 1: Contains my main list, that is updated
A new row is added (A3) to SHEET 1:
SHEET 2: references Sheet 1 and pulls through the rows
However, you can see that where row 3 should contain the added row 'Rachael', it instead has shifted down to Sheet1!A4 and missed A3 out all together.
How can I fix this?

Try using this formula in sheet2:
(add it to Sheet2, A2, then copy it around.)
=offset(sheet1!$A$1,row(a2)-1,column(a2)-1,1,1)

Try to avoid formula volatilty, which means a formula recalculates on a change to the sheet even if its precedents have not changed.
Having numerous volatile formulas in a worksheet can cause performance issues.
Any formulas that utilize the OFFSET() function or the INDIRECT() function automatically become volatile. But of these two functions, INDIRECT is much worse than OFFSET. Both are volatile, but OFFSET is extremely fast, while INDIRECT is extremely slow.
DO NOT USE INDIRECT().
The best alternative is without question the INDEX() function. It is even faster than the OFFSET function and INDEX is not volatile.
So use the following formula in cell A2 of the 2nd sheet:
=INDEX(Sheet1!$1:$1048576,ROW(),COLUMN())
...and then copy as needed.

To directly answer your question - you can achieve this with the INDIRECT function. INDIRECT allows you to dynamically reference a cell through a formula, which doesn't necessarily follow Excel's "tracking" rules. Keep in mind that normally, Excel gives each cell a 'unique id', and when you initially reference any cell, the internal logic points to that specific 'unique id', and the visible reference points to the 'A1' style reference to that cell. This is done so you can insert rows and columns without unintentionally losing all of your references.
It is generally not a good idea to do what I'm about to show, because you lose the inherent benefit that direct references provide (in general: easier to maintain). However, to show you how it would work, see below [this assumes you want one header row, and that the column on your results sheet should match the column on your raw data sheet]:
=INDIRECT("Sheet1!R"&ROW()+1&"C"&COLUMN())

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Excel SUMIFS - define sum_range by dynamically changing column - excel

Related

Why use INDIRECT in place of direct reference in excel

Difference between SUMIF(Condition, Values), SUMPROD(Condition, Values) and SUM(Condition*Values)

How do I use an array formula over a whole column or varying range?

Using a SUMIFS formula to select the same column in multiple tables in excel

Reference a range of cells and keep the order when adding new rows

Categories

Resources