how to sum data from multiple columns with specific index - excel

I have a data set and some index like below
I'd like to sum column 2, 5 and 7 (A1, A4 and A6 in data set). Note that na is also included in this column. How should I prepare a formula for this type of calculation. Thanks

Assuming no excel version constraints as per tag listed in the question, then you can use the following in cell G1 to sum by columns, if you need by row, then replace BYCOL with BYROW:
=BYCOL(CHOOSECOLS(B2:D3,TOCOL(E2:E4,2)), LAMBDA(x, SUM(TOCOL(x,2))))
Here a sample output:
It deals in cases we have #N/A values in column selection as well as in the data to sum, via TOCOL function, using the second input argument.

Related

Using INDEX MATCH function to multiply and sum rows

I've the following table:
In cell A7 I define the index of column 1 I want to use for my computation, in this case A. For every row with index A I want to multiple the value for that row of column 2 with column 3. In this example there are two rows with index A, so I want to do the computation for both rows and then add the results. That will look as follows: 5 * 3 + 9 * 7= 78.
To achieve this, I first tried to write a code that sums all values in column 2 that match a given index. That index is A, so 5 + 9= 14 is what the output should be. I only get my code to find the first match, so that's row 2 and it will display the value of column 2, so that's 5. This is my code for cell B7:
=SUM(INDEX(B2:B5;MATCH(A7;A2:A5;0)))
Even if I solve this I still don't have what I actually want, but I think it's a start. How do I get what I innitially wanted and have the outcome equal 78?
Type in this formula:
=SUM((A2:A5=A7)*(B2:B5)*(C2:C5))
Then Ctrl-Shift-Enter
This converts it into an array formula, which you can identify because there will be braces around it:
{=SUM((A2:A5=A7)*(B2:B5)*(C2:C5))}
Using BYROW()
• Formula used in cell B7
=SUM(BYROW(FILTER(Table12[[Column 2]:[Column 3]],A7=Table12[Column 1]),LAMBDA(m,PRODUCT(m))))
If not using Structured References then
=SUM(BYROW(FILTER(B2:C5,A7=A2:A5),LAMBDA(m,PRODUCT(m))))
Or, use the incredible & versatile SUMPRODUCT() Function
• Formula used in cell C7
=SUMPRODUCT((A7=A2:A5)*(B2:B5)*(C2:C5))

Best way to handle copy down of formulas adjacent to filter arrays in excel

edited to add an example table
I use excel's filter and unique functions to retrieve arrays from a source table. The first array is typical set of dates, followed by data. Next to the retrieved arrays, I have columns with formulas.
Once the source table grows, filter function is always up to date, adding new rows in the end...but the columns with formulas do not. You need to copy down the formulas. Also, you cannot make a table of a range if the columns have spill functions like filter or unique.
What would be the recommended way to handle this? Is there a better way than making a macro that copies down the formulas?
As an example, the source table has a growing number of dates and some categories with values:
date
category
value
1.1.2022
A
1.2
1.1.2022
A
0.5
1.1.2022
B
0.2
1.1.2022
B
2.2
2.1.2022
A
0.1
2.1.2022
A
0.3
2.1.2022
B
1.2
...
Now in the summary table, I use unique function to retrieve the dates in the first column. This spills down automatically - so far so good. In the second column (category A), I use sum(filter(..)) function to sum all values in the source table where category = A and date = the date on the same row in the first column:
unique date
cat A
cat B
1.1.2022
1.7
2.4
2.1.2022
0.4
1.2
This is problematic since filter formula looks like this (assuming the above table starts from cell A1):
=sum(filter(source[value],(source[category]=B$1)*isnumber(match(source[date],$A2))))
Hashtag did not seem to work in the last parameter ($A2), e.g. replacing $A2 by offset($A2#,0,0,1) worked only on the first row.
I often use OFFSET in combination with the spill-range syntax that #Rory cited:
For example, if your FILTER/UNIQUE formula in cell A2 spills to columns A and B, use OFFSET(A2#,0,1,,1) in a function that acts on the values in column B only. The result will be a spill range for each row of your original spill range.
Of course, you can also offset rows that way too (e.g., to calculate incremental changes between rows: =OFFSET(A2#,0,1,,1)-OFFSET(A2#,-1,1,,1) for the prior row, so long as the value in B1 does not cause an error).
Additionally, Office365 versions have BYROW and BYCOL that you can use to act on each row or column of a spill range. For example, to find the max value of each row, =BYROW(A2#,LAMBDA(r, MAX(r))).
Thanks #pdtcaskey
I had a similar challenge and your mention of BYROW and LAMBDA did the trick for me!
For above scenario, let's assume the source table is an Excel Table with the elusive name Source and columns Date, Category, Value, situated at A1.
For the summary, assuming it's in a separate sheet and also starts at A1, you can do this:
Put the headers in row 1
In A2 put: =SORT(UNIQUE(Source[Date]))
In B2 put: =BYROW(A2#;LAMBDA(r;SUM(FILTER(Source[Value]; (Source[Category]="A")*(Source[Date]=r)))))
In C2 put: =BYROW(A2#;LAMBDA(r;SUM(FILTER(Source[Value];(Source[Category]="B")*(Source[Date]=r)))))

Sum column based on conditions for subsums

So I have a table which basically looks as follows:
Criterion Value
1 -5
1 1
2 5
2 5
3 2
3 -1
I want to sum the values in column B based on the criteria in column A, but only if the sum for an individual criterion is not negative. So for example if I ask for the sum of all values where criterion is between 1 and 3, the result should be 11 (the values for criterion 1 not being included in the sum because they add up to a negative number.
My first idea was to add a third column with a sumif([criterion];[#criterion];[value]) and then use a sumifs function which checks whether that that third column is negative. However, my table has +100k lines and with that many sumif functions it becomes intolerably slow.
I know I could create a pivot table to the same effect, but that has two drawbacks: I would have to create a separate sheet, which would add complexity, and my table is frequently updated which means I would have to manually update that pivot table every time to allow for downstream calculations. NBD and I could do that as a last resort, but I wonder whether there isn't a more elegant way to solve this problem.
I would want to avoid VBA to avoid complexity (the sheet will be used by other persons).
Thank you
This can be easily done using UNIQUE() and the two versions of SUMIF() in this way:
First collect all the criteria with =UNIQUE(A2:A7) -- Assuming your data are in columns A and B starting from row 2, this goes in cell C2, with "Criteria" in C1
Compute the subtotals for all criteria using =SUMIF($A$2:$A$7, C2, $B$2:$B$7) -- This goes in cell D2 and extends as the criteria do, "Partials" in cell D1
sum all the data in step 2 yielding a positive sum with =SUMIF(D2:D7, ">0") in cell E2
If you have a lot of data I suggest to use the column references to avoid absolute references and the need to adjust the formulas as data change (in number):
The first formula becomes =UNIQUE(A:A) -- Don't care about the heading being taken (strings and empty cells are not summed)
For the second formula use =SUMIF(A:A, C2, B:B)
Use =SUMIF(D:D, ">0") for the last step
This should be reasonably fast, using just as many extra cells as the number of distinct criteria (multiplied by 2).

Excel sheet dynamic summing condition based

Say I have the following numbers in cells in suceeding rows of column B 1,24,23,12,15,17. How do I get Excel to only add up to that cell so that the sum equals a predefined number (say 25) and return the corresponding row number at which this condition is satisfied?
In the example above, it should add B1 and B24 whose result equals the predefined number (25) and return row 2 as a result.
The challenge is unlike SUMIF and similar commands, I cannot prescribe a range B1:B6 or so. Instead of B6 it should be some number Bx where x (2 in this case) is decided on the fly. Does that make sense?
Thanks in advance.
You cannot do it with a single formula, but you can do it by adding a column with running totals. See this code example:

Create a dynamic 'if' statement in Excel without VBA

* Updated *
I have a rather large excel data set that I'm trying to summarise using up to 3 dimensions: region, sector, industry.
Any combination of these dimensions can be set or left blank and I need to create a formula that accommodates this WITHOUT using VBA.
Within the data I've set up named ranges to refer to these dimensions.
I'm using an array formula but I'd like to dynamically create a string which is then used as the boolean argument in the array formula.
For instance if:
A1 = "Hong Kong" (region)
B1 = <blank> (sector)
C1 = <blank> (industry)
I create a dynamic string in D1 such that
D1 = (region="Hong Kong")
I then want to use the string in D1 to create an array formula
E1 = {counta(if(D1,employees))}
However, if the user includes a sector such that:
A2 = "Hong Kong" (region)
B2 = "finance" (sector)
C2 = <blank> (industry)
Then I want the string in D2 to update to:
D2 = (region="Hong Kong")*(sector="finance")
Which then automatically updates the value in E2 which still has the same formula.
E2 = {counta(if(D2,employees))}
Is this possible? Alternatively is there any other way of achieving the same outcome, keeping in mind that I need to be able to copy D1 and E1 down into different rows so that different combinations of the dimensions can be viewed simultaneously.
Thanks.
* Updated *
To be clear, the reason the I need the values in column D to be dynamic is so that I can create different scenarios in Row 1, Row 2, Row 3 etc. and I need the values in column E of each row to match the criteria set in columns A:C of that row.
There had to be a fairly simple way!
Columns B:D contain the criteria, A is a criterion number and E is the result of applying the DSUM function to the criterion in that row. I've used DSUM as it seems more natural (to me at least) to sum employee numbers. However, DCOUNT can equally well be used. For brevity I've not shown the data I'm using but it is a very trivial data set with just a few rows of test data.
The first set of criteria in row 2 is: Sector takes value of "Man" (manufacturing) whilst Region and Industry are unspecified. The 3rd set of criteria (in row 4) is: the Region is "Fr" (for France) AND the Industry is "Cars". The results in the DSUM column are obtained by applying the set of criteria in the corresponding row. All, some or even none of the cells in a row may contain entries.
The approach used is based on columns G:J, where with the exception cells G1 and G2 (which contain the numbers 0 and 1, respectively) everything in these columns has been generated by a formula.
There are twice as many rows in columns G:J as there are sets of criteria listed in B:D and the rows should be taken in pairs. The first pair (rows 1 and 2) provide a criterion table for use in DSUM corresponding to the first set of criteria (the table is cells H1:J2), the second pair in rows 3 and 4 provides a criterion table for the second set of criteria (cells H3:J4), etc. (Ignore the 11th row - I copied too many rows downwards in the screenshot!)
Column G has a fairly obvious pattern and can be generated by applying a simple =IF() function in cell G3 which references the starting pair in G1 and G2 with the formula in G3 then copied downwards.
The cells in columns H:J reference the appropriate cells of the set of all criteria (B1:D6 in the screenshot) using the INDEX function (and making use of the value in column Galong the way). It is not too difficult to create a single formula that can be copied from H1 to the range H1:J11 by judicious use of mixed relative and absolute addressing and an IF or two). Note that references to an empty cell in B2:D6 will generate a value of 0 in the corresponding cell in H:J so the construct IF(x=0,"",x) must be used - this makes the formula used in the cells in columns H:J a bit clunky but not excessively so.
Having generated the 5 criteria tables corresponding to the 5 sets of criteria in B:D, use is made of the OFFSET function to deliver the correct criterion table as the third argument of the DSUM functions in column E.
I chose to base my OFFSETs on cell $H$1, so the top-left cell of the criterion table for the first set of criteria is offset from my base cell by 0 rows and 0 columns. The second criterion table is offset by 2 rows and 0 columns, the third by 4 rows and 0 columns. It should be clear how the number of offset rows and columns to use can be calculated from the corresponding criterion number in column A. It should also be obvious that the final two arguments of the OFFSET function will always be 2 and 3. So my DSUM() functions in column E look something like
=DSUM(myData,"Employees",OFFSET($H$1,row_offset,0,2,3))
where myData is the named range containing the test dataset and row_offset is a very simple formula involving the corresponding value in column A.
It would have been nice to have been able to deliver the third argument of the function without having to adopt the approach of effectively reproducing the sets of criteria in B1:D6 in cells H1:J10. Whilst there are ways to generate the required criterion table arrays formulaically without putting them onto the worksheet, I found that DSUM generated an error when applying such an array as its third argument.

Resources