Sumif with staggered columns - excel

I would like to add the rows if the value in each cell is less than 100 and if the columns headings were "Tom", "Dick" and "Harry"
So in Row 1, only Dick is less than 100 so the sum is 7.
In Row 2, Tom and Harry are less than 100 individually so the sum is 79.
I have over 30, 250x250 matrices where I would like to get conditional sums of a seven staggered columns. All the combinations of SUMIFs I have tried seem to be giving errors.
I don't just want to add, I would like to be able to do other things, like just count how many times Tom, Dick, and Harry have individually less than 100 or calculate other statistics like mean, median etc.

I would limit the criteria range to each row and then evaluate. This approach should work for COUNTIFS too (take them individual i.e. not aggregate the counts as we do with the SUM).
The formula for SUM would look like this:
=SUMIFS(B2:F2,$B$1:$F$1,"Tom",B2:F2,"<100")+
SUMIFS(B2:F2,$B$1:$F$1,"Dick",B2:F2,"<100")+
SUMIFS(B2:F2,$B$1:$F$1,"Harry",B2:F2,"<100")
So the formula logic is (this logic should work for Countifs too):
SUMIFS(B2:F2,$B$1:$F$1,"Tom",B2:F2,"<100")
SUMIFS(Return value from row, Search for "Tom", given that the return value is <100), we do the same for the other people too.
EDIT:
For the countif same logic tested:
=COUNTIFS($B$1:$F$1,"Tom",B2:F2,"<100")

Related

Excel: SUMPRODUCT with percentages

I use the following formula to calculate the SUMPRODUCT of the values in a column with the condition that there is a match between B2:B6 and A1:A3 (shoutout to user:8162520).
=SUMPRODUCT(($B$1:$B$6=$A$9)*(G2:G6))
Now however I would like to subtract to that result the percentage in C1 and/or D1, and "give it" to one of the other blue numbers, again with the condition that there is a match between C2:C6 or D2:D6 and A7:A9.
So for example in G7 the result for "1" should be 50% of 10 + 75% of 5, because "2" takes 50% of 10(G2) and "3" takes 25% of 5(G5).
The point is to create an overview of how many hours employees (1,2,3) have to spend spend on project B2:B6 for each week (E,F,G), and what happens if other employees take a certain percentage of the workload (C,D). What is the most efficient way to do this?
The numbers in A, B,C, D are reference numbers, they should not be part of the calculation. Maybe it's clearer if I use letters. See also the right results for E7:G7. The outcome must be the sum of the values in each column (workload) minus the percentage in C and D IF there is someone sharing the workload:
So deducting the percentage in C1:D1, presumably where there is no workload the result should be zero?
=SUMPRODUCT(($B$2:$B$6=$A7)*($C$2:$D$6<>"")*(1-$C$1:$D$1)*G$2:G$6)

Summing the result of nested IF statements in one cell

I've got a pretty complex conditional formula that works for each row of a column (sorry, no excel 2016 IFS) and I would like to get the sum of all instances in a range in one formula without having to make all the rows as a middle step.
Done this quite a bit with other stuff, but for some reason I'm stuck on this one.
The formula per cell is:
=IF((IF(IF(AND(ISNUMBER(Test_Samples!B2),Test_Samples!B2>0.1),Reference_Dataset!$H:$H,"")="N",Test_Samples!B2,0)<1),0,IF(IF(IF(AND(ISNUMBER(Test_Samples!B2),Test_Samples!B2>0.1),Reference_Dataset!$H:$H,"")="N",Test_Samples!B2,0)<2,1,IF(IF(IF(AND(ISNUMBER(Test_Samples!B2),Test_Samples!B2>0.1),Reference_Dataset!$H:$H,"")="N",Test_Samples!B2,0)<5,2,IF(IF(IF(AND(ISNUMBER(Test_Samples!B2),Test_Samples!B2>0.1),Reference_Dataset!$H:$H,"")="N",Test_Samples!B2,0)<13,3,IF(IF(IF(AND(ISNUMBER(Test_Samples!B2),Test_Samples!B2>0.1),Reference_Dataset!$H:$H,"")="N",Test_Samples!B2,0)<34,4,IF(IF(IF(AND(ISNUMBER(Test_Samples!B2),Test_Samples!B2>0.1),Reference_Dataset!$H:$H,"")="N",Test_Samples!B2,0)<91,5,IF(IF(IF(AND(ISNUMBER(Test_Samples!B2),Test_Samples!B2>0.1),Reference_Dataset!$H:$H,"")="N",Test_Samples!B2,0)<245,6,IF(IF(IF(AND(ISNUMBER(Test_Samples!B2),Test_Samples!B2>0.1),Reference_Dataset!$H:$H,"")="N",Test_Samples!B2,0)<666,7,))))))))
I would like to transform it to a formula that sums everything from the range B:B (or B2:B499) in one go.
I tried some SUM and SUMIF(S) stuff and changing B2 to B:B. That doesn't seem to work.
Oh, if someone has a tip to reduce the nested IF formula to something more readable, that's welcome as well. The idea of the formula is to transform counts to classes.
The datasets that are referred to look like this:
Test_Samples:
Reference_Dataset:
The If statements make up a classification as follows:
0= 0
1= 1
2= 2-4
3= 5-12
4= 13-33
5= 34-90
6= 91-244
7= 245-665
8= 666+
Here you see a count of 2 in "Test_samples", and it is labelled "N" in "Reference_dataset", so the result classifies it as "2" (to avoid confusion: if the count was 5 it would be labelled "3" according to the class criteria).
Say if there are 5 instances with result "2" in the range B2:B499, the sum should be 10.
Create a lookup table like this for your "result value < X" to summarize the If statements into a single lookup table:
Test_Samples B Value
0
1
2
5
13
34
91
245
666
In this example, i've put that on the same worksheet that the formula is placed, in cells A1:A10 (header in A1, so data values in A2:A10). Then you can simplify your formula and make it reference the range of your data like this:
=SUM(MATCH(IF(ISNUMBER(Test_Samples!$B$2:$B$499)*(Reference_Dataset!$H$2:$H$499="N"),Test_Samples!$B$2:$B$499,0),$A$2:$A$10)-1)
Note that this is an array formula and as such must be confirmed with CtrlShiftEnter (instead of just Enter).

How to create a dynamic formula to find the average of a set of values for a given vector

I am trying to create a formula that gives me the average of the last 12 entries in a given dataset depending on the associated vector.
Let's make an example:
I have in column F2,G2,H2 and I2 dates, Company1, Company2 and Company3 respectively. Then from row3 to row 33 I have months dates starting from May 2016.
Date Company1 Company2 Company3
May-16 2,453,845
Jun-16 13,099,823
Jul-16 14,159,037
Aug-16 38,589,050 8,866,101
Sep-16 63,290,285 13,242,522
Oct-16 94,005,364 14,841,793
Nov-16 123,774,792 7,903,600 41,489,883
Dec-16 93,355,037 12,449,604 69,117,105
Jan-17 47,869,982 13,830,712 83,913,764
Feb-17 77,109,905 10,361,555 68,176,643
The goal is to create a formula that, when I drag it down, correctly calculates the average of the last 12 values for a given company.
So for example i would have, say in table "B2:C5":
Company1 76,856,345
Company2 11,120,859
Company3 65,674,349
And, if a new Company4 is added to the list, then I just have to drag it down the formula, to calculate the average of the last 12 months for Company4.
Until now, I have came up with this formula:
=AVERAGE(LOOKUP(LARGE(IF(ISNUMBER(G:G),ROW(G:G)),ROW(INDIRECT("1:"&MIN(12,COUNT(G:G))))),ROW(G:G),G:G ))
This formula correctly calculates the average of a given column, considering only the last 12 values. The last step would be to come up with a formula that includes all the columns and then calculates the average for the given company.
Thanks!
I recommend that you use a named range to define your data in columns G:I. When a company is added, just modify the named range's specs. I used the name Target. Of course, you can replace it with $G:$I if you feel so inclined but I would rather recommend reducing the number of rows in the range, which is easier to manage when it is named.
Use the formula below to extract the company names from the first row of Target into the first column of your averages table. This is to ensure that the names are spelled identically in both locations.
=INDEX(Target,1,ROW()-2)
The number 2 indicates the number of rows above the row containing the formula. it is copied here from cell M3. There, ROW()-2 creates the number 1, counting sequentially as the formula is copied down.
Now I have the formula below in my cell N3 and copied down.
=SUM(INDEX(Target,0,MATCH($M3,INDEX(Target,1,0),0)))
The formula simply sums up the columns G, H, and I in 3 consecutive rows.
In the final step I inserted the range definition established above, meaning excluding the SUM() function, into your existing formula.
=AVERAGE(LOOKUP(LARGE(IF(ISNUMBER(INDEX(Target,0,MATCH($M3,INDEX(Target,1,0),0))),ROW(INDEX(Target,0,MATCH($M3,INDEX(Target,1,0),0)))),ROW(INDIRECT("1:"&MIN(12,COUNT(INDEX(Target,0,MATCH($M3,INDEX(Target,1,0),0))))))),ROW(INDEX(Target,0,MATCH($M3,INDEX(Target,1,0),0))),INDEX(Target,0,MATCH($M3,INDEX(Target,1,0),0))))

Sum first five instances in Excel

I have an Excel table with three columns. Column A has a list of countries, Column B has a list of cities in each country and Column C has populations of those cities.
The way the table is structured makes it so that Column A will have repeated names of countries - as many times as the number of cities in each country, in column B.
I would like to sum the populations of the first five cities in each country.
I have tried using SUMIF and COUNTIF but haven't managed to do it. How can I sum the populations (in row C) of the first five cities appearing for each country?
Are you trying to sum the population of the first five cities in the list or the population of the top 5 most populous cities for each country (which if the list is sorted by population, these are the same)? If it's the latter you can do it with a one line array formula
=SUM(LARGE(IF(A:A="CountryName",C:C),{1,2,3,4,5}))
(Ctrl+Shift+Enter after setting up the formula)
Where you replace "CountryName" with a reference to the country you want the sum of the top 5 populations for. I think the only issue with this is it will fail if there are less than 5 cities in a country on the list.
Here is a version of the formula that works when there are less than 5 cities but still caps at out at the top 5. Getting an array of 1-n values is kind of an ugly hack in Excel but this seems to work.
=SUM(LARGE(IF(A:A="CountryName",C:C),ROW(OFFSET(A1,,,MIN(COUNTIF(A:A,"CountryName"),5)))))
(Ctrl+Shift+Enter after setting up the formula)
Add a column D. In D2 write the following formula D2=COUNTIF($A$1:$A2,$A2) and drag it down.
Now what this will do is ranking the instances of a particular country.
Now it's a very simple formula for column E, where you will get the sum
E2=SUMIFS($C$2:$C$1000,$A$2:$A$1000,$A2,$D$2:$D$1000,"<=5") and drag it down
Now for each country you will have the sum of population of first 5 cities

Median Selling Price Excel Table

I have a spreadsheet with different products, listing units and retail value sold like the example below
Product Units Value
A 10 100
B 15 80
C 30 560
I'd like to compare the Average Selling Price with the Median Selling price, so I am looking for a quick formula to accurately calculate the median.
The median function requires the entire series, so for Product A above I would need 10 instances of 10 etc. How can I calculate the Median quickly considering the condensed form of my data?
Without writing your own VBA function to do this there are a couple of approaches that can be taken.
The first expands the data from its compressed frequency count format to generate the full set of observations. This can be done manually or formulaically. On the assumption the latter is required, it can be achieved using a few columns.
All the blue cells are formulae.
Column Eis simply the cumulative of column B and F is an adjusted version of this. Column H is just the values 1 to 55, the total number of observations given by cell L2. Column I uses the MATCH() with its final argument as 1 to match each observation in H against the adjusted cumulative in F. Column J uses the INDEX() function to generate the value of the observation. (Observations 1-10 have value 100, 11-25 have value 80 and 26-55 have value 560 in this example). The MEDIAN() function is used in cell M2 with column J as its argument.
This approach can be refined to take account of varying numbers of products and data points through the use of the OFFSET function to control the range arguments of the MATCH(), INDEX() and MEDIAN functions. And, of course, adjacent cells in columns I and J could be combined using a single formula - I've shown them separately for ease of explanation.
The second approach involves sorting the data by value (so in this case the data rows would become Product B in row 2, product A in row 3 and product C left as-is in row 4). It is then a case of identifying the middle observation number (if the number of observations is odd) or the middle pair of observation numbers (if the number of observations is even) and then determining the value(s) corresponding to this/these middle observation(s). In this approach the adjusted cumulative in column F is still used but rather than explicitly calculating the values in column I and J for every observation it can now be restricted to just the middle observation(s).
I think there is no way around compromises. Either using big amounts of helper cells or having the table sorted by the values.
Helper cells:
Formula in F4:AS6:
=IF(COLUMN()<COLUMN($F$4)+$B4,$C4,"end")
Formula in D2:
=MEDIAN(F4:AS6)
Sorted:
Formula in F4 downwards:
=SUM($B$3:B3)+1
Formula in D2:
=SUM(LOOKUP(INT(SUM(B4:B6)/2+{0.5,1}),F4:F6,C4:C6))/2

Resources