Classifying numerical data in column into range - excel

I have a column of numbers which I would like to classify into the following ranges:
0-2%, 2%-3%, 3%-4, so on and so forth. It seems like a nested conditional would be too large, but I'm not sure how else to specify that the data will be classified into each range provided the condition that it is between the two numbers be met.

The following formula should return what you are after
=LET(
lower,SEQUENCE(50,1,0,2) / 100,
upper,lower + 2/100,
INDEX(TEXT(lower,"0%")&"-"&TEXT(upper,"0%"),MATCH(1,(B2>=lower)*(B2<upper),0))
)
which looks as shown in this screenshot
Note: the sequence covers [0%,100%) in steps of 2%.
Another formula to achieve the same result is
=TEXT(ROUNDDOWN(B2/2,2)*2,"0%")&"-"&TEXT(ROUNDUP(B2/2,2)*2,"0%")
This one is not limited to a range, however, for a percentage value that is exactly x.0%, it will return the range x.0%-x.0%. For example, if the percentage is exactly 6%, the range it returns will be 6%-6%.
So it should be adjusted to
=TEXT(ROUNDDOWN(B2/2,2)*2,"0%")&"-"&TEXT(1/IFERROR(1/(ROUNDUP(B2/2,2)*2-B2),50)+B2,"0%")

Related

Excel Calculate a running average on filtered data with criteria

I am trying to calculate a running average on a filtered data table. All other posts online use Sumproduct(Subtotal) on the entire range but do not calculate a row by row running average
I am stuck on how to calculate columns C and D.
If column B (Score) > 0, I want to sum and average it under column C (Average Win)
If column B (Score) < 0, I want to sum and average it under column D (Average Loss)
The table is filterable by column A (Type) and the results should look as follows
Progress so far:
I have figured out how to calculate a Cumulative score based on filtered data. However this does not fully solve my problem. I appreciate any help!
=SUBTOTAL(3,B3)*SUBTOTAL(9,B$3:B3)
SUBTOTAL(3,B3) checks if the current row is visible, SUBTOTAL(9,B$3:b3) sums the values.
Final update needed
Jos - Thank you for your detailed explanation on how subtotal() works. I learned a ton through your explanation and will continue to study it. This is my first time being exposed to structured referencing so some of the syntax is a bit confusing to me still
The last formula I need is a running win % column where a Win is defined by score > 0. Please see the picture below
My assumptions believe that the same formula would work, except that we average a 1 or 0 in each row instead of the [Score] column.
Using the prior solution, why can't we modify the output of your prior solution to calculate a running win %?
[...] IF([Score]>0,IF(ROW([Score])<=ROW([#Score]),[Win])))),0)
Where [Win] is a helper column with the outputs 1 for win, 0 for loss.
This could be done by saying
if([#score]>0,1,0)
Instead of averaging out the actual #Score, this would average out a column of 1's and 0's with the desired output 0%, 50%, 66%, etc.
I am aware that the solution I provided does not work but I am trying to embrace the correct logic. I still struggle to understand how these structured column references are calculated on a row by row basis.
For example: Average(If([Score]>0,[Score])
How is this calculated on a row by row basis? When A3 does If([Score] > 0,), does this equal If({-10}>0)? When on A4, does If([Score]>0) equal If({-10,20} >0)? Thank you for your patience and help thus far.
I disagree with your result for Average Loss for the last row of your unfiltered table (surely -9.33...?), but try this for Average Win:
=IFERROR(AVERAGE(IF(SUBTOTAL(3,OFFSET(INDEX([Score],1),ROW([Score])-MIN(ROW([Score])),)),IF([Score]>0,IF(ROW([Score])<=ROW([#Score]),[Score])))),0)
Same formula for Average Loss, changing [Score]>0 to [Score]<0.
Explanation:
Using the data you provided and assuming:
The table's top-left cell is in A1
The table is filtered on the Type column for "A"
In order to determine which rows are filtered, we must pass an array of range references - i.e. for each cell within a chosen column of the table - to the SUBTOTAL function. It's a touch unfortunate that such an array of range references can only be generated via a volatile function (INDIRECT or OFFSET), but here, unless we resort to helper columns, we are left with no choice.
INDEX([Score],1)
simply returns a range reference to the first cell within the Score column. When using Excel tables, it's preferable not to write formulas which include a mixture of structured and non-structured referencing, even if that results in slightly longer expressions. So here, for example, we would not reference A2 within the formula.
ROW([Score])-MIN(ROW([Score]))
generates an array of integers from 0 up to one fewer than the number of rows in the table, i.e.
{0;1;2;3;4}
and so
=IFERROR(AVERAGE(IF(SUBTOTAL(3,OFFSET(INDEX([Score],1),ROW([Score])-MIN(ROW([Score])),)),IF([Score]>0,IF(ROW([Score])<=ROW([#Score]),[Score])))),0)
becomes
=IFERROR(AVERAGE(IF(SUBTOTAL(3,OFFSET(A2,{0;1;2;3;4},)),IF([Score]>0,IF(ROW([Score])<=ROW([#Score]),[Score])))),0)
OFFSET then generates an array of range references (though note that you will not be able to 'see' this step within the Evaluate Formula window - rather, an array of #VALUE! errors is displayed):
=IFERROR(AVERAGE(IF(SUBTOTAL(3,{A2;A3;A4;A5;A6}),IF([Score]>0,IF(ROW([Score])<=ROW([#Score]),[Score])))),0)
SUBTOTAL then determines which of these range references is filtered (note that care must be given here to the choice of first parameter), returning the relevant Boolean, so that:
SUBTOTAL(3,{A2;A3;A4;A5;A6})
resolves to:
{1;1;1;0;1}
And so we now have:
=IFERROR(AVERAGE(IF({1;1;1;0;1},IF([Score]>0,IF(ROW([Score])<=ROW([#Score]),[Score])))),0)
and the rest is straightforward.
So, I would use averageifs().
=averageifs(B:B,B:B,">=1",A:A,"A")
is one example, note I have added the control of Type A in the example.
See:

SumIF Using Table/Named Range Instead of Single Cell Criteria

I have 2 sheets in a workbook (Sheet1, Sheet2).
Sheet 2 contains a table (Named Table1) with 5 columns:
Takeaways
Household
Clothing
Fuel
Groceries
On sheet one, I have 2 columns:
Expense Name
Expense Total
Now, what I am trying to do is:
Set the range for the Expense Name (Range 1)
Set the range for the Expense Total (Range 2)
Compare Range 1 with the respective column in the table and only add up the values for matches
For example, in Range 1 (B6:B16):
BP
Caltex
McDonalds
KFC
In Range 2 (C6:C16):
300
400
200
150
Now, all I want to do is add up the values for the Takeaways (McDonalds, KFC) and exclude anything that DOES NOT match the criteria.
So my sum total will be all occurrences of Takeaways - provided they are listed in my table - 350 in this case.
But I cannot seem to get the formula to work.
I used these sources:
https://exceljet.net/excel-functions/excel-sumifs-function
Selecting a Specific Column of a Named Range for the SUMIF Function
and ended up with this formula:
=SUMIF($B$6:$B$16;Table1[Takeaways];C6:C16)
This source:
https://excelchamps.com/blog/sumif-sumifs-or-logic/
and ended up with this formula:
=SUM(SUMIFS(C6:C16;B6:B16;Table1[Takeaways]))
Both formulae return 0.
BUT, with BOTH of them, if I change Table1[Takeaways] to "McDonalds", then it correctly identifies every occurrence of the word "McDonalds" in Range 1.
EDIT:
I have updated the formulae above to match the images below.
This is the table that contains the references:
This table contains the data:
Formula:
Cell C4 (Next to Takeaways): =SUMIF($B$6:B$16;Table1[Takeaways];C6:C16)
Cell C5 (Next to Fuel): =SUM(SUMIFS(C6:C16;B6:B16;Table1[Fuel]))
It appears that ONLY BP is being detected in the formula.
This is a an output table when I use the formulae with a single cell reference and not a table or used range:
Formula:
Cell F4 (Next to BP): =SUMIF($B$6:B$16;"BP";C6:C16)
Cell F5 (Next to Caltex): =SUM(SUMIFS(C6:C16;B6:B16;"Caltex"))
Cell F6 (Next to McDonalds): =SUMIF($B$6:B$16;"McDonalds";C6:C16)
Cell F7 (Next to KFC): =SUM(SUMIFS(C6:C16;B6:B16;"KFC"))
If I understand correctly what you're trying to achieve, I think your setup is not right conceptually.
It looks like you're trying to track expenses, and each expense (or payee) is allocated to a category ("Takeaways", "Household" etc.). From a relational-model point of view, your second table (which defines the category for each expense/payee) should only have two columns (or variables): Expense Name and Expense Category.
The table you set up ('Sheet 2') uses the categories (i.e., possible values) as different columns (i.e., variables). But there's only variable, namely the "Expense Category", and the categories themselves are the possible values.
If you set it up like that, the problem changes: you can add a dependent column to your first table that shows the category for each payee (or "Expense Name"), using a VLOOKUP() from the second table.
You can then sum the expenses for all payees matching that category.
Note: I've created the illustration using LibreOffice Calc, so there might be some small differences, but the logic is the same.
Without seeing the data in L and K I can't give you a full answer - but likely it's to do with the way you're pulling your Array
Try something similar to this
=SUMPRODUCT(SUMIFS($L$11:$L$43,$K$11:$K$43,CHOOSE({1,2},Takeaways,"anything else you wanted to sum")))
Remember SUMIFS is for multiple criteria, so if you're only calculating one, you'll need =SUMPRODUCT(SUMIF(
The way the above works is with vertical vectors only, but changing your named ranges so the table of 2 columns is 2 named ranges instead should be okay - unless it's part of your requirements
Table 2 would become expense_Name and expense_Total etc
I was about to close this as a duplicate of my own question here but there is a bit of a difference in using a named range I think. However the logic behind this follows more or less the same approach.
Working further on my partial solution below I derived the following formula:
=SUMPRODUCT(COUNTIF(Table1[Takeaways];Range1)*Range2)
The COUNTIF() part counts the number of occurrences of the cell value in your table. Therefore make sure there are no duplicates in your table. If the value is present in the table the result of COUNTIF() will be 0. This way we create a matrix of 1's and 0's. By multiplying and the use of SUMPRODUCT() we force excel to perform matrix calculations and return the correct result.
Partial solution
I used the following formula:
=SUMPRODUCT(ISNUMBER(MATCH(Range1;Table1[Takeaways]))*Range2)
The formula does the following:
The MATCH()checks if the value in Range1 is present in your table and returns the position of the matching value in your table.
The ISNUMBER() checks if a match is found by checking if the MATCH() fucntion returned a number
Multiplying this with Range2 forces matrix calculation, using the SUMPRODUCT() function
EDIT:
This worked for a really limited sample. As soon as I added the fourth row to my data the formula stopped working as intended. See screenshot:
It took the first two values into the sum correctly, the fourth is not taken into account.

Is there a general way to process only visible cells in excel?

I am wondering, if there is a general way to express, that only visible rows of a formula should be taken into account.
If I have for example a formula sumif($E5:$E100; "ABC"; $F5:F100) it would be very helpful, if there would be a way to express, that the given ranges should only take visible cells into account. I could imagine that a kind of prefix can be specified to a range construct like % or that like. For example the formula then would look like sumif(%$E5:%$E100; "ABC"; %F5:%F100) to make clear, that in the given ranges only visible rows should be taken into account.
Same would then for example be for sum(%A1:%A100) which would mean, that in the range between A1 and A100 only visible cells should be taken to sum up the cells.
The point is, that this construct could be taken inside any kind of formula, no matter what it is.
Thanks in advance
Georg
Generically to sum sumrange based on a match in criteriarange.....but only for visible rows you can use this formula: =SUMPRODUCT((criteriarange=criteria)+0,SUBTOTAL(109,OFFSET(sumrange,ROW(sumrange)-MIN(ROW(sumrange)),0,1,1))) The first part (criteriarange=criteria)+0 just checks the criteria for each row and returns 1 for a match or 0 OFFSET returns an "array of ranges" with each range in this case being a single cell from the sum range. SUBTOTAL can process that and with the sum function (109) gives the "sum" (i.e. the value) of each cell, only when visible. – SUMPRODUCT then multiplies the two ranges and sums the result, effectively giving you the sum of visible rows where the criteria matches
Try This
=SUMPRODUCT(($E$5:$E$100="ABC")+0,SUBTOTAL(109,OFFSET($F$5:$F$100,ROW($F$5:$F$100)-MIN(ROW($F$5:$F$100)),0,1,1)))

What function can I pass to determine the point score in Excel?

I am trying to calculate the point score for which matches the findings column from the data. There are many different sections since there are different categories within the data. What kind of forumla could I use to determine the score based on the different categories? I considered use vlookup but that only works for the first section of the point data.
The objective is to return either 0,1 or 2 based on the category of the data such as media and the finding values
Excel Data
You can use a combination of INDEX, MATCH and OFFSET, with a helper row:
For formula I'm using there is as follows:
=INDEX(OFFSET(INDEX($A$2:$A$12,MATCH(C18,$A$2:$A$12,0)),3,1,1,3),MATCH(B18,OFFSET(INDEX($A$2:$A$12,MATCH(C18,$A$2:$A$12,0)),2,1,1,3),1))
OFFSET(INDEX($A$2:$A$12,MATCH(C18,$A$2:$A$12,0)),3,1,1,3) this part gives the range for points matching the category. I use something similar to get the range matching the points.
First, INDEX and MATCH gives the cell containing the category. I use OFFSET to move that reference 3 cells down, 1 cell right, keep the height and increase the width to 3. For instance, in D2, INDEX and MATCH gives me cell A7. Offsetting that using the values I mentioned earlier means that the result of offset will be the range B10:D10.
Using the same logic, I get the range B9:D9. From that range, I use MATCH to get the highest column in which the value in range B9:D9 is smaller than the listing value, in this case the value 100 is the largest value that is smaller than 165, so I get the result 3 from MATCH. This fed into INDEX gives the corresponding points.
But you can do without OFFSET if you can picture the different arrays in your head using only INDEX and MATCH:
=INDEX($B$5:$D$15,MATCH(C18,$A$2:$A$12,0),MATCH(B18,INDEX($B$4:$D$14,MATCH(C18,$A$2:$A$12,0),0),1))

Counting from a specific range of data

Is there a formula without using VB to sum up a total number form a specific range of data?
For example:
Example
    
I need to sum up the number of times Mary took up the cooking lesson.
I understand that just by using the sum and manually select the range (B3:D3) I will be able to get it. But is there a formula to determine the range (B3:D3) instead?
Please advise. Thanks
The use of the merged cells in row 1 necessitates building a range with a pair of INDEX functions which is then re-examined with another INDEX to pick the row of data with a MATCH function. Once the range has been defined, a SUM function produces the result.
      
The formula in C10 is,
=SUM(INDEX(INDEX($B$3:$J$6, 0, MATCH($B10, B$1:J$1, 0)):INDEX($B$3:$J$6, 0, MATCH($B10, B$1:J$1, 0)+2), MATCH($A10, $A$3:$A$6, 0), 0))
Fill down as necessary.

Resources