How to quintile by date group using Percentile in excel - excel

Im just wondering if its possible to quintile my data by group in Excel, using the percentile function.
I can quintile my entire data by doing =MATCH(C2|PERCENTILE(C$2:C$20|{5,4,3,2,1}/5)|-1) but I want to group it up by date.
e.g of data
Date Team_Id Score
04/02/2019 1 50
04/02/2019 2 58
04/02/2019 3 75
04/02/2019 4 34
04/02/2019 5 52
04/02/2019 6 81
05/02/2019 1 87
05/02/2019 2 75
05/02/2019 3 24
05/02/2019 4 75
05/02/2019 5 11
05/02/2019 6 84
06/02/2019 1 45
06/02/2019 2 67
06/02/2019 3 56
06/02/2019 4 55
06/02/2019 5 61
06/02/2019 6 15
06/02/2019 7 88
So basically I want it to be quintiled by Score for each date group, resulting value for each row in Excel should be 1, 2, 3, 4, or 5. Ive been messing around with IF but just dont know where to place it.

If you can tolerate typing CTL SHFT ENTER (or at least wait until Microsoft comes out with their big release) I think this will work
=MATCH(C4,PERCENTILE(IF($A$4:$A$22=A4,$C$4:$C$22,""),{5,4,3,2,1}/5),-1)
This is essentially building a conditional array on each row based on the date
Again when entering the formula you have to type ctl SHIFT enter or it will work.
I'm not exactly sure what we're doing here so if this wrong, sorry.

Will this work?
=MATCH(C2,PERCENTILE(INDIRECT(ADDRESS(1+MATCH($A2,$A$2:$A$20,0),3)&":"&ADDRESS(ROW()+COUNTIF(A3:$A$20,$A2),3)),{5,4,3,2,1}/5),-1)
I've defined the range for the percentile calculation using an Indirect function where the start and end of the range are found with Match and Countif, respectively.

Related

Excel: How to Use AVERAGEIFS with Multiple Ranges (different columns)

Date
Group A
Group B
Group C
Group A
Sub Group
A1
B1
C1
A2
1/1/2022
35
12
54
10
1/2/2022
43
45
62
93
1/3/2022
76
65
39
48
1/4/2022
12
25
81
18
1/5/2022
89
76
20
26
1/6/2022
23
87
47
17
1/7/2022
56
59
21
53
1/8/2022
29
51
9
68
1/9/2022
76
8
52
35
1/10/2022
36
53
38
53
User Input
Start Dt - 1/1/2022
End Dt - 1/5/2022
Group - Group B
Question
What is the daily average of a Group given the above user input?
Formula
=AVERAGEIFS(INDEX($B$2:$E$11,,MATCH($I$3,$B$1:$E$1,0)), $A$2:$A$11, ">="&$G$3, $A$2:$A$11, "<="&$H$3)
Answer
44.6
User will select a start date, end date and Group.
I want to compute the daily average of that.
The issue arises when there are multiple columns with same group as Averageif takes the first column only.
Issue - How can I find the daily average of Group A for the given dates, given that Group A are in two columns (they can't be combined as there are multiple sub groups)
If you are on Microsoft-365 then can try-
=AVERAGE(FILTER(FILTER(B3:E12,B1:E1=I2),(A3:A12>=H1)*(A3:A12<=H2)))
If you have an older version you could use:
=AVERAGE(
INDEX(A1:E12,
AGGREGATE(15,6,ROW(A3:A12)/(A3:A12>=G3)/(A3:A12<=H3),
ROW(A1:INDEX(A:A,SUMPRODUCT((A3:A12>=G3)*(A3:A12<=H3))))),
AGGREGATE(15,6,COLUMN(B1:E1)/(B1:E1=I3),
TRANSPOSE(ROW(A1:INDEX(A:A,SUMPRODUCT(N(B1:E1=I3))))))))
Entered with ctrl+shift+enter (I think)
In the picture below J3 shows the answer cell. K3 is used to demonstrate the array that is used for the AVERAGE function (a Office 365 spill range).
Another old school approach. User input fields in the formula are referenced using named ranges.
=SUMPRODUCT(($A$3:$A$12>=StartDt)*($A$3:$A$12<=EndDt)
*($B1:$E$1=Group)*($B$3:$E$12))/(COUNTIFS($A$3:$A$12,">="&StartDt,$A$3:$A$12,"<="&EndDt)
*(COUNTIF($B$1:$E$1,Group)))

Excel MERGE two tables

I have SET 1
CLASS
Student
TEST
SCORE
A
1
1
46
A
1
2
50
A
1
3
45
A
2
1
45
A
2
2
47
A
2
3
31
A
3
1
34
A
3
2
45
B
1
1
36
B
2
1
31
B
2
2
41
B
3
1
50
C
1
1
42
C
3
1
31
and SET 2
CLASS
SIZE
YEARS
A
39
7
B
20
12
C
31
6
and wish to COMBINE to make SET 3
CLASS
STUDENT
TEST
SCORE
SIZE
YEARS
A
1
1
46
39
7
A
1
2
50
39
7
A
1
3
45
39
7
A
2
1
45
39
7
A
2
2
47
39
7
A
2
3
31
39
7
A
3
1
34
39
7
A
3
2
45
39
7
B
1
1
36
20
12
B
2
1
31
20
12
B
2
2
41
20
12
B
3
1
50
20
12
C
1
1
42
31
6
C
3
1
31
31
6
so basically add the SIZE and YEARS columns from SET 2 and merge on CLASS onto SET 1. In excel how you can do this? I need to match on CLASS
Define both sets as tables and “left join” in PowerQuery. There you can choose the columns of the resulting table.
https://learn.microsoft.com/en-us/power-query/merge-queries-left-outer
If you have Set 1 on the top left of a worksheet "Set1" and Set 2 on the top left of a worksheet "Set2", then you can use the formula
=VLOOKUP(A2;'Set2'!$A$2:$C$4;2;FALSE), where $A$2:$C$4 is the range of Set2, and A2 is the class value from Set1, which is what is used to do the lookup in Set2. The next argument, 2, means to take the second row from Set2, and the FALSE at the end means that you only want exact matches on the CLASS. You can do auto-fill with this formula, and do similar steps for the years. If you look up the help for VLOOKUP within Excel, that should help you to understand how it works.
Your first set of data is essentially your primary set of data that you just want to add attribute columns to. I built this example on Google Sheets which should help explain. Using spill formulas, only a few cells are needed with their own formulas. You can see them as they are highlighted in yellow. When you use in Excel, obviously make sure you change the column references, but this would get you the answer.
Note you have to have SpillRange in Excel for this to work. To test, see if you have the formula =unique()
This solution may work for you if both sets start in the same column. As example in my image, both of them start at column A. You can get all data with a single VLOOKUP formula:
Formula in cell E2 is:
=VLOOKUP($A2;$A$22:$R$25;COLUMN($B22);FALSE)
Notice the mixed references at first and third argument and absolute references in the second one. Third argument is critical, because is the relational position between both sets, that's the reason it's easier if both sets start at same column. If not, you'll need to adjust this argument substracting or adding, depending on the case.
Anyways, with a single formula, you can get any number of columns. The only disavantage of this formula is that you need to manually drag to right until you got all the columns (10, 30 or whatever). You'll notice you are done because the formula will raise an error:
This error means you are trying to get a referenced outside of your column area.

Excel Rank Multiple Columns

I'm facing a issue with ranking in Excel particularly in regards to tie breaking. I tried several options but i guess they don't fit my issue. Its quite simple really, I'll explain:
The Data:
1 2 3 4 5 6 7 8 9 10
87 83 74 95 69 90 73 0 74 85
121 121 96 121 121 121 121 83 121 121
As you can see its easy for me to rank the first line (I'm working in columns instead of rows for the data). When i do a Rank Function gives the following result:
3 5 6 1 9 2 8 10 6 4
Which is correct.
The problem arises in the second line. There are ties because all of them reach the maximum of 121:
1 1 9 1 1 1 1 10 1 1
What i would like to do is take the first row as a tie breaker. So even if there is a tie the first line which was firstly text but now is a sequence from 1 to 10 could provide as secondary criteria to order the rank, thus giving the following ranking line:
1 2 9 3 4 5 6 10 7 9
Could one achieve this result?
Thank You very much in advance.
You need a helper row to break the tie. You can add a fraction of the first row to the second row to create a new row & use the new row to rank
A4 = A3+(A2/(MAX($A$2:$J$2)+1))
Using the MAX I ensure the fraction is less than 1 which is adequate to break ties in this case.
A6 = RANK(A4,$A$4:$J$4)
You can hide the helper row if you dont want to show it.

Excel Median for multiple conditions

Basically in excel I want a table, like the one given below on the right (the scale of my data is a lot bigger than the example given),that has the median for each subject, for each condition (e.g. TADA, TADP, TPDA, TPDP). Ideally, I would use a pivot table, however, excel does not do 'median' in a pivot table. I was wondering if there was a formula I could to save me having to go through manually and working out the median, i've tried a few (along to lines of "median(if etc.." but my coding knowledge in excel is very poor. Is there a short way to do this?
Data Table
Subject RT condition Subject TADA TADP TPDA TPDP
1 23 TADA 1
1 54 TPDA 2
1 65 TADA 3
1 67 TPDP
1 76 TADA
2 72 TPDA
2 87 TADA
2 12 TPDP
2 45 TADP
2 32 TPDP
2 87 TADA
3 98 TPDA
3 12 TADA
3 53 TPDA
3 78 TADP
3 98 TPDP
Assuming data in A2:C100 and then your results table with headers in F1 across and row labels in E2 down you can use an array formula like this in F2
=MEDIAN(IF($A$2:$A$100=$E2,IF($C$2:$C$100=F$1,$B$2:$B$100)))
confirmed with CTRL+SHIFT+ENTER and copied across and down
extend data ranges as required

How to format a number to appear as percentage in Excel

So lets say I have a few numbers in a sheet
a b c d
1 33 53 23 11
2 42 4 83 64
3 75 3 48 38
4 44 0 22 45
5 2 34 76 6
6
7 Total 85
I would like to display those numbers so that the cell value still holds the original figure (A1 = 33)
but the cell displays both the number and a percentage from the total (B7) eg
a b c d
1 33 (39%) 53 (62%) 23 (27%) 11 (13%)
2 42 (49%) 4 (5%) 83 (98%) 64 (75%)
3 75 (88%) 3 (4%) 48 (56%) 38 (45%)
4 44 (52%) 0 (0%) 22 (26%) 45 (53%)
5 2 (2%) 34 (40%) 76 (89%) 6 (7%)
6
7 Total 85
I know how to format a cell as a percentage, but I can't figure out how to display both original values, the calculated percentage value (value/total*100), but not change the cell value so I could still sum the cells in the end (eg. A6 =SUM(A1:A5) = 196)
Does anyone have an idea? I was hoping there could be a way to duplicate and calculate the figure using text formatting, but I can't get anything to work.
I'm guessing this is a trivial answer and maybe not what you're looking for, but why not just add a column for each of the columns you have now?
a a' b b' c c' d d'
1 33 (39%) 53 (62%) 23 (27%) 11 (13%)
2 42 (49%) 4 (5%) 83 (98%) 64 (75%)
3 75 (88%) 3 (4%) 48 (56%) 38 (45%)
4 44 (52%) 0 (0%) 22 (26%) 45 (53%)
5 2 (2%) 34 (40%) 76 (89%) 6 (7%)
6
7 Total 85
#Ari’s answer seems to meet to meet the requirements in your question, not repeat information more than the example you gave for output requirement and be viable for up to around 8000 or so columns to start with (unless a very old version of Excel) and Jerry’s comment is also correct that what you want to achieve the way you want to achieve it is not possible.
However there are other approaches that might be acceptable substitutes. One is to copy your data and Paste Special with Operation Divide, either elsewhere or over the top of your data. If over the top this either shows the values or the percentages otherwise duplicates your data. Over the top would also require something like Operation Multiply to revert back to values, and reformatting each time if to appear as in your example.
Another is to use a PivotTable with some calculated fields and both are shown below:
I appreciate neither is exactly what you are asking for.

Resources