Percentile function across multiple arrays - excel-formula

So I have this data in excel right now
A B C
2015-1 Test 1 23
2015-2 Test 1 12
2015-3 Test 1 43
2015-4 Test 1 32
2015-5 Test 1 3
2015-6 Test 1 90
2015-1 Test 2 200
2015-2 Test 2 123
2015-3 Test 2 21
2015-4 Test 2 40
2015-5 Test 2 17
2015-6 Test 2 138
2015-1 Test 3 160
2015-2 Test 3 55
2015-3 Test 3 30
2015-4 Test 3 74
2015-5 Test 3 67
2015-6 Test 3 89
Right now, I have it so that the user can look at the a specific time period, not necessarily all of the dates, of data, (for example, from 2015-1 to 2015-4). So when the user selects the date that they want, I want to take the percentile of the data(column C) at that date across all of the different test scenarios in column B. Right now there is only 3, but there will be up to 100 different test cases.
I know its possible to do =Percentile((test1_data,test2_data,test3_data),1),
but I'm going to have to do the percentile across over 100 difference test cases, and the way I have it set up now seems highly inefficient. Is there a way to do this without having to enter in all of the 100 different arrays by hand?

Based on your table, something along the lines of the following formula should work. (It is an array formula and you should use CTL+SHIFT+ENTER as you enter the formula into the cell to activate the function.)
{=PERCENTILE(
IF(NUMBERVALUE(LEFT($A$1:$A$18,4))<=EndYear,
IF(NUMBERVALUE(LEFT($A$1:$A$18,4))>=BegYear,
IF(NUMBERVALUE(RIGHT($A$1:$A$18,1))<=EndMonth,
IF(NUMBERVALUE(RIGHT($A$1:$A$18,1))>=BegMonth,
$C$1:$C$18)))),1)}
EndYear is a reference to the cell that has the LAST year you want included
BegYear is a reference to the cell that has the FIRST year you want included
EndMonth is a reference to the cell that has the LAST month (or whatever the second unit is) you want included
BegMonth is a reference to the cell that has the FIRST month (or whatever the second unit is) you want included
Just expand the references $A$1:$A$18 and $C$1:$C$18 to include however many test cases you want.
FORMULA EXPLANATION
The first two if statements focus on the year. They take the LEFT() four digits as a string. NUMBERVALUE() then turns strings into values. You can then use the if statement to logically evaluate whether the test dates fall into the desired range of dates.
The second two if statements do precisely the same thing on the last single-digit (month?)
The embedded if statements, will return an array of the associated value from column C if all the statements are true and FALSE if one of the statements is not true.
PERCENTILE() will take the array, ignore the items that returned as FALSE, and provide you with the k-th percentile of the range of values in which all four if statements are true.
*As a note, I don't know the significance of your second digit. If it ever goes above 9, you might need to adjust for your data. In that case you could either replace all the 2015-9 entries with 2015-09 and change the second argument of the RIGHT() function to 2, or you could do something like MID($A$1:$A$18,6,2) or the last digit could just be replaced by however many characters you have after the year argument.

Related

Excel - Find row with conditional statement in XLOOKUP

I'm trying to use XLOOKUP to find a value based on user inputs.
The table looks like this:
Type Start End 33 36 42 48
---------------------------------------
4002 1 7 1.17 1.34 1.5 1.84
4002 8 12 1.84 1.67 2.1 3.45
User selects type, number (can be between start and end), and 33-48
I can nest an XLOOKUP to specify the 3 criteria
=XLOOKUP(*type* & *number* , *typeRange* & *numberRange* ,XLOOKUP(*33-48* , *33-48Range* , *ResultRange* ))
And I can find if a value is between the columns
=IF(AND(*number*>=*Start*,*number*<=*End*),TRUE,FALSE)
Can I combine the two? The data is redundant for numbers 1-7, and I would like to keep the table small.
You sort-of can combine them. I have added a couple of extra rows to the table to see what would happen if you had different Type values as well as number values. The problem then is that if you used approximate match and put in a number like thirteen which is out of range, you might end up getting the next row of the table which would be incorrect. One way round it would be to use the options in Xlookup to search for next-smaller-item in the Start column and next-larger-item in the End column and see if the results match:
=IF(XLOOKUP(I2&TEXT(J2,"00"),A2:A7&TEXT(B2:B7,"00"),XLOOKUP(K2,D1:G1,D2:G7),,-1)=XLOOKUP(I2&TEXT(J2,"00"),A2:A7&TEXT(C2:C7,"00"),
XLOOKUP(K2,D1:G1,D2:G7),,1),XLOOKUP(I2&TEXT(J2,"00"),A2:A7&TEXT(C2:C7,"00"),XLOOKUP(K2,D1:G1,D2:G7),,1),"Error")
If you have some checks in place which make it impossible for number to be out of range, then you can simplify the formula:
=XLOOKUP(I2&TEXT(J2,"00"),A2:A7&TEXT(B2:B7,"00"),XLOOKUP(K2,D1:G1,D2:G7),,-1)
or
=XLOOKUP(I2&TEXT(J2,"00"),A2:A7&TEXT(C2:C7,"00"),XLOOKUP(K2,D1:G1,D2:G7),,1)

Excel pass/fail which only triggers the fail after 2 values don't meet the requirements

I'm trying to write a pass/fail check that returns a fail only after 2 values in the range fail to pass the check. I've wrote the start of the check however it already returns the "Fail" straight after the first failing value.
For example: Pass/Fail check if all values are above 20.
20
20
20
---
good
20
19
20
---
still good
28
10
19
---
fail (since 2 values fail to meet the required value)
In my sheet 5 values need to be checked which need to be in a certain range defined in a other location (XX1 and XX2 in formula). The formula I used so far is:
=IFS(AND(E37:E41>=MIN(XX1);E37:E41<=MAX(XX2));"Pass";TRUE;"Fail")
There are multiple options:
Using COUNTIFS and COUNTA:
=IF(COUNTIFS(E37:E41,">="&XX1,E37:E41,"<="&XX2)>COUNTA(E37:E41)-2,"Pass","Fail")`
If you need to also check that the average falls between XX1 and XX2, then use AND and AVERAGE along with the formula above.
=IF(AND(COUNTIFS(E37:E41,">="&XX1,E37:E41,"<="&XX2)>COUNTA(E37:E41)-2,AVERAGE(E37:E41)>=XX1,AVERAGE(E37:E41)<=XX2),"Pass","Fail")`
3 Conditions IF Statement
AVERAGE(E37:E41)>=XX1
AVERAGE(E37:E41)<=XX2
COUNTIF(E37:E41,"<20")<2 i.e. not more than 1 value is lt 20 (or at least 4 values are gte 20).
=IF(AND(AVERAGE(E37:E41)>=XX1,AVERAGE(E37:E41)<=XX2,COUNTIF(E37:E41,"<20")<2),"Pass","Fail")

Is there a non-VBA way to calculate the average of the sum of two sets of columns?

I'm creating an excel spreadsheet to track when an item is received as well as when a response to the item having been received has been made (ie: my mail was delivered at 1:00pm (item received) but I didn't check the mail until 5:00pm (response to item having been received)).
I need to track both the date and time of the item being received and want to separate these in two separate columns. At the moment this translates to:
Column A: Date item received
Column B: Time item received
Column L: Date item was responded to having been received
Column M: Time item was responded to having been received
In essence I'm looking to run calculations on the response time between when the item is received and when it has been responded to (ie: average response time, number of responses in less than an hour, and even things like the number of responses that took between 2 and 3 hours where Bob was the person who responded).
The per-line pseudo code would look something like:
(Lr + Mr) - (Ar + Br) ' where L,M,A,B are the columns and 'r' is the row number.
An example, with the following data:
1. A B L M
2. 1/5/19 10:00 1/5/19 12:00
3. 1/5/19 21:00 1/6/19 1:00
4. 1/5/19 22:00 1/5/19 23:00
5. 1/6/19 3:00 1/6/19 4:00
The outcome for the average response time would be 2 hours (average(rows 2-5) = average(2, 4, 1, 1) = 2)
The number of items with an average response times would be as follows:
(<=1 hour) = 2
(>1 & <=2) = 2
(>2 & <=3) = 0
(>3) = 1
I don't know (or can find) a function that will perform this and then let me use it within something like a countifs() or averageifs() function.
While I could do this (fairly easily) in VBA, the practical implementation of this spreadsheet limits me to standard Excel. I suspect that sumproduct() will be fundamental to make this work, but I feel that I need something like a sumsum() function (which doesn't exist) and I'm not familiar with sumproduct() to better understand what to even look for to set something like this up.
If you are not so familiar with SUMPRODUCT() or the likes I would suggest one helper column. Like so:
You can see the formula used is:
=((C2+D2)-(A2+B2))
You can probably do all type of calculations on this helper column. Note, column is formatted hh:mm. However, if you want to look into SUMPRODUCT() you could think about these:
Formula in H2:
=SUMPRODUCT(--(ROUND((((A2:A5+B2:B5)-(C2:C5+D2:D5))*-24),2)<=1))
Formula in H3:
=SUMPRODUCT((ROUND((((A2:A5+B2:B5)-(C2:C5+D2:D5))*-24),2)>1)*(ROUND((((A2:A5+B2:B5)-(C2:C5+D2:D5))*-24),2)<=2))
Formula in H4:
=SUMPRODUCT((ROUND((((A2:A5+B2:B5)-(C2:C5+D2:D5))*-24),2)>2)*(ROUND((((A2:A5+B2:B5)-(C2:C5+D2:D5))*-24),2)<3))
Formula in H5:
=SUMPRODUCT(--(ROUND((((A2:A5+B2:B5)-(C2:C5+D2:D5))*-24),2)>3))
The helper column is the easiest approach. It gives you the time differences that you can then easily analyse however you want. Analysis without the helper column is possible, but the approach differs depending on what type of analysis you want to do.
For the example you provided, which is counting the number of time differences grouped into ranges, you would use the FREQUENCY function:
=FREQUENCY(C2:C5+D2:D5-A2:A5-B2:B5,F2:F4)
In F2:F4 (called the "bins"), enter the upper limit of each range you want to count. The Frequency function counts up to and including the first value, then counts from there up to and including the second value, and so on. Enter the bins as times, e.g. 1:00 for 1 hour.
Note that Frequency is an array-entered and an array-returning function. This you means you need to first select the range that will contain all output values, G2:G5 in this example, then enter the function, then press CTRL+SHIFT+ENTER
Also note that Frequency returns an array that is one element larger than the number of bins specified. The extra element is the count of all values greater than the largest bin specified.

Compare cell in excel

i have some data like below in one column.
Value
-----
A#show
20
20
B#show
20
25
30
C#show
10
10
10
10
D#show
10
E#show
10
20
I want to compare the values between the cell where the last string is "show" and if there is only one then no comparison.
Value Comparison
----------------------
A#show Same
20
20
B#show different
20
25
30
C#show same
10
10
10
10
D#show only one
10
E#show different
10
20
I think it's can be possible using a VBA script
It's a bit unclear what you're trying to compare between values. However, there is a way to do this without VBA.
1) In the Second Column, create a "Header" column which names which header each value belongs to. The first entry would just be A#show, but then the following would be:
=IFERROR(IF(A2*1>0,B1),B2)
2) In the third column, you can utilize countif to see if the header has more than 2 entries (indicating it has a comparison). Here is where you can apply whatever comparative metric you'd like. If it's something unformulaic, just use a pivot table with the 3 columns.

Excel function to choose a value greater than or less that a particular value in cell

I have a data set something like this
Units Price
1 15
100 10
150 9
200 8
50000 7
I need the output as Price with respect to quantity.
Example- If Input value is 90 it should give price as 15
If input is 210 it should give value as 8.
However,sadly I cannot use IF statement.
Thanks in advance.
You can use a combination of INDEX and MATCH
=INDEX(B1:B5,MATCH(lookup_value,A1:A5,1))
This assumes Units are in column A and Price is in column B
Make sure you understand both functions:
INDEX
MATCH - particularly the reason for the ,1) at the end
You can also use VLOOKUP. This is probably a bit easier although INDEX/MATCH is more versatile:-
=VLOOKUP(Lookup_value,$A$2:$B$6,2,TRUE)

Resources