How to find Quartiles from the data in two columns in excel - excel

Hi I have a basic question. I tried to find the answer by myself but I couldn't. how to contain multi areas in Quartile func in excel? (actually, google spreadsheet)
I know the basic func of quartile is =quartile(data,1). ex) =quartile(A2:A10,1)
Situation: I have a # of trials in column A, and the # of scores in column B. After I had 50 trials, I didn't want the table to be too long. so I have another # of trials in column D(51~100) and # of scores in column E
Main point: I want to find the average, median, and qurtile1, quartile2, quartile3 for all scores of my trials.
I know for average, =average(B2:B51,E2:E51). for median, =median(B2:B51,E2:E51). However, when I do quartile =quartile(B2:B51,E2:E51,1) or =quartile((B2:B51,E2:E51),1) it has an error that I put 3 data and it should be 2. How can I contain the data from two columns?(column B and column E) please let me know. Thank you!

Range unions in Excel are achieved by the following syntax:
(Range1,Range2)
For example:
=QUARTILE((B2:B51,E2:E51),1)
Whereas in Google Sheets it's:
{Range1;Range2}
For example:
=QUARTILE({Range1;Range2},1)

Related

Excel CUBEVALUE & CUBESET count records greater than a number

I am writing a series of queries to my workbook's data model to retrieve the number of documents by Category_Name which are greater than a certain numbers of days old (e.g. >=650).
Currently this formula (entered in celll C3) returns the correct number for a single Days Old value (=3).
=CUBEVALUE("ThisWorkbookDataModel",
"[Measures].[Count of Docs]",
"[EDD_Report].[Category_Name].&["&$B2&"]",
"[EDD_Report_10-01-18].[Days Old].[34]")
How do I return the number of documents for Days Old values >=650?
The worksheet looks like:
A B C
1 Date PL Count of Docs
2 10/1/2018 ALD 3
3 ...
UPDATE: As suggested in #ama 's answer below, the expression in step B did not work.
However, I created a subset of the Days Old values using
=CUBESET("ThisWorkbookDataModel",
"{[EDD_Report_10-01-18].[Days Old].[all].[650]:[EDD_Report_10-01-18].[Days Old].[All].[3647]}")
The cell containing this cubeset is referenced as the third Member_expression of the original CUBEVALUE formula. The limitation is now that the values for the beginning and end must be members of the Days Old set.
This is limiting, in that, I was hoping for a more general test for >=650 and there is no way to guarantee that specific values of Days Old will be in the query.
First time I hear about CUBE, so you got me curious and I did some digging. Definitely not an expert, but here is what I found:
MDX language should allow you to provide value ranges in the form of {[Table].[Field].[All].[LowerBound]:[Table].[Field].[All].[UpperBound]}.
A. Get the total number of entries:
D3 =CUBEVALUE("ThisWorkbookDataModel",
"[Measures].[Count of Docs]",
"[EDD_Report].[Category_Name].&["&$B2&"]"),
"{[EDD_Report_10-01-18].[Days Old].[All]")
B. Get the number of entries less than 650:
E3 =CUBEVALUE("ThisWorkbookDataModel",
"[Measures].[Count of Docs]",
"[EDD_Report].[Category_Name].&["&$B2&"]"),
"{[EDD_Report_10-01-18].[Days Old].[All].[0]:[EDD_Report_10-01-18].[Days Old].[All].[649]}")
Note I found something about using .[All].[650].lag(1)} but I think for it to work properly your data might need to be sorted?
C. Substract
C3 =D3-E3
Alternatively, go for the quick and dirty:
=CUBEVALUE("ThisWorkbookDataModel",
"[Measures].[Count of Docs]",
"[EDD_Report].[Category_Name].&["&$B2&"]"),
"{[EDD_Report_10-01-18].[Days Old].[All].[650]:[EDD_Report_10-01-18].[Days Old].[All].[99999]}")
Hope this helps and do let me know, I am still curious!

Find values occurring in multiple columns in excel

I have sets of gene probes that are upregulated when put under different chemical stresses. Each column contains all of the upregulated gene probes. I have 12 columns, how do I get a list of gene probes that appear in all 12 columns?
I've been able to find similarities between two columns using the formula
=IF(ISERROR(MATCH(A2,$C$2:$C$21473,0)),"",A2)
but cant work out how to adapt it to include 12 columns
G.Ac G.As G.At G.Ac.At G.As.Ac G.As.At G.Cd G.Cu G.Ni
G.Cd.Cu G.Cd.Ni G.Ni.Cu
GENE:JGI_V11_3346220103 GENE:JGI_V11_2653050203 GENE:JGI_V11_3299790103
GENE:JGI_V11_359040103 GENE:JGI_V11_2228010103 GENE:JGI_V11_2662750203
GENE:JGI_V11_1926920303 GENE:JGI_V11_3134270303 GENE:JGI_V11_3119540303
GENE:JGI_V11_3134270203 GENE:JGI_V11_1926920303 GENE:JGI_V11_3134270303
GENE:JGI_V11_3164760203 GENE:JGI_V11_565470303 GENE:JGI_V11_2296170203
GENE:JGI_V11_2045300203 GENE:JGI_V11_2421620203 GENE:JGI_V11_2228010303
GENE:JGI_V11_2196580303 GENE:JGI_V11_3134270203 GENE:JGI_V11_3119540203
GENE:JGI_V11_1926920103 GENE:JGI_V11_1926920103 GENE:JGI_V11_1014720202
GENE:JGI_V11_478830203 GENE:JGI_V11_3168730303 GENE:JGI_V11_3311070202
GENE:JGI_V11_3216620102 GENE:JGI_V11_2653050303 GENE:JGI_V11_3300140202
GENE:JGI_V11_2653050303 GENE:JGI_V11_1159220202 GENE:JGI_V11_2024180303
GENE:JGI_V11_1926920303 GENE:JGI_V11_2196580303 GENE:JGI_V11_1159220202
GENE:JGI_V11_3164760303 GENE:JGI_V11_2228010203 GENE:JGI_V11_2341670203
GENE:JGI_V11_1938910303 GENE:JGI_V11_3026230203 GENE:JGI_V11_2449230203
GENE:JGI_V11_3134270303 GENE:JGI_V11_2235750203 GENE:JGI_V11_1981410203
GENE:JGI_V11_3251310202 GENE:JGI_V11_977750103 GENE:JGI_V11_954070203
GENE:JGI_V11_2267320203 GENE:JGI_V11_2268000303 GENE:JGI_V11_2226270101
GENE:JGI_V11_3003640303 GENE:JGI_V11_223520203 GENE:JGI_V11_2662750103
GENE:JGI_V11_2228010103 GENE:JGI_V11_3251310202 GENE:JGI_V11_3198630203
GENE:JGI_V11_3134270303 GENE:JGI_V11_1926920203 GENE:JGI_V11_287750103
GENE:JGI_V11_465160203 GENE:JGI_V11_2268000203 GENE:JGI_V11_2473230303
GENE:JGI_V11_3192220102 GENE:JGI_V11_3026230303 GENE:JGI_V11_3039310303
GENE:JGI_V11_1926920103 GENE:JGI_V11_1159220102 GENE:JGI_V11_3052790202
GENE:JGI_V11_3075830303 GENE:JGI_V11_2196580203 GENE:JGI_V11_3134280203
GENE:JGI_V11_3142970303 GENE:JGI_V11_503720303 GENE:JGI_V11_2236410103
GENE:JGI_V11_3042230103 GENE:JGI_V11_2228010203 GENE:JGI_V11_3028210101
GENE:JGI_V11_2105710303 GENE:JGI_V11_1926920303 GENE:JGI_V11_2131620103
GENE:JGI_V11_1002840203 GENE:JGI_V11_2088480203 GENE:JGI_V11_3196120102
Heres the first 8 rows of the 12 columns. There are 21473 rows in total.
Thanks
You could use an array formula like this to count how many columns a particular gene probe occurs in
=SUM(--(MMULT(TRANSPOSE(ROW(A$2:L$10000)^0),N(A$2:L$10000=A2))>0))
This is a standard way of getting column totals for a 2D array - in this case an array of true/false values corresponding to instances of an array element being equal/unequal to A2.
It is rather a brute force approach - it needs ~120K multiplications for each row. If you copy the formula down for ~10K rows, there is a delay of ~100 seconds on my computer while Excel works out the results.
Must be entered as an array formula using CtrlShiftEnter
In this dummy data C is the only value that occurs in all 12 columns.

Syntax for average if with OR and AND statements (example in body of Q)

I've come across an issue with AVERAGEIFS formula where they are unable to complete OR and AND simultaneously.
I'm working with data in the format below. I have one consistent criteria, in this example is "DF" in Column B. I have alternative criteria which can also be correct, "Dog*" or "Cat*" in Column A. In column C is the Cuteness Level.
Therefore, I am trying to work out the average cuteness level of Dogs and Cats with Vet Code DF with an Average If formula.
I've tried the following which doesn't work;
=AVERAGE(IF(OR(AND(A2:A17="Cat*",B2:B17="DF"),AND(A2:A17="Dog*",B2:B17="DF")),C2:C17, FALSE))
Can anyone please explain where I am going wrong?
Average IF
I couldn't figure out AVERAGEIFS with multiple wildcard conditions, but you can try something like:
= AVERAGE( IF( OR(Left(A2:A17,3)="Cat", Left(A2:A17,3)="Dog") * (B2:B17="DF"), C2:C17 ) )

excel- purchase cost function with multiple variables

So I'm not good with excel (computers in general) and can do some things but this one is out of my league.
This is the problem:
The cost of a used car is highly correlated with the following variables:
t= age of car 1 ≤ t ≤ 5 (years)
V= volume of engine 1000 ≤ V ≤ 2500 (cubic centimeters)
D= number of doors D= 2,3,4,5
A= accessories and style A= 1,2,3,4,5,6 (qualitative)
Using regression analysis, the following relationship between the cost and four independent variables was found:
purchase cost= (1+1/t)*V*(D/2+A)
Plot the purchase price of the car as a function of the four variables.
I know how to input the function into excel and only use one number from each variable:
Function:
=(1+(1/B2))*C2*((D2/2)+E2
Where: B2=1, C2=1000, D2=2, E2=1
Which: A1=4000 (for the purchase cost)
What I don't know is how to make the function use multiple number combinations within those variables (i.e. how to change one variable and not the others). I've looked up "youtube" videos and numerous websites to figure this out and none of them showed me what I needed to know. Any help would be greatly appreciated.
I think you have fundamentally solved your problem but there is a typo in your formula which is preventing it from calculating. Add a closing parenthesis at the end your formula. Then assuming that the formula is placed in cell A1 and your example values are placed in the cells indicated in your post, simply type a new numeric value in B2, C2, D2 and/or E2 'manually' and your formula result will update.
If you wish you can create a dropdown list in your cells to hold the list of values for each of your variables. This link should get you going.

Using OR Logical functions with SUMIFS in Excel

I have two workbooks in excel which I copy columns from one to the other.
I would like to copy the number of one column, say A, IF another column, say B, is equal to "Test Tool" or "Hard Tool". I've written this code and can't get it to work, it just gives me the sum zero which is wrong. The last argument doesn't matter so ignore it.
"=SUMIFS('Tooling forecast template'!R6C17:R500C17,'Tooling forecast template'!R6C7:R500C7,""OR(=Test Tool, =Hard Tool)"" ,'Tooling forecast template'!R6C6:R500C6,""<>Actual tool/equipment change"")"
Here is a method that saves you typing out a large number of SUMIF statements, although it doesn't stop Excel having to calculate the multiple SUMIFs...
=SUM( SUMIFS('Tooling forecast template'!R6C17:R500C17,'Tooling forecast template'!R6C7:R500C7, {"Test Tool", "Hard Tool"} ,'Tooling forecast template'!R6C6:R500C6,"<>Actual tool/equipment change") )
Basically, you calculate the SUMIF with an array of values as your criteria, then wrap that SUMIF in a SUM so that the multiple answers are added together.
This example is quite hard to read due to the long variable names. Here's a simpler example, where you want to add up some numbers where the corresponding letter is either A or B...
The long way:
=SUMIFS(B1:B5, A1:A5, "A") + SUMIFS(B1:B5, A1:A5, "B")
The short way:
=SUM( SUMIFS(B1:B5, A1:A5, {"A","B"}) )
=IF(OR(CellToCheck="Test Tool", CellToCheck="Hard Tool"), CellToCopy, 0)
Just add the two SUMIFS together, its the same thing!
=SUMIFS('Tooling forecast template'!R6C17:R500C17,'Tooling forecast template'!R6C7:R500C7,"=Test Tool" ,'Tooling forecast template'!R6C6:R500C6,"<>Actual tool/equipment change") + SUMIFS('Tooling forecast template'!R6C17:R500C17,'Tooling forecast template'!R6C7:R500C7,"=Hard Tool" ,'Tooling forecast template'!R6C6:R500C6,"<>Actual tool/equipment change")
Use the fact that A OR B is the same thing as NOT ((NOT A) and (NOT B)). For example, sum the entries in A if B=1 or C=1 using SUMIFS:
=SUM(A1:A10) - SUMIFS(A1:A10,B1:B10,"<>0",C1:C10,"<>0")
You can achieve the same result with SUMPRODUCT:
=SUM(A1:A10) - SUMPRODUCT(A1:A10,--(B1:B10<>0),--(C1:C10<>0))
Wouldn't this work as well?
Note:
Assuming Col A houses values to be summed.
Assuming Col B houses the tool types.
=SUM(SUMIFS(A:A,B:B,"hard tool"),SUMIFS(A:A,B:B,"test tool"))

Resources