I'm trying to calculate the Median since a pivot table won't work.
I have a number of conditons that i need to fulfill so i need a
={median(if(and(A:A=A2,B:B=B2,C:C=C2,D:D=D2),T:T,"")}
type formula.
Columns A, B, C and D have the criteria and T has the value that I need the Median of.
I have been able to produce a median with just 1 variable, but i'm only getting #n/a when i try more.
I have seen that an AND function doesn't work with an Array, so is there another way that I can calculate the mean based upon 4 different conditions?
Any Help would be greatly appreciated!
Ed
Array formula do not like AND or OR so use * and + respectively to turn the TRUE and FALSE of each of the Boolean test to 1 and 0 respectively.
So with * if any are FALSE it will be 0 and turn the whole to 0, where as with + if any are TRUE then it will be greater than 0 and the IF will return the TRUE result:
=median(if((A:A=A2)*(B:B=B2)*(C:C=C2)*(D:D=D2),T:T))
If you are using Google Sheet (If not, you should :) )
Above, can be achieved using combination of MEDIAN and FILTER functions.
FILTER(range, condition1, [condition2, ...])
=MEDIAN(FILTER(T:T, A:A=A2, B:B=B2, C:C=C2, D:D=D2)
It filters T:T based on the conditions provided next, then Median of the result is returned.
Related
I have a DataFrame as shown in the attached image. My columns of interest are fgr and fgr1. As you can see, they both contain values corresponding to years.
I want to iterate in the the two columns and for any value present, I want 1 if the value is present or else 0.
For example, in fgr the first value is 2028. So, the first row in column 2028 will have a value 1 and all other columns have value 0.
I tried using lookup but I did not succeed. So, any pointers will be really helpful.
Example dataframe
Data:
Data file in Excel
This fill do you job. You can use for loops aswell but I think this approach will be faster.
df["Matched"] = df["fgr"].isin(df["fgr1"])*1
Basically you check if values from one are in anoter column and if they are, you get True or False. You then multiply by 1 to get 1 and 0 instead of True or False.
From this answer
Not the most efficient, but should work for your case(time consuming if large dataset)
s = df.reset_index().melt(['index','fgr','fgr1'])
s['value'] = s.variable.eq(s.fgr.str[:4]).astype(int)
s['value2'] = s.variable.eq(s.fgr1.str[:4]).astype(int)
s['final'] = np.where(s['value']+s['value2'] > 0,1,0)
yourdf = s.pivot_table(index=['index','fgr','fgr1'],columns = 'variable',values='final',aggfunc='first').reset_index(level=[1,2])
yourdf
Following from the example here I'm trying to add additional conditions to a sum formula. I've represented an example below:
The output that I'm looking for for example for Jan 2017 is
2017
1
UP A 1
UP B 6
UP C 6
DOWN A 1
DOWN B 8
DOWN C 7
I tried with the following formula:
=MMULT(--($B$17:$C$17="X"),MATCH(1,($A23=$C$2:$C$14)*(C$21=$A$2:$A$14)*(C$22=$B$2:$B$14)*($E$2:$E$14=$D$2:$D$14),0))
but I get a N/A value.
Does anyone know it if is possible to do it?
In your first example the number of rows in array1 and number of columns in array2 were equal, five. Here you have two columns and 13 rows. That they are unequal here is part (all) of the reason why you are having an issue.
Also your match function is returning a Boolean not an array
I have a way to do this using matrix condition and multiple criteria but had to change problem up a bit, see photo for example:
{=MMULT(--(D18:P18="x"),E$2:E$14*(--(A$2:A$14=$C$21)*--(B$2:B$14=$C$22)*--(C$2:C$14=A24)))"
https://i.stack.imgur.com/FEvgR.png
You can create a formula to fill the second matrix with X's see below
=IF(OR(INDIRECT("D"&VALUE(D20))=$A$18,INDIRECT("D"&VALUE(D20))=$B$18),"X","")
https://i.stack.imgur.com/4rS4L.png
That being said I don't think this is particularly efficient as you are treating the one of the matrixes as a all 1's so you basically just adding an extra criteria / Boolean with added complexity....that being said u asked for this specifically and I believe that I have delivered that LOL
Just add two SUMIFS together.
=SUMIFS($E$2:$E$14, $A$2:$A$14, C$21, $B$2:$B$14, C$22, $C$2:$C$14, $A23, $D$2:$D$14, IF(INDEX($B$17:$C$19, MATCH($B23, $A$17:$A$19, 0), 1)="x", $B$16))+
SUMIFS($E$2:$E$14, $A$2:$A$14, C$21, $B$2:$B$14, C$22, $C$2:$C$14, $A23, $D$2:$D$14, IF(INDEX($B$17:$C$19, MATCH($B23, $A$17:$A$19, 0), 2)="x", $C$16))
I have a scale, based on which I decide the value of the coefficient for the multiplication. The scale looks as following:
Which means that:
for Category1: when value>=1.000.000 then coef is 1, when value>=500.000 then coef is 0.8 and etc.
Same logic applies for Category2;
Then I have input data in the following format:
Company !MainCat|Sales Amount|
Company1|T1 | 6.500.000|
Company2|T2 | 70.000|
I need to find corresponding coefficient, ratio of the coeffitient and the value (=ratio*MaxCoef). Currently, I am finding coef the following way:
- for company1:
=IF(C8>=$D2;$D$1;IF(C8>=$E2;$E$1;IF(C8>=$F2;$F$1;IF(C8>=$G2;$G$1;IF(C8>=$H2;$H$1;IF(C8>=$I2;$I$1))))))
That is literally hardcoded and doesn't look good. Maybe there is a better way of doing ? Any suggestions?
Formula view:
You can COUNTIF(range, [criteria] < value) * 0.2 as your add 0.2 per coef stage.
To you data do: =COUNTIF(D2:H2, "<"&C8) * 0.2, count how many stages the value passes * the value per stage.
Your count if range needs to be until H2 as I2 is 0, so inferior to value and gets counted.
To combine the COUNTIF() with a dynamic search for the right category based on MainCat you can MATCH() the MainCat with Code which will give the row where the Code is located and utilize INDIRECT() to apply it as range.
=COUNTIF(INDIRECT("D"&MATCH(B8,B:B,0)&":H"&MATCH(B8,B:B,0)),"<"&C8)*0.2
MATCH(B8,B:B,0) - will match the value on B8 (lets say T1) and return the row 2.
INDIRECT("D"&MATCH(B8,B:B,0)&":H"&MATCH(B8,B:B,0) = INDIRECT("D"&2&":H"&2) - will turn the text into an actual range to be use by the COUNTIF().
Create a table ‚Mapping’ that contains two columns, ‚Category’ and ‚Coefficient‘, then use INDEX-MATCH on it as described in https://www.deskbright.com/excel/using-index-match/.
=INDEX(Mapping[Category]; MATCH([Coefficient]; Mapping[Coefficient]; -1))
This example assumes that you put this formula into a table that has a column named ‚Coefficient‘ with the input value to your multiple IFs.
The trick is that as a match_type argument, provide either -1 or 1, according to your needs.
You can do this in VBA. Write your own function which ends in something like that
=MyOwnScale(C8; B8; A2:I3)
The first parameter of your VBA-function is the value, the second the category and the third is the range with the thresholds. So you can move your cascading IF-loops in VBA-Code and you (and your users) see only a clean function call in the cell.
In MS Excel there is a handy formula =AVERAGEIF(values, criteria).
Is there a similar way to average values within one columns that conform to certain condition?
I have a column of values in my data frame from -5000 to +5000.
I need to average values between -5000 <= x < 0
And separately average values between 0 < x <= 5000.
NOTE: I'd like to avoid applying Boolean mask and therefore creating new dataframe, because I have lots of columns.
Any help, suggestions, or edits to this post are welcome.
Using Boolean mask actually does what I need.
df[df>0].mean(axis=0,skipna=True,numeric_only=True)
It returns as many single values as I have columns. Perfect!
I have been trying to get this array function to output (non-zero) minimum values in the 'FINAL DATA' AE column. Can you see a structural error in this formula?
=IF($C$4="All EMEA",
MIN(IF('FINAL DATA'!$2:$AE$250000<>0,
('FINAL DATA'!$J$2:$J$250000=$C$4)*('FINAL DATA'!$E$2:$E$250000=$E$4)*( 'FINAL DATA'!$AE$2:$AE$250000))),
MIN(IF('FINAL DATA'!$AE$2:$AE$250000<>0,
('FINAL DATA'!$K$2:$K$250000=$C$4)*('FINAL DATA'!$E$2:$E$250000=$E$4)*( 'FINAL DATA'!$AE$2:$AE$250000)))
)
By using <>0 that will eliminate zeroes and blanks, so that isn't the problem.....[although if you only want to eliminate blanks and have zero as a valid return value you should use <>""]
You can't multiply the conditions with the number range because by multiplying you get zeroes for any rows where the conditions are not satisfied, use multiple IFs instead, like this:
=MIN(IF('FINAL DATA'!$AE$2:$AE$250000<>0,IF('FINAL DATA'!$J$2:$J$250000=$C$4,IF('FINAL DATA'!$E$2:$E$250000=$E$4,'FINAL DATA'!$AE$2:$AE$250000))))
Second line, you have !$2, no column specified.
MIN(IF('FINAL DATA'!$2:$AE$250000<>0,
Also, it looks like you are trying to run a single If comparison against a range, which I don't think will work the way you are trying to use it.
Barry has identified the core problem (tests returnimg 0 to the MIN function).
Here's a refactor of your formula (still an array formula) that solves this, and is quite a bit shorter
=MIN(IF(($S:$S<>0)*($E:$E=$E$4)*(IF($C$4="All EMEA",$J:$J,$K:$K)=$C$4),
($S:$S)))
Note that this (as would your original formaul, when fixed) will return 0 if there are no qualifying values >0 in the ranges
You can eliminate the zeros by using an IF() function in an array formula. Consider the following:
A
Row -----
1 0
2 7
3 5
4 6
5
6 3
The array formula =MIN(IF($A$1:$A$6>0,$A$1:$A$6)) will return 3 because the 0 and blank cell are eliminated with the >0 portion of the if statement.