sum max value across multiple non-adjacent columns - excel

In excel, is there a neat way (e.g. using arrays ?) to do this ? I have 3 non-adjacent columns all with numbers, and I want the sum of the max value in each row; I could only arrive at this:
=SUM(MAX(E23, H23, K23),MAX(E24, H24, K24), MAX(E25, H25, K25),
MAX(E26, H26, K26), MAX(E27, H27, K27), MAX(E28, H28, K28))
any help greatly appreciated.

Related

Count Values for Each Number in a cell in a Column

I have an excel sheet like the following, and would like to go down each row and add 1 to each of the numbers listed under the L3 column. Eventually, I would like to output something like this:
L3s Count Attr Ids
4770 10 [370, 380, ...]
6420 8 [481, 490...]
21253 20 [580....290]
... ... ...
The count is derived by going through all of the rows, and adding 1 to each L3 number whenever it is encountered. Attr IDs are the ids that contributed to the count. Is there any simple way to accomplish this in excel without having to vba/python?
Thanks in advance!
If you have windows Excel O365, you can use the following formulas:
(Note that I made the original data into a Table)
Sorted Unique list of the L3s:
=SORT(UNIQUE(FILTERXML("<t><s>" &SUBSTITUTE(SUBSTITUTE(TEXTJOIN("</s><s>",TRUE,Table1[L3s])," ",""),",","</s><s>")&"</s></t>","//s")))
Count of the L3s
=COUNT(FILTERXML("<t><s>" &SUBSTITUTE(SUBSTITUTE(TEXTJOIN("</s><s>",TRUE,Table1[L3s])," ",""),",","</s><s>")&"</s></t>","//s[.=" & F8 &"]"))
Associate Attr IDs
="[" &TEXTJOIN(",",TRUE,FILTER(Table1[attr],ISNUMBER(FIND(","&F8&",",SUBSTITUTE(","&Table1[L3s]& ","," ","")))))&"]"

Excel AVERAGEIFS looking up ONE of the criteria columns

I have built a large data set and I need to see the average results given many different criteria. I've done this with the AVERAGEIFS function and it works just fine, however the more and more I add its getting really time intensive.
I'm wondering if there is a way to nest a vlookup or index match or anything like that in the AVERAGEIFS that read the criteria column heading and criteria in a cell (or 2 if they need to be separated) to be added to the AVERAGEIFS.
Here is an example of my spreadsheet:
The first 3 sets of criteria I want to stay locked.
I want it to read what the 4th criteria column and criteria should be by referencing the I11 cell. The highlighted portion in the formula bar is the part that I want to reference I11 so it reads it and knows that the 4th criteria is the 'code' column and the criteria is '>7'. I can separate this into 2 separate cells if need be.
I've tried a few combinations of VLOOKUP and INDEX MATCH but cannot get it to work.
Data as Text:
Price,Type,sub cat,Time,code,amount,Result,,
,,,,,,,,
9.95,t2,d,ac,2.18," 22,780,893 ",0.73,,T2 and D and AC
118.94,u2,d,bo,2.78," 172,110,893 ",4.07,,
57.63,t1,u,ac,7.09," 128,419,877 ",-2.16,,code
8.88,t2,d,ac,1.50," 62,634,868 ",12.72,,amount < 100 000 000
11.61,u1,u,ac,2.14," 146,982,736 ",1.07,,price >10
13.46,u3,u,ac,0.93," 17,513,672 ",-13.93,,
31.53,t1,u,ac,0.89," 47,170,877 ",1.39,,
16.34,t3,d,bo,1.07," 1,914,767,076 ",-1.42,,
111.59,u1,d,bo,0.62," 2,283,546,000 ",0.67,,
72.4,u3,d,bo,10.37," 951,541,514 ",1.13,,
34.55,u3,d,bo,0.77," 951,541,514 ",-2.52,,
42.25,t1,d,bo,1.05," 63,748,352 ",8.88,,
17.18,u3,u,ac,2.64," 140,217,257 ",4.35,,
97.66,t1,d,bo,3.45," 1,070,383,954 ",1.33,,
58.49,t2,u,bo,8.64," 151,876,559 ",-0.92,,
64.48,t2,d,ac,2.35," 291,967,334 ",3.03,,
38.4,t1,u,ac,17.05," 83,478,472 ",-4.31,,
20.87,u3,d,ac,28.92," 214,080,937 ",-2.16,,
36.53,t1,d,ac,1.43," 73,438,589 ",-2.07,,
89.16,t3,u,ac,1.41," 26,786,958 ",-1.75,,
15.84,t1,u,bo,2.90," 133,560,818 ",1.76,,
3.2,u3,u,bo,2.95," 215,677,667 ",-1.06,,
25.46,t1,d,bo,3.92," 57,148,431 ",1.89,,
40,t2,d,ac,8.00," 65,274,903 ",0.61,,
27.72,t1,u,ac,2.50," 381,400,886 ",6.46,,
29.07,u3,u,ac,2.32," 52,632,107 ",-0.78,,
173.31,t1,d,ac,3.58," 31,547,380 ",-4.92,,
18.22,u3,d,ac,0.58," 292,669,493 ",4.06,,
9.59,t1,d,bo,3.60," 266,883,020 ",3.16,,
115.22,t2,d,bo,4.51," 132,376,476 ",0.78,,
64.48,u3,d,ac,3.03," 338,360,104 ",-0.95,,
41.74,t1,u,bo,25.65," 245,766,436 ",-3.42,,
5.99,t3,u,bo,2.15," 175,054,713 ",-4.37,,
Use INDEX/MATCH to return the correct column. This will require that you separate the column name and the criteria:
=AVERAGEIFS(G:G,B:B,"T2",C:C,"D",D:D,"AC",INDEX(A:F,0,MATCH(I11,$A$7:$G$7,0)),J11)
An idea:
I10 - "Write down the limitation. (You have to use <,>,=,<> AND the value, for e.g.: <5)"
I11 - The user can use relations and values.
In J11, you can reference to I11 ;) It works for me.

Excel sum based on matrix condition and multiple criteria

Following from the example here I'm trying to add additional conditions to a sum formula. I've represented an example below:
The output that I'm looking for for example for Jan 2017 is
2017
1
UP A 1
UP B 6
UP C 6
DOWN A 1
DOWN B 8
DOWN C 7
I tried with the following formula:
=MMULT(--($B$17:$C$17="X"),MATCH(1,($A23=$C$2:$C$14)*(C$21=$A$2:$A$14)*(C$22=$B$2:$B$14)*($E$2:$E$14=$D$2:$D$14),0))
but I get a N/A value.
Does anyone know it if is possible to do it?
In your first example the number of rows in array1 and number of columns in array2 were equal, five. Here you have two columns and 13 rows. That they are unequal here is part (all) of the reason why you are having an issue.
Also your match function is returning a Boolean not an array
I have a way to do this using matrix condition and multiple criteria but had to change problem up a bit, see photo for example:
{=MMULT(--(D18:P18="x"),E$2:E$14*(--(A$2:A$14=$C$21)*--(B$2:B$14=$C$22)*--(C$2:C$14=A24)))"
https://i.stack.imgur.com/FEvgR.png
You can create a formula to fill the second matrix with X's see below
=IF(OR(INDIRECT("D"&VALUE(D20))=$A$18,INDIRECT("D"&VALUE(D20))=$B$18),"X","")
https://i.stack.imgur.com/4rS4L.png
That being said I don't think this is particularly efficient as you are treating the one of the matrixes as a all 1's so you basically just adding an extra criteria / Boolean with added complexity....that being said u asked for this specifically and I believe that I have delivered that LOL
Just add two SUMIFS together.
=SUMIFS($E$2:$E$14, $A$2:$A$14, C$21, $B$2:$B$14, C$22, $C$2:$C$14, $A23, $D$2:$D$14, IF(INDEX($B$17:$C$19, MATCH($B23, $A$17:$A$19, 0), 1)="x", $B$16))+
SUMIFS($E$2:$E$14, $A$2:$A$14, C$21, $B$2:$B$14, C$22, $C$2:$C$14, $A23, $D$2:$D$14, IF(INDEX($B$17:$C$19, MATCH($B23, $A$17:$A$19, 0), 2)="x", $C$16))

Find values occurring in multiple columns in excel

I have sets of gene probes that are upregulated when put under different chemical stresses. Each column contains all of the upregulated gene probes. I have 12 columns, how do I get a list of gene probes that appear in all 12 columns?
I've been able to find similarities between two columns using the formula
=IF(ISERROR(MATCH(A2,$C$2:$C$21473,0)),"",A2)
but cant work out how to adapt it to include 12 columns
G.Ac G.As G.At G.Ac.At G.As.Ac G.As.At G.Cd G.Cu G.Ni
G.Cd.Cu G.Cd.Ni G.Ni.Cu
GENE:JGI_V11_3346220103 GENE:JGI_V11_2653050203 GENE:JGI_V11_3299790103
GENE:JGI_V11_359040103 GENE:JGI_V11_2228010103 GENE:JGI_V11_2662750203
GENE:JGI_V11_1926920303 GENE:JGI_V11_3134270303 GENE:JGI_V11_3119540303
GENE:JGI_V11_3134270203 GENE:JGI_V11_1926920303 GENE:JGI_V11_3134270303
GENE:JGI_V11_3164760203 GENE:JGI_V11_565470303 GENE:JGI_V11_2296170203
GENE:JGI_V11_2045300203 GENE:JGI_V11_2421620203 GENE:JGI_V11_2228010303
GENE:JGI_V11_2196580303 GENE:JGI_V11_3134270203 GENE:JGI_V11_3119540203
GENE:JGI_V11_1926920103 GENE:JGI_V11_1926920103 GENE:JGI_V11_1014720202
GENE:JGI_V11_478830203 GENE:JGI_V11_3168730303 GENE:JGI_V11_3311070202
GENE:JGI_V11_3216620102 GENE:JGI_V11_2653050303 GENE:JGI_V11_3300140202
GENE:JGI_V11_2653050303 GENE:JGI_V11_1159220202 GENE:JGI_V11_2024180303
GENE:JGI_V11_1926920303 GENE:JGI_V11_2196580303 GENE:JGI_V11_1159220202
GENE:JGI_V11_3164760303 GENE:JGI_V11_2228010203 GENE:JGI_V11_2341670203
GENE:JGI_V11_1938910303 GENE:JGI_V11_3026230203 GENE:JGI_V11_2449230203
GENE:JGI_V11_3134270303 GENE:JGI_V11_2235750203 GENE:JGI_V11_1981410203
GENE:JGI_V11_3251310202 GENE:JGI_V11_977750103 GENE:JGI_V11_954070203
GENE:JGI_V11_2267320203 GENE:JGI_V11_2268000303 GENE:JGI_V11_2226270101
GENE:JGI_V11_3003640303 GENE:JGI_V11_223520203 GENE:JGI_V11_2662750103
GENE:JGI_V11_2228010103 GENE:JGI_V11_3251310202 GENE:JGI_V11_3198630203
GENE:JGI_V11_3134270303 GENE:JGI_V11_1926920203 GENE:JGI_V11_287750103
GENE:JGI_V11_465160203 GENE:JGI_V11_2268000203 GENE:JGI_V11_2473230303
GENE:JGI_V11_3192220102 GENE:JGI_V11_3026230303 GENE:JGI_V11_3039310303
GENE:JGI_V11_1926920103 GENE:JGI_V11_1159220102 GENE:JGI_V11_3052790202
GENE:JGI_V11_3075830303 GENE:JGI_V11_2196580203 GENE:JGI_V11_3134280203
GENE:JGI_V11_3142970303 GENE:JGI_V11_503720303 GENE:JGI_V11_2236410103
GENE:JGI_V11_3042230103 GENE:JGI_V11_2228010203 GENE:JGI_V11_3028210101
GENE:JGI_V11_2105710303 GENE:JGI_V11_1926920303 GENE:JGI_V11_2131620103
GENE:JGI_V11_1002840203 GENE:JGI_V11_2088480203 GENE:JGI_V11_3196120102
Heres the first 8 rows of the 12 columns. There are 21473 rows in total.
Thanks
You could use an array formula like this to count how many columns a particular gene probe occurs in
=SUM(--(MMULT(TRANSPOSE(ROW(A$2:L$10000)^0),N(A$2:L$10000=A2))>0))
This is a standard way of getting column totals for a 2D array - in this case an array of true/false values corresponding to instances of an array element being equal/unequal to A2.
It is rather a brute force approach - it needs ~120K multiplications for each row. If you copy the formula down for ~10K rows, there is a delay of ~100 seconds on my computer while Excel works out the results.
Must be entered as an array formula using CtrlShiftEnter
In this dummy data C is the only value that occurs in all 12 columns.

Python Pandas: Average column if

In MS Excel there is a handy formula =AVERAGEIF(values, criteria).
Is there a similar way to average values within one columns that conform to certain condition?
I have a column of values in my data frame from -5000 to +5000.
I need to average values between -5000 <= x < 0
And separately average values between 0 < x <= 5000.
NOTE: I'd like to avoid applying Boolean mask and therefore creating new dataframe, because I have lots of columns.
Any help, suggestions, or edits to this post are welcome.
Using Boolean mask actually does what I need.
df[df>0].mean(axis=0,skipna=True,numeric_only=True)
It returns as many single values as I have columns. Perfect!

Resources