Excel AVERAGEIFS looking up ONE of the criteria columns - excel

I have built a large data set and I need to see the average results given many different criteria. I've done this with the AVERAGEIFS function and it works just fine, however the more and more I add its getting really time intensive.
I'm wondering if there is a way to nest a vlookup or index match or anything like that in the AVERAGEIFS that read the criteria column heading and criteria in a cell (or 2 if they need to be separated) to be added to the AVERAGEIFS.
Here is an example of my spreadsheet:
The first 3 sets of criteria I want to stay locked.
I want it to read what the 4th criteria column and criteria should be by referencing the I11 cell. The highlighted portion in the formula bar is the part that I want to reference I11 so it reads it and knows that the 4th criteria is the 'code' column and the criteria is '>7'. I can separate this into 2 separate cells if need be.
I've tried a few combinations of VLOOKUP and INDEX MATCH but cannot get it to work.
Data as Text:
Price,Type,sub cat,Time,code,amount,Result,,
,,,,,,,,
9.95,t2,d,ac,2.18," 22,780,893 ",0.73,,T2 and D and AC
118.94,u2,d,bo,2.78," 172,110,893 ",4.07,,
57.63,t1,u,ac,7.09," 128,419,877 ",-2.16,,code
8.88,t2,d,ac,1.50," 62,634,868 ",12.72,,amount < 100 000 000
11.61,u1,u,ac,2.14," 146,982,736 ",1.07,,price >10
13.46,u3,u,ac,0.93," 17,513,672 ",-13.93,,
31.53,t1,u,ac,0.89," 47,170,877 ",1.39,,
16.34,t3,d,bo,1.07," 1,914,767,076 ",-1.42,,
111.59,u1,d,bo,0.62," 2,283,546,000 ",0.67,,
72.4,u3,d,bo,10.37," 951,541,514 ",1.13,,
34.55,u3,d,bo,0.77," 951,541,514 ",-2.52,,
42.25,t1,d,bo,1.05," 63,748,352 ",8.88,,
17.18,u3,u,ac,2.64," 140,217,257 ",4.35,,
97.66,t1,d,bo,3.45," 1,070,383,954 ",1.33,,
58.49,t2,u,bo,8.64," 151,876,559 ",-0.92,,
64.48,t2,d,ac,2.35," 291,967,334 ",3.03,,
38.4,t1,u,ac,17.05," 83,478,472 ",-4.31,,
20.87,u3,d,ac,28.92," 214,080,937 ",-2.16,,
36.53,t1,d,ac,1.43," 73,438,589 ",-2.07,,
89.16,t3,u,ac,1.41," 26,786,958 ",-1.75,,
15.84,t1,u,bo,2.90," 133,560,818 ",1.76,,
3.2,u3,u,bo,2.95," 215,677,667 ",-1.06,,
25.46,t1,d,bo,3.92," 57,148,431 ",1.89,,
40,t2,d,ac,8.00," 65,274,903 ",0.61,,
27.72,t1,u,ac,2.50," 381,400,886 ",6.46,,
29.07,u3,u,ac,2.32," 52,632,107 ",-0.78,,
173.31,t1,d,ac,3.58," 31,547,380 ",-4.92,,
18.22,u3,d,ac,0.58," 292,669,493 ",4.06,,
9.59,t1,d,bo,3.60," 266,883,020 ",3.16,,
115.22,t2,d,bo,4.51," 132,376,476 ",0.78,,
64.48,u3,d,ac,3.03," 338,360,104 ",-0.95,,
41.74,t1,u,bo,25.65," 245,766,436 ",-3.42,,
5.99,t3,u,bo,2.15," 175,054,713 ",-4.37,,

Use INDEX/MATCH to return the correct column. This will require that you separate the column name and the criteria:
=AVERAGEIFS(G:G,B:B,"T2",C:C,"D",D:D,"AC",INDEX(A:F,0,MATCH(I11,$A$7:$G$7,0)),J11)

An idea:
I10 - "Write down the limitation. (You have to use <,>,=,<> AND the value, for e.g.: <5)"
I11 - The user can use relations and values.
In J11, you can reference to I11 ;) It works for me.

Related

Excel CUBEVALUE & CUBESET count records greater than a number

I am writing a series of queries to my workbook's data model to retrieve the number of documents by Category_Name which are greater than a certain numbers of days old (e.g. >=650).
Currently this formula (entered in celll C3) returns the correct number for a single Days Old value (=3).
=CUBEVALUE("ThisWorkbookDataModel",
"[Measures].[Count of Docs]",
"[EDD_Report].[Category_Name].&["&$B2&"]",
"[EDD_Report_10-01-18].[Days Old].[34]")
How do I return the number of documents for Days Old values >=650?
The worksheet looks like:
A B C
1 Date PL Count of Docs
2 10/1/2018 ALD 3
3 ...
UPDATE: As suggested in #ama 's answer below, the expression in step B did not work.
However, I created a subset of the Days Old values using
=CUBESET("ThisWorkbookDataModel",
"{[EDD_Report_10-01-18].[Days Old].[all].[650]:[EDD_Report_10-01-18].[Days Old].[All].[3647]}")
The cell containing this cubeset is referenced as the third Member_expression of the original CUBEVALUE formula. The limitation is now that the values for the beginning and end must be members of the Days Old set.
This is limiting, in that, I was hoping for a more general test for >=650 and there is no way to guarantee that specific values of Days Old will be in the query.
First time I hear about CUBE, so you got me curious and I did some digging. Definitely not an expert, but here is what I found:
MDX language should allow you to provide value ranges in the form of {[Table].[Field].[All].[LowerBound]:[Table].[Field].[All].[UpperBound]}.
A. Get the total number of entries:
D3 =CUBEVALUE("ThisWorkbookDataModel",
"[Measures].[Count of Docs]",
"[EDD_Report].[Category_Name].&["&$B2&"]"),
"{[EDD_Report_10-01-18].[Days Old].[All]")
B. Get the number of entries less than 650:
E3 =CUBEVALUE("ThisWorkbookDataModel",
"[Measures].[Count of Docs]",
"[EDD_Report].[Category_Name].&["&$B2&"]"),
"{[EDD_Report_10-01-18].[Days Old].[All].[0]:[EDD_Report_10-01-18].[Days Old].[All].[649]}")
Note I found something about using .[All].[650].lag(1)} but I think for it to work properly your data might need to be sorted?
C. Substract
C3 =D3-E3
Alternatively, go for the quick and dirty:
=CUBEVALUE("ThisWorkbookDataModel",
"[Measures].[Count of Docs]",
"[EDD_Report].[Category_Name].&["&$B2&"]"),
"{[EDD_Report_10-01-18].[Days Old].[All].[650]:[EDD_Report_10-01-18].[Days Old].[All].[99999]}")
Hope this helps and do let me know, I am still curious!

Find values occurring in multiple columns in excel

I have sets of gene probes that are upregulated when put under different chemical stresses. Each column contains all of the upregulated gene probes. I have 12 columns, how do I get a list of gene probes that appear in all 12 columns?
I've been able to find similarities between two columns using the formula
=IF(ISERROR(MATCH(A2,$C$2:$C$21473,0)),"",A2)
but cant work out how to adapt it to include 12 columns
G.Ac G.As G.At G.Ac.At G.As.Ac G.As.At G.Cd G.Cu G.Ni
G.Cd.Cu G.Cd.Ni G.Ni.Cu
GENE:JGI_V11_3346220103 GENE:JGI_V11_2653050203 GENE:JGI_V11_3299790103
GENE:JGI_V11_359040103 GENE:JGI_V11_2228010103 GENE:JGI_V11_2662750203
GENE:JGI_V11_1926920303 GENE:JGI_V11_3134270303 GENE:JGI_V11_3119540303
GENE:JGI_V11_3134270203 GENE:JGI_V11_1926920303 GENE:JGI_V11_3134270303
GENE:JGI_V11_3164760203 GENE:JGI_V11_565470303 GENE:JGI_V11_2296170203
GENE:JGI_V11_2045300203 GENE:JGI_V11_2421620203 GENE:JGI_V11_2228010303
GENE:JGI_V11_2196580303 GENE:JGI_V11_3134270203 GENE:JGI_V11_3119540203
GENE:JGI_V11_1926920103 GENE:JGI_V11_1926920103 GENE:JGI_V11_1014720202
GENE:JGI_V11_478830203 GENE:JGI_V11_3168730303 GENE:JGI_V11_3311070202
GENE:JGI_V11_3216620102 GENE:JGI_V11_2653050303 GENE:JGI_V11_3300140202
GENE:JGI_V11_2653050303 GENE:JGI_V11_1159220202 GENE:JGI_V11_2024180303
GENE:JGI_V11_1926920303 GENE:JGI_V11_2196580303 GENE:JGI_V11_1159220202
GENE:JGI_V11_3164760303 GENE:JGI_V11_2228010203 GENE:JGI_V11_2341670203
GENE:JGI_V11_1938910303 GENE:JGI_V11_3026230203 GENE:JGI_V11_2449230203
GENE:JGI_V11_3134270303 GENE:JGI_V11_2235750203 GENE:JGI_V11_1981410203
GENE:JGI_V11_3251310202 GENE:JGI_V11_977750103 GENE:JGI_V11_954070203
GENE:JGI_V11_2267320203 GENE:JGI_V11_2268000303 GENE:JGI_V11_2226270101
GENE:JGI_V11_3003640303 GENE:JGI_V11_223520203 GENE:JGI_V11_2662750103
GENE:JGI_V11_2228010103 GENE:JGI_V11_3251310202 GENE:JGI_V11_3198630203
GENE:JGI_V11_3134270303 GENE:JGI_V11_1926920203 GENE:JGI_V11_287750103
GENE:JGI_V11_465160203 GENE:JGI_V11_2268000203 GENE:JGI_V11_2473230303
GENE:JGI_V11_3192220102 GENE:JGI_V11_3026230303 GENE:JGI_V11_3039310303
GENE:JGI_V11_1926920103 GENE:JGI_V11_1159220102 GENE:JGI_V11_3052790202
GENE:JGI_V11_3075830303 GENE:JGI_V11_2196580203 GENE:JGI_V11_3134280203
GENE:JGI_V11_3142970303 GENE:JGI_V11_503720303 GENE:JGI_V11_2236410103
GENE:JGI_V11_3042230103 GENE:JGI_V11_2228010203 GENE:JGI_V11_3028210101
GENE:JGI_V11_2105710303 GENE:JGI_V11_1926920303 GENE:JGI_V11_2131620103
GENE:JGI_V11_1002840203 GENE:JGI_V11_2088480203 GENE:JGI_V11_3196120102
Heres the first 8 rows of the 12 columns. There are 21473 rows in total.
Thanks
You could use an array formula like this to count how many columns a particular gene probe occurs in
=SUM(--(MMULT(TRANSPOSE(ROW(A$2:L$10000)^0),N(A$2:L$10000=A2))>0))
This is a standard way of getting column totals for a 2D array - in this case an array of true/false values corresponding to instances of an array element being equal/unequal to A2.
It is rather a brute force approach - it needs ~120K multiplications for each row. If you copy the formula down for ~10K rows, there is a delay of ~100 seconds on my computer while Excel works out the results.
Must be entered as an array formula using CtrlShiftEnter
In this dummy data C is the only value that occurs in all 12 columns.

Using tbl.Lookup to match just part of a column value

This question relates to the Schematiq add-in for Microsoft Excel.
Using =tbl.Lookup(table, columnsToSearch, valuesToFind, resultColumn, [defaultValue]) the values in the valuesToFind column have a consistent 3 characters to the left and then varying characters after (e.g. 908-123456 or 908-321654 - i.e. 908 is always consistent)
How can I tell the function to lookup the value based on the first 3 characters only? The expected answer should be the sum of the results of the above, i.e. 500 + 300 = 800
tbl.Lookup() works by looking for an exact match - this helps ensure it's fast but in this case it means you need an extra step to calculate a column of lookup values, something like this:
A2: =tbl.CalculateColumn(A1, "code", "x => LEFT(x, 3)", "startOfCode")
This will give you a new column that you can use for the columnsToSearch argument, however tbl.Lookup() also looks for just one match - it doesn't know how to combine values together if there is more than one matching row in the table, so I think you also need one more step to group your table by the first 3 chars of the code, like this:
A3: =tbl.Group(A2, "startOfCode", "amount")
Because tbl.Group() adds values together by default, this will give you a table with a row for each distinct value of startOfCode and the subtotal of amount for each of those values. Finally, you can do the lookup exactly as you requested, which for your input table will return 800:
A4: =tbl.Lookup(A3, "startOfCode", "908", "amount")

Excel - 2 tables - If 2 cells in a single row match, return another cell of same row

Working with 2 separate data sets (with duplicates)
Dataset is unique identified by an ID.
There may not be an entry for the timestamp I require.
Datasets are quite large, and due to duplicates, can't use vlookup.
Samples:
Table 1:
Device Name|Time Bracket| On/Off?
ID1 |06:20:00 |
ID2 |06:20:00 |
ID3 |06:30:00 |
Table 2:
Device Name |Timestamp |On/Off?
ID1 |06:20:00 |On
ID2 |06:50:00 |Off
ID3 |07:20:00 |Off
What I want to achieve:
I want an if statement to check if:
1) device ID matches AND
2) timestamp matches
If so, return the value of On/Off from Table 2.
If not, then I want it to return the value of the cell above it IF it's the same device, otherwise just put "absent" into the cell.
I thought I could do this with some IF statements like so:
=if(HOUR([#[Time Bracket]]) = HOUR(Table13[#[Timestamp Rounded (GMT)]]) and
minute([#[Time Bracket]]) = minute(Table13[#[Timestamp Rounded (GMT)]]) and
[#[Device Name]]=Table13[#[Device Name]], Table13[#[On/Off?]],
IF([#[Device Name]]=Table13[#[Device Name]], INDIRECT("B" and Rows()-1), "absent"))
(I put some newlines in there for readability)
However, this doesn't seem to resolve at all... what am I doing wrong?
Is this even the correct way of achieving this?
I've also tried something similar with a VLookUp, but that failed horribly.
Thanks all!
To not deal with array formulas or merging strings which, (not in your case) can still be wrong at the end, I suggest the use of COUNTIFS due to the fact, you have a very small amount of outcomes (just on or off)...
for the first table (starting at A1, so the formula is at C2):
=IFERROR(CHOOSE(
OR(COUNTIFS(Table13[Device Name],[#[Device Name]],Table13[Timestamp],[#[Time Bracket]],Table13[On/Off?],"On"))+
OR(COUNTIFS(Table13[Device Name],[#[Device Name]],Table13[Timestamp],[#[Time Bracket]],Table13[On/Off?],"Off"))*2
,"On","Off","Error"),IF(A1=[#[Device Name]],C1,"Absent"))
this will also show "Error" of a match for "On" and "Off" is shown... to skip that and increase the speed, you also could use:
=IF(COUNTIFS(Table13[Device Name],[#[Device Name]],Table13[Timestamp],[#[Time Bracket]],Table13[On/Off?],"On"),"On",
IF(COUNTIFS(Table13[Device Name],[#[Device Name]],Table13[Timestamp],[#[Time Bracket]],Table13[On/Off?],"Off"),"Off",
IF(A1=[#[Device Name]],C1,"Absent")))
For both the "Device Name" is at column A, "Time Bracket" at column B and "On/Off?" at column C while the table starts at row 1... If that is not the case for you, then change A1 and C1 so they match
(Also inserted line-breaks for better reading)
Picture to show the layout:
I picked the second formula to show how it works... also, this formula should not be able to return 0's... I'm confused
Couple of good suggestions, however using the helper column as suggested in the topic by Scott Craner above worked.
Created a helper column of concat'd device ID and timestamp for both tables, then did a simple VlookUp.
Another lesson learned: Think outside of the box, and go with simple solutions, rather than try + be too clever like I was doing... :)

Return a row number that matches multiple criteria in vbs excel

I need to be able to search my whole table for a row that matches multiple criteria. We use a program that outputs data in the form of a .csv file. It has rows that separate sets of data, each of these headers don't have any columns that are unique in of them self but if i searched the table for multiple values i should be able to pinpoint each header row. I know i can use Application.WorksheetFunction.Match to return a row on a single criteria but i need to search on two three or four criteria.
In pseudo-code it would be something like this:
Return row number were column A = bill & column B = Woods & column C = some other data
We need to work with arrays:
There are 2 kinds of arrays:
numeric {1,0,1,1,1,0,0,1}
boolean {TRUE,FALSE,TRUE,TRUE,TRUE,FALSE,FALSE,TRUE}
to convert between them we can use:
MATCH function
MATCH(1,{1,0,1,1,1,0,0,1},0) -> will result {TRUE,FALSE,TRUE,TRUE,TRUE,FALSE,FALSE,TRUE}
simple multiplication
{TRUE,FALSE,TRUE,TRUE,TRUE,FALSE,FALSE,TRUE}*{TRUE,FALSE,TRUE,TRUE,TRUE,FALSE,FALSE,TRUE} -> will result {1,0,1,1,1,0,0,1}
you can can check an array in the match function, entering it like in the picture below, be warned that MATCH function WILL TREAT AN ARRAY AS AN "OR" FUNCTION (one match will result in true
ie:
MATCH(1,{1,0,1,1,1,0,0,1},0)=TRUE
, YOU MUST CTR+SHIFT+ENTER !!! FOR IT TO GIVE AN ARRAY BACK!!!
in the example below i show that i want to sum the hours of all the employees except the admin per case
we have 2 options, the long simple way, the complicated fast way:
long simple way
D2=SUMPRODUCT(C2:C9,(A2=A2:A9)*("admin"<>B2:B9)) <<- SUMPRODUCT makes a multiplication
basically A1={2,3,11,3,2,4,5,6}*{0,1,1,0,0,0,0,0} (IT MUST BE A NUMERIC ARRAY TO THE RIGHT IN SUMPRODUCT!!!)
ie: A1=2*0+3*1+11*1+3*0+2*0+4*0+5*0+6*0
this causes a problem because if you drag the cell to autocomplete the rest of the cells, it will edit the lower and higher values of
ie: D9=SUMPRODUCT(C9:C16,(A9=A9:A16)*("admin"<>B9:B16)), which is out of bounds
same as the above if you have a table and want to view the results in a diferent order
the fast complicated way
D3=SUMPRODUCT(INDIRECT("c2:c9"),(A3=INDIRECT("a2:a9"))*("admin"<>INDIRECT("b2:b9")))
it's the same, except that INDIRECT was used on the cells that we want not be modified when autocompleting or table reorderings
be warned that INDIRECT sometimes give VOLATILE ERROR,i recommend not using it on a single cell or using it only once in an array
f* c* i cant post pictures :(
table is:
case emplyee hours totalHoursPerCaseWithoutAdmin
1 admin 2 14
1 him 3 14
1 her 11 14
2 him 3 5
2 her 2 5
3 you 4 10
3 admin 5 10
3 her 6 10
and for the functions to check the arrays, open the insert function button (it looks like and fx) then doubleclick MATCH and then if you enter inside the Lookup_array a value like
A2=A2:A9 for our example it will give {TRUE,TRUE,TRUE,FALSE,FALSE,FALSE,FALSE,FALSE} that is because only the first 3 lines are from case=1
Something like this?
Assuming that you data in in A1:C20
I am looking for "Bill" in A, "Woods" in B and "some other data" in C
Change as applicable
=IF(INDEX(A1:A20,MATCH("Bill",A1:A20,0),1)="Bill",IF(INDEX(B1:B20,MATCH("Woods",B1:B20,0),1)="Woods",IF(INDEX(C1:C20,MATCH("some other data",C1:C20,0),1)="some other data",MATCH("Bill",A1:A20,0),"Not Found")))
SNAPSHOT
I would use this array* formula (for three criteria):
=MATCH(1,((Range1=Criterion1)*(Range2=Criterion2)*(Range3=Criterion3)),0)
*commit with Ctrl+Shift+Enter

Resources