Related
I'm creating a set of formulas to analyze different sets of json data. I would like to show the uniqueness for each field in the dataset and the top 3 values per field. The json data is pasted on one of the sheets, and the results of my analyses are shown on a different sheet.
An example of some arbitrary raw data:
For this dataset I can create the following formulas (all similar coloured cells are matrix formulas):
Cell A1 contains a formula that dynamically returns all headers (yellow). If the pasted data contains more fields, this list expands automatically. The pink area also grows or shrinks based on the amount of records and fields in the raw data.
What I would like to know is how to setup the following formulas:
Row 2: Return if the values are either all unique, or how many variations are there within each column. I allready have the formula for a single column, but I would like a matrix formula so that it automatically grows or shrinks as well.
Row 3 to 5: Return the top 3 of values within each column.
An example of the header formula (yellow):
=LET(SUB,INDIRECT("A8:"&ADDRESS(8,number_of_fields)),SUBSTITUTE(SUBSTRING(SUB,1,FIND(":",SUB)-1),"""","")
(formula translated from dutch syntax)
I know how to manually copy the formulas over, but I'm sure it's possible to convert this into a matrix formula. For example, is there a function like Repeat, but for formulas repeating for x amount of cells?
Edit after answer: Getting close! The top 3 is almost working as intended. The answer below creates the following result on a more complex dataset:
It sometimes leaves a cell empty in the top 3 for that column. Preferably the top 3 values bubble up to the top, where it populates row 2 and 3 if the column only contains 2 variations.
Maybe a little too literal, but the following formula will spill the top 3 and the splitted data as shown in the picture
=LET(data,TRIM(Sheet1!A1:A9),
f,FILTER(data,LEFT(data,1)=""""),
split,DROP(REDUCE(0,f,LAMBDA(a,b,VSTACK(a,TEXTSPLIT(b,",")))),1),
header,SUBSTITUTE(TEXTSPLIT(TAKE(split,1),":"),"""",""),
s,SEQUENCE(1,COLUMNS(split)),
count,DROP(REDUCE(0,s,LAMBDA(a,b,HSTACK(a,MMULT(--(TRANSPOSE(INDEX(split,,b))=INDEX(split,,b)),SEQUENCE(ROWS(f),,1,0))))),,1),
comb,split&" ("&count&")",
allunique,DROP(IFERROR(REDUCE(0,s,LAMBDA(a,b,HSTACK(a,UNIQUE(INDEX(comb,,b))))),""),,1),
fq,DROP(REDUCE(0,s,LAMBDA(a,b,HSTACK(a,ROWS(f)-FREQUENCY(XMATCH(INDEX(split,,b),INDEX(split,,b)),XMATCH(INDEX(split,,b),INDEX(split,,b)))))),-1,1),
_top3,TAKE(REDUCE(0,s,LAMBDA(a,b,HSTACK(a,SORTBY(INDEX(allunique,,b),INDEX(fq,,b),1)))),3,-COLUMNS(split)),
IFERROR(VSTACK(header,_top3,"","",split),""))
split is all data (below),
_top3 is the top 3 of the frequency of the text per column.
You may only need the _top3 data though..
If I'm not mistaken, this would be the Dutch variant:
=LET(data;SPATIES.WISSEN(A1:A9);
f;FILTER(data;LINKS(data;1)="""");
split;WEGLATEN(REDUCE(0;f;LAMBDA(a;b;VERT.STAPELEN(a;TEKST.SPLITSEN(b;","))));1);
header;SUBSTITUEREN(TEKST.SPLITSEN(NEMEN(WEGLATEN(REDUCE(0;f;LAMBDA(a;b;VERT.STAPELEN(a;TEKST.SPLITSEN(b;","))));1);1);":");"""";"");
s;REEKS(1;KOLOMMEN(WEGLATEN(REDUCE(0;f;LAMBDA(a;b;VERT.STAPELEN(a;TEKST.SPLITSEN(b;","))));1)));
count;WEGLATEN(REDUCE(0;s;LAMBDA(a;b;HOR.STAPELEN(a;PRODUCTMAT(--(TRANSPONEREN(INDEX(WEGLATEN(REDUCE(0;f;LAMBDA(a;b;VERT.STAPELEN(a;TEKST.SPLITSEN(b;","))));1);;b))=INDEX(WEGLATEN(REDUCE(0;f;LAMBDA(a;b;VERT.STAPELEN(a;TEKST.SPLITSEN(b;","))));1);;b));REEKS(RIJEN(f);;1;0)))));;1);
comb;split&" ("&count&")";
allunique;WEGLATEN(ALS.FOUT(REDUCE(0;s;LAMBDA(a;b;HOR.STAPELEN(a;UNIEK(INDEX(comb;;b)))));"");;1);
fq;WEGLATEN(REDUCE(0;s;LAMBDA(a;b;HOR.STAPELEN(a;RIJEN(f)-INTERVAL(X.VERGELIJKEN(INDEX(split;;b);INDEX(split;;b));X.VERGELIJKEN(INDEX(split;;b);INDEX(split;;b))))));-1;1);
_top3;NEMEN(REDUCE(0;s;LAMBDA(a;b;HOR.STAPELEN(a;SORTEREN.OP(INDEX(allunique;;b);INDEX(fq;;b);1))));3;-KOLOMMEN(split));
ALS.FOUT(VERT.STAPELEN(header;_top3;"";"";split);""))
(I'm Dutch, but I'm not familiar with the Dutch equivalents of the newer functions, since I work with English version and support is contradicting in some times:
NEMEN might be TAKE, since it's listed as NEMEN here https://support.microsoft.com/nl-nl/office/excel-functies-alfabetisch-b3944572-255d-4efb-bb96-c6d90033e188#bm14, but if you click for it, it shows explanation for TAKE in Dutch (https://support.microsoft.com/nl-nl/office/take-functie-25382ff1-5da1-4f78-ab43-f33bd2e4e003) ).
Edit:
To "drop" the trailing boolean column you can add another condition to DROP (WEGLATEN):
WEGLATEN([data],1,-1) this means dropping the first row of the data (condition 1) and it's last column (condition -1):
=LET(data;SPATIES.WISSEN(A1:A9);
f;FILTER(data;LINKS(data;1)="""");
split;WEGLATEN(REDUCE(0;f;LAMBDA(a;b;VERT.STAPELEN(a;TEKST.SPLITSEN(b;","))));1;-1);
header;SUBSTITUEREN(TEKST.SPLITSEN(NEMEN(WEGLATEN(REDUCE(0;f;LAMBDA(a;b;VERT.STAPELEN(a;TEKST.SPLITSEN(b;","))));1);1);":");"""";"");
s;REEKS(1;KOLOMMEN(WEGLATEN(REDUCE(0;f;LAMBDA(a;b;VERT.STAPELEN(a;TEKST.SPLITSEN(b;","))));1)));
count;WEGLATEN(REDUCE(0;s;LAMBDA(a;b;HOR.STAPELEN(a;PRODUCTMAT(--(TRANSPONEREN(INDEX(WEGLATEN(REDUCE(0;f;LAMBDA(a;b;VERT.STAPELEN(a;TEKST.SPLITSEN(b;","))));1);;b))=INDEX(WEGLATEN(REDUCE(0;f;LAMBDA(a;b;VERT.STAPELEN(a;TEKST.SPLITSEN(b;","))));1);;b));REEKS(RIJEN(f);;1;0)))));;1);
comb;split&" ("&count&")";
allunique;WEGLATEN(ALS.FOUT(REDUCE(0;s;LAMBDA(a;b;HOR.STAPELEN(a;UNIEK(INDEX(comb;;b)))));"");;1);
fq;WEGLATEN(REDUCE(0;s;LAMBDA(a;b;HOR.STAPELEN(a;RIJEN(f)-INTERVAL(X.VERGELIJKEN(INDEX(split;;b);INDEX(split;;b));X.VERGELIJKEN(INDEX(split;;b);INDEX(split;;b))))));-1;1);
_top3;NEMEN(REDUCE(0;s;LAMBDA(a;b;HOR.STAPELEN(a;SORTEREN.OP(INDEX(allunique;;b);INDEX(fq;;b);1))));3;-KOLOMMEN(split));
ALS.FOUT(VERT.STAPELEN(header;_top3;"";"";split);""))
And to cope with columns where there's less than 3 top ranked values:
=LET(data,TRIM(Sheet1!A1:A9),
f,FILTER(data,LEFT(data,1)=""""),
split,DROP(REDUCE(0,f,LAMBDA(a,b,VSTACK(a,TEXTSPLIT(b,",")))),1),
header,SUBSTITUTE(TEXTSPLIT(TAKE(split,1),":"),"""",""),
s,SEQUENCE(1,COLUMNS(split)),
count,DROP(REDUCE(0,s,LAMBDA(a,b,HSTACK(a,MMULT(--(TRANSPOSE(INDEX(split,,b))=INDEX(split,,b)),SEQUENCE(ROWS(f),,1,0))))),,1),
comb,split&" ("&count&")",
allunique,DROP(IFERROR(REDUCE(0,s,LAMBDA(a,b,HSTACK(a,UNIQUE(INDEX(comb,,b))))),""),,1),
fq,DROP(REDUCE(0,s,LAMBDA(a,b,HSTACK(a,ROWS(f)-FREQUENCY(XMATCH(INDEX(split,,b),INDEX(split,,b)),XMATCH(INDEX(split,,b),INDEX(split,,b)))))),-1,1),
_top3,TAKE(REDUCE(0,s,LAMBDA(a,b,HSTACK(a,SORTBY(INDEX(allunique,,b),INDEX(fq,,b),1)))),3,-COLUMNS(split)),
_top3minus,DROP(IFERROR(REDUCE(0,s,LAMBDA(a,b,HSTACK(a,FILTER(INDEX(_top3,,b),INDEX(_top3,,b)<>"")))),""),,1),
IFERROR(VSTACK(header,_top3minus,"","",split),""))
I'm trying to translate a machine-learning model into excel, so that data analysts could play with it interactively.
I'd like to transform a categorical variable into dummy representation:
WeekDay
Monday
Thursday
to
WeekDay
{1,0,0,0,0,0,0}
{0,0,0,1,0,0,0}
Using excel arrays.
I tried this:
={INT(A1="Monday"),INT(A1="Tuesday"),INT(A1="Wednesday"), ...}
However, for some reason, excel doesn't accept forumlas in array expressions.
This approach does work, but it is problematic - since it does not allow combinig multiple arrays into one
=IF(A1="Monday", {1,0,0,0,0,0,0}, IF(A1="Tuesday", {0,1,0,0,0,0,0}, ....))
Also, it's super ugly
Any ideas ?
To get your array you can use INDEX like this:
INDEX(IF(TEXT(ROW($2:$8),"dddd")=A1,1,0),0)
This returns a vertical array.
to return a horizontal array use:
INDEX(IF(TEXT(COLUMN($B:$H),"dddd")=A2,1,0),0)
I have spilled the results of the array in the photo below:
If one has the Dynamic Array Formula SEQUENCE the ROW and COLUMN can be replaced with:
SEQUENCE(7,,2)
and
SEQUENCE(,7,2)
Respectively
I am not sure what the end goal is, but I would suggest instead of an array (again, assuming you are not utilizing VBA, and even if you are, this is very possible with Case functions):
Use the formula =Weekday(Cell,different return type for which day should = 1)
If you were to use the actual date (e.g. 5/1/2020), display it through custom formatting with only the day of the week (typically data analysis will already have the full date),
Cell Value A20 = "5/1/2020", format display Long Date - "Friday, May 1, 2020"
cell referencing with formula '=WEEKDAY(A20,1)' = 6
cell referencing with formula '=WEEKDAY(A20,2)' = 5 (is this similar to {0,0,0,0,1,0,0} enough for you to use?)
if using VBA, you could define a range with logic to turn this into {0,0,0,0,1,0,0}
Hope this helps.
This was taken and improved slightly from Question that has since been deleted
For those who can see deleted posts, it was taken from here: https://stackoverflow.com/questions/39793322/three-dimensional-lookup-no-concatenate-or-named-ranges-excel
I'm trying to do a three dimensional lookup without named ranges or concatenates. Simplified, my data is on the form:
Column1 Column2 Column3
Scott
P 1 2 3
M 4 5 6
N 7 8 9
George
P 10 11 12
M 13 14 15
N 16 17 18
I now want to search for a specific Name and then for a specific letter within that names table, I then want to match this row number with a specific column.
I tried a simple INDEX/MATCH:
=INDEX(A:D,MATCH("M",A:A,0),MATCH("Column1",1:1,0))
And that works for the fist name but not any others as it finds the first instance of M.
How do I modify it to look for a different name?
I have answered below, but want to see if someone has a better solution.
I used an IF() statement array formula to find what the P row number was after the George row... I also needed to use the MIN() function to get the first P row number after the name.
Beyond that, it's a simple INDEX() function.... that racked my brain for over an hour :).
=INDEX($A$1:$D$9,MIN(IF((ROW(A1:A9)>MATCH($F$4,A1:A9,0))*(A1:A9=$F$5),ROW(A1:A9),"")),MATCH($F$6,$A$1:$D$1,0))
Don't Forget!
Use Ctrl+Shift+Enter when finishing the formula, so it gets evaluated as an array formula.
You can use two other INDEX/MATCH's inside the first MATCH to set the lookup range. Then you simply need to add the MATCH() to find the absolute position of the name.
=INDEX(A:D,MATCH($H$4,INDEX(A:A,MATCH($H$3,A:A,0)):INDEX(A:A,MATCH($H$3,A:A,0)+4),0)+MATCH($H$3,A:A,0)-1,MATCH($H$5,$1:$1,0))
This one works better and does not have a size constraint:
=INDEX(A:D,MATCH(F4,INDEX(A:A,MATCH(F3,A:A,0)):A1040000,0)+MATCH(F3,A:A,0)-1,MATCH(F5,A1:D1,0))
You can do this just by adding the results of two matches together. One match for the names plus one match for the letter equals the total row.
=INDEX(A:D,MATCH(G5,A3:A5,0)+MATCH(G3,A:A,0),MATCH(G4,1:1,0))
In other words: Index(All of the Data, Match(Name, In name column, exact) + Match(Letter, In letter column, exact), Match(Column name, in Column row, exact)
Screen capture of working sheet
My answer attempts the general case with only one caveat:
That a letter is single character text, and a name is more than 1 character. Otherwise i feel there is no difference logically between letters and names, and it is then impossible to really do...
RE-EDIT for better function construction:
{=INDEX($A$1:$D$17, MATCH($H$3,$A1:$A17, 0)+MATCH($H$4, INDEX($A1:$A17, MATCH($H$3,$A1:$A17, 0)):INDEX($A:$A, SMALL(IFERROR(MATCH($H$3,$A1:$A17, 0)+POWER(SQRT(IF(LEN($A$1:$A$17)>1, ROW($A$1:$A$17), 0)-MATCH($H$3,$A$1:$A$17, 0)), 2)-1, ROWS($A$1:$A$17)), 2)), 0)-1, MATCH($H$5, $A$1:$D$1, 0))}
This uses an array formula along column A, and checks if the length is > 1 and throws the row nums into an array, with letters given a 0.
Then match row of unique name(e.g. George) is subtracted from each.
We then use a min(of all other name rows, with the last data row as the final default - SMALL function with 2 parameter) to find the next name row(or last data row if there is no following name).
Rest is standard index/match etc.
It will correctly return #N/A if there is no such letter under the chosen name...
My dataset is A1:A17, and the formula could use A:A instead each time, but the array calc inside the IF needs the A1:A17 for speed.
EDIT for better function construction:
If we wanted to avoid editing the formula when the data length changes, then we could let full column references of A:A go through the entire construction(and lose speed/efficiency) with the last data row in colA calculated via ROWS(A:A):
Re-edit:
{=INDEX($A:$D, MATCH($H$3,$A:$A, 0)+MATCH($H$4, INDEX($A:$A, MATCH($H$3,$A:$A, 0)):INDEX($A:$A, SMALL(IFERROR(MATCH($H$3,$A:$A, 0)+POWER(SQRT(IF(LEN($A:$A)>1, ROW($A:$A), 0)-MATCH($H$3,$A:$A, 0)), 2)-1, ROWS($A:$A)), 2)), 0)-1, MATCH($H$5,1:1, 0))}
It really depends on the setup...
Edit again for version which takes blanks as separators for names
If you want to use blanks as the separator for names, where no blanks are in the data results, but blanks appear in columns B to D where there is a name, then a tiny change in the above formulae will result in this:
=INDEX($A$1:$D$17, MATCH($H$3,$A$1:$A$17, 0)+MATCH($H$4, INDEX($A:$A, MATCH($H$3,$A:$A, 0)):INDEX($A:$A, SMALL(IFERROR(MATCH($H$3,$A:$A, 0)+POWER(SQRT(IF($B$1:$B$17="", ROW($A$1:$A$17), 0)-MATCH($H$3,$A$1:$A$17, 0)), 2)-1, ROWS($A$1:$A$17)), 2)), 0)-1, MATCH($H$5, $A$1:$D$1, 0))
This means that the names and letters do not have to be any specified length, but just one proviso is that blanks appear in the row with the name.
A small amendment to the condition to find the end range to search for the letter by replacing this: SQRT(IF(LEN($A$1:$A$17)>1, with this:
SQRT(IF($B$1:$B$17="",
I would use the area (4th parameter) of Index(). Below is a screenshot of test data. This example assumes the same columns and keys are sorted and consistent.
This works by using (Range1,Range2) as the first parameter of index. For the 4th parameter of index, use N for which area in the () you want Index to return.
I think this may be slightly tidier, and a little easier to modify maybe.
=INDEX(OFFSET(INDIRECT("A"&MATCH($H$3,$A:$A,0),TRUE),0,0,4,4),MATCH($H$4,$A:$A,0),MATCH(H5,$1:$1,0))
Using offset to create the range first, we're able to use the name from H3 to set that up, and then beyond that we are just indexing within that new range.
Now this is still dependendent on staying in Column A for the names.
Assuming the format of the data is always Name then P, M and N this formula does the work:
=INDEX($A:$D,
MATCH($H$3,$A:$A,0)
+LOOKUP($H$4,{"P",1;"M",2;"N",3}),
MATCH($H$5,$1:$1,0))
This solution works on almost all conditions. One restriction I found is when one of the subjects (Names) does no have data for any of the details (letters), but as of now the same occurs with all the other answers.
The formula assumes the data is located at B6:F30 (in order to ensure it can be applied regardless of the source range location).
The formula uses the Index\Match functions:
First, a MATCH to retrieve the position of the Name:
MATCH($H8,$B$6:$B$30,0)
With that info it uses INDEX to build a range that is used to obtain the position of the Detail (letter) using a second MATCH Function:
+ MATCH($I8,INDEX($B$6:$B$30, 1 + MATCH($H8,$B$6:$B$30,0))
:INDEX($B$6:$B$30,ROWS($B$6:$B$30)),0),
Adding the results of the first and second MATCH functions obtains the position of the Name`Detail` combination and uses it in an Index to the entire data. The position of the Data Column required is obtained with a Match:
INDEX($B$6:$F$30, 1st.MATCH + 2nd.MATCH,
MATCH(J$6,$B$6:$F$6,0))
With the results located at G6:L30 enter this formula in J8 then copy to J8:L30:
= INDEX( $B$6:$F$30,
MATCH( $H8, $B$6:$B$30, 0)
+MATCH( $I8, INDEX( $B$6:$B$30 , 1 + MATCH( $H8, $B$6:$B$30 ,0))
: INDEX( $B$6:$B$30, ROWS($B$6:$B$30) ),0),
MATCH( J$6, $B$6:$F$6, 0)),"")
This solution works in all conditions discussed so far (let me know of any condition that it does not work and I’ll try to cover it).
I’m posting this as a separated answer as the formulas applied in prior answer rightly apply to the conditions stated in them, as such they will be useful to users with those specific scenarios, so they don’t need to apply these long formulas.
This formula assumes the data is located at B6:E30 (in order to ensure it can be applied regardless of the source range location).
This formula uses the Index\Match functions and it’s a Formula Array.
FormulaArrays are entered pressing [Ctrl] + [Shift] + [Enter] simultaneously, you shall see { and } around the formula if entered correctly
Syntax:
=IFERROR(INDEX(DataRng,
MATCH(Value1,NamesRng,0)
+IFERROR(MATCH(Value2,INDEX(NamesRng,
1+MATCH(Value1,NamesRng,0))
:INDEX(NamesRng, IFERROR(MATCH(Value1,NamesRng,0)
+MATCH("#",IF((INDEX(Col1Rng,1+MATCH(Value1,NamesRng,0))
:INDEX(Col1Rng,ROWS(NamesRng)))="","#","!"),0),
ROWS(NamesRng))),0),NA()),MATCH(ValCol,DataHdr,0)),"")
Arguments:
Assuming the data is located at B6:E30.
Value1= Name to be found in Data, i.e. George, Scott, etc.
Value2= Detail to be found in Data, i.e. Detail1, Detalle2, etc.
ValCol = Column to be found in Data i.e. Column1, Column2, etc.
DataRng= $B$6:$E$30
DataHdr= $B$6:$E$6
NamesRng= $B$6:$B$30
Col1Rng= $C$6:$C$30
1st MATCH: Retrieves the position of the Name:
MATCH(Value1,NamesRng,0)
2nd MATCH: Retrieves the end position of the Name’s corresponding Details, which is determined by a blank value in column C or the end of the data range:
MATCH("#",IF((INDEX(Col1Rng, 1 + 1stMATCH)
:INDEX(Col1Rng,ROWS(NamesRng)))="","#","!"),0),
Builds a Range (vRange): With the Names's Details using the 1st and 2nd match functions. If 2nd Match returns an error then it uses the last row of the Data range:
INDEX(NamesRng, 1 + 1stMATCH )
:INDEX(NamesRng, IFERROR( 1stMATCH + 2ndMATCH, ROWS(NamesRng)))
3rd MATCH: Retrieves the position of the Detail within the vRange. It returns #NA if the combination is not present.
IFERROR(MATCH(Value2, vRange,0), NA())
Adding the results of the 1st and 3rd match functions obtains the Row index of the Name`Detailcombination or#NAif no found.
The Column index is obtained with a Match from the Header of the Data.
It then applying the INDEX function to the Data Range returns the value of theName\Detail\Columncombination.
If theName\Detail` combination is not found it returns blank.
=IFERROR( INDEX( DataRng, 1stMATCH + 3rdMATCH, MATCH(Column,DataHdr,0)),"")
With the results located at H6:L37 enter this Formula Array in J8 then copy to K8:L37 and to J9:L37:
=IFERROR( INDEX($B$6:$E$30,
MATCH($H8,$B$6:$B$30,0)
+IFERROR( MATCH($I8, INDEX($B$6:$B$30,
1+MATCH($H8,$B$6:$B$30,0))
:INDEX($B$6:$B$30, IFERROR(MATCH($H8,$B$6:$B$30,0)
+MATCH("#", IF((INDEX($C$6:$C$30,1+MATCH($H8,$B$6:$B$30,0))
:INDEX($C$6:$C$30,ROWS($B$6:$B$30)))="","#","!"),0),
ROWS($B$6:$B$30))),0),NA()),
MATCH(J$6,$B$6:$E$6,0)), "")
Wow... So many solutions already.
I think a simpler solution could be using offset to get a more generic answer.
=INDEX($A$1:$D$9, MATCH($G$3,OFFSET($A$1,MATCH($G$2,$A$1:$A$9,0),0,3,1),0)+MATCH($G$2,$A$1:$A$9,0), MATCH($G$4,$B$1:$D$1,0)+1)
The only variable to look for is 3 which is the number of M/N/P options present because that will affect the number of rows. Otherwise, the solution works fine in all possible scenarios and different orders.
When I have more than two inpunts for a data search I prefer to have the data organized as shown in the figure, so that I can use a pivot table and get it to organize the data in rows and columns as I like.
Then I use GETPIVOTDATA to search for a value.
Cell G9 contains this formula:
=GETPIVOTDATA("Value";$F$3;"Name";G15;"Letter";G16;"Column";G17)
I want to calculate the average over a range (B1:B12 or C1:C12 in the figure), excluding:
Cells not being numeric, including Empty strings, Blank cells with no contents, #NA, text, etc. (B1+B8:B12 or C1+C8:C12 here).
Cells for which corresponding cells in a range (A1:A12 here) have values outside an interval ([7,35] here). This would further exclude B2:B3 or C2:C3.
At this point, cells in column A may contain numbers or have no contents.
I think it is not possible to use any built-in AVERAGE-like function. Then, I tried calculating the sum, the count, and divide. I can calculate the count (F2 and F7), but not the sum (F3), when I have #N/A in the range, e.g.
How can I do this?
Notes:
Column G shows the formulas in column F.
I cannot filter and use SUBTOTAL.
B8:C8 contain Blank cells with no contents, B9:C9 contain Empty strings.
I am looking for (non-user defined) formulas, i.e., non-VBA.
From
https://stackoverflow.com/a/30242599/2103990:
Providing you are using Excel 2010 and above the AGGREGATE
function
can be optioned to ignore all errors.
=AGGREGATE(1, 6, A1:A5)
You can accomplish this by using array formulas based upon nested IFs to provide at least part of the criteria. When an IF resolves to FALSE it no longer process the TRUE portion of the statement.
The array formulas in F2:F3 are,
=SUM(IF(NOT(ISNA(B2:B13)), (A2:A13>=7)*(A2:A13<=35)*(B2:B13<>"")))
=SUM(IF(NOT(ISNA(B2:B13)), IF(B2:B13<>"", (A2:A13>=7)*(A2:A13<=35)*B2:B13)))
The array formulas in F7:F8 are,
=SUM(IF(NOT(ISNA(C2:C13)), (A2:A13>=7)*(A2:A13<=35)*(C2:C13<>"")))
=SUM(IF(NOT(ISNA(C2:C13)), IF(C2:C13<>"", (A2:A13>=7)*(A2:A13<=35)*C2:C13)))
Array formulas need to be finalized with Ctrl+Shift+Enter↵. Once entered correctly, they can be filled down like any other formula if necessary.
Array formulas increase calculation load logarithmically as the range(s) they refer to expand. Try to keep excess blank rows to a minimum and avoid full column references.
You can get the average of your "NA" column values in one fairly simple formula like this:
=AVERAGE(IF(
(
($A$2:$A$13>=$F$2)*
($A$2:$A$13<=$F$3)*
ISNUMBER(B2:B13)
)>0,
B2:B13))
entered as an array formula using CtrlShiftEnter↵.
I find this to be a very clear way of writing it, because all your conditions are lined up next to each other. They're "and'ed" using the mathematical operator *; this of course converts TRUE and FALSE values to 1's and 0's, respectively, so when the and'ing is done, I convert them back to TRUE/FALSE using >0. Note that instead of hard-coding your thresholds 7 and 35 (hard-coding literals is usually considered bad practice), I put them in cells.
Same logic for your sum and your count; just replace AVERAGE with SUM and COUNT, respectively:
=SUM(IF((($A$2:$A$13>=$F$2)*($A$2:$A$13<=$F$3)*ISNUMBER(B2:B13))>0,B2:B13))
=COUNT(IF((($A$2:$A$13>=$F$2)*($A$2:$A$13<=$F$3)*ISNUMBER(B2:B13))>0,B2:B13))
though a more succinct formula can also be used for the count:
=SUM(($A$2:$A$13>=$F$2)*($A$2:$A$13<=$F$3)*ISNUMBER(B2:B13))
The same formulas can be used to average/sum/count your "blank" column. Here I just drag-copied them one column to the right (column G), which means that all instances of B2:B13 became C2:C13.
this should be simple enough, but numbers (on OSX) keeps throwing an error about the ranges being different sizes.
I have a list of numbers, each with an associated date, and I want a sum of all numbers within a particular month (to give, on a separate sheet, a monthly total).
Here is what I've tried:
SUMIFS(
Sheet1::Table 1::D2:D84,
MONTH(Sheet1::Table 1::A2:A84), "=04",
YEAR(Sheet1::Table 1::A2:A84), "=2014"
)
Sorry if this is a stupid question, but I've tried fiddling with it and it just won't accept it.
Thanks in advance.
You cannot put a function inside the range:
=SUMIFS(C1:C25;Month(A1:A25);"=3")
than you need to add two (hidden?) columns with Month & year function.
After you build SUMIFS based on the new columns
=SUMIFS(C1:C25;D1:D25;"=4";E1:E25;"=2014")
sum all value from C1 to C25 if column of month (D) is equal to 4 and column of year (E) is equal to 2014.
I would suggest considering using a SUMIFS function with an upper\lower limit, and then either referencing a cell with the dates, or using their numerical value in the formula (the former is my preference, to reduce hard coded values = easily updated\reused). So something like:
=SUMIFS(D2:D84, A2:A84, ">="&E1, A2:A84, "<="&E2)
In this example:
column 'D' has the values you want to sum
column 'A' has the date values
columns 'B' and 'C' are treated as irrelevant (for the sake of this formula)
column 'E' has 2 values, in row 1, the lower limit (for this specific question, the first of April) and in row 2 the upper limit (for this specific question, the final day of April)
Then have your lower limit for dates (the first day from when you would like column 'D' to be counted) in cell E1, and your date upper limit in E2.
Easily updated for future months, so might save you some work down the line.
The next easier option would be to update it to be formatted as a table, because your formula would be slightly more legible, add in some named ranges (in this case, E1 = 'lowerlimit' and E2 = 'upperlimit) to once again make it easier to read the formula, in which case you'd end up with something like:
=SUMIFS(table[FigureToBeAccrued], table[dates], ">="&lowerLimit, table[dates], "<="&upperlimit)
Might've overcooked this answer, it's my first, so wanted to make sure I didn't skimp. Let me know if you've got any follow up questions.