Issue with RANK.AVG with Dynamic Array in Excel 365 [duplicate] - excel-formula

How to RANK an array directly? I would like to avoid creating more intermediate data in cells just to reference them.
Excel RANK.AVG formula states it accepts both array and reference:
Syntax
RANK.AVG(number,ref,[order])
The RANK.AVG function syntax has the following arguments:
Number Required. The number whose rank you want to find.
Ref Required. **An array of, or a reference to**, a list of numbers. Nonnumeric values in Ref are ignored.
Order Optional. A number specifying how to rank number.
But Excel keeps rejecting the below formula.
=RANK.AVG(5, {3,1,7,10,5})
If the numbers are put in cells, say B1:B5, Excel accepts
=RANK.AVG(5, B1:B5}
Ultimately, I would like to rank a dynamic array
=RANK.AVG(value, TOCOL(VSTACK(array1, array2))
e.g. =RANK.AVG(5, TOCOL(VSTACK(B1:B5,C1:C10))

It seems that the official documentation on the various RANK functions is simply wrong with respect to the fact that they permit arrays for the ref argument (see here, for example).
You will have to come up with creative alternatives which mimic the RANK.AVG function, for example:
=LET(ζ,SORT(MyArray,,-1),AVERAGE(FILTER(SEQUENCE(COUNT(ζ)),ζ=MyValue)))

Related

Excel formula to sum an array of items from a lookup list

I'm trying to make my monthly transaction spreadsheet less work-intensive but I'm running up against problems outputting my category lookups as an array. Right now I have a table with all my monthly transactions and I want to create another table with monthly running totals. What I've been doing is manually summing each entry from each category, but I'd love to automate the process. Here's what I have:
=SUM(INDEX(Transactions[Out], N(IF(1,MATCH(I12,Transactions[Category],FALSE)))))
I've also tried using AGGREGATE in place of SUM but it still only returns the first value in the category. The N(IF()) was supposed to force INDEX to return all the matches as an array, but it's not working. I found that trick online, with no explanation of why it works, so I really don't know how to fix it. Any ideas?
Just in case anyone ever looks at this thread in the future, I was able to find a simpler solution to my problem once I implemented the Transactions[Category]=I12 method. SUM, itself will take an array as an argument, so all I had to do was form an array of the values I wanted to keep from Transactions[Out] range. I did this by adjusting the method Ron described above, but instead of using 1/(Transactions[Category]=I12 I used 1/IF(Transactions[Category]=I12, 1,1000) and surrounded that by a FLOOR(*resulting array*, .01) which rounded all the thousandth's down to zero and didn't yield any #DIV/0! errors.
Then! I realized that the simplest way to get the actual numbers I wanted, rather than messing with INDEX or AGGREGATE, was to multiply the range Transactions[Out] by the binary array from the IF test. Since the range is a table, I know they will always be the same size. And SUM automatically multiplies element by element and then adds for operations like this.
(The result is a "CSE" formula, which I guess isn't everyone's favorite. I'm still not 100% clear on what it means: just that it outputs data in a single cell, rather than over multiple cells. But in this context, SUM should only output a single number, so I'm not sure why I need CSE... A problem for another day!)
In your IF, the value_if_true clause needs to return an array of the desired row numbers from the array.
MATCH does not return an array of values; it only returns a single value which, with the FALSE parameter, will be the first value. That's why INDEX is only returning the first value.
One way to return an array of values:
Transactions[Category]=I12
will return an array of {TRUE,FALSE,FALSE,TRUE,...} depending on if it matches.
You can then multiply that by the Row number to get the relevant row on the worksheet.
Since you are using a table, to obtain the row number in the data body array, you have to subtract the row number of the Header row.
But now we are going to have an array which includes 0's for the non-matching entries, which is not good for us as a row number argument for the INDEX function.
So we get rid of that by using the AGGREGATE function with the ignore errors argument set after we do change the equality test to 1/(Transactions[Category]=I12) which will create DIV/0 errors for the non-matchers.
Putting it all together
=SUM(INDEX(Transactions[Out],AGGREGATE(15,6,1/(Transactions[Category]=I12)*ROW(Transactions)-ROW(Transactions[#Headers]),ROW(INDIRECT("1:"&COUNTIF(Transactions[Category],$I$12))))))
You may need to enter this with CSE depending on your version of Excel.
Also, if you have a lot of these formulas, you may want to change the k argument for AGGREGATE to use the INDEX function (non-volatile) instead of the volatile INDIRECT function.
=SUM(INDEX(Transactions[Out],AGGREGATE(15,6,1/(Transactions[Category]=I12)*ROW(Transactions)-ROW(Transactions[#Headers]),ROW(INDEX($A:$A,1,1):INDEX($A:$A,COUNTIF(Transactions[Category],$I$12),1)))))
Edit
If you have Excel/O365 with dynamic arrays and the FILTER function, you can greatly simplify the above to the normally entered:
=SUM(FILTER(Transactions[Out],Transactions[Category]=I12))

Multiple arrays within multiple arrays

I have multiple lookup values that I am trying to match across multiple arrays. I would like to match one of those lookup values across several arrays within the same match but I keep getting "#VALUE" or "#N/A".
Current formula I try to use is below simplified for ease of reading.
=INDEX($I$2:$I$10,MATCH(A2&B2&C2,$D$2:$D$10&OR($E$2:$E$10,$F$2:$F$10)&$G$2:$G$10,0))
In this case, I am trying to match B2 either in $E$2:$E$10 or $F$2:$F$10. What am I doing wrong?
Thanks in advance!
At first: You misinterpret the OR function. OR needs boolean values as parameters. And it will not return arrays of values, even not boolean values. It will return either TRUE or FALSE.
At second: Even if OR would work as you seem to think, MATCH needs an one dimensional lookup_array, a row vector or a column vector. It can't work with two dimensional matrices like {$D$2&$E$2&$G$2 , $D$2&$F$2&$G$2 ; $D$3&$E$3&$G$3 , $D$3&$F$3&$G$3 ; ...}
So simplest solution with your example would be to have one INDEX MATCH combination each possible lookup_array:
{=IFERROR(INDEX($I$2:$I$10,MATCH(A2&B2&C2,$D$2:$D$10&$E$2:$E$10&$G$2:$G$10,0)),INDEX($I$2:$I$10,MATCH(A2&B2&C2,$D$2:$D$10&$F$2:$F$10&$G$2:$G$10,0)))}
Or, if you really need this the way you seem to think your formula should work, then you can't use MATCH and need calculate the row number on other way. For example like so:
{=INDEX($I$2:$I$10,MIN(IF(A2&B2&C2=$D$2:$D$10&T(OFFSET($E$2,ROW($2:$10)-2,{0,1}))&$G$2:$G$10,ROW($2:$10)-1,ROWS($D$2:$D$10)+1)))}
The T is used assuming the values in A2:G10 are text values. If they are numeric, N must be used instead.

SumProduct Using Multiple Criteria Returning Too Much Data

Although this question has been asked and answered, (Stack Overflow is where I learned how to implement SP), an issue has come up which I can't figure out.
I'm using SP to sum shipments within a pivot table using a product number (with wild-cards), and a specific date. For instance, part numbers can be "AX10235-HP", "AX11135-HP", "AX10235-HP2", "AX10235-HPSPARE" or TP10101-IBM. (There are a large variety of numbers.)
So in this case I want to sum the qty shipments of "AX???35-HP". I wish to sum just the first 2 parts in my short list. However, the command used causes all the parts to sum except the *-IBM number; as if there was a wild-card at the end of the number. In other words "AX???35-HP" is the same as "AX???35-HP*". I've tried wrapping the value in quotes but it takes uses the quotes literally so fails.
This is the function
SUMPRODUCT((S_PART_DATA)*(ISNUMBER(SEARCH($A6,S_PART_RANGE))*(S_PART_DATES=T$4)))
S_PART_DATA array of Shipments,
S_PART_RANGE array of list of part numbers,
S_PART_DATES array of Dates shipments were made
It works the way you describe because SEARCH function finds $A6 within other text, hence it may not be an exact match - better to use SUMIFS function like this:
=SUMIFS(S_PART_DATA,S_PART_RANGE,$A6,S_PART_DATES,T$4)
Assuming all named ranges are the same size and A6 contains the value AX???35-HP
If that doesn't work try this version
=SUMPRODUCT(S_PART_DATA*ISNUMBER(SEARCH("^"&$A6&"^","^"&S_PART_RANGE&"^"))*(S_PART_DATES=T$4))
concatenating the ^ values means you will [probably] only get exact matches

Array evaluation of find_text argument in SEARCH() function

Say I have the following:
Entering the following formulas in cell C1 and then clicking Evaluate Formula->Evaluate produces very different results:
Formula 1: B$1:B$5 evaluates as non-array
{=SEARCH(B$1:B$5,A1)}
Formula 2: B$1:B$5 evaluates as an array
{=IF(SEARCH(B$1:B$5,A1),"")}
Why, exactly, is this? What is the cause of this behavior? If possible, please provide other examples using other Excel functions to illustrate what is happening here.
Parenthetically:
My question came about while experimenting with the accepted answer to this question.
In general, an array of values will only be returned by a worksheet function given that the following two conditions are satisfied:
1) The formula in question is either in itself capable of returning an array of values, or else is contained within a larger set-up of several functions, one or more of those which precede the function in question (and therefore act upon it) having that property. Whether that capability is something which requires coercion (i.e. via array-entry (CSE)) or is an in-built feature of the function is not important in terms of the answer you are seeking.
2) The array generated must be passed to a further function for processing. Excel is more teleological than you think: it has no great belief in returning an array of values as an end in itself.
As for your example, it's not that SEARCH, when array-entered, isn't capable of processing arrays (it is). It's more that there is no further function incited which is to act upon that array. In the IF version, there is precisely that, though again, if you process that one more time you'll find that your current array is reduced to just the first element in that array. Wrap a further function around the IF, e.g. SUM, and you'll be able to go one step further, and so on and so on.
And here is a major difference between evaluating formulas via the Evaluate Formula tool, and repeated "evaluation" via selecting various parts of the function in the formula bar and pressing F9.
The latter will always return an array of values, whether the above two conditions are satisfied or not. However - and not many people realise this - the "evaluation" so obtained can, ultimately, lead to incorrect results, and so should only be used providing one is aware of its limitations.
Take the following example, for instance:
With A1:A10 empty, the formula:
=SUMPRODUCT(0+(A1:A10=""))
correctly returns 10.
Now select just the part A1:A10 in the formula bar and press F9. Excel, being forced to "evaluate" the range, returns:
=SUMPRODUCT(0+({0;0;0;0;0;0;0;0;0;0}=""))
which, on further processing, results (correctly, it would seem) in the quite different result of 0.
Regards

nested excel functions with conditional logic

Just getting started in Excel and I was working with a database extract where I need to count values only if items in another column are unique.
So- below is my starting point:
=SUMPRODUCT(COUNTIF(C3:C94735,{"Sharable Content Object Reference Model 1.2","Authored SCORM/AICC content","Authored External Web Content"}))
what i'd like to figure out is the syntax to do something like this-
=sumproduct (Countif range1 criteria..., where range2 criteria="is unique value")
Am I getting this right? The syntax is a bit confusing, and I'm not sure I've chosen the right functions for the task.
I just had to solve this same problem a week ago.
This method works even when you can't always sort on the grouping column (J in your case). If you can keep the data sorted, #MikeD 's solution will scale better.
Firstly, do you know the FREQUENCY trick for counting unique numbers? FREQUENCY is designed to create histograms. It takes two arrays, 'data' and 'bins'. It sorts 'bins', then creates an output array that's one longer than 'bins'. Then it takes each value in 'data' and determines which bin it belongs in, incrementing the output array accordingly. It returns the array. Here's the important part: If a value appears in 'bins' more than once, any 'data' value meant for that bin goes in the first occurrence. The trick is to use the same array for both 'data' and 'bins'. Think it through, and you'll see that there's one non-zero value in the output for each unique number in the input. Note that it only counts numbers.
In short, I use this:
=SUM(SIGN(FREQUENCY(<array>,<array>)))
to count unique numeric values in <array>
From this, we just need to construct arrays containing numbers where appropriate and text elsewhere.
In the example below, I'm counting unique days when the color is red and the fruit is citrus:
This is my conditional array, returning 1 or true for the rows I'm interested in:
($A$2:$A$10="red")*ISNUMBER(MATCH($B$2:$B$10,{"orange","grapefruit","lemon","lime"},0))
Note that this requires ctrl-shift-enter to be used as an array formula.
Since the value I'm grouping by for uniqueness is text (as is yours), I need to convert it to numeric. I use:
MATCH($C$2:$C$10,$C$2:$C$10,0)
Note that this also requires ctrl-shift-enter
So, this is the array of numeric values within which I'm looking for uniqueness:
IF(($A$2:$A$10="red")*ISNUMBER(MATCH($B$2:$B$10,{"orange","grapefruit","lemon","lime"},0)),MATCH($C$2:$C$10,$C$2:$C$10,0),"")
Now I plug that into my uniqueness counter:
=SUM(SIGN(FREQUENCY(<array>,<array>)))
to get:
=SUM(SIGN(FREQUENCY(
IF(($A$2:$A$10="red")*ISNUMBER(MATCH($B$2:$B$10,{"orange","grapefruit","lemon","lime"},0)),MATCH($C$2:$C$10,$C$2:$C$10,0),""),
IF(($A$2:$A$10="red")*ISNUMBER(MATCH($B$2:$B$10,{"orange","grapefruit","lemon","lime"},0)),MATCH($C$2:$C$10,$C$2:$C$10,0),"")
)))
Again, this must be entered as an array formula using ctrl-shift-enter. Replacing SUM with SUMPRODUCT will not cut it.
In your example, you'd use something like:
=SUM(SIGN(FREQUENCY(
IF(ISNUMBER(MATCH($C$3:$C$94735,{"Sharable Content Object Reference Model 1.2","Authored SCORM/AICC content","Authored External Web Content"},0)),MATCH($J$3:$J$94735,$J$3:$J$94735,0),""),
IF(ISNUMBER(MATCH($C$3:$C$94735,{"Sharable Content Object Reference Model 1.2","Authored SCORM/AICC content","Authored External Web Content"},0)),MATCH($J$3:$J$94735,$J$3:$J$94735,0),"")
)))
I'll note, though, that scaling might be a problem on data sets as large as yours. I tested it on larger data sets, and it was fairly fast on the order of 10k rows, but really slow on the order of 100k rows, such as yours. The internal arrays are plenty fast, but the FREQUENCY function slows down. I'm not sure, but I'd guess it's between O(n log n) and O(n^2) depending on how the sort is implemented.
Maybe this doesn't matter - none of this is volatile, so it'll just need to calculate once upon refreshing the data. If the column data is changing, though, this could be painful.
Asuming the source data is sorted by the key value [A], start with determining the occurence of the key column
B2: =IF(A2=A1;B1+1;1)
Next determine a group sum
C2: =SUMIF($A$2:$A$9;A2;$B$2:$B$9)
A key is unique if its group sum is exactly 1
D2: =(C2=1)
To count records which match a certain criterium AND are unique, include column D in a =IF(AND(D2, [yourcondition];1;0) and sum this column
Another option is to asume a key unique within a sorted list if it is unequal to both its predecessor and successor, so you could find the unique records like
E2: =AND(A2<>A1;A2<>A3)
G2: =IF(AND(E2;F2="this");1;0)
E and G can of course be combined into one single formula (not sure though if that helps ...)
G2(2): =IF(AND(AND(A2<>A1;A2<>A3);F2="this");1;0)
resolving unnecessarily nested AND's:
G2(3): =IF(AND(A2<>A1;A2<>A3;F2="this");1;0)
all formulas in row 2 should be copied down to the end of the list

Resources