How to make nested array computations with INDEX()? - excel

Imagine I have several (i.e. > 100) column vectors of numbers. Vectors are large with equal length (e.g. 20k items). The vectors are not adjacent, so they don't make a matrix.
What I want, is to get some row-wise computation with the vectors, for instance
For each row what is the first non zero value among all vectors?
or
For each row what is the maximal value among all vectors?
See this simplified example, that should get the maximal value for all vectors, which would be 3 for all row (in reality the displayed value is 1):
It would be easy, if I could copy the vectors as a matrix and get the column of row ranges that spans all vectors for a given row, instead of the column ranges. But that is not the option due to the size of the data. I think it is related to other SO question: Is it possible to have array as an argument to INDIRECT(), so INDIRECT() returns array?.

You can use CHOOSE to combine equal sized columns into a single range, e.g. for your 3 range example:
=CHOOSE({1,2,3},$B$1:$B$4,$B$5:$B$8,$A$3:$A$6)
Then use that directly in a formula, e.g. in G2 copied down to get the MAX in each row for your example
=MAX(INDEX(CHOOSE({1,2,3},$B$1:$B$4,$B$5:$B$8,$A$3:$A$6),F2,0))
or you can define the CHOOSE part as a named range [especially useful if you have 100 ranges], e.g. name that Matrix and use
=MAX(INDEX(Matrix,F2,0))
You need to modify the {1,2,3} part based on the number of ranges, to shortcut when you have 100 ranges you can use
=CHOOSE(TRANSPOSE(ROW(INDIRECT("1:100"))),Range1, Range2.....Range100)
Now needs to be confirmed with CTRL+SHIFT+ENTER
To get the first non-zero value you can use this version
=INDEX(INDEX(Matrix,F2,0),MATCH(TRUE,INDEX(Matrix,F2,0)<>0,0))
also confirmed with CTRL+SHIFT+ENTER

I've found that you actually "can" return an array from INDIRECT().
However it must be in "R1C1" syntax AND you cannot create your R1C1 syntax with a formula (not with something like "R" & ROW() & "C" & COLUMN()".
You have to enter the ROW & COLUMN numbers as absolute and then it works.
Apparently excel puts {} around the numbers when they are returned by ROW() or COLUMN() function, and I guess that's why it doesn't work (try debugging, you'll see).

Related

SUMIF (or something) over a wide range of columns

I've got a massive parts spreadsheet that I'm trying to simplify. Various parts could be included in number of locations, which I would like to add up to a single list. The attached file is just an example using reindeer.
This is doable with using a bunch of SUMIF statements added together, but not practical due to the range of columns I need to include. There's gotta be a better way!?
=SUMPRODUCT(--($D$4:$J$11=$A4),$E$4:$K$11))
SUMPRODUCT can do that. Make sure the second range shifts one column, but has equal count of columns (and rows).
($D$4:$J$11=$A4) results in an array of TRUE's or FALSE's for the value in range $D$4:$J$11 being equal to the value in $A4 (no $ prior to it's row number will increase the row # referenced when dragged down).
Adding -- in front of the array converts the TRUE's and FALSE's to 1's and 0's respectively.
Multiplying that with the range to the right of it will result in 1* the value in $E$4:$K$11 for all TRUE's, which results in it's value, or 0* the value in $E$4:$K$11 for all FALSE's, which results in 0.
Summing the array of values results in the sum of all values where the condition is met in the column left from it.
SUMPRODUCT combines the multiplication of the array and summing the array results to 1 total sum.
You can use simply the SUM:
=SUM((D$4:$D$11=A4)*$E$4:$E$11,($F$4:$F$11=A4)*$G$4:$G$11, etc.)
where in etc you can put any range you want. If you don't use 2021/365 version, you must confirm the formula with CTRL+SHIFT+ENTER.

How to regexmatch a range?

I am trying to take a data from a table and get the value of how much a class gets a point. I used VLOOKUP to do this, but the problem is that I have to tell the sheets on which class gets how much.
The data:
Your data seems to be setup in a way that unnecessarily complicates things.
kelas-column isn't showing the class, but name and class. For easy use in calculation this would better be divided in two columns: name | class
poins-column seems to be numbers formatted as text (judging by the leading +) if it was showing the number only and the class would show the actual class, a simple SUMIF would solve your problem.
Now it's still doable using SUMPRODUCT:
=SUMPRODUCT(--(A17=RIGHT($B$2:$B$11,2)),--($D$2:$D$11))
The first part checks if the search value A17 equals the last 2 digits in range B2:B11 (the $'s in the formula are to lock the range when dragging the formula down or aside).
This results in an array of TRUE's and FALSE's which is converted to 1's and 0's by the leading --.
The second part simply converts the text values to numbers using the same logic as with the TRUE's and FALSE's, using the --.
SUMPRODUCT multiplies the first array with the second array and adds it all up.
If a condition is true it multiplies the value of the points column by 1 (equals the points), if false it multiplies by 0 (equals 0).
In the end it sums all values meeting given condition.

Tolerant average (ignore #NA, etc.)

I want to calculate the average over a range (B1:B12 or C1:C12 in the figure), excluding:
Cells not being numeric, including Empty strings, Blank cells with no contents, #NA, text, etc. (B1+B8:B12 or C1+C8:C12 here).
Cells for which corresponding cells in a range (A1:A12 here) have values outside an interval ([7,35] here). This would further exclude B2:B3 or C2:C3.
At this point, cells in column A may contain numbers or have no contents.
I think it is not possible to use any built-in AVERAGE-like function. Then, I tried calculating the sum, the count, and divide. I can calculate the count (F2 and F7), but not the sum (F3), when I have #N/A in the range, e.g.
How can I do this?
Notes:
Column G shows the formulas in column F.
I cannot filter and use SUBTOTAL.
B8:C8 contain Blank cells with no contents, B9:C9 contain Empty strings.
I am looking for (non-user defined) formulas, i.e., non-VBA.
From
https://stackoverflow.com/a/30242599/2103990:
Providing you are using Excel 2010 and above the AGGREGATE
function
can be optioned to ignore all errors.
=AGGREGATE(1, 6, A1:A5)
        
You can accomplish this by using array formulas based upon nested IFs to provide at least part of the criteria. When an IF resolves to FALSE it no longer process the TRUE portion of the statement.
   
The array formulas in F2:F3 are,
=SUM(IF(NOT(ISNA(B2:B13)), (A2:A13>=7)*(A2:A13<=35)*(B2:B13<>"")))
=SUM(IF(NOT(ISNA(B2:B13)), IF(B2:B13<>"", (A2:A13>=7)*(A2:A13<=35)*B2:B13)))
The array formulas in F7:F8 are,
=SUM(IF(NOT(ISNA(C2:C13)), (A2:A13>=7)*(A2:A13<=35)*(C2:C13<>"")))
=SUM(IF(NOT(ISNA(C2:C13)), IF(C2:C13<>"", (A2:A13>=7)*(A2:A13<=35)*C2:C13)))
Array formulas need to be finalized with Ctrl+Shift+Enter↵. Once entered correctly, they can be filled down like any other formula if necessary.
Array formulas increase calculation load logarithmically as the range(s) they refer to expand. Try to keep excess blank rows to a minimum and avoid full column references.
You can get the average of your "NA" column values in one fairly simple formula like this:
=AVERAGE(IF(
(
($A$2:$A$13>=$F$2)*
($A$2:$A$13<=$F$3)*
ISNUMBER(B2:B13)
)>0,
B2:B13))
entered as an array formula using CtrlShiftEnter↵.
I find this to be a very clear way of writing it, because all your conditions are lined up next to each other. They're "and'ed" using the mathematical operator *; this of course converts TRUE and FALSE values to 1's and 0's, respectively, so when the and'ing is done, I convert them back to TRUE/FALSE using >0. Note that instead of hard-coding your thresholds 7 and 35 (hard-coding literals is usually considered bad practice), I put them in cells.
Same logic for your sum and your count; just replace AVERAGE with SUM and COUNT, respectively:
=SUM(IF((($A$2:$A$13>=$F$2)*($A$2:$A$13<=$F$3)*ISNUMBER(B2:B13))>0,B2:B13))
=COUNT(IF((($A$2:$A$13>=$F$2)*($A$2:$A$13<=$F$3)*ISNUMBER(B2:B13))>0,B2:B13))
though a more succinct formula can also be used for the count:
=SUM(($A$2:$A$13>=$F$2)*($A$2:$A$13<=$F$3)*ISNUMBER(B2:B13))
The same formulas can be used to average/sum/count your "blank" column. Here I just drag-copied them one column to the right (column G), which means that all instances of B2:B13 became C2:C13.

MAX IF in Excel with the same range of data

I have a spreadsheet with a lot of voltage numbers and I want to get the maximum and minimum deviations from a value (the value is 0.95).
The ideal formula would be:
=MAX(IF([range of many values]<0.95,[range of many values],""))
The range is a matrix of values, if that matters.
But this doesn't work since IF doesn't like ranges.
Is there a way to do this without creating another sheet just for the IF values results?
Thanks in advance
Use the formula
=MAX([range of many values]*([range of many values]<0.95))
as an array formula, i.e. hold hold ctrl-shift when pressing enter after typing the formula.
By entering this as an array formula, the intermediate computations can return arrays. So, ]*([range of many values]<0.95) will return an array that has 1 for True, and 0 for False. This is then multiplied by the original values in the array, entry by entry, and returns an array, which will feed into the MAX function.
BTW, your original formula will also work, if it is entered as an array formula.
There are also ways you could do this with non array formulas, e.g.
=SMALL(Range,COUNTIF(Range,"<0.95"))
That works because if there are 100 values in your range and 30 are < 0.95 then the value you want is the 30th smallest value in the range

Excel - Return the first negative number of a column

I'm using Excel 2010 and I'm looking for a way to return the first negative number of a column. For instance, I have the following numbers distributed in a column:
1
4
6
-3
4
-1
-10
8
Which function could I use to return -3?
Thanks!
This could be interpreted two ways... If all the numbers are in a single cell (one column) as a string, the MID function can be used. If the numbers are in A1, a formula that could work is this:
=VALUE(MID(A1,SEARCH("-",A1),SEARCH(" ",A1,SEARCH("-",A1))-SEARCH("-",A1)))
If the numbers are each in their own columns (in my example, A3:H3), a different technique must be used:
{=INDEX(A3:H3,1,MATCH(TRUE,A3:H3<0,0))}
Don't type the { } - enter the equation using CTRL+SHIFT+ENTER.
In each case, the formula will return the number -3, which is the first negative number in the series.
Another possibility, avoiding the array formula (which are a big source of performance issues):
=LOOKUP(1;1/(M2:M15<0);M2:M15)
(I assume your numbers are in the M2:M15 range).
This will return the first number matching the "<0" condition. You may use any other condition, including text comparisons.
You may also extract the value of another array corresponding to the matching cell:
=LOOKUP(1;1/(M2:M15<>"OK");T2:T15)
In this example, the first cell containing another string than "OK" will be searched for in the m2:m15 array and the corresponding value in array t2:t15 will be returned.
Please note that the usage of the lookup function should be avoided whenever possible (but in this case, it's very handy !)
(I got the original inspiration for this answer from this post)

Resources