Using SMXMY2() by referencing named ranges - excel

I would like to use SUMXMY2() by only using named ranges, but have been running into problems.
Basically I am trying to sum the square of the differences of each value of a subset from a single value
I first started with basic data to understand the formula and make sure it was doing what I wanted, that went fine. I then moved to try to include only named ranges. The first challenge was to create an array out of a single value which was done by using the INDEX() technique: INDEX((5*ROW(1:8))/ROW(1:8),) but that already breaks down when used with named ranges.
Here is the mess:
SUMXMY2(INDEX(NamedRange,MATCH($C$4,COB_Date,0)):INDEX(NamedRange,MATCH($C$5,COB_Date,0)),
index((AVERAGE(INDEX(NamedRange,
MATCH($C$4,COB_Date,0)):INDEX(NamedRange,MATCH($C$5,COB_Date,0)))*
row(1:count(INDEX(NamedRange,MATCH($C$4,COB_Date,0)):INDEX(NamedRange,MATCH($C$5,COB_Date,0)))))/
row(1:count(INDEX(NamedRange,MATCH($C$4,COB_Date,0)):INDEX(NamedRange,MATCH($C$5,COB_Date,0)))),))
As said earlier, I am trying to sum the square of the differences of each value of a subset from a single value. This just gives me NA. I'm trying to figure out a way to do this without out the formula, but am completely stuck

The solution for those interested
=SUMXMY2(INDEX(NamedRange,MATCH($C$4,COB_Date,0)):INDEX(NamedRange,MATCH($C$5,COB_Date,0)),INDEX((AVERAGE(INDEX(NamedRange,MATCH($C$6,COB_Date,0)):INDEX(NamedRange,MATCH($C$5,COB_Date,0)))*ROW($A$1:INDEX($A:$A,COUNT(INDEX(NamedRange,MATCH($C$4,COB_Date,0)):INDEX(NamedRange,MATCH($C$5,COB_Date,0))))))/ROW($A$1:INDEX($A:$A,COUNT(INDEX(NamedRange,MATCH($C$4,COB_Date,0)):INDEX(NamedRange,MATCH($C$5,COB_Date,0))))),))+
SUMXMY2(INDEX(NamedRange,MATCH($C$6,COB_Date,0)):INDEX(NamedRange,MATCH($C$4,COB_Date,0)),INDEX((AVERAGE(INDEX(NamedRange,MATCH($C$6,COB_Date,0)):INDEX(NamedRange,MATCH($C$5,COB_Date,0)))*ROW($A$1:INDEX($A:$A,COUNT(INDEX(NamedRange,MATCH($C$6,COB_Date,0)):INDEX(NamedRange,MATCH($C$4,COB_Date,0))))))/ROW($A$1:INDEX($A:$A,COUNT(INDEX(NamedRange,MATCH($C$6,COB_Date,0)):INDEX(NamedRange,MATCH($C$4,COB_Date,0))))),))

Related

Returning a column reference from MATCH to avoid using INDIRECT with a Named Range

TL;DR: I'm basically trying to obtain a column range such as 'Sheet 1'!$A:$A where the A is obtained by matching the contents of a given cell to a 1:1 range within a sheet referenced by another given cell, for use in a dynamic range.
In the highly probable case where that made zero sense, here's an illustration:
PARAMETERS: A2 = "LIST" | C2 = "FirstName" | Desired result: 'LIST'!$A:$A
And I've obtained that, BUT, I can't use that output ('LIST'!$A:$A) within formulas (namely to create a dynamic range). For instance, here 'LIST'!$A:$A contains 101 cells with values in them:
V3 = NamedFormula = 'LIST'!$A:$A
COUNTA(INDIRECT(V3)) = 101
COUNTA(INDIRECT(NamedFormula)) = 1 because it evaluates to #VALUE and that is a singular result.
Before delving into the topic of using INDIRECT with a Named Range (which I've read about and am still getting over my confused grief), I'm realizing my Names are getting a bit out of hand. I tend to use Excel like a mad scientist. So, in case there's a much simpler solution to what I'm trying to do, here's my actual mission:
0. I'm building a tool to simplify a process where email addresses are built from different data, which needs to run without any scripts, only formulas.
1. A tab with no imposed name would contain a user database with minimally (firstname and lastname OR IDs) AND (potentially other data columns) in no specific order. Tool users would import that tab from wherever the data got to them depending on the client, and would only need to copy-paste relevant headers to the main tab without changing anything else here for data integrity.
2. The main tab would have specific input fields where tool users would paste in the name of the imported tab as well as the labels of the columns they need (for instance, the labels in the first row of the columns containing the first name and the last name), and an input field for the domain name to use to build those email addresses.
3. A Data tab is referenced for cleaning and preparing strings for email address formats.
4. The Export tab would spew out a list of clean email addresses that can be exported to CSV.
The Data tab is just 2 columns to use with SUBSTITUTE so that for instance apostrophes are removed but accented letters are normalized (é -> e). I've used LAMBDA within Names to get there. The problem is to tie everything in - to get those Named ranges into the final formula.
The Names I'm using so far (I'd like to use fewer but testing specific parts extended beyond simple usage I fear):
ALPH ={"A";"B";"C";"D";"E";"F";"G";"H";"I";"J";"K";"L";"M";"N";"O";"P";"Q";"R";"S";"T";"U";"V";"W";"X";"Y";"Z"}
LABELS =LAMBDA(labelname,ADDRESS(2,MATCH(labelname,INDIRECT("'"&PARAMETERS!$A$2&"'!$1:$1"),0),1,1,PARAMETERS!$A$2))
RANGECOL =LAMBDA(labelname,COLUMN(INDIRECT(LABELS(labelname))))
RNCOL =LAMBDA(label,"'"&PARAMETERS!$A$2&"'!$"&INDEX(ALPH,RANGECOL(label))&":$"&INDEX(ALPH,RANGECOL(label)))
I haven't tied everything in the Data tab yet - I'm still trying to automate my main tab before pushing further and using the Data tab substitutions on top of everything. That will be the next step, not my current focus. But, for the curious and interested, on the Data tab I'm using something something I found on ablebits which works wonders =]
So, now if I use the offset range with a static LIST!A:A it works:
=IF($C$2<>"",LOWER(INDEX(OFFSET(INDIRECT(ADDRESS(2,MATCH($C$2,INDIRECT("'"&$A$2&"'!$1:$1"),0),1,1,$A$2)),0,0,COUNTA(LIST!A:A)-1,1),ROW())),"") &IF($C$3<>"","."&LOWER(INDEX(OFFSET(INDIRECT(ADDRESS(2,MATCH($C$3,INDIRECT("'"&$A$2&"'!$1:$1"),0),1,1,$A$2)),0,0,COUNTA(LIST!A:A)-1,1),ROW())),"") &"#"&$C$4
But when I try to use the dynamic RNCOL($C$3) it does not:
=IF($C$2<>"",LOWER(INDEX(OFFSET(INDIRECT(LABELS($C$2)),0,0,COUNTA(INDIRECT(RNCOL($C$2)))-1,1),ROW())),"") &IF($C$3<>"","."&LOWER(INDEX(OFFSET(INDIRECT(LABELS($C$3)),0,0,COUNTA(INDIRECT(RNCOL($C$3)))-1,1),ROW())),"") &"#"&$C$4
This just gives #REF, and evaluating shows the digression starting at INDIRECT(RNCOL($C$3)) equating to #VALUE.
I'm starting to see double here but my undying and completely normal love for Excel prevents me from going home from work as I'm way too far down the rabbit hole to let my obsession die here.
Any pointers as to how this can work?
Note - all of the names in the supplied sheet were generated by an online fake name generator, nothing in here is actual user data #GDPR
Thanks in advance! <3
Test sheet is available via Google Drive.
Your current set-up is not good for many reasons, and in my opinion would require a complete overhaul, the scope of which lies beyond a response on this website.
As to a 'quick fix' to your current issue, the reason your formula in E1 is currently returning an error is due to the fact that, as you can see via stepping through with the Evaluate Formula tool, the part
COUNTA(INDIRECT(RNCOL($C$2)))-1
is resolving to
COUNTA(INDIRECT({"'LIST'!$A:$A"}))-1
and this is not the same as
COUNTA(INDIRECT("'LIST'!$A:$A"))-1
in that the value being passed to INDIRECT is an array in the former though not in the latter. Although INDIRECT can accept arrays, it is only within certain constructions in conjunction with other suitable functions; here it will simply error.
And the reason that it is returning an array is due to the fact that RNCOL($C$2) is returning an array, and that is because that function is defined as
=LAMBDA(label,"'"&PARAMETERS!$A$2&"'!$"&INDEX(ALPH,RANGECOL(label))&":$"&INDEX(ALPH,RANGECOL(label)))
and, since RANGECOL($C$2) resolves to 1 here, the above is equivalent to
"'PARAMETERS!$A$2'!$"&INDEX(ALPH,1)&":$"&INDEX(ALPH,1)
Here, because you are omitting the column_num parameter from INDEX, the part
INDEX(ALPH,1)
is resolving to
{"A"}
which is an array (albeit one comprising a single value) and technically different from
"A"
In most circumstances, this is not an issue. As such, it is almost always unnecessary to pass both a row_num and column_num parameter to INDEX when indexing a one-dimensional array. Here, however, it matters.
You can resolve this by explicitly including a column_num parameter, i.e. redefine RNCOL as
=LAMBDA(label,"'"&PARAMETERS!$A$2&"'!$"&INDEX(ALPH,RANGECOL(label),1)&":$"&INDEX(ALPH,RANGECOL(label),1))

Calculate the minimum value of each column in a matrix in EXCEL

Alright this should be a simple one.
I apologize in case it has been already solved, but I can only find posts related to solving this issue with programming languages and not specifically to EXCEL.
Furthermore, I could find posts that address a sub-problem of my question (e.g. regarding limitation of certain EXCEL functions) and should solve/invalidate my request but maybe, just maybe, there is a workaround.
Problem statement:
I want to calculate the minimum value for each column in an EXCEL matrix. Simply enough, I want to input a 2D array (mxn matrix) in a function and output an array with dimension 1xm where each item is the minimum value MIN(nj) of each nj column.
However, I want to solve this with specific constraints:
Avoid using VBA and other non-function scripting: that I could devise myself;
All in one function: what I want to achieve here is to have one and one function only, not split the problem into multiple passages (such as for example copypasting a MIN() function below each column, that wouldn't do it);
The result should be a transposable array (which is already ok, I assume);
Where I am stranded with my solution so far:
The main issue here is that any function I am trying to use takes the entire matrix as a single array input and would calculate the MIN() of the entire matrix, not each column. My current (not working) function for an exemplary 4x4 matrix in range A1:D4 would be as below (the part in bold is where it is clearly not working):
=MIN(INDEX(A1:D4,SEQUENCE(4,4,1,1)))
which ofc does not work, because INDEX() does probably not "understand" SEQUENCE() as an array of items to take into account. Another, not working, way of solving this is to input a series of ranges (A1:A4;B1:B4;C1:C4;D1:D4) so that INDEX() "understands" the ranges as single columns, but ofc does not know and I do not know sincerely how to formulate that. I could use INDIRECT() in some way to reference the array of ranges, but do not know how and could find a way by searching online.
Fundamental question is: can a function, which works with single arrays, also work with multiple arrays? Basically, I do not know how to communicate an EXCEL array formula, that each batch of data I am inputting is a single array and must be evaluated separately (this is very easily solved with for() cycles, I know).
Many thanks for any suggestion and any workaround, any function and solution works as longs as it fits in the constrains defined above (maybe a LAMBA() function? don't know).
This is ofc a simplification of a way more complex problem (I am trying to calculate the annual mean temperature evolution for a specific location by finding the value - for each year from 1950 to 2021 - that is associated to the lat/lon coordinates that are the nearest to the one of the location inputted, given a netCDF-imported grid of time-arrayed data; the MIN() function is used to selected the nearest location, which is then used, via INDEX() to find temp data). I need to do this in one hit (meaning just pasting the function, which evaluates a matrix of data that is referenced by a fixed range), so that I can just use it modularly for other data sets. I already have a working solution, which is "elegant"* enough, but not "elegant"* as the one I could develop solving this issue.
*where "elegant"= it saves me one click every time for 1000+ datasets when applying the function.
If I understand your problem correct then this should solve it:
=BYCOL(A1:D4,LAMBDA(d,MIN(d)))

Can I use MINIFS or INDEX/MATCH on two non-contiguous ranges...?

Problem is straightforward, but solution is escaping. Hopefully some master here can provide insight.
I have a big data grid with prices. Those prices are ordered by location (rows) and business name (cols). I need to match the location/row by looking at two criteria (location name and a second column). Once the matching row is found (there will always be a match), I need to get the minimum/lowest price from two ranges within the grid.
The last point is the real challenge. Unlike a normal INDEX or MINIFS scenario, the columns I need to MIN aren't contiguous... for example, I need to know what the MIN value is between I4:J1331 and Q4:U1331. It's not an intersection, it's a contiguous set of values across two different arrays.
You're probably saying "hey, why don't you just reorder your table to make them contiguous"... not an option. I have a lot of data, and this spreadsheet is used for a bunch of other stuff. So, I have to work with the format I have, and that means figuring out how to do a lookup/min across multiple non-contiguous ranges. My latest attempt:
=MINIFS(AND($I$4:$J$1331,$K$4:$P$1331),$B$4:$B$1331,$A2,$E$4:$E$1331,$B2)
Didn't work, but it should make it more clear what I'm trying to do. There has GOT to be an easy way to just tell excel "use these two ranges instead of one".
Thanks,
Rick
Figured it out. For anyone else who's interested, there doesn't seem to be any easy way to just "AND" arrays together for a search (Hello MS, backlog please). So, what I did instead was to just create multiple INDEX/MATCH arrays inside of a MIN function and take the result. Like this:
MIN((INDEX/MATCH ARRAY 1),(INDEX/MATCH ARRAY 2))
They both have identical criteria, the only difference is the set of arrays being indexed in each function. That basically gives me this:
MIN((match array),(match array))
And Min can then pull the lowest value from either.
Not as elegant as I'd like... lots of redundant code, but at least it works.
-rt

Named Range "name" vs name

I have a ridiculous problem in Excel 2003 where I want to reference a Range I have defined myself, with names such as Div1, Div2, Div3 etc.
I have a macro that determines whether I need to use Div1, Div2, Div3 etc. and I then need to use VLOOKUP and MATCH with these different ranges.
However:
MATCH("ValueSearched", Div1, 0) works fine, but
MATCH("ValueSearched", "Div1", 0) fails
Since Div1 is determined programmatically, it is only stored as a string and I cannot use it.
I understand that in normal programming, you never really reference values like this and would use a hash table or similar, but I thought Excel would have a better work around as everything is done runtime.
Any suggestions on how I can dynamically reference these ranges?
pnuts solved it.
Have you tried =MATCH("ValueSearched",INDIRECT(Div1),0)

INDIRECT() returns #VALUE! unexpectedly

Background: I'm using Excel functions to parse a lot of data out, essentially creating a flexible pivot table. It sorts a lot of race timing data by car, etc. In this portion of the sheet, I'm searching for the minimum segment times for each car. The rest of the sheet avoids macros and VBA so I'd like to avoid that here.
Issue: My formula worked when there are no zeros, but sometimes there are zeros that I need to exclude. My array formula is pretty complicated, but the change I made that broke it is this:
OLD (working):
{=min(if(car_number = indirect("number_vector"), indirect("data_vector")))}
NEW (non-working):
{=min(if(and(car_number = indirect("number_vector"),not(0=indirect("data_vector"))), indirect("data_vector")))}
I am using INDIRECT() with this exact argument several times in the formula. However, in this particular instance (inside the NOT()), it returns #VALUE! instead of {data1;...;datan}. Please see the screencaps below.
Before evaluation:
After evaluation:
I suspect that your AND function might be a problem - AND only returns a single result not an array of results as required, try using multiple IFs like this
=min(if(car_number = indirect("number_vector"),IF(indirect(data_vector)<>0, indirect(data_vector))))
Note that I also used <> rather than using NOT
Are data vector and number vector the same size and shape? (both vertical?)
why are there quotes around one but not the other?

Resources