Excel Match Function with two Criteria but differing match types

Excel Match Function with two Criteria but differing match types - excel

Basically I am trying to improve a spreadsheet that current uses fixed IF functions within IF functions to determine where to find data, then originally used the VLOOKUP function to return the appropriate cable cleat size. Where "Cleat Diameter">"Cable Diameter".
I've been using this for a while, however excel quickly runs out of resources with all the remaining calculations being performed. As a result, I've opted to put all data a single table, and try to use the match function to retrieve the necessary row. Then Simply use the =INDIRECT function to retrieve data from the appropriate column of the associated row.
Unfortunately I believe the issue relates to the fact that I first need to perform at MATCH Type 0 (exact match), followed by a type -1 for the size to identify the next size up that can accommodate a specific cable size.
I've managed a simple lookup on another dataset using (for exact matches):
=MATCH($B3,'Current Raw Data'!A:A,0)+ROW('Current Raw Data'!A:A)-1
However when I attempt the same thing with two types of matches I get errors. The closest I get it using the following array formula, but it does not work unless the data set is arranged so that the contents of Cell C3 is the first occurring item in the dataset in column A:A:
{=MATCH(C3,($B3='Lookup - Cleats'!A:A)*('Lookup - Cleats'!B:B),-1)}
Main sheet:
Dataset Example:

With this array formula (click Ctrl + Shift + Enter together inside formula bar), you should be able to get your results:
=IFERROR(INDEX('Lookup - Cleats'!C$3:C$26,MATCH($B3&$C3,'Lookup - Cleats'!$A$3:$A$26&'Lookup - Cleats'!$B$3:$B$26,0)),"")
I tried my best to use your data setup but maybe miss one or two things that you will need to adjust accordingly. Let me know if this is not working.

Related

Returning a column reference from MATCH to avoid using INDIRECT with a Named Range

TL;DR: I'm basically trying to obtain a column range such as 'Sheet 1'!$A:$A where the A is obtained by matching the contents of a given cell to a 1:1 range within a sheet referenced by another given cell, for use in a dynamic range.
In the highly probable case where that made zero sense, here's an illustration:
PARAMETERS: A2 = "LIST" | C2 = "FirstName" | Desired result: 'LIST'!$A:$A
And I've obtained that, BUT, I can't use that output ('LIST'!$A:$A) within formulas (namely to create a dynamic range). For instance, here 'LIST'!$A:$A contains 101 cells with values in them:
V3 = NamedFormula = 'LIST'!$A:$A
COUNTA(INDIRECT(V3)) = 101
COUNTA(INDIRECT(NamedFormula)) = 1 because it evaluates to #VALUE and that is a singular result.
Before delving into the topic of using INDIRECT with a Named Range (which I've read about and am still getting over my confused grief), I'm realizing my Names are getting a bit out of hand. I tend to use Excel like a mad scientist. So, in case there's a much simpler solution to what I'm trying to do, here's my actual mission:
0. I'm building a tool to simplify a process where email addresses are built from different data, which needs to run without any scripts, only formulas.
1. A tab with no imposed name would contain a user database with minimally (firstname and lastname OR IDs) AND (potentially other data columns) in no specific order. Tool users would import that tab from wherever the data got to them depending on the client, and would only need to copy-paste relevant headers to the main tab without changing anything else here for data integrity.
2. The main tab would have specific input fields where tool users would paste in the name of the imported tab as well as the labels of the columns they need (for instance, the labels in the first row of the columns containing the first name and the last name), and an input field for the domain name to use to build those email addresses.
3. A Data tab is referenced for cleaning and preparing strings for email address formats.
4. The Export tab would spew out a list of clean email addresses that can be exported to CSV.
The Data tab is just 2 columns to use with SUBSTITUTE so that for instance apostrophes are removed but accented letters are normalized (é -> e). I've used LAMBDA within Names to get there. The problem is to tie everything in - to get those Named ranges into the final formula.
The Names I'm using so far (I'd like to use fewer but testing specific parts extended beyond simple usage I fear):
ALPH ={"A";"B";"C";"D";"E";"F";"G";"H";"I";"J";"K";"L";"M";"N";"O";"P";"Q";"R";"S";"T";"U";"V";"W";"X";"Y";"Z"}
LABELS =LAMBDA(labelname,ADDRESS(2,MATCH(labelname,INDIRECT("'"&PARAMETERS!$A$2&"'!$1:$1"),0),1,1,PARAMETERS!$A$2))
RANGECOL =LAMBDA(labelname,COLUMN(INDIRECT(LABELS(labelname))))
RNCOL =LAMBDA(label,"'"&PARAMETERS!$A$2&"'!$"&INDEX(ALPH,RANGECOL(label))&":$"&INDEX(ALPH,RANGECOL(label)))
I haven't tied everything in the Data tab yet - I'm still trying to automate my main tab before pushing further and using the Data tab substitutions on top of everything. That will be the next step, not my current focus. But, for the curious and interested, on the Data tab I'm using something something I found on ablebits which works wonders =]
So, now if I use the offset range with a static LIST!A:A it works:
=IF($C$2<>"",LOWER(INDEX(OFFSET(INDIRECT(ADDRESS(2,MATCH($C$2,INDIRECT("'"&$A$2&"'!$1:$1"),0),1,1,$A$2)),0,0,COUNTA(LIST!A:A)-1,1),ROW())),"") &IF($C$3<>"","."&LOWER(INDEX(OFFSET(INDIRECT(ADDRESS(2,MATCH($C$3,INDIRECT("'"&$A$2&"'!$1:$1"),0),1,1,$A$2)),0,0,COUNTA(LIST!A:A)-1,1),ROW())),"") &"#"&$C$4
But when I try to use the dynamic RNCOL($C$3) it does not:
=IF($C$2<>"",LOWER(INDEX(OFFSET(INDIRECT(LABELS($C$2)),0,0,COUNTA(INDIRECT(RNCOL($C$2)))-1,1),ROW())),"") &IF($C$3<>"","."&LOWER(INDEX(OFFSET(INDIRECT(LABELS($C$3)),0,0,COUNTA(INDIRECT(RNCOL($C$3)))-1,1),ROW())),"") &"#"&$C$4
This just gives #REF, and evaluating shows the digression starting at INDIRECT(RNCOL($C$3)) equating to #VALUE.
I'm starting to see double here but my undying and completely normal love for Excel prevents me from going home from work as I'm way too far down the rabbit hole to let my obsession die here.
Any pointers as to how this can work?
Note - all of the names in the supplied sheet were generated by an online fake name generator, nothing in here is actual user data #GDPR
Thanks in advance! <3
Test sheet is available via Google Drive.

Your current set-up is not good for many reasons, and in my opinion would require a complete overhaul, the scope of which lies beyond a response on this website.
As to a 'quick fix' to your current issue, the reason your formula in E1 is currently returning an error is due to the fact that, as you can see via stepping through with the Evaluate Formula tool, the part
COUNTA(INDIRECT(RNCOL($C$2)))-1
is resolving to
COUNTA(INDIRECT({"'LIST'!$A:$A"}))-1
and this is not the same as
COUNTA(INDIRECT("'LIST'!$A:$A"))-1
in that the value being passed to INDIRECT is an array in the former though not in the latter. Although INDIRECT can accept arrays, it is only within certain constructions in conjunction with other suitable functions; here it will simply error.
And the reason that it is returning an array is due to the fact that RNCOL($C$2) is returning an array, and that is because that function is defined as
=LAMBDA(label,"'"&PARAMETERS!$A$2&"'!$"&INDEX(ALPH,RANGECOL(label))&":$"&INDEX(ALPH,RANGECOL(label)))
and, since RANGECOL($C$2) resolves to 1 here, the above is equivalent to
"'PARAMETERS!$A$2'!$"&INDEX(ALPH,1)&":$"&INDEX(ALPH,1)
Here, because you are omitting the column_num parameter from INDEX, the part
INDEX(ALPH,1)
is resolving to
{"A"}
which is an array (albeit one comprising a single value) and technically different from
"A"
In most circumstances, this is not an issue. As such, it is almost always unnecessary to pass both a row_num and column_num parameter to INDEX when indexing a one-dimensional array. Here, however, it matters.
You can resolve this by explicitly including a column_num parameter, i.e. redefine RNCOL as
=LAMBDA(label,"'"&PARAMETERS!$A$2&"'!$"&INDEX(ALPH,RANGECOL(label),1)&":$"&INDEX(ALPH,RANGECOL(label),1))

Error in Excel with Time function within Match function

I am trying to use the Match function to return the row of an indicated table that a certain time value is on. The time is in mm:ss format on the table, so I want users to input the desired time to match as text for their simplicity (with data validation to ensure its correct format), and then use the Time function within the Match function to convert the input to match the format of the table for comparison. However, when using the Time function, the Match function returns the incorrect row, one row number short of what it should be to be precise. I attempted to do some debugging (shown below) and looked into the documentation of both the Time and Match functions, but can't figure out why this would happen. Is there something about the Time function I'm missing?
Here is a breakdown of what I'm using and what I've done to debug and figure out it's the Time function that's causing me issues. Column R has the functions I've been using and their results, and Column S has direct links to the table to show what the output should be. Column T shows that the time values are exactly the same but that using them yields different results in the Match function. Column U is the user input time in text format, and columns V through X are just used to ensure we get to the correct column in the lookup table.
(https://i.stack.imgur.com/ageCW.png)
Here is a snip of the table being referenced in the Match function.
(https://i.stack.imgur.com/FgfGG.png)

Well, this is curious. This is NOT a proper answer, but I needed to enter this as an answer rather than a comment because I needed the space and the markup of a table. I created my own table and ran my own experiment.
I entered the time value of 00:01:23 three different ways:
I typed "00:01:23" into a cell manually.
I entered =TIME(0,1,23) in a cell
I typed "00:01:15" and "00:01:16" into two consecutive cells, and then dragged it down and let Excel autofill.
Here's the results I got:
How Entered
Value
Typed "00:01:23" in Excel
0.0009606481481481480000
=TIME(0,1,23)
0.0009606481481481480000
Fill
0.0009606481481481490000
I emphasized the digit that turned out unexpectedly different.
I then did a MATCH(x,x,1) down this column for each value and it resulted in exactly the behavior you observed. The first two matched 1:22, as they should, because they were ever so slightly less than the table value. The self-referencing MATCH() of the 1:23 cell correctly matched on 1:23.
What is puzzling to me is that my test revealed to me that the value in the lookup table was a tiny bit off, by (0.0000000000000000010000), where your test presented the exact same number, concealing the difference. So in my test, the MATCH() behaved correctly for the data given, even if the data was wrong.
Excel is limited to 15 significant digits, and I have no way of knowing what rounding shenanigans Excel goes through to drop the remaining digits.
My thought goes to wondering how the time values in your lookup table were first created to begin with. Like, were they initially entered in a google sheet and then opened in excel? Is the 15 significant digit rounding handled identically among excel versions and OSes?

Reading an Excel sheet using ADO/ODBC in Delphi 7

I'm trying to read an Excel sheet from an XLS or XLSX file in memory using Delphi 7. When possible I use automation to read the cells one by one, but when Excel is not installed, I revert to using the ADO/ODBC Jet driver.
I connect using either
Provider=Microsoft.Jet.OLEDB.4.0; Data Source=file.xls;Extended Properties="Excel 8.0;Persist Security Info=False;IMEX=1;HDR=No";
Provider=Microsoft.ACE.OLEDB.12.0; Data Source=file.xlsx;Extended Properties="Excel 12.0;Persist Security Info=False;IMEX=1;HDR=No";
My problem then is that when I use the following query:
SELECT * FROM [SheetName$]
the returned results do not contain the empty rows or empty columns, so if the sheet contains such rows or columns, the following cells are shifted and do not end up in their correct position. I need the sheet to be loaded "as is", ie know exactly from what cell position each value comes from.
I tried to read the cells one by one by issuing one query of the form
SELECT F1 FROM `SheetName$A1:A1`
but now the driver returns an error saying "There is data outside the selected region". btw I had to use backticks to enclose the name because using brackets like this [SheetName$A1:A1] gave out a syntax error message.
Is there a way to tell the driver to select the sheet as-is, whithout skipping blanks? Or maybe a way to know from which cell position each value is returned?
For internal policy reasons (I know they are bad but I do not decide these), it is not possible to use a third party library, I really need this to work from standard Delphi 7 components.

I assume that if your data is say in the range B2:D10 for example, you want to include the column A as an empty column? Maybe? Is that correct? If that's the case, then your data set, when you read the sheet (SELECT * FROM [SheetName$]) would also return 1 million rows by 16K columns!
Can you not execute a query like: SELECT * FROM [SheetName$B2:D10] and use the ADO GetRows function to get an array - which will give you the size of the data. Then you can index into the array to get what data you want?

OK, the correct answer is
Use a third party library no matter what your boss says. Do not even
try ODBC/ADO to load arbitrary Excel files, you will hit a wall sooner or later.
It may work for excel files that contain a single data table, but not when you want to cherry pick data in a sheet primarily made for human consumption (ie where a single column contains some cells with introductory text, some with numerical data, some with comments, etc...)
Using IMEX=1 ignores empty lines and empty columns
Using IMEX=0 sometimes no longer ignores empty lines, but now some of the first non empty cells are considered field names instead of data, although HDR=No. Would not work anyway since valules in a column are of mixed types.
Explicitly looping across cells and making a SELECT * FROM [SheetName$A1:A1] works until you reach an empty cell, then you get access violations (see below)
Access violation at address 1B30B3E3 in module 'msexcl40.dll'. Read of address 00000000
I'm too old to want to try and guess the appropriate value to use so it works until someone comes with yet another mix of data in a column. Sorry for having wasted everybody's time.

Combine two data ranges into one range (Google Drive Excel)

Hi there I am looking to combine two data ranges/arrays into one in order to feed them into excel FREQUENCY function.
Example:
First data range - B5:F50
Second data range - J5:N50
Bins data range - I5:I16
Function definition - FREQUENCY(data_array; bins_array)
Basically I am lazy and I don't want to reshuffle my excel script to spit out both datasets side by side so that I can reference them using something like B5:K50 range. Is there any way I can combine both datasets into data_array using some kind of formula? Maybe to end up with something along the line of =FREQUENCY((B5:F50,J5:N50); I5:I16) ?
BTW: Either of
=FREQUENCY(B5:F50; I5:I16)
=FREQUENCY(J5:N50; I5:I16)
work just file on their own for me.
Update
Actual formula definition FREQUENCY(data, classes)
2013 MS Excel (unrelated)

In MS Excel FREQUENCY function accepts a "union" as the first argument, i.e. a list of references separated by commas and enclosed in parentheses e.g.
=FREQUENCY((B5:F50,J5:N50),I5:I16)
Note: the "bins array" can also be a union if required
In "Google sheets" I don't think the same thing is possible - there may be a clever workaround, but I'm not aware of it

The Using Arrays page has some details that worked for me:
https://support.google.com/docs/answer/6208276?hl=en
It says:
"You can join multiple ranges into one continuous range using this same punctuation, which works the same way as VMERGE. For example, to combine values from A1-A10 with the values from D1-D10, you can use the following formula to create a range in a continuous column: ={A1:A10; D1:D10}"
I have two named ranges, so I was able to use {namedRange1;namedRange2} and it gave me one continuous range.

nested excel functions with conditional logic

Just getting started in Excel and I was working with a database extract where I need to count values only if items in another column are unique.
So- below is my starting point:
=SUMPRODUCT(COUNTIF(C3:C94735,{"Sharable Content Object Reference Model 1.2","Authored SCORM/AICC content","Authored External Web Content"}))
what i'd like to figure out is the syntax to do something like this-
=sumproduct (Countif range1 criteria..., where range2 criteria="is unique value")
Am I getting this right? The syntax is a bit confusing, and I'm not sure I've chosen the right functions for the task.

I just had to solve this same problem a week ago.
This method works even when you can't always sort on the grouping column (J in your case). If you can keep the data sorted, #MikeD 's solution will scale better.
Firstly, do you know the FREQUENCY trick for counting unique numbers? FREQUENCY is designed to create histograms. It takes two arrays, 'data' and 'bins'. It sorts 'bins', then creates an output array that's one longer than 'bins'. Then it takes each value in 'data' and determines which bin it belongs in, incrementing the output array accordingly. It returns the array. Here's the important part: If a value appears in 'bins' more than once, any 'data' value meant for that bin goes in the first occurrence. The trick is to use the same array for both 'data' and 'bins'. Think it through, and you'll see that there's one non-zero value in the output for each unique number in the input. Note that it only counts numbers.
In short, I use this:
=SUM(SIGN(FREQUENCY(<array>,<array>)))
to count unique numeric values in <array>
From this, we just need to construct arrays containing numbers where appropriate and text elsewhere.
In the example below, I'm counting unique days when the color is red and the fruit is citrus:
This is my conditional array, returning 1 or true for the rows I'm interested in:
($A$2:$A$10="red")*ISNUMBER(MATCH($B$2:$B$10,{"orange","grapefruit","lemon","lime"},0))
Note that this requires ctrl-shift-enter to be used as an array formula.
Since the value I'm grouping by for uniqueness is text (as is yours), I need to convert it to numeric. I use:
MATCH($C$2:$C$10,$C$2:$C$10,0)
Note that this also requires ctrl-shift-enter
So, this is the array of numeric values within which I'm looking for uniqueness:
IF(($A$2:$A$10="red")*ISNUMBER(MATCH($B$2:$B$10,{"orange","grapefruit","lemon","lime"},0)),MATCH($C$2:$C$10,$C$2:$C$10,0),"")
Now I plug that into my uniqueness counter:
=SUM(SIGN(FREQUENCY(<array>,<array>)))
to get:
=SUM(SIGN(FREQUENCY(
IF(($A$2:$A$10="red")*ISNUMBER(MATCH($B$2:$B$10,{"orange","grapefruit","lemon","lime"},0)),MATCH($C$2:$C$10,$C$2:$C$10,0),""),
IF(($A$2:$A$10="red")*ISNUMBER(MATCH($B$2:$B$10,{"orange","grapefruit","lemon","lime"},0)),MATCH($C$2:$C$10,$C$2:$C$10,0),"")
)))
Again, this must be entered as an array formula using ctrl-shift-enter. Replacing SUM with SUMPRODUCT will not cut it.
In your example, you'd use something like:
=SUM(SIGN(FREQUENCY(
IF(ISNUMBER(MATCH($C$3:$C$94735,{"Sharable Content Object Reference Model 1.2","Authored SCORM/AICC content","Authored External Web Content"},0)),MATCH($J$3:$J$94735,$J$3:$J$94735,0),""),
IF(ISNUMBER(MATCH($C$3:$C$94735,{"Sharable Content Object Reference Model 1.2","Authored SCORM/AICC content","Authored External Web Content"},0)),MATCH($J$3:$J$94735,$J$3:$J$94735,0),"")
)))
I'll note, though, that scaling might be a problem on data sets as large as yours. I tested it on larger data sets, and it was fairly fast on the order of 10k rows, but really slow on the order of 100k rows, such as yours. The internal arrays are plenty fast, but the FREQUENCY function slows down. I'm not sure, but I'd guess it's between O(n log n) and O(n^2) depending on how the sort is implemented.
Maybe this doesn't matter - none of this is volatile, so it'll just need to calculate once upon refreshing the data. If the column data is changing, though, this could be painful.

Asuming the source data is sorted by the key value [A], start with determining the occurence of the key column
B2: =IF(A2=A1;B1+1;1)
Next determine a group sum
C2: =SUMIF($A$2:$A$9;A2;$B$2:$B$9)
A key is unique if its group sum is exactly 1
D2: =(C2=1)
To count records which match a certain criterium AND are unique, include column D in a =IF(AND(D2, [yourcondition];1;0) and sum this column
Another option is to asume a key unique within a sorted list if it is unequal to both its predecessor and successor, so you could find the unique records like
E2: =AND(A2<>A1;A2<>A3)
G2: =IF(AND(E2;F2="this");1;0)
E and G can of course be combined into one single formula (not sure though if that helps ...)
G2(2): =IF(AND(AND(A2<>A1;A2<>A3);F2="this");1;0)
resolving unnecessarily nested AND's:
G2(3): =IF(AND(A2<>A1;A2<>A3;F2="this");1;0)
all formulas in row 2 should be copied down to the end of the list

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string