Find text in a table array excel - excel

below is an extract of a large table (2017 IDs by 35 cases)
I want a formula which will look for a Case reference e.g. P0093 and return the first ID it finds (column A).
So for example, given 'P0094', the formula will result in '1'.
I expect the answer is something to do with an array formula, which are a bit of a blindspot for me.
Thanks in advance.
ID Case 1 Case 2 Case 3 Case 4 Case 5
1 P0001 P0092 P0093 P0094
2 P0016 P0150 P0419 P0420
3 P0018 P0189 P0421 P0422
4 P0004 P0095 P0096 P0097
5 P0005 P0104 P0105
6 P0021 P0068 P0069
7 P0007 P0098 P0099 P0100
8 P0008 P0101 P0102 P0103
9 P0009 P0062 P0233 P0234

Try this: (line break added for readability)
= IFERROR(INDEX(A:A,MATCH(1,(MMULT((A1:E10="P0094")+0,
TRANSPOSE((COLUMN(A1:E10)>0)+0))>0)+0,0)),"no match")
Just change both instances of E10 in the formula above to however large your actual data table is. (Assuming 2017 ID's and 35 cases, I would probably change E10 to AJ2018 but I don't know for sure.)
Also note this is an array formula, so you must press Ctrl+Shift+Enter on the keyboard after typing this formula rather than just pressing Enter.

non CSE alternative
See the picture for cell reference layout. Use the following formula in I13:
=IFERROR(INDEX($A$1:$A$10,AGGREGATE(15,6,ROW($B$2:$E$10)/($B$2:$E$10=$I$12),1)),"not found")
Aggregate performs array like operations without actually being an array.
The concept is to find the row where the value you are looking for is true and all other rows become an error. It starts by making the denominator return TRUE or False. Because the TRUE or FALSE is sent through a math operation Excel converts the TRUE to 1 and FALSE to 0. Since all values divided by 0 become an error, only rows where the denominator are true are going to be kept. The reason for this is the 6 in the aggregate function which tells aggregate to ignore all errors. 15 in aggregate tell aggregate to sort the results from smallest to largest. Finally the ,1) tells aggregate to return the first value in the list. Once that is known, INDEX takes over. Index returns the row entry in the range A1:A10 that is passed from aggregate. If the range had been A2:A10, then I would have to subtract row(A2)-1 to get the starting entry number in the list instead of the row number.
An important thing to note. Even though this is not an array, AGGREGATE performs array like calculations. As such full column references should be avoided within the AGGREGATE function to avoided wasted calculations on blank cells.

Related

In Excel, How do I return a list of values that match a description?

My question is that I want to return a list of values in column B in sheet 2 (or in this case NBA Players) that contain the value "PG" in cell A3 in sheet 1, from column A in sheet 2. Not only do I want it to match "PG" but I also want the value to have a salary (Column C) that is between $7100 (Cell B2 in Sheet 1) and $8000 (Cell C2) in Sheet 1). Any help would be appreciated.
you are either going to need to use an array formula or a function that returns array like calculations. I will suggest using the AGGREGATE function. Avoid using full comm/row references within an array formula or a function performing array like calculations or you may wind up bogging down your system with excessive calculations.
The AGGREGATE function is made up a several individual functions. Depending which one you choose, it will perform array operations. I am going to suggest that formula 14. What the following example will do is generate a list of results sorted from smallest to largest that ignores error values, then return the first value from the list. The thing we will list is the row number for a row that matches your ALL your criteria. So the basics of AGGREGATE looks like this:
AGGREGATE(Formula #, Error/hidden handling #, Formula, parameter)
The hardest part of this is coming up with the right formula. In the numerator you put the thing you are looking for. In the denominator you place your TRUE/FALSE condition checks. Separate each condition check with *. * will act as an AND function. The thing that makes this work is that TRUE/FALSE convert to 1/0 when they are sent through a math operation. So anything you do not want is FALSE. and anything divided by FALSE becomes divide by 0 which in turn generates an error. Since AGGREGATE is set to ignore error, only things that meet your condition will exist in the list and since they are being divided by TRUE which is 1, your thing remains unchanged. So the aggregate function is going to start to look like:
AGGREGATE(14,6,ROW(some range)/((Condition 1)*Condition 2)*...*(Condition N)),1)
So as eluded to before, 14 set the AGGREGATE to sort a list in ascending order. 6 tells AGGREGATE to ignore errors, and the 1 tells AGGREGATE to return the first item in its sorted list. If it was 2 instead of 1 it would return the 2nd position. If you ask for a position that is greater than the number of items in the list, there will be an error produced by AGGREGATE which does not get ignored.
So now that there is some understanding of what AGGREGATE does lets see how we can apply this to your data. For starters lets assume your data is in rows 2:100 and row 1 is a header row. You will have to adjust the references to suit your data.
CONDITION 1
LEFT($A$2:$A$100,2)="PG"
Checks to see if the first two characters are PG. based on the data in your screen shot, PG was either to the left of the / or was the only entry. There was also an observation that there was only one / in the cells of column A. If you also need to check if it after the / and with the assumption that it can only be on one side and not both at the same time you could use this alternative for your condition check:
(LEFT($A$2:$A$100,2)="PG")+(RIGHT($A$2:$A$100,2)="PG")
In this case the + is performing the task of an OR function. The caveat mentioned earlier is important because if both sides are TRUE then you wind up with TRUE+TRUE which becomes 1+1 which is 2 and we only want to divide by 1 or 0. Though to counter that you could go with:
MIN((LEFT($A$2:$A$100,2)="PG")+(RIGHT($A$2:$A$100,2)="PG"),1)
CONDITION 2
Check that the salary in C is less than or equal a value 80000.
($C$2:$C$100<=80000)
CONDITION 3
Check that the salary in C is greater than or equal a value 71000.
($C$2:$C$100>=71000)
Now lets put this all together to get a list of row numbers that meet your conditions:
AGGREGATE(14,6,ROW($A$2:$A$100)/MIN((LEFT($A$2:$A$100,2)="PG")+(RIGHT($A$2:$A$100,2)="PG"),1)*($C$2:$C$100<=80000)*($C$2:$C$100>=71000),ROW(A1))
Now provided I did not screw up the bracketing in that formula, you can place that formula in a cell and copy it down until it produces errors. As you copy it down, the only thing that will change is the A1 in ROW(A1). It acts like a counter. 1,2,3 etc. so you will get a list of row numbers that meet your criteria. Now we need to convert those row numbers to names.
To find the names, the INDEX function is your friend here. Because it is not part of an array formula or inside a function performing array like calculations, full column reference can be used. So we take our formula that is generating row numbers and place it inside the INDEX function to give:
INDEX(B:B,Row Number)
INDEX(B:B,AGGREGATE(14,6,ROW($A$2:$A$100)/MIN((LEFT($A$2:$A$100,2)="PG")+(RIGHT($A$2:$A$100,2)="PG"),1)*($C$2:$C$100<=80000)*($C$2:$C$100>=71000),ROW(A1)))
Now if you hate seeing error codes when you have copied down further then results you can place the whole thing inside and IFERROR function to give:
IFERROR(formula,What to display in case of an error)
So for blank entries:
IFERROR(INDEX(B:B,AGGREGATE(14,6,ROW($A$2:$A$100)/MIN((LEFT($A$2:$A$100,2)="PG")+(RIGHT($A$2:$A$100,2)="PG"),1)*($C$2:$C$100<=80000)*($C$2:$C$100>=71000),ROW(A1))),"")
and custom message:
IFERROR(INDEX(B:B,AGGREGATE(14,6,ROW($A$2:$A$100)/MIN((LEFT($A$2:$A$100,2)="PG")+(RIGHT($A$2:$A$100,2)="PG"),1)*($C$2:$C$100<=80000)*($C$2:$C$100>=71000),ROW(A1))),"NOT FOUND")
So now you just need to adjust the references to suit your data. If your data is located on another sheet remember to include the sheet name. A reference to B3:C4 would become:
Sheet1!B3:C4
and if the sheet name has a space in it:
'Space Name'!B3:C4

MAX + Left Function Excel

I am trying to get the max value of a column based on the Left function
What I am doing is the following :
These are the results I get when i write this into column C :
=MAX(LEFT(A:A, 2))
But what I truly want is to get in column C the max value of all column A not for each cell.
So the result should be in this case 90 for all rows.
What should be the formula here?
Just another option that gets entered normally:
=AGGREGATE(14,6,--LEFT($A$1:INDEX(A:A,MATCH("ZZZ",A:A)),2),1)
Array formulas will calculate the enitre referenced array. So care should be taken to limit the number of iterations to only the data set.
The $A$1:INDEX(A:A,MATCH("ZZZ",A:A)) part of the formula does that. It finds the last cell in column A with data in it and sets that as the upper bound. So in this instance the reference range is A1:A3. But, it will grow dynamically as data in Column A is added, so no need to change the formula each time data is added.
Update 2
Here is another solution which I think is better than my original (below)
=INT(SUMPRODUCT(MAX(SUBSTITUTE(A:A,"-",".")*1)))
it can be entered as normal (just Enter)
Orignal Answer
You need numbers and arrays
=MAX(IFERROR(LEFT(A:A,2)*1,0))
Let's break this down. Multiplying by turns your strings into numbers - since Left only returns a string
LEFT(A:A,2)*1
Unfortunately this method returns #Value if you multiply an empty string by 1. You will definitely have some empty strings in the range A:A so we wrap the whole thing with an IFERROR function.
IFERROR(LEFT(A:A,2)*1,0)
Now we still need excel to treat this as an array (i.e. a whole column of numbers, rather than just one number). So we put the MAX formula in and enter it with Ctrl+Shift+Enter rather than just Enter. The result is that the formula looks like this in the formula bar
{=MAX(IFERROR(LEFT(A:A,2)*1,0))}
which would return 90 in your example, as required
Update 1
If you are using Excel 2013 or later, you can also use the NUMBERVALUE function
=MAX(NUMBERVALUE(LEFT(A:A,2)))
again, enter it with Ctrl+Shift+Enter

What is this programmer doing with his Lookup function?

I found on a forum a formula to find the last populated cell in a column:
LOOKUP(2,1/(G:G<>""),ROW(G:G)))
But what's going on with this bit?
1/(G:G<>"")
One divided by ??? (something that's not equal to ""?) I don't understand the logic, here.
If you want to observe the calculation step by step by evaluating the formula with Formula Auditing I strongly recommend limiting the range. Say apply:
=LOOKUP(2,1/(G1:G10<>""),ROW(G1:G10))
And, for illustration purposes, populate no lower down the sheet than say G6 (but a void between, in the range G1:G5, may in that case help to understand what is happening).
For this answer I am only going to consider five cells: G1, G3 and G4 populated, G2 and G5 (onwards) not.
1/(G1:G5<>"")
Is indeed at the heart of this formula. G1:G5<>"" does, as you have recognised, test whether not equal to "". "" is the convention for 'empty' for an Excel cell. If populated (ie "not empty") this returns TRUE and FALSE otherwise. Hence for the five cells as chosen for this example an array is returned, regarding G1:G5 in order, of:
TRUE;FALSE;TRUE;TRUE;FALSE.
In arithmetic calculations Excel treats TRUE as 1 and FALSE as 0. Hence using the above truth table as the denominator and 1 as the numerator gives an array (again in order) of:
1/1;1/0;1/1;1/1;1/0
which resolves to:
1;#DIV/0!;1;1;#DIV/0!.
In the LOOKUP function above 2 was chosen as the lookup_value. (Any other number greater than 1 would serve equally well.) So we are looking for 2 in an array that is composed exclusively of either 1s or errors. Therefore there is no chance of finding an exact match, so the default kicks in, which is the last value (in order, not counting errors). The last 1 in the array is the fourth element, and the fourth element in ROW(G1:G5) is …4.
G4 is the last populated cell in ColumnG (in my example).

4 variables index function, with great than and less than for 2 variables

I am trying to use index match functions to determine the appropriate rate for the below table.
So for example a consumer loan that is for a person that owns property, the car is 2 years or less in age and the total loan to value ratio is less than 140% should return a value of 5.15%
I believe this is what you wanted...
I would use a series of nested if functions to evaluate which column of LTV I would want the value to come from.
"That is what is done in the AND( ) part. If the value is greater than the 110% and smaller than 140% let's do the Index Match on the 110% Column, Otherwise do it on the 140% Column."
You could extend this for more columns with more IFs in the false condition.
Then it is a simple INDEX match with concatenation. It searches for the three parameters all concatenated in a single range of concatenations.
Hope it helped.
Proof of Concept
In order to achieve the above I had to make a minor edit to your header to be able to distinguish between the two 140% columns.
The functions used in this answer are:
AGGREGATE function
MATCH function
INDEX function
ROW function
IFERROR function
I placed the main part of the formula inside the IFERROR function as a way of dealing with things that may be out of range or when not all the input have been provided. I then assumed that what you were basing your search on would be provided in a series of cells. In my example I assumed the questions would be asked in the range H3 to K3 and I place the results in L3.
The main concept is centered around the INDEX function. I specified the index range as being the height of your table and the width of the percentage rates. Or for this example D2:F9.
=IFERROR(INDEX($D$2:$F$9,row number, column number),"Not Found")
That is the easy part. That more challenging part is determining the row and column number to look in. Lets start with the column number as it is the slightly easier of the two. I assumed the ratio to look for, or rather the header of the column to look in would be supplied. I basically used this equation to determine the column number:
=MATCH(K3,$D$1:$F$1,0)
which in layman's terms is which column between D and F, counting column D as 1, has the value equal to the contents of K3. So now that there is a formula to determine the column, we can drop that into our original formula and wind up with:
=IFERROR(INDEX($D$2:$F$9,row number,MATCH(K3,$D$1:$F$1,0)),"Not Found")
Now we just need to determine the row number. This is the most complex operation. We are going to basically make a bunch of logical checks and take the first row that matches all the logical checks. The premise here is that a logical check is either TRUE or FALSE. In excel 0 is false an every other integer is TRUE. So if we multiply a series of logical checks together, only the one that is true in all cases will be equal to 1. The first logical check is the loan type. it will be followed by the living status and then the vehicle age.
=(H3=$A$2:$A$9)*(I3=$B$2:$B$9)*(J3=C2:C9)
now if you put that into an array formula you will get a series of true false or 1/0. We are going to use it inside an AGGREGATE function with a special feature. The AGGREGATE function will perform array like calculation for some of its functions. We are going to use function 15 which will do this. We are also going to tell the aggregate function to ignore all errors, which is what the 6 does. So in the end what we wind up doing is dividing each row number by the logical check. If the logical check is false or 0, it will generate a Div/0! error which aggregate will choose to ignore. In the end we wind up with a list of row which match our logical check. We then tell the aggregate that we want the first result with the ,1. so we wind up with a formula that looks like:
=AGGREGATE(15,6,ROW($A$2:$A$9)/((H3=$A$2:$A$9)*(I3=$B$2:$B$9)*(J3=C2:C9)),1)
While this does provide us with the row number we want, we need to adjust it to make it an index number. In order to do this you need to subtract the number of header rows. In this case 1. So the index row number is given by this formula:
=AGGREGATE(15,6,ROW($A$2:$A$9)/((H3=$A$2:$A$9)*(I3=$B$2:$B$9)*(J3=C2:C9)),1)-1
And when we substitute that back into the earlier equation for the row number, we wind up with the final equation of:
=IFERROR(INDEX($D$2:$F$9,AGGREGATE(15,6,ROW($A$2:$A$9)/((H3=$A$2:$A$9)*(I3=$B$2:$B$9)*(J3=C2:C9)),1)-1,MATCH(K3,$D$1:$F$1,0)),"Not Found")

Using INDEX(MATCH()) to return the n'th value of a cell

I have a simple worksheet with 2 columns
I want to get all the results(on column "H") (I can get only the first occurrence,I want to know if I can get the others) that contains the value from cell G1, is that possible without a macro ? any way of doing it would be appreciated...Any ideas?
You can use ROW() and SMALL() to get those instead of MATCH() since this always gets the first match.
=IFERROR(INDEX($C$4:$C$7,SMALL(IF($D$4:$D$7=$G$1,ROW($D$4:$D$7)-(ROW()-1)),ROWS($D$4:D4))),"Null")
So, if the array $D$4:$D$7=$G$1 returns true (i.e. the value equals that in G1), you will get the row numbers of these values, in this case, you will get 4 and 6. All the other will return False.
After some processing with -(ROW()-1), the 4 and 6 become 1 and 3. those two values are what will be fed to the INDEX.
SMALL() then picks the smallest, starting with the 1st (you get 1 from ROWS($D$4:D4)) and when you drag the formula down, the ROWS become ROWS($D$4:D5) which gives 2, and SMALL ends up taking the 2nd smallest value, which is 3.
EDIT: Forgot to mention. You have to array enter the above formula. To do this, type the combination keys of Ctrl+Shift+Enter after typing the formula (edit the formula again if necessary) instead of Enter alone.

Resources