Excel formula - Get first occurrence of partial string in rows above cell - excel

I have a table of fruits in Excel 2013.
I'd like to fill the "Category" column by searching from the current row to the top until the first occurrence of "::", which is the keyword for a category in the table.
If there was some way to reverse a range, I could do something like "=Match("::*"; $A6:$A$2)" to find the row. However, this is not possible.
Does anyone know how this might be accomplished using formulas?

Using your provided sample data, and assuming your data is already organized as shown in your sample, you can take advantage of that organization and use this formula in cell C2 and copy down:
=IF(LEFT(A2,2)="::","",IF(LEFT(A1,2)="::",MID(A1,4,LEN(A1)),C1))

Assuming your table is in A1, put this in C3:
=INDEX(A:A, AGGREGATE (14,6,ROW($A$1:A2)/(LEFT($A$1:A2,2)="::"),1))
And copy down.

Here's a kinda different approach. I'm just basically responding to this part of your post to prove this is possible:
If there was some way to reverse a range, I could do something like "=Match("::*"; $A6:$A$2)" to find the row. However, this is not possible.
Reversing a range is possible, it's just tricky.
As you pointed out: $A6:$A$2 won't work since this is equivalent to $A$2:$A6.
However, without getting into the nitty-gritty details, this array formula will reverse this range:
= INDEX($A$2:$A6,N(IF({1},MAX(ROW($A$2:$A6))-ROW($A$2:$A6)+1)))
Note this is an array formula, so you must press Ctrl+Shift+Enter instead of just Enter after typing this formula into a cell.
You could use this in combination with your MATCH formula to get the desired result (which tells you how many rows up the :: row is):
= MATCH("::*",INDEX($A$2:$A6,N(IF({1},MAX(ROW($A$2:$A6))-ROW($A$2:$A6)+1))),0)
(Also haha this is kinda cool: Usually you see MATCH used within INDEX to effectively get a VLOOKUP type of functionality. This is the first time I have ever seen it the opposite way of having INDEX within MATCH.)
Note that I'm not saying this is necessarily the best approach for this specific problem, just proving a point that arrays can be reversed.

Related

Excel - VLOOKUP vs. INDEX/MATCH - Which is better?

I understand how to use each method: VLOOKUP (or HLOOKUP) vs. INDEX/MATCH.
I'm looking for differences between them not in terms of personal preference, but primarily in the following areas:
Is there something that one method can do that the other cannot?
Which one is more efficient in general (or does it depend on the situation)?
Any other advantages/disadvantages to using one method vs. the other
NOTE: I am answering my own question here but looking to see if anyone else has other insights I hadn't thought of.
I prefer to use INDEX/MATCH in practically every situation because it is far more flexible and has the potential to be much more efficient depending on how large the lookup table is.
The only time when I can really justify using VLOOKUP is for very straight-forward tables where the column index number is dynamic, although even in this case, INDEX/MATCH is equally viable.
I'll give a few specific examples below to demonstrate the detailed differences between the two methods.
INDEX/MATCH can lookup to the left (or anywhere else you want)
This is probably the most obvious advantages to INDEX/MATCH as well as one of the biggest downfalls of VLOOKUP. VLOOKUP can only lookup to the right, INDEX/MATCH can lookup from any range, including different sheets if necessary.
The example below cannot be accomplished with VLOOKUP.
INDEX/MATCH has the potential to use smaller cell ranges (thus increasing efficiency)
Consider the example below. It can be accomplished with either method.
Both of these formulas work fine. However, since the VLOOKUP formula contains a larger range than the INDEX/MATCH formula, it is unnecessarily volatile.
If any cell in the range B1:G4 changes, the VLOOKUP formula must recalculate (because B1:G4 is within the range A1:H4) even though changing any cell in B1:G4 will not affect the outcome of the formula. This is not an issue for INDEX/MATCH because its formula does not contain the range B1:G4.
Using VLOOKUP with fixed col_index_number is dangerous
The main issue I see with having a fixed column index number is that it will not update as it should if full columns are inserted. Consider the following example:
This formula works fine unless a column is inserted within the lookup table. In that case, the formula will lookup the value to the left of where it should. See below, result after a column has been inserted.
This can actually be alleviated by using the following VLOOKUP formula instead:
= VLOOKUP("s",A1:H4,COLUMN(H1)-COLUMN(A1)+1,FALSE)
Now H1 will automatically update to I1 if a column is inserted, thus preserving the reference to the same column. However, this is entirely unnecessary because INDEX/MATCH can accomplish this without this problem with the formula below.
= INDEX(H1:H4,MATCH("s",A1:A4,0))
I realize this is an unlikely scenario, but it always bothered me that VLOOKUP by default looks up based on a fixed column index that does not automatically update if columns are inserted. To me, it just seems to make the VLOOKUP function more fragile.
INDEX/MATCH can handle variable column indexes just as well, but longer formula
If the column index number itself is dynamic, this is really the only case when I think VLOOKUP simplifies things a bit, but again the INDEX/MATCH alternative is just as good, just slightly more confusing. See below examples.
INDEX/MATCH is more efficient for multiple lookups
(thanks to #jeffreyweir)
If multiple lookup values are needed for a single match value, it is much more efficient to have a helper cell with the match value. This way, the match only has to be computed once, instead of one for each lookup formula. See example below.
This match value can then be used to return the appropriate lookup values. See example below, (formula has been dragged to the right).
This manual "splitting" of the match value and index values is not an option with VLOOKUP since the match value is an "internal" variable in VLOOKUP and cannot be accessed.
INDEX/MATCH can look up a range, allowing another operation
Let's say for example you want to find a max value in a column based on the column name.
You can first use MATCH to find the appropriate column, then INDEX to return the range of that entire column, then use MAX to find the max of that range.
See example below, the formula in H4 looks up the max value of the column name specified in cell G4. This cannot be accomplished using VLOOKUP alone.
MATCH doesn't have to match an exact value
Usually MATCH is used with the third argument as 0, meaning "find an exact match". But depending on the situation, using -1 or 1 as the third argument of MATCH can be very useful.
For example, the following formula returns the row number of the last row in column A that contains a number:
= MATCH(-1E+300,A:A,-1)
This is because this formula starts from the bottom of the A column and works its way toward the top, and returns the first row number in the A column where the value is greater than or equal to -1E+300 (which is basically any number).
Then INDEX can be used in combination with this to return the value in that cell. See example below.
In Summary
VLOOKUP is, at best, as good as INDEX/MATCH and admittedly slightly less confusing in some situations. And at worst, VLOOKUP is much more unsafe and volatile than INDEX/MATCH.
Also worth noting that if you want to look up a range instead of a single value, INDEX/MATCH must be used. VLOOKUP cannot be used to look up a range.
For these reasons, I generally prefer INDEX/MATCH in practically all situations.

COUNTIFS with unique values Excel

I am trying to produce a count of the number of times different strings come up in an Excel table. An example table, currently in SHEET1, would be this:
I have another table in another spreadsheet where I want to indicate, for each letter on the left in Table 1, how many entries for "za", "zc" or "zd" come up on the right. However, I would only like to only consider one entry of each.
The end result, on row B of SHEET2, would have to be something like this:
At the moment I am using a combination of SUM and COUNTIFS to do the job.
More specifically, applied to the example, I am using the following formula:
=SUM(COUNTIFS(Sheet1!A1:A18,Sheet2!$A1,Sheet1!B1:B18,{"za","zc","zd"}))
The formula is doing some of what is intended. However, it is not counting each entry just one time. Instead, its is counting, for each letter on the left, every entry of "za","zc" or "zd". The table that the formula is returning is as follows:
How can I change the formula so that it does what I intend?
Thank you.
My initial thought would be:
=SUM(MIN (1,COUNTIFS(Sheet1!A1:A18,Sheet2!$A1,Sheet1!B1:B18,{"za","zc","zd"}))
but I’m not where I can test if the MIN will apply properly to the COUNTIFS array of results. ;-)
EDITED: The MIN function is taking minimum of 1 or all of the items in the COUNTIFS array, rather than minimum of 1 and each item in the COUNTIFS array, which is what I was afraid of. Using
=MIN(COUNTIFS(Sheet1!A$1:A$18,Sheet2!$A1,Sheet1!B$1:B$18,"za"),1)+MIN(COUNTIFS(Sheet1!A$1:A$18,Sheet2!$A1,Sheet1!B$1:B$18,"zc"),1)+MIN(COUNTIFS(Sheet1!A$1:A$18,Sheet2!$A1,Sheet1!B$1:B$18,"zd"),1)
will gain the desired results. It is a little clunky, but simpler than an array formula. If you want an array formula, you can use:
=SUM(FREQUENCY(IFERROR(MATCH({"za","zc","zd"},(IF(Sheet1!$A$1:$A$18=$A5,Sheet1!$B$1:$B$18)),0),""),IFERROR(MATCH({"za","zc","zd"},(IF(Sheet1!$A$1:$A$18=$A5,Sheet1!$B$1:$B$18)),0),"")))
This uses the FREQUENCY function to take a set of values and see how many items in another set of values fall within each of the data ranges. Since you need text instead of numbers, we use the MATCH function to find out the first time the value occurs in your list, returning "" with the IFERROR function if it doesn't. (We only need the first occurrence since you don't want to know how many occurrences there are). Since it is text, we use the same input for both arguments for FREQUENCY.
Therefore, if you need to change the values you are looking for or the ranges in which you are searching, make sure to change both! Alternately, you could list the values out somewhere, say in F1:F3, and make a named range for this, another one for A1:A18, and another for B1:B18. Your formula would then look something like this:
=SUM(FREQUENCY(IFERROR(MATCH(SearchValues,(IF(colA=$A2,colB)),0),""),IFERROR(MATCH(SearchValues,(IF(colA=$A2,colB)),0),"")))
Then you need only change your named range definitions and your formulas would update. :-)
NOTE: Since this is an array formula, you must close out of the cell by pressing CTRL+SHIFT+ENTER rather than only ENTER. When you look at the formula bar, you should see
{=SUM(FREQUENCY(IFERROR(MATCH(SearchValues,(IF(colA=$A2,colB)),0),""),IFERROR(MATCH(SearchValues,(IF(colA=$A2,colB)),0),"")))}
It does NOT work to enter the curly braces yourself. ;-)
You can use this formula at B1 and fill down:
B1:
=SUMPRODUCT(((Sheet1!$A$1:$A$18=A1)*(Sheet1!$B$1:$B$18= {"za","zc","zd"}))/
COUNTIFS(Sheet1!$A$1:$A$18,Sheet1!$A$1:$A$18,Sheet1!$B$1:$B$18,Sheet1!$B$1:$B$18))

How to Use Cell Text From Cell Being Checked by COUNTIF in Excel

What I'm wanting to do is have a formula in one cell that counts the values in a range that conform to a lookup of that range cell's value compared to another cell.
OMG, now that I look at it, that is totally confusing. Let me try to clarify a lot here.
Say we have Cell1, which will hold the counting formula. I have a list of values in a two-column table, Table1. The range, Range1 that Cell1 will be counting from is a range of cells that have List Validation in them. Table1 holds references to all values that can result from those Lists, in column 1. I have another cell, Cell2, which holds a number value. Column 2 of Table1 holds values that reference Cell2. I need to count the number of values from Range1 whose row matches in Table12 match the value in Cell2. Is there a way I can do this with COUNTIF without referencing each cell individually? Is there some shorthand (like Range.currentValue) that I can use to get the value of the cell currently being checked? The range is 11 rows long, and I need to do a second range that has 12 rows counted.
Man, I really don't know how to clarify that any more... I'll post this for now, in case anyone can understand what I'm saying and knows the answer, while I work on a sample spreadsheet I can upload.
I did my best to visually represent what I'm trying to accomplish:
http://gyazo.com/b83295baf3b156683a5c39b40c806504
Extended explanation: http://gyazo.com/4048802050e3dcfca7aee238acc2f7dd
Use a helper column, say, between the brown and the first blue or at the right of the setup. Use a vlookup like
=vlookup(brownvalue,BluetableRange,2,false)
Then do a countif on the helper column
=countif(HelperColumn,"<="&GreenCellAddress)
You can hide the column with the helper if it upsets your spreadsheet design.
You can (and probably should) use a helper column as Teylyn suggests. But, for when that may be inconvenient, you can also use an array formula:
=SUM(COUNTIFS(listlookupcolumn,rangeoflists,numbervaluecolumn,"<="&numbertomatch))
To enter it as an array formula, type "ctrl-shift-enter" after editing the formula, rather than just "enter"
Rough explanation: since rangeoflists is in a place where a single value is expected, the countifs is calculated once for each value, and the array of results is passed to sum. Use the "evaluate formula" feature to see the intermediate result array.
Afterthought: It occurs to me now that this does rely on listlookupcolumn containing unique values. (Almost certainly true in this example.) You can modify the formula a bit to get around this:
=SUM(SIGN(COUNTIFS(listlookupcolumn,rangeoflists,numbervaluecolumn,"<="&numbertomatch)))
The SIGN function will keep you from double counting.
Again, you must use "ctrl-shift-enter" for this to work. (Yes, as I'm sure others are ready to point out, you can also use the sumproduct hack in this instance.)

Can I store the range result of =Offset() in another cell?

It's recommended to use =MATCH() in it's own cell and then use INDEX to refer to that cell. This makes sense, why redo the MATCH() formula over and over when it's the same result?
I want to do the same thing with the OFFSET() formula. I'm working with large tables and I understand that keeping your ranges small is the key to optimization. So, using OFFSET to figure out how big of a range i want to use has been extremely beneficial. However, sometimes I might have an IF statement that checks out several COUNTIFS that require the same range. In these cells I am forced to use the OFFSET to determine the exact same range, over and over... wouldn't it be better to simply do the same thing as INDEX/MATCH?
Unfortunately I don't think excel can output the range itself... I notice in the formula auditor that it will reveal the resulting range--i need that literal range in a cell so A1 might say "$B$2:$B$342".
Probably not possible, but thought I'd ask!
Thanks
You can try to use the 'CELL()' formula. This formula can return the 'address' of a referenced cell. See formula below:
=CELL("address",B1)&":"&CELL("address",B10)
Results should be: $B$1:$B$10
Put the above formula in cell 'A1' and see if this helps you at all. You will probably need to tweak it a bit to get the exact results you're looking for (for example, you may need to 'nest' your offset() formula within the cell() formula).
Best of luck!

Simple Excel vlookup doesn't work

I am a programmer that rarely uses Excel. I'm now trying to do a simple vlookup and it just won't work. I have read several online tutorials and troubleshooting guides, no dice. Here's what I've got:
As you can see, the formula in B8 is =VLOOKUP(A8,$A$1:$B$5,1,FALSE)
I am baffled why this isn't working. I have absolutely verified that each cell in the lookup table (A1-B5) doesn't contain any leading/trailing spaces, no special chars, etc. In fact I typed these in manually, they're not pasted. Same goes for the little column of colors (A8-A11). This is the simplest case possible. For example, I want the formula in B8 to look at "Red" in A8, find Red in the lookup table, and return Red's number, which is "3". And I want an exact match.
In case you're wondering why I'm trying this on a simple and useless case, it's because I began on a more complex sheet, as part of prepping for a data import from Excel, got the #N/A everywhere, so I started a new worksheet and made this simple example, and got the same wrong result.
What am I doing wrong?
You would be better served by using index() and match() because in a vlookup(), the value that you're trying to look for has to be in the left-most column.
match() will return the number or index (in your case, the the row number) in which it finds the value you're looking for, and that can be given to index() to use to return some other value associated with that index (in this case, the color number in that row). It would end up looking like this:
=index($a$1:$a$5, match(a8, $b$1:$b$5, 0))
I found that the lookup value (color) must be in the left column and the ID must be in the right column.
VLOOKUP doesn't work to it's left, this function looks rightwards. This is why you need to swap numbers and colors.

Resources