Using LOOKUP Functions with <= and >= - excel

I attempting to use the LOOKUP functions in Excel in a nested(?) fashion and with ranges of data. In the attached picture, the left-hand table is my data that extends for another 360 rows or so. Each row has a unique ID (I've taken this data from a larger set so I wanted to retain it), a State postal abbreviation, and the income level for that data point (each row is data from a different zipcode).
The table on the right is the metadata - quintile levels for income in each state. For each row on the left, I want to look up the state abbreviation from the metadata, then use the adjacent income level to determine and print out the appropriate quintile based on that row in the metadata. I anticipate that the solution would use some form of the lookup functions and inequalities, but I'll take any solution.

For this approach you need Office 365, with the new XMatch function, which can do an approximate match for next bigger number without requiring the data to be sorted.
The formula is
=INDEX($J$1:$N$1,XMATCH(C2,INDEX(J:J,MATCH(B2,H:H,0),0):INDEX(N:N,MATCH(B2,H:H,0)),1))
If you don't have XMatch, you would need to re-arrange the lookup columns from Highest to Lowest. Then you can use
=INDEX($J$1:$N$1,MATCH(C2,INDEX(J:J,MATCH(B2,H:H,0),0):INDEX(N:N,MATCH(B2,H:H,0)),-1))

If you paste this formula on cell D4, would the result be your expected output?
(Highest Quintile for California)
=INDEX(N:N,MATCH(B4,B:B,0))
Paste this to D5
(Lowest Quintile for Ohio)
=INDEX(J:J,MATCH(B5,B:B,0))
The last 0 (zero) in the formula is the match type. It can be replaced with:
1 - less than
0 - exact match
-1 - greater than
depending on your need. Did I get your point?

Your information is a bit sparse. So I tell you what I did and you take it from there.
First I created a named range to comprise 2 columns of your median income table, state and income. Then I created this formula to extract the income by state.
=VLOOKUP($B2,Income,2,FALSE)
Observe that the state name is in column B and the income in the 2nd column of the Income range. Your list may be structured differently. The key to it is that the Income range must have the State in its first column and the 2 in the formula just counts columns from State to Income.
If you place this formula on the same sheet as the Income range it will just produce a copy of the Income column. But that isn't what I did. I placed it in a blank column on the Quintile tab. That happened to be column J, since A:G is taken up with your data, notably, C:G with the columns for quintile numbers. Observe in this transfer that column B displays the state abbreviations. By coincidence it's column B in both sheets. The relevant column is the one on the Quintile sheet. So, the formula still shows the median income for each state but the sequence is determined by the sequence of state names on the tab where the formula resides, and that is the requirement here.
Next, I created this formula and placed it in column K of the Quintile sheet.
=MATCH(J2,C2:G2,1)
This formula determines the column in C:G where the value in J2 is matched. J2, of course, contains the Median income drawn by the VLOOKUP. If that number is nothing it will be interpreted as zero and the lowest quintile returned. Read up on the precise method of the MATCH function.
Now J2 can be integrated into the formula. I did that in a copy of the MATCH formula in column L.
[L2] =MATCH(VLOOKUP($B2,Income,2,FALSE),C2:G2,1)
Observe that the formula in K and L have the same result. I copied them down a few rows to make sure. I got a lot of #N/A errors in this exercise resulting from state abbreviations in the Quintile sheet not being found in the Income range. I think that information is useful. Therefore I didn't suppress it.
The result so far is the quintile, numbered from 1 to 5. I wanted to make sure that the numbers are correct. Therefore I "translated" them to the number in the Quintile table.
For this purpose I created another named range, called this one "Quintiles", comprising of columns C:G. It's important that this range should start in row 1. The columns could be any other columns (not C:G) but they must be the same as specified in the formula, with the lowest quintile being the first. And this became my formula in column M.
[M2] =INDEX(Quintiles,ROW(),L2)
If you actually need this number you can replace the reference to L2 in the formula with the formula in L2.

Related

Aggregating data using INDEX MATCH MATCH or SUMIFS

I'm trying to create an Excel formula that is able to sum multiple rows in a table, where the rows and column to be summed are determined by the contents of other cells.
Ordinarily I would use Index Match Match to achieve this, but the multiple rows summation has left me stumped.
I've seen a couple of examples on here of Index Match with a SUMIFS formula, but nothing that pairs this with Index Match Match.
I have two tables on different Excel sheets. The first one looks a little this (the actual table is 105 columns x 200 rows):
That is from a sheet called "Firm Cost Summary". Row 4 contains a list of unique employee numbers. Column A is the expense category per our accounting system and Column B is a broader category that should be used in Excel to group similar items. Column E onwards then contains the numerical information to be aggregated.
What I would then like to do is summarise that table in a more presentable format that can then be manipulated in other ways. The table looks like this:
That is on a sheet called "Staff Cost Summary". I would like to fill out the info in the yellow cells, i.e. total the salary, bonus, benefits, etc, of each staff member. Ideally this would be a formula I input in cell E6 that I can then drag right and downwards to fill the table.
To give an example, to fill out cell I6 in the second table, the formula should look in cell A6 to find the employee number (1 in this case) and look this up in row 1 of the first table to find the appropriate column of the first table (column E in this case).
The formula should then look in cell I5 of the second table to see that we are looking to aggregate benefits, then look down column B of the first table to find each row that should be summed (rows 7-10 in this case).
With that in mind, here's what I've got:
=INDEX('Firm Cost Summary'!$A$4:$G$10,MATCH('Staff Cost Summary'!$A6,'Firm Cost Summary'!$A$4:$G$10,0),MATCH('Staff Cost Summary'!E$5,'Firm Cost Summary'!$B$4:$B$10,0))
Total benefits for Joe Bloggs are the sum of cell E7:E10 of table 1, i.e. 5 + 10 + 50 + 100 = 165.
Clearly there are multiple matches in column B of that table, so the above formula gives an answer of 0. Any ideas how I can tweak that to make it work?
Put this in E6 and copy over and down
=SUMIFS(INDEX('Firm Cost Summary'!$D:$DD,0,MATCH($A6,'Firm Cost Summary'!$D$4:$DD$4,0)),'Firm Cost Summary'!$B:$B,E$5)
The index/match returns the correct column to be added.

Offset formula logic clarity

I am trying to get year to desired month total of personal expenditure sub categories. After researching stackoverflow, I found a formula seemingly appropriate for my requirements. I found it shifting the desired area by one row down during formula evaluation. I modified the formula by hit and trial on adhoc basis which is giving the correct results. To me the initially chosen formula appeared quite appropriate. I have shown below the sample data sheet and the evaluation steps of the original and modified formula. Could someone explain particularly the offset portion as to why it was going wrong for the initially chosen formula and how the modification helped in solving the problem. Somehow I am not able to get conceptual clarity on this issue.
Sample Data files
Personal_Accounts evaluated with formula A
Personal_Accounts evaluated with modified formula
Offset works by specifying:
A cell from you which you will offset (A1 in this example) then specifying how many rows and columns to move from that position, and then how tall and wide to make the range.
The number of rows to move down: In this case the number of rows down is determined by Match(). Match() here will return the number of rows down in the range A1:A9 that the value SS can be found. The answer is 5. Offset now is looking at Range A1 + 5 rows: A6
The number of columns to move across: Here we move 1 column. No funny business. New range is B6
The number of rows to include in the range from that start point: Here COUNTIFS() is used to return the number of times SS is found in the range A2:A9. The answer is 3. So the range will start at B6 and include three rows down in the range. Essentially B6:B8.
Finally, the number of columns to include in the range: Here it's 7 since that's what you have in cell A13, so your range is now B6:H8
OFfseT() returns that range and Sum sums it up
You subtracted one from the results of MATCH() and correctly moved that formula to produce B5:H7. You could have also changed the search range in MATCH() to A2:A9, which would probably make more sense from a readability standpoint.
Lastly, your COUNTIFS() could just be COUNTIF() since you are not evaluating multiple conditions.
So if I had to write this from scratch, I would use:
=Sum(Offset(A1, Match(A2:A9, A12, 0), 1, Countif(A2:A9, A12), A13)
Which will get you the same correct answer, without any math on Match() results.
Offset has two main functions - either to move to cell (target) using specified number of rows and columns from the starting point, or to select range of specified number of rows and columns starting in the target cell. Your original formula has issue in this part
MATCH(A12;A1:A9;0)
matched cell is fifth therefore the offset moves 5 rows down ending in A6, because it starts in A1 + 5 rows. Then it moves 1 column to be in B6 and then creates range of 3 rows in total and 7 columns = B6:H8. So you need to deduct 1 from the result of the match function to end up in the right row.
For better understanding imagine if the SS value was in the first row of the range A1:A9 (in A1) - then the offset would move from A1 one row down to A2 although you wouldnt want it to move at all.
look at your basic offset formula definition.
Offest (REFERENCE CELL, HOW MANY ROWS TO MOVE FROM REFERENCE, HOW MANY COLUMNS TO MOVE FROM EFERENCE, HOW MANY ROWS TO RETURN, HOW MANY COLUMNS TO RETURN)
so if you set your reference cell to A1 and you want to return the result in A2, you need to move down 1 row from your reference cell.
OFFSET ($A$1,1,0,1,1)
Now if we look at the match portion of your equation, MATCH return what position the information is in. So if we want to find the match position of the information in A2 in a range going from A1:A100, Match is going to tell you that the information in A2 is in the 2nd position of the column. Or more precisely it returns a value of 2.
So now we need to tell offset how far down to reach the 2nd position. We dont actually want it to move down 2 rows to get to the second position since our reference point is A1 which is the first row. As a result we really want to go down 1 row to get to the second row. So you want 1 less from your match results which you correctly did by doing Match(...)-1

Need formula operating against a dynamic range copied across a series of cells

I'm creating a grid of correlation values, like a distance grid. I have a series of cells that each contain a formula whose ranges are easy to describe if you know the offset from the first cell, and I'm having trouble figuring out how to specify it.
In the upper left hand cell (R10), the formula is CORREL(C2:C21,C2:C21) -- it's 1, of course.
In the next column over (S10), the formula is CORREL(D2:D21,C2:C21).
In the next row down (R11), the formula is CORREL(C2:C21,D2:D21).
Of course, S11 would contain CORREL(D2:D21,D2:D21), which is also 1. And so on, for a roughly 15x15 grid.
Here's a graphical representation of the ranges involved:
C2:C21,C2:C21 C2:C21,D2:D21 C2:C21,E2:E21
D2:D21,C2:C21 D2:D21,D2:D21 D2:D21,E2:E21
E2:E21,C2:C21 E2:E21,D2:D21 E2:E21,E2:E21
Whenever I add a new data row, I have to manually update several formulas. So, I'd like the last non-blank column number (21, in this case), to be dynamically determined, such as with COUNTA(C:C). Ideally, I'd like the formula to calculate the row offsets, too, so that I can drag one formula across my entire range.
What's the best way to accomplish this? I think OFFSET might be a component in the solution, but I haven't had success getting it all to work together.
Using this simple setup per element of the corr matrix also helps:
=CORREL(INDIRECT("'Risk factors'!"&"T"&G6&":T"&H6);INDIRECT("'Risk factors'!"&"U"&G6&":U"&H6))
With this function I refer to data in another sheet, Risk factors, to correlate rows T and U with each other. I want the ranges of the data to be dynamic so I refer with G6 and H6 in my current sheet to the lenght of the columns (number of rows) which I of course specify in these G6 and H6 cells.
Hope this helps!
I found this formula, while wordy, achieved the desired results. In this example, the data lives in C2:O19. The table I wanted to construct computed the correlation values of all permutations of pairs of columns. Since there are 11 columns, the correlation pairs table is 11x11 and starts at R10. Each cell has the following formula:
=CORREL(INDIRECT(ADDRESS(2,2+(ROWS($R$10:R10)),4)&":"&ADDRESS(COUNTA($C:$C),
2+(ROWS($R$10:R10)),4)),INDIRECT(ADDRESS(2,2+(COLUMNS($R$10:R10)),4)&":"&
ADDRESS(COUNTA($C:$C),2+(COLUMNS($R$10:R10)),4)))
As I found out, INDIRECT() resolves a cell reference and obtains its value.
Let's take a cell, say U12, and look at the range formula in detail. The first INDIRECT is the column given by applying the row offset from R10.
Since Row 12 is 2 rows down from Row 10, ADDRESS(2,2+(ROWS($R$10:U12)),4)&":"&ADDRESS(COUNTA($C:$C),2+(ROWS($R$10:U12)),4) should yield the column that's 2 rows right of Row C, which is E. The formula evaluates to E2:E19.
The second INDIRECT is the column given by applying the column offset from R10. Similarly, since Column U is 3 columns right of Column R, ADDRESS(2,2+(COLUMNS($R$10:U12)),4)&":"&ADDRESS(COUNTA($C:$C),2+(COLUMNS($R$10:U12)),4) should yield the column that's 3 rows right of Row C, which is F. The second formula evaluates to F2:F19.
Substituting these range reference values in, the cell formula reduces to =CORREL(INDIRECT("E2:E19"),INDIRECT("F2:F19")) and further to =CORREL(E2:E19,F2:F19), which is what I'd been using up till now.
Just like a distance table, this table is symmetrical along the diagonal, because =CORREL(E2:E19,F2:F19) equals =CORREL(F2:F19,E2:E19). Each value on the diagonal is 1, because CORREL of the same range is 100% correlation by definition.

Compare two data sheets

The issue I'm faced with is I have two sheets of data in Excel. They are a stocksheet list, listing items that have a variance from a stocktake. The items are randomly placed between both documents, so it is almost impossible to do a side-by-side view even if I were to order the columns (which I already have). For example it would be like this:
Sheet 1:
A1 (Apple) (1)
A2 (Carrot) (-3)
A3 (Banana) (4)
A4 (Chocolate (-7)
Whereas Sheet 2 may be:
A1 (Orange) (-2)
A2 (Apple) (3)
A3 (Muffin) (-8)
A4 (Carrot) (3)
So as you can see, the same data may appear, and if it does I want to compare those two sets, to know the variance, i.e. Sheet 1 said -3 whereas sheet 2 said +1... I preferably would like to do this in a batch if possible, as there are over 800 cells to go through.
Just so that you can see what I'm dealing with, here's links to pastebins of both sheets;
Sheet 1: http://pastebin.com/6i7QKJ6N
Sheet 2: http://pastebin.com/zjtC2U7q
Is there anything anyone can think of that would be able to assist me, other than me going through this one by one which I am considering doing?
Excuse me from avoiding the real situation and sticking with your example. Assuming the values are in ColumnB in the corresponding rows, then:
in Sheet1: =VLOOKUP(A1,Sheet2!A:B,2,FALSE)
in Sheet2: =VLOOKUP(A1,Sheet1!A:B,2,FALSE)
say in ColumnsC should 'align' the entries (where both exist, otherwise #N/A). =B1=C1 in D1 copied down should then help to identify the mismatches and say =B1-C1 in E1 copied down the quantification the discrepancies between the sheets, by 'vegetable'.
There should be no need for a batch mode for this.
I'm assuming that the unique identifier for the stock items is the column labelled CYSKU, right?
If that's so, then there are only 192 common items between the two sheets. I ran a vlookup in both sheets a bit similar to the one pnuts used and used a filter.
There are more variances between CYCOST than with CYRETL as far as I can see (I haven't compared the other columns).
To perform the comparison, you can do the following:
Insert a column between columns C and F (just after CYSKU) and put a vlookup formula in row 2 of this column and fill it down:
=VLOOKUP(C2, Sheet2!C:C, 1, 0)
Insert a filter and filter out #N/A from this column to get only those that are common between the two sheets.
In column M (after CYDVAR), insert another vlookup and fill it down:
=VLOOKUP(C2, Sheet2!C:F, 4, 0)
This will give you the corresponding CYRETL from Sheet2. You can then compare the two CYRETL.
How VLOOKUP works:
The first parameter is what VLOOKUP will be looking for.
The second parameter is the table range in which to look the first parameter.
The third parameter is the nth column from which a match will be returned, limited to the table (if the table is in column A:A, only 1 column is available, if the table is A:B, 2 columns are available, etc).
The last parameter is for either exact or approximate match. Exact is 0 (or FALSE) and approximate is 1 (or TRUE).
You can just change the table range and the column number to change the value you're looking for from Sheet2.

How to create result table with vlookup from merged cells

I have this data table and i want another result table. when I write name of state ,result table can show all of company with data1,data2 and data3.I trying use vlookup but because there are merged cells the formula just show first row.
how can I fix problem?
If I'm understanding correctly, you want to set up a lookup range so that when you enter a particular state, you can see the data for all the companies that have data in that state. Here is one way to do that.
The first thing you would need to do is set up three columns to the left of the original table:
The first column holds the name of the state associated with each row of data
The second is an index that counts off the number of data rows in each state
The third combines the first two columns to produce a unique key value for each row in the table.
All the values in these three columns can be assigned by formula. The picture below shows the formulas for the first row of cells A9:C9, which are then copied down through row 27.
The next step is to lay out the new table, which is in cells Q8:U27 in my example.
There are several thing to note about the setup. First, the state that will be displayed is entered in cell Q9, which I've highlighted in yellow. To the left of the table, in column P, I've entered item numbers from 1 to 19, which will be needed to construct the key values for the lookups. The lookup formulas themselves are in cells R9:U27; in the picture, the formulas for the first row (R9:U9) are shown (they are then copied down through row 27).
It's worth taking a moment to look more closely at one of the lookups. Here is the formula for the first company name in cell `R9'.
=IFERROR(VLOOKUP($Q$9&$P9,$C$9:$N$27,4,0),"")
Looking at each of the arguments of the VLOOKUP in turn, $Q$9&$P9 concatenates the state name in cell Q9 with the item number (1 in this case), yielding the lookup value 'California1'. The lookup table is defined as the range $C$9:$N$27 - column C of that range is what the lookup value is matched against. The third argument is the column from which to return a value if the lookup is a match. The number 4 here corresponds with the company name column of the original table. Finally, the last argument is 0 (or equivalently, FALSE) indicating an exact match is required.
Finally, the VLOOKUP function is wrapped inside IFERROR. This catches the #N/A that would otherwise be returned when no match is found, replacing it with an empty string ("").

Resources