Offset formula logic clarity - excel

I am trying to get year to desired month total of personal expenditure sub categories. After researching stackoverflow, I found a formula seemingly appropriate for my requirements. I found it shifting the desired area by one row down during formula evaluation. I modified the formula by hit and trial on adhoc basis which is giving the correct results. To me the initially chosen formula appeared quite appropriate. I have shown below the sample data sheet and the evaluation steps of the original and modified formula. Could someone explain particularly the offset portion as to why it was going wrong for the initially chosen formula and how the modification helped in solving the problem. Somehow I am not able to get conceptual clarity on this issue.
Sample Data files
Personal_Accounts evaluated with formula A
Personal_Accounts evaluated with modified formula

Offset works by specifying:
A cell from you which you will offset (A1 in this example) then specifying how many rows and columns to move from that position, and then how tall and wide to make the range.
The number of rows to move down: In this case the number of rows down is determined by Match(). Match() here will return the number of rows down in the range A1:A9 that the value SS can be found. The answer is 5. Offset now is looking at Range A1 + 5 rows: A6
The number of columns to move across: Here we move 1 column. No funny business. New range is B6
The number of rows to include in the range from that start point: Here COUNTIFS() is used to return the number of times SS is found in the range A2:A9. The answer is 3. So the range will start at B6 and include three rows down in the range. Essentially B6:B8.
Finally, the number of columns to include in the range: Here it's 7 since that's what you have in cell A13, so your range is now B6:H8
OFfseT() returns that range and Sum sums it up
You subtracted one from the results of MATCH() and correctly moved that formula to produce B5:H7. You could have also changed the search range in MATCH() to A2:A9, which would probably make more sense from a readability standpoint.
Lastly, your COUNTIFS() could just be COUNTIF() since you are not evaluating multiple conditions.
So if I had to write this from scratch, I would use:
=Sum(Offset(A1, Match(A2:A9, A12, 0), 1, Countif(A2:A9, A12), A13)
Which will get you the same correct answer, without any math on Match() results.

Offset has two main functions - either to move to cell (target) using specified number of rows and columns from the starting point, or to select range of specified number of rows and columns starting in the target cell. Your original formula has issue in this part
MATCH(A12;A1:A9;0)
matched cell is fifth therefore the offset moves 5 rows down ending in A6, because it starts in A1 + 5 rows. Then it moves 1 column to be in B6 and then creates range of 3 rows in total and 7 columns = B6:H8. So you need to deduct 1 from the result of the match function to end up in the right row.
For better understanding imagine if the SS value was in the first row of the range A1:A9 (in A1) - then the offset would move from A1 one row down to A2 although you wouldnt want it to move at all.

look at your basic offset formula definition.
Offest (REFERENCE CELL, HOW MANY ROWS TO MOVE FROM REFERENCE, HOW MANY COLUMNS TO MOVE FROM EFERENCE, HOW MANY ROWS TO RETURN, HOW MANY COLUMNS TO RETURN)
so if you set your reference cell to A1 and you want to return the result in A2, you need to move down 1 row from your reference cell.
OFFSET ($A$1,1,0,1,1)
Now if we look at the match portion of your equation, MATCH return what position the information is in. So if we want to find the match position of the information in A2 in a range going from A1:A100, Match is going to tell you that the information in A2 is in the 2nd position of the column. Or more precisely it returns a value of 2.
So now we need to tell offset how far down to reach the 2nd position. We dont actually want it to move down 2 rows to get to the second position since our reference point is A1 which is the first row. As a result we really want to go down 1 row to get to the second row. So you want 1 less from your match results which you correctly did by doing Match(...)-1

Related

In Excel, how do I get the header corresponding to the max value from a subset of a range?

I'm pushing beyond my Excel knowledge here. I'm trying to do a poll like thing in Excel. My problem lies on showing the selected result. Here's what I have so far:
I need to select the header corresponding to the cell with the highest value in the range B2:G2 (type 1). However, if there's a tie, I need to select the header corresponding to the highest value in the range B3:G3 amongst the cells with highest values in the range B2:G2.
In my sample, column "bb" and "cc" both share highest value on type 1 (5). So, in order to determine the winner, I need to compare the highest value for type 2 between them. Since "bb" is 0 and "cc" is 1, I expect "cc" as final result.
Components for formula are below:
J2: Displays the count of cells on line 2 with the highest value in the range. So, 2. I did that with COUNTIF comparing with MAX.
K2: Displays the first header it finds with the highest value on line 2. I managed with the following formula:
=INDEX($B$1:$G$1;0;MATCH(MAX($B$2:$G$2);$B$2:$G$2;0))
To be honest, I don't fully understand that formula. Did it with help of tutorials from the internet.
I2: Displays "TIE" when there's a tie on range B2:G2. Otherwise display the winning header (K2).
J3: Displays the number of cells with the maximum value on range B3:G3 but only considering winning cells from line 2. I did that with COUNTIFS.
=COUNTIFS(B3:G3;LARGE(B3:G3;1);B2:G2;MAX(B2:G2))
Edit: Just found out by entering number "4" on B3 that this formula above is also not working...
I3: Should follow the same pattern as the cell above. Displays "TIE" when there's still a TIE. Otherwise would display winning header (to be presented on K3).
K3: I don't know what to put here. Probably because I don't quite understand that formula with INDEX, MATCH and so on, I can't figure out a way to check the highest value between the two "winning" columns from the line above and get the header.
Could somebody help me with this?
First, let's establish if there is a tie. As you have discovered, you can do this by counting how many times the highest number appears in the range.
=COUNTIF($B2:$G2;MAX($B2:$G2))
If that count is more than 1, then there is a tie.
=IF(COUNTIF($B2:$G2;MAX($B2:$G2))>1;"TIE";"no tie")
In case of a tie you want to involve the values in row 3 as a tie breaker. We could add them to the values in row 2 using this array formula. You must confirm the array formula with Ctrl+Shift+Enter, not just Enter, otherwise it won't work.
=INDEX($B$1:$G$1,MATCH(MAX(((IF(B2:G2=MAX(B2:G2),MAX(B2:G2),0))+B3:G3)),INDEX((B2:G2+B3:G3),0)))
You only want to factor in row 3 if there is a tie, though, so you can re-use the IF statement from above and replace the "tie" in the formula above with the array formula and remember to press Ctrl+Shift+Enter!!
=IF(COUNTIF($B$2:$G$2,MAX($B$2:$G$2))>1,INDEX($B$1:$G$1,MATCH(MAX(((IF(B2:G2=MAX(B2:G2),MAX(B2:G2),0))+B3:G3)),INDEX((B2:G2+B3:G3),0))),"no tie")
You already have the formula to look up the value if there is no tie.
My system uses the comma as the list separator. I have manually replaced these with semicolons in the formulas I posted, but please bear with me if I may have missed one.
Now you can copy these formulas down to row 3. If there is a tie in the data in row 3, you will need data in row 4 to break the tie.
To understand the Index/Match combo, start with your first formula and read it from the inside out. The Max() finds the largest number. The Match() returns the position, i.e. column number, of the largest number in the range B2 to G2, i.e. 2 (the second column in the range). Index looks at B1 to G1 and returns the column value from the position that the Match returned, i.e. the 2nd column, which is the text bb.
Using row 3 as the tie breaker, the formula works pretty much the same, only that rows 2 and 3 are added together when the value in row 2 is the Max value and then that number is used to find the Max and the Match.
Here is an approach with sumproducts. I dont really inderstand what results you want in I3, J3, and K3. will try to workout.
I2:
=IF(SUMPRODUCT(--(B2:G2=MAX(B2:G2)))>1,"TIE","")
J2:
=SUMPRODUCT(--(B2:G2=MAX(B2:G2)))
K2:
=IF(B7>1,OFFSET(B1,0,SUMPRODUCT(--(B3:G3=MAX(B3:G3))*--(B2:G2=MAX(B2:G2))*{0,1,2,3,4,5})),OFFSET(B1,0,SUMPRODUCT(--(B2:G2=MAX(B2:G2))*{0,1,2,3,4,5})))
the {0,1,2,3,4,5} refers to the number of headers, if there are more, this array needs to changed

Adding all the values below the current cell in Excel

I am trying to display the total sum of all the numbers for a particular column. I want the sum to be displayed above the column as follows:
21 30
A B
6 5
6 10
6 10
3 5
I know I can sum the values and display it at the bottom of the column using =SUM(A3:INDIRECT("D"&ROW()-2)), however I am not getting a way to display it at the top of the column.
Please guide.
Based on the comments and the previous answers I suggest following formula, entered in cell A1:
=SUM(OFFSET(A$2,0,0,ROWS(b:b)-1))
You can then copy/paste to the right till second last column.
You could also modify your formula in A1 like this to achieve the same:
=SUM(INDIRECT("A2:A"&ROWS(A:A)-2))
But then you cannot copy/paste to the right...
A more general approach with your idea would be:
=SUM(INDIRECT(ADDRESS(ROW()+1,COLUMN())&":"&ADDRESS(ROWS(A:A),COLUMN())))
You can then copy/paste to the right till last column.
Some explanations:
Both formula sums up every value in the range from A2 till the bottom of column A (i.e. for Excel 2010 this would be A2:A1048576)
It doesn't matter if there are blanks or cells without value; the formula sums up only the numbers
I put A$2 and B:B in the OFFSET formula to avoid circular references, since I'm writing in cell A1 and I cannot write A$1 nor A:A
With the INDIRECT formula you don't have to worry about circular references
Further commenting (sorry, I don't have the credits to comment at the right place under the question):
Phylogenesis formula =SUM(A3:A65535) could also do the work, isn't it?
Didn't understand your question at first, because you talk of "sum of all the numbers for a particular row" but then you sum columns, isn't it?
When I'm doing something like this, I prefer to not include any empty cells beneath the range I'm summing, because I've had errors in the past as the result of including them (usually because there's a cell way down in the column somewhere that I'm not expecting to have a value). I'm assuming that A & B are your column headers. Assuming that, here is how I would do it. This is your formula for cell A1:
=SUM(OFFSET(A$1,2,0,COUNTA(A$3:A$65535)))
Explanation
I'm updating this with a brief explanation, per the OP's request.
According to ExcelFunctions.net:
The Excel Offset function returns range of cells that is a specified number of rows and columns from an initial supplied range.
The function reference for OFFSET is:
=OFFSET(reference, rows, cols, [height], [width])
What this formula does is create a dynamic range based on the number of cells in the selection, relative to cell A$1. This is an offset of two rows and no columns, which starts the range at A$3. The height of the range is the total number of filled cells in the range A$3:A$65535. The assumption here is that there are no blank cells in the range, which there were not in the sample data.

Need formula operating against a dynamic range copied across a series of cells

I'm creating a grid of correlation values, like a distance grid. I have a series of cells that each contain a formula whose ranges are easy to describe if you know the offset from the first cell, and I'm having trouble figuring out how to specify it.
In the upper left hand cell (R10), the formula is CORREL(C2:C21,C2:C21) -- it's 1, of course.
In the next column over (S10), the formula is CORREL(D2:D21,C2:C21).
In the next row down (R11), the formula is CORREL(C2:C21,D2:D21).
Of course, S11 would contain CORREL(D2:D21,D2:D21), which is also 1. And so on, for a roughly 15x15 grid.
Here's a graphical representation of the ranges involved:
C2:C21,C2:C21 C2:C21,D2:D21 C2:C21,E2:E21
D2:D21,C2:C21 D2:D21,D2:D21 D2:D21,E2:E21
E2:E21,C2:C21 E2:E21,D2:D21 E2:E21,E2:E21
Whenever I add a new data row, I have to manually update several formulas. So, I'd like the last non-blank column number (21, in this case), to be dynamically determined, such as with COUNTA(C:C). Ideally, I'd like the formula to calculate the row offsets, too, so that I can drag one formula across my entire range.
What's the best way to accomplish this? I think OFFSET might be a component in the solution, but I haven't had success getting it all to work together.
Using this simple setup per element of the corr matrix also helps:
=CORREL(INDIRECT("'Risk factors'!"&"T"&G6&":T"&H6);INDIRECT("'Risk factors'!"&"U"&G6&":U"&H6))
With this function I refer to data in another sheet, Risk factors, to correlate rows T and U with each other. I want the ranges of the data to be dynamic so I refer with G6 and H6 in my current sheet to the lenght of the columns (number of rows) which I of course specify in these G6 and H6 cells.
Hope this helps!
I found this formula, while wordy, achieved the desired results. In this example, the data lives in C2:O19. The table I wanted to construct computed the correlation values of all permutations of pairs of columns. Since there are 11 columns, the correlation pairs table is 11x11 and starts at R10. Each cell has the following formula:
=CORREL(INDIRECT(ADDRESS(2,2+(ROWS($R$10:R10)),4)&":"&ADDRESS(COUNTA($C:$C),
2+(ROWS($R$10:R10)),4)),INDIRECT(ADDRESS(2,2+(COLUMNS($R$10:R10)),4)&":"&
ADDRESS(COUNTA($C:$C),2+(COLUMNS($R$10:R10)),4)))
As I found out, INDIRECT() resolves a cell reference and obtains its value.
Let's take a cell, say U12, and look at the range formula in detail. The first INDIRECT is the column given by applying the row offset from R10.
Since Row 12 is 2 rows down from Row 10, ADDRESS(2,2+(ROWS($R$10:U12)),4)&":"&ADDRESS(COUNTA($C:$C),2+(ROWS($R$10:U12)),4) should yield the column that's 2 rows right of Row C, which is E. The formula evaluates to E2:E19.
The second INDIRECT is the column given by applying the column offset from R10. Similarly, since Column U is 3 columns right of Column R, ADDRESS(2,2+(COLUMNS($R$10:U12)),4)&":"&ADDRESS(COUNTA($C:$C),2+(COLUMNS($R$10:U12)),4) should yield the column that's 3 rows right of Row C, which is F. The second formula evaluates to F2:F19.
Substituting these range reference values in, the cell formula reduces to =CORREL(INDIRECT("E2:E19"),INDIRECT("F2:F19")) and further to =CORREL(E2:E19,F2:F19), which is what I'd been using up till now.
Just like a distance table, this table is symmetrical along the diagonal, because =CORREL(E2:E19,F2:F19) equals =CORREL(F2:F19,E2:E19). Each value on the diagonal is 1, because CORREL of the same range is 100% correlation by definition.

Absolute Row Reference

I'm attempting to create a formula that will always reference data in cells on row 5.
Current Formula =(OFFSET(INDIRECT(ADDRESS(ROW(),COLUMN())),SUM(ROW()-COUNT(ROW())+5),MOD(COLUMN()+1,-4)))
This is giving me the relative active cell address in the worksheet
=(OFFSET(INDIRECT(ADDRESS(ROW(),COLUMN()))
This is giving me the column offset. My data repeats in sets of 4 so the formula takes the current column adds one then divides by four. The remainder is the column offset I will use.
MOD(COLUMN()+1,-4)))
The section I'm having a problem on is the row reference for the formula. The data i need is always in row 5 so i was attempting to use the code below to find the current row subtract the row count and add five to land in row five. However the code is evaluating to 1 so either I'm overlooking something or this will not work.
Any help would be great.
SUM(ROW()-COUNT(ROW())+5)
Example.
if the active cell is C17 then I want to subtract 17 and add 5 to end up on row C5. I using this method as the active cell could be any cell in the worksheet.
As suggested by #Jerry but a version that can be dragged up or down and still refer to Row5:
=INDEX($5:$5, 0, COLUMN())

Excel: Find intersection of a row and a column

My question is how can I find an intersecting cell of a specific column and row number?
My situation is this: with some calculations I find two cells, lets say B6 and E1. I know that I need a row of the first one and a column of the second one. So I could just use ROW and COLUMN functions to get the numbers. After that, I need to find an intersecting cell. Which would be E6 in this example.
I would just use INDEX(A1:Z100;ROW;COLUMN) but I don't know the exact area that I'm going to need - it depends on other stuff. I could use something like A1:XFG65000 but that is way too lame. I could also use a combination of INDIRECT(ADDRESS()) but I'm pulling data from a closed workbook so INDIRECT will not work.
If this would help to know what is this all for - here's a concrete example:
I need to find limits of a section of a sheet that I would work with. I know that it starts from the column B and goes all the way down to the last non-empty cell in this column. This range ends with a last column that has any value in first row. So to define it - I need to find the intersection of this last column and the last row with values in B column.
I use this array formula to find the last column:
INDEX(1:1;MAX((1:1<>"")*(COLUMN(1:1))))
And this array formula to find the last row:
INDEX(B:B;MAX((B:B<>"")*(ROW(B:B)))
Last column results in E1 and last row results in B6. Now I need to define my range as B1:E6, how can I get E6 out of this all to put into the resulting formula? I've been thinking for a while now and not being and Excel expert - I couldn't come up with anything. So any help would really be appreciated. Thanks!
You can use an Index/Match combination and use the Match to find the relevant cell. Use one Match() for the row and one Match() for the column.
The index/match function to find the last cell in a sheet where
column B is the leftmost table column
row 1 is the topmost table row
data in column B and in row 1 can be a mix of text and numbers
there can be empty cells in column B and row 1
the last populated cell in column B marks the last row of the table
the last populated cell in row 1 marks the last column of the table
With these premises, the following will return correct results, used in a Sum() with A1 as the starting cell and Index to return the lower right cell of the range:
=SUM(A1:INDEX(1:1048576,MAX(IFERROR(MATCH(99^99,B:B,1),0),IFERROR(MATCH("zzzz",B:B,1),0)),MAX(IFERROR(MATCH(99^99,1:1,1),0),IFERROR(MATCH("zzzz",1:1,1),0))))
Since you seem to be on a system with the semicolon as the list delimiter, here is the formula with semicolons:
=SUM(A1:INDEX(1:1048576;MAX(IFERROR(MATCH(99^99;B:B;1);0);IFERROR(MATCH("zzzz";B:B;1);0));MAX(IFERROR(MATCH(99^99;1:1;1);0);IFERROR(MATCH("zzzz";1:1;1);0))))
Offset would seem to be the way to go
=OFFSET($A$1,ROW(CELL1)-1,COLUMN(CELL2)-1)
(The -1 is needed because we already have 1 column and 1 row in A1)
in your example, =OFFSET($A$1,ROW(B6)-1,COLUMN(E1)-1) would give the value in E6
There is also ADDRESSS if you want the location: =ADDRESS(ROW(B6),COLUMN(E1)) gives the answer $E$6
The following webpage has a much easier solution, and it seems to work.
https://trumpexcel.com/intersect-operator-in-excel/
For example, in a cell, type simply: =C:C 6:6. Be sure to include one space between the column designation and the row designation. The result in your cell will be the value of cell C6. Of course, you can use more limited ranges, such as =C2:C13 B5:D5 (as shown on the webpage).
As I was searching for the answer to the same basic question, it astounded me that there is no INTERSECT worksheet function in Excel. There is an INTERSECT feature in VBA (I think), but not a worksheet function.
Anyway, the simple spacing method shown above seems to work, at least in straightforward cases.

Resources