How do you calculate the Quintile for groups of rows in Excel? - excel

I found a partial answer to this question under How do you calculate the Quintile for every row in Excel?. I would like to derive the same quintile data for each row but I need the quintiles to be based on groups that are determined by a value in another column.

Use this formula:
=MAX(1,ROUNDUP(10*PERCENTRANK($C:$C,$C2,4),0))
Change the value 10 to split into whatever number of groups you need - currently it's making decile groups. Change $C:$C to column with target range of values, and change $C2 to the cell in the first row with target value and autofill down.
Your output will be a split of groups based on the value of 1 being lowest and anything greater higher value in terms of numerical value.
If you mean just row numbers as your ranking criteria, you could insert a new column in A and autofill down numbers that correspond to your row references.

Related

Return value in a specific column based on array formula result

I am using an array formula to find the largest value within a given month. Photo below ('Range of values') shows the values for each month, and I am selecting September as the criteria ("K5" in the formula below). Cell K5 will change based on the specific month I am looking for:
Formula:
{=MAX(IF(E8:P8=K5,E12:P21,""))}
Result:
63,490
Requirement:
I need to have a cell with the result, and a cell next to it with the corresponding 'Location', which in this case would be 6.
What would be the best formula to use in the case where you are not confined to a single column for the value lookup?
Typical Vlookup and Index/Match as I understand are limited as they require a single column to look up a value. The array is outside of that scope, but I feel I may be overthinking it, but I don't know.
if you have Excel 365 current channel you can use this formula to return both values:
=LET(data,A1:D5,
selectMonth,B7,
dataMonth,CHOOSECOLS(data,1,MATCH(selectMonth,CHOOSEROWS(data,1),0)),
TAKE(SORT(dataMonth,2,-1),2))
The basic idea is to first reduce the data range to the first column and the month that should be evaluated.
Then sort that "range" descending by the months values (= column 2 of new range) and take the first 2 rows (including header) - as this is the max value.

Is there a way to spill fill a column based on count of nonblank cells in adjacent column in Excel?

I am looking to find a way to fill a whole column with the same output, "Yes", based on the number of cells in the adjacent column.
For example, if there's data in A2:A10, I would like B2:B10 to be filled with "Yes". If more data is added to column A, I'd like column B to automatically update / spill the "Yes" into additional rows within B based on the number of entries added to column A.
I'm aware that I can do an =IF(ISBLANK()) statement for each row, but I am trying to reduce the number of formulas. I'd like to try and do this with a single formula within the top row of column B that spills down.
The value in column A can change, I'm only trying to check the number of non-blank values.
I'm using Excel / Office 365.
This is a generic solution for column A containing mixed datatypes:
=REPT("Yes",A2:INDEX(A:A,MAX(IFNA(MATCH(IF({0;1},"Ω",77^77),A:A),0)))<>"")

min & max value in range in loop

I've faced huge problem with my macro. I have data that contains colums with quantities and values of stock like this:
What I'm trying to achieve is:
to go through every row until the very last, locate quantity (colums with Q letter) and values (colums with V letters) below 0, then adding these quantities below zero to the maximum quantity within the row and adding these values below 0.
to find values within age category for every row that have no corresponding quantity (see cell B4 as example) and add these values to the maximum value within the row.
Why VBA for something you can achieve with a formula?
Let me show you how I calculate the maximum of a list of cells, referring to a column, whose name starts with a "V":
=MAXIFS(A2:F2,A1:F1,"V*")
Screenshot:
Explanation:
Take the maximum of the values on the second row (A2:F2)
The criteria you need to take into account refer to the first row (A1:F1)
The criteria is that it should start with a "V".

Using LOOKUP Functions with <= and >=

I attempting to use the LOOKUP functions in Excel in a nested(?) fashion and with ranges of data. In the attached picture, the left-hand table is my data that extends for another 360 rows or so. Each row has a unique ID (I've taken this data from a larger set so I wanted to retain it), a State postal abbreviation, and the income level for that data point (each row is data from a different zipcode).
The table on the right is the metadata - quintile levels for income in each state. For each row on the left, I want to look up the state abbreviation from the metadata, then use the adjacent income level to determine and print out the appropriate quintile based on that row in the metadata. I anticipate that the solution would use some form of the lookup functions and inequalities, but I'll take any solution.
For this approach you need Office 365, with the new XMatch function, which can do an approximate match for next bigger number without requiring the data to be sorted.
The formula is
=INDEX($J$1:$N$1,XMATCH(C2,INDEX(J:J,MATCH(B2,H:H,0),0):INDEX(N:N,MATCH(B2,H:H,0)),1))
If you don't have XMatch, you would need to re-arrange the lookup columns from Highest to Lowest. Then you can use
=INDEX($J$1:$N$1,MATCH(C2,INDEX(J:J,MATCH(B2,H:H,0),0):INDEX(N:N,MATCH(B2,H:H,0)),-1))
If you paste this formula on cell D4, would the result be your expected output?
(Highest Quintile for California)
=INDEX(N:N,MATCH(B4,B:B,0))
Paste this to D5
(Lowest Quintile for Ohio)
=INDEX(J:J,MATCH(B5,B:B,0))
The last 0 (zero) in the formula is the match type. It can be replaced with:
1 - less than
0 - exact match
-1 - greater than
depending on your need. Did I get your point?
Your information is a bit sparse. So I tell you what I did and you take it from there.
First I created a named range to comprise 2 columns of your median income table, state and income. Then I created this formula to extract the income by state.
=VLOOKUP($B2,Income,2,FALSE)
Observe that the state name is in column B and the income in the 2nd column of the Income range. Your list may be structured differently. The key to it is that the Income range must have the State in its first column and the 2 in the formula just counts columns from State to Income.
If you place this formula on the same sheet as the Income range it will just produce a copy of the Income column. But that isn't what I did. I placed it in a blank column on the Quintile tab. That happened to be column J, since A:G is taken up with your data, notably, C:G with the columns for quintile numbers. Observe in this transfer that column B displays the state abbreviations. By coincidence it's column B in both sheets. The relevant column is the one on the Quintile sheet. So, the formula still shows the median income for each state but the sequence is determined by the sequence of state names on the tab where the formula resides, and that is the requirement here.
Next, I created this formula and placed it in column K of the Quintile sheet.
=MATCH(J2,C2:G2,1)
This formula determines the column in C:G where the value in J2 is matched. J2, of course, contains the Median income drawn by the VLOOKUP. If that number is nothing it will be interpreted as zero and the lowest quintile returned. Read up on the precise method of the MATCH function.
Now J2 can be integrated into the formula. I did that in a copy of the MATCH formula in column L.
[L2] =MATCH(VLOOKUP($B2,Income,2,FALSE),C2:G2,1)
Observe that the formula in K and L have the same result. I copied them down a few rows to make sure. I got a lot of #N/A errors in this exercise resulting from state abbreviations in the Quintile sheet not being found in the Income range. I think that information is useful. Therefore I didn't suppress it.
The result so far is the quintile, numbered from 1 to 5. I wanted to make sure that the numbers are correct. Therefore I "translated" them to the number in the Quintile table.
For this purpose I created another named range, called this one "Quintiles", comprising of columns C:G. It's important that this range should start in row 1. The columns could be any other columns (not C:G) but they must be the same as specified in the formula, with the lowest quintile being the first. And this became my formula in column M.
[M2] =INDEX(Quintiles,ROW(),L2)
If you actually need this number you can replace the reference to L2 in the formula with the formula in L2.

Need formula operating against a dynamic range copied across a series of cells

I'm creating a grid of correlation values, like a distance grid. I have a series of cells that each contain a formula whose ranges are easy to describe if you know the offset from the first cell, and I'm having trouble figuring out how to specify it.
In the upper left hand cell (R10), the formula is CORREL(C2:C21,C2:C21) -- it's 1, of course.
In the next column over (S10), the formula is CORREL(D2:D21,C2:C21).
In the next row down (R11), the formula is CORREL(C2:C21,D2:D21).
Of course, S11 would contain CORREL(D2:D21,D2:D21), which is also 1. And so on, for a roughly 15x15 grid.
Here's a graphical representation of the ranges involved:
C2:C21,C2:C21 C2:C21,D2:D21 C2:C21,E2:E21
D2:D21,C2:C21 D2:D21,D2:D21 D2:D21,E2:E21
E2:E21,C2:C21 E2:E21,D2:D21 E2:E21,E2:E21
Whenever I add a new data row, I have to manually update several formulas. So, I'd like the last non-blank column number (21, in this case), to be dynamically determined, such as with COUNTA(C:C). Ideally, I'd like the formula to calculate the row offsets, too, so that I can drag one formula across my entire range.
What's the best way to accomplish this? I think OFFSET might be a component in the solution, but I haven't had success getting it all to work together.
Using this simple setup per element of the corr matrix also helps:
=CORREL(INDIRECT("'Risk factors'!"&"T"&G6&":T"&H6);INDIRECT("'Risk factors'!"&"U"&G6&":U"&H6))
With this function I refer to data in another sheet, Risk factors, to correlate rows T and U with each other. I want the ranges of the data to be dynamic so I refer with G6 and H6 in my current sheet to the lenght of the columns (number of rows) which I of course specify in these G6 and H6 cells.
Hope this helps!
I found this formula, while wordy, achieved the desired results. In this example, the data lives in C2:O19. The table I wanted to construct computed the correlation values of all permutations of pairs of columns. Since there are 11 columns, the correlation pairs table is 11x11 and starts at R10. Each cell has the following formula:
=CORREL(INDIRECT(ADDRESS(2,2+(ROWS($R$10:R10)),4)&":"&ADDRESS(COUNTA($C:$C),
2+(ROWS($R$10:R10)),4)),INDIRECT(ADDRESS(2,2+(COLUMNS($R$10:R10)),4)&":"&
ADDRESS(COUNTA($C:$C),2+(COLUMNS($R$10:R10)),4)))
As I found out, INDIRECT() resolves a cell reference and obtains its value.
Let's take a cell, say U12, and look at the range formula in detail. The first INDIRECT is the column given by applying the row offset from R10.
Since Row 12 is 2 rows down from Row 10, ADDRESS(2,2+(ROWS($R$10:U12)),4)&":"&ADDRESS(COUNTA($C:$C),2+(ROWS($R$10:U12)),4) should yield the column that's 2 rows right of Row C, which is E. The formula evaluates to E2:E19.
The second INDIRECT is the column given by applying the column offset from R10. Similarly, since Column U is 3 columns right of Column R, ADDRESS(2,2+(COLUMNS($R$10:U12)),4)&":"&ADDRESS(COUNTA($C:$C),2+(COLUMNS($R$10:U12)),4) should yield the column that's 3 rows right of Row C, which is F. The second formula evaluates to F2:F19.
Substituting these range reference values in, the cell formula reduces to =CORREL(INDIRECT("E2:E19"),INDIRECT("F2:F19")) and further to =CORREL(E2:E19,F2:F19), which is what I'd been using up till now.
Just like a distance table, this table is symmetrical along the diagonal, because =CORREL(E2:E19,F2:F19) equals =CORREL(F2:F19,E2:E19). Each value on the diagonal is 1, because CORREL of the same range is 100% correlation by definition.

Resources