I have a list of names and numbers
NAME | Number
Joe | 1
Jane | 0
Jack | 1
Jill | 0
John | 1
I'm trying to look up the numbers and find out the corresponding name
The formula I have is
{=index($A$2:$B$6, SMALL(IF($B$2:$B$6 = 1, ROW ($B$2:$B$6)), Row(1:1)), 1)}
As I understand the formula:
First Excel runs the index function. It runs the index function on the array A2 through B6.
For the row number in the index function, it uses the function SMALL(IF($B$2:$B$6 = 1, ROW ($B$2:$B$6)), Row(1:1). This examines an array, b2:b6, and if the element under consideration in B2:B6 is a 1, it returns the row number of b2:b6. In this case, it would return a 2.
At this point I'm kind of stuck. I'm guessing that the second ROW function returns first case of the 1 derived from the small function
Lastly, the index function finds the name located in column 1 for the index found.
Your understanding of this formula is pretty good. I assume that you are going to copy it down enough rows to get all the values reported? If so, here is what is happening:
INDEX needs to know what row to go retrieve. In order to do this, we are going to give it a row number.
In order to get a row number we need to know which items meet the condition. We use the IF conditional to report a row number if the condition is met (otherwise we get FALSE).
Since that will give us an array of row numbers, we then use the SMALL function to give us a single value. That satisfies the INDEX function which needs a single row to retrieve.
So which value do we choose from SMALL? Well, we just give it a sequence of 1-2-3-... by using ROW(1:1). When this is copied down, it will become ROW(2:2), ROW(3:3), etc. Each of these will return 1, 2, 3, respectively so we get the next entry. Note that SMALL skips FALSE so it works for the output of the IF call.
So the first call to ROW (inside the IF) is used to determine the row of the values in the array that match the condition.
The second call to ROW(1:1) is just used to get an incrementing sequence once the formula is copied down.
The final thing to note is that your formula will be off by one row on the answers because ROW($B$2:$B$6) will return the absolute row number of those rows and not one that is relative to the starting corner of the array of interest. In this case, you will need to subtract 1 to get it to work (since it starts in row 2). In the general case, use a formula like this which accounts for the offset of the array:
=INDEX($A$2:$A$6,SMALL(IF($B$2:$B$6=1,ROW($B$2:$B$6)-ROW($B$2)+1),ROW(1:1)))
That is an array formula like you have (enter with CTRL+SHIFT+ENTER). The corresponding ranges look like:
Related
A
B
C
D
4
1
6
5649
3
8
10
9853
5
2
7
1354
I have two worksheets, for example column A in sheet 1 and columns B-D in sheet 2.
What I want to do is to take one value in Column A, and scan both columns B and C and it is between those two values, then display the corresponding value from column D in a new worksheet.
There could be multiple matches for each of the cell in column A and if there is no match, to skip it and not have anything displayed. Is there a way to code this and somehow create a loop to do all of column A? I tried using this formula, but I think it only matches for each row and not how I want it to.
=IF(AND([PQ.xlsx]Sheet1!$A2>=[PQ.xlsx]Sheet2!$B2,[PQ.xlsx]Sheet1!$A2<[PQ.xlsx]Sheet2!$C2),[PQ.xlsx]Sheet2!$D$2,"")
How do I do this?
Thank you.
I'm not positive if I understood exactly what you intended. In this sheet, I have taken each value in A:A and checked to see if it was between any pair of values in B:C, and then returned each value from D:D where that is true. I did keep this all on a single tab for ease of demonstration, but you can easily change the references to match your own layout. I tested in Excel and then transferred to this Google Sheet, but the functions should work the same.
https://docs.google.com/spreadsheets/d/1-RR1UZC8-AVnRoj1h8JLbnXewmzyDQKuKU49Ef-1F1Y/edit#gid=0
=IFERROR(TRANSPOSE(FILTER($D$2:$D$15, ($A2>=$B$2:$B$15)*($A2<=$C$2:$C$15))), "")
So what I have done is FILTEREDed column D on the two conditions that Ax is >= B:B and <= C:C, then TRANSPOSED the result so that it lays out horizontally instead of vertically, and finally wrapped it in an error trap to avoid #CALC where there are no results returned.
I added some random data to test with. Let me know if this is what you were looking at, or if I misunderstood your intent.
SUPPORT FOR EXCEL VERSIONS WITHOUT DYNAMIC ARRAY FUNCTIONS
You can duplicate this effect with array functions in pre-dynamic array versions of Excel. This is an array function, so it has be finished with SHFT+ENTER. Put it in F2, SHFT+ENTER, and then drag it to fill F2:O15:
=IFERROR(INDEX($D$2:$D$15, SMALL(IF(($A2>=$B$2:$B$15)*($A2<=$C$2:$C$15), ROW($A$2:$A$15)-MIN(ROW($A$2:$A$15))+1), COLUMNS($F$2:F2))),"")
reformatted for easier explanation:
=IFERROR(
INDEX(
$D$2:$D$15,
SMALL(
IF(
($A2>=$B$2:$B$15)*($A2<=$C$2:$C$15),
ROW($A$2:$A$15) - MIN(ROW($A$2:$A$15))+1
),
COLUMNS($F$2:F2)
)
),
"")
From the inside out: ROW($A$2:$A$15) creates an array from 2 to 15, and MIN(ROW($A$2:$A$15))+1 scales it so that no matter which row the range starts in it will return the numbers starting from 1, so ROW($A$2:$A$15) - MIN(ROW($A$2:$A$15))+1 returns an array from 1 to 14.
We use this as the second argument in the IF clause, what to return if TRUE. For the first argument, the logical conditions, we take the same two conditions from the original formula: ($A2>=$B$2:$B$15)*($A2<=$C$2:$C$15). As before, this returns an array of true/false values. So the output of the entire IF clause is an array that consists of the row numbers where the conditions are true or FALSE where the conditions aren't met.
Take that array and pass it to SMALL. SMALL takes an array and returns the kth smallest value from the array. You'll use COLUMNS($F$2:F2) to determine k. COLUMNS returns the number of columns in the range, and since the first cell in the range reference is fixed and the second cell is dynamic, the range will expand when you drag the formula. What this will do is give you the 1st, 2nd, ... kth row numbers that contain matches, since FALSE values aren't returned by SMALL (as a matter of fact they generate an error, which is why the whole formula is wrapped in IFERROR).
Finally, we pass the range with the numbers we want to return (D2:D15 in this case) to INDEX along with the row number we got from SMALL, and INDEX will return the value from that row.
So FILTER is a lot simpler to look at, but you can get it done in an older version. This will also work in Google Sheets, and I added a second tab there with this formula, but array formulas work a little different there. Instead of using SHFT+ENTER to indicate an array formula, Sheets just wraps the formula in ARRAY_FORMULA(). Other than that, the two formulas are the same.
Since FALSE values aren't considered, it will skip those.
I am trying to run a formula that does the following:
I have three columns, an account number, recorded amount, and the actual amount. What I'm trying to do is this, if the actual amount is not equal to the recorded amount, I want to pull that line, including the account number, recorded amount, and actual amount, and put it into a separate sheet. I'm trying to get this to happen over the span of about 100 rows. So it would look like this:
Account | Recorded Amount | Actual Amount
-----------------------------------------
Company | $356 | $356
Company | $569 | $569
Company | $700 | $705 ** Doesn't match
Company | $300 | $320 ** Doesn't match
##Now since the third and fourth rows don't match their respective columns
##The data is then extracted into a separate sheet.
**Separate Spreadsheet**
Account | Recorded Amount | Actual Amount
-----------------------------------------
Company | $700 | $705
Company | $300 | $320
I've tried using Vlookup and Match functions, but can't seem to figure this one out. Any help would be appreciated!
Attempts:
Attempting to use IF statement, the problem I encounter is not being able to return the whole row. I can return a specific cell but not the entire row.
=IF(E5=D5,A5:E5,"") * give a #VALUE error
=IF(E5=D5,E5) * returns selected cell
=VLOOKUP(E15=D15,D15:E299,2,FALSE) * #N/A
Tried using it across a sequence, but it'll only return the first cell that is selected, in this case, it would just return 'Company'. I could run this for each row but that's a lot of effort and code to run that piece of code across multiple columns and rows. It's also not scalable.
The main problem I'm having is capturing the entire row. I can extract the value of a specific cell if it matches, but not the entire row of data. I would also accept that Excel is not capable of this. I was able to generate the required results in a couple of lines of code in Python but in Excel, I'm not as fluent and I'm unsure of what path to take.
Suggested solution
If I understand you correctly, the solution could be the following.
Sheet1 contains the source data:
Sheet2 contains a table with calculated data: only those rows that differ in the values Recorded and Actual:
Cells A2:C9 of Sheet2 contain formulas. This is the same range of cells like the source data on Sheet1. Sheet2 A2 contains this formula:
{=IFERROR(INDEX('Sheet1'!$A$2:$C$9,LARGE(N('Sheet1'!$B$2:$B$9<>'Sheet1'!$C$2:$C$9)*(ROW('Sheet1'!$B$2:$B$9)-ROW('Sheet1'!$B$1)),SUM(N('Sheet1'!$B$2:$B$9<>'Sheet1'!$C$2:$C$9))-ROWS(A$1:A1)+1),COLUMNS($A1:A1)),"")}
The formula is copied to the other cells up to C9. You may adjust cell references to your needs.
Note that this is an array formula. Omit the curly braces and enter the formula by pressing Ctrl + Shift + Enter.
Explanation
I will try my best...
Let's start with a slightly better readable formula.
{=IFERROR(
INDEX(
'Sheet1'!$A$2:$C$9,
LARGE(
N('Sheet1'!$B$2:$B$9<>'Sheet1'!$C$2:$C$9) * (ROW('Sheet1'!$B$2:$B$9)-ROW('Sheet1'!$B$1)),
SUM(N('Sheet1'!$B$2:$B$9<>'Sheet1'!$C$2:$C$9)) - ROWS(A$1:A1) + 1
),
COLUMNS($A1:A1)
),
""
)}
Main definitions
INDEX
Returns the value of an element in a table or an array, selected by the row and column number indexes.
Usage is INDEX(array, rowNumber, columnNumber).
Example: if D6 contains Hello World! then INDEX(C3:E20, 4, 2) returns Hello World! (2nd cell in 4th row in the given range)
LARGE
Returns the k-th largest value in a data set. You can use this function to select a value based on its relative standing. For example, you can use LARGE to return the highest, runner-up, or third-place score.
Usage is LARGE(array, k).
Example: LARGE({1,5,5,9,2,7,0,1}, 2) = 7 (7 is the second largest value)
Breakdown of the formula
1) Find row numbers
It all starts with a comparison of the columns B and C.
'Sheet1'!$B$2:$B$9<>'Sheet1'!$C$2:$C$9
Remind that this is an array formula. Thus the result of this comparison is an array containing boolean values.
{FALSE,TRUE,FALSE,FALSE,TRUE,TRUE,FALSE,TRUE}
In the next step the boolean-array is multiplied with the relative row numbers. ROW('Sheet1'!$B$2:$B$9) returns the absolute row numbers: {2,3,4,5,6,7,8,9}. The position of the heading ROW('Sheet1'!$B$1) is subtracted. We get the relative row numbers {1,2,3,4,5,6,7,8}.
Both arrays are multiplied.
N('Sheet1'!$B$2:$B$9<>'Sheet1'!$C$2:$C$9) * (ROW('Sheet1'!$B$2:$B$9) - ROW('Sheet1'!$B$1))
Replaced with values:
N({FALSE,TRUE,FALSE,FALSE,TRUE,TRUE,FALSE,TRUE}) * ({2,3,4,5,6,7,8,9} - 1)
Resolved:
{0,1,0,0,1,1,0,1} * {1,2,3,4,5,6,7,8,9}
The resulting array contains the relative row numbers of those rows that differ in B and C.
{0,2,0,0,5,6,0,8}
2) Arrange row numbers in desired order
The result of the LARGE function is passed as row number parameter to the INDEX function. We want the INDEX function to return errors (discussed later) for rows with equal values in columns B and C. Thus we have to implement some weird logic to calculate the k parameter for the LARGE function.
LARGE(
N('Sheet1'!$B$2:$B$9<>'Sheet1'!$C$2:$C$9) * (ROW('Sheet1'!$B$2:$B$9)-ROW('Sheet1'!$B$1)),
SUM(N('Sheet1'!$B$2:$B$9<>'Sheet1'!$C$2:$C$9)) - ROWS(A$1:A1) + 1
),
The SUM counts rows having differences in columns B and C =4, then the currently viewed row ROWS(A$1:A1) is subtracted and 1 is added. We get following values for the k parameter of LARGE: 4, 3, 2, 1, 0, -1, -2, -3.
LARGE({0,2,0,0,5,6,0,8}, k)
The resulting values are:
2, 5, 6, 9, #NUM!, #NUM!, #NUM!, #NUM!
3) Pick the values
The INDEX function references the source data 'Sheet1'!$A$2:$C$9. Row numbers are the values we just calculated with LARGE, and column number is the currently viewed column COLUMNS($A1:A1).
For the first target row INDEX returns the values of the second source row, for the second target row the values of the 5th source row, and so on. From the 5th target row onwards we don't want to display anything. If we would use 2, 5, 6, 9, 0, 0, 0, 0 for the row numbers INDEX would write unwanted values in the 5th to 8th line. This is why we wanted LARGE to return #NUM! for rows with equal values. If INDEX is passed #NUM! then it also returns #NUM!. Finally, we can handle these cases with IFERROR(..., "") and get empty cells.
That's it.
My question is that I want to return a list of values in column B in sheet 2 (or in this case NBA Players) that contain the value "PG" in cell A3 in sheet 1, from column A in sheet 2. Not only do I want it to match "PG" but I also want the value to have a salary (Column C) that is between $7100 (Cell B2 in Sheet 1) and $8000 (Cell C2) in Sheet 1). Any help would be appreciated.
you are either going to need to use an array formula or a function that returns array like calculations. I will suggest using the AGGREGATE function. Avoid using full comm/row references within an array formula or a function performing array like calculations or you may wind up bogging down your system with excessive calculations.
The AGGREGATE function is made up a several individual functions. Depending which one you choose, it will perform array operations. I am going to suggest that formula 14. What the following example will do is generate a list of results sorted from smallest to largest that ignores error values, then return the first value from the list. The thing we will list is the row number for a row that matches your ALL your criteria. So the basics of AGGREGATE looks like this:
AGGREGATE(Formula #, Error/hidden handling #, Formula, parameter)
The hardest part of this is coming up with the right formula. In the numerator you put the thing you are looking for. In the denominator you place your TRUE/FALSE condition checks. Separate each condition check with *. * will act as an AND function. The thing that makes this work is that TRUE/FALSE convert to 1/0 when they are sent through a math operation. So anything you do not want is FALSE. and anything divided by FALSE becomes divide by 0 which in turn generates an error. Since AGGREGATE is set to ignore error, only things that meet your condition will exist in the list and since they are being divided by TRUE which is 1, your thing remains unchanged. So the aggregate function is going to start to look like:
AGGREGATE(14,6,ROW(some range)/((Condition 1)*Condition 2)*...*(Condition N)),1)
So as eluded to before, 14 set the AGGREGATE to sort a list in ascending order. 6 tells AGGREGATE to ignore errors, and the 1 tells AGGREGATE to return the first item in its sorted list. If it was 2 instead of 1 it would return the 2nd position. If you ask for a position that is greater than the number of items in the list, there will be an error produced by AGGREGATE which does not get ignored.
So now that there is some understanding of what AGGREGATE does lets see how we can apply this to your data. For starters lets assume your data is in rows 2:100 and row 1 is a header row. You will have to adjust the references to suit your data.
CONDITION 1
LEFT($A$2:$A$100,2)="PG"
Checks to see if the first two characters are PG. based on the data in your screen shot, PG was either to the left of the / or was the only entry. There was also an observation that there was only one / in the cells of column A. If you also need to check if it after the / and with the assumption that it can only be on one side and not both at the same time you could use this alternative for your condition check:
(LEFT($A$2:$A$100,2)="PG")+(RIGHT($A$2:$A$100,2)="PG")
In this case the + is performing the task of an OR function. The caveat mentioned earlier is important because if both sides are TRUE then you wind up with TRUE+TRUE which becomes 1+1 which is 2 and we only want to divide by 1 or 0. Though to counter that you could go with:
MIN((LEFT($A$2:$A$100,2)="PG")+(RIGHT($A$2:$A$100,2)="PG"),1)
CONDITION 2
Check that the salary in C is less than or equal a value 80000.
($C$2:$C$100<=80000)
CONDITION 3
Check that the salary in C is greater than or equal a value 71000.
($C$2:$C$100>=71000)
Now lets put this all together to get a list of row numbers that meet your conditions:
AGGREGATE(14,6,ROW($A$2:$A$100)/MIN((LEFT($A$2:$A$100,2)="PG")+(RIGHT($A$2:$A$100,2)="PG"),1)*($C$2:$C$100<=80000)*($C$2:$C$100>=71000),ROW(A1))
Now provided I did not screw up the bracketing in that formula, you can place that formula in a cell and copy it down until it produces errors. As you copy it down, the only thing that will change is the A1 in ROW(A1). It acts like a counter. 1,2,3 etc. so you will get a list of row numbers that meet your criteria. Now we need to convert those row numbers to names.
To find the names, the INDEX function is your friend here. Because it is not part of an array formula or inside a function performing array like calculations, full column reference can be used. So we take our formula that is generating row numbers and place it inside the INDEX function to give:
INDEX(B:B,Row Number)
INDEX(B:B,AGGREGATE(14,6,ROW($A$2:$A$100)/MIN((LEFT($A$2:$A$100,2)="PG")+(RIGHT($A$2:$A$100,2)="PG"),1)*($C$2:$C$100<=80000)*($C$2:$C$100>=71000),ROW(A1)))
Now if you hate seeing error codes when you have copied down further then results you can place the whole thing inside and IFERROR function to give:
IFERROR(formula,What to display in case of an error)
So for blank entries:
IFERROR(INDEX(B:B,AGGREGATE(14,6,ROW($A$2:$A$100)/MIN((LEFT($A$2:$A$100,2)="PG")+(RIGHT($A$2:$A$100,2)="PG"),1)*($C$2:$C$100<=80000)*($C$2:$C$100>=71000),ROW(A1))),"")
and custom message:
IFERROR(INDEX(B:B,AGGREGATE(14,6,ROW($A$2:$A$100)/MIN((LEFT($A$2:$A$100,2)="PG")+(RIGHT($A$2:$A$100,2)="PG"),1)*($C$2:$C$100<=80000)*($C$2:$C$100>=71000),ROW(A1))),"NOT FOUND")
So now you just need to adjust the references to suit your data. If your data is located on another sheet remember to include the sheet name. A reference to B3:C4 would become:
Sheet1!B3:C4
and if the sheet name has a space in it:
'Space Name'!B3:C4
I have a dataset in Excel where i would like a formula to find the most frequent observation (from column B to column F) for each row. However, if there are any ties there are two tie-breakers, ranked in the following order: The first tie-breaker is, that if the number 4 is tied as the most frequent observations in any row the result in that row should be 4. The second tie-breaker is that if there is a tie (where 4 is not tied for the most frequent observation) it should show the value in Column G.
In the picture below I have made a rough sketch of (to the left) the data I have now and (to the right) the outcome i want.
Picture of dataset:
What formula would I need to write, in order to get the result i would like?
Thanks in advance,
Anders
See if this works for you:
=IF(ISNA(MODE.MULT(MyData)),IF(ISNA(MATCH(4,MyData,0)),Fruit,4),IF(ISERR(INDEX(MODE.MULT(MyData),2)),MODE.MULT(MyData),IF(ISNA(MATCH(4,MODE.MULT(MyData),0)),Fruit,4)))
entered as an array formula CTRL-SHIFT-ENTER.
Here MyData is a placeholder for a row of data. In your example, MyData will be a single row from columns B-F; for case A, MyData={1,1,1,1,2}. Fruit is a placeholder the corresponding value from column G. You can replace MyData with B2:F2 and Fruit with G2 then copy and paste to other locations.
Here's how it works. The formula uses Excel's MULT.MODE function, which returns as many mode values as there are in the data.
MULT.MODE returns N/A when there are no repeated elements in MyData. This is the situation for your cases D and E. This means there is an N-way tie, so we need to apply the tie breaking rules. This is done by using the MATCH function to see if 4 is found in MyData; if it is, return 4, otherwise return Fruit.
If MyData has repeated elements, MULT.MODE returns an array containing the mode or modes found. If there is no tie, MULT.MODE returns a single element, otherwise the array will have at least two elements. To test for ties, we attempt access to the 2nd element of the array with use INDEX(MULT.MODE(MyData),2). This will throw an error if there is no tie.
If there is no tie, detect the resulting error with ISERR. With no tie, we return the result of MULT.MODE.
If there is a tie, no error occurs. In that case, we use MATCH to look for 4 in the results of MULT.MODE. If 4 is found, we return 4; if not return Fruit.
Hope that helps.
#xidgel: This is a great answer, but you'll also have to account for the case where all observations are the same.
=IF(ISNA(MODE.MULT(MyData)),
IF(ISNA(MATCH(4,MyData,0)),Fruit,4),
IF(ROWS(MODE.MULT(MyData))<2,
IF(AND(COUNTIF(MyData,"<>"&MODE.MULT(MyData))=0,MODE.MULT(MyData)<>4),Fruit,MODE.MULT(MyData)),
IF(ISNA(MATCH(4,MODE.MULT(MyData),0)),Fruit,4)))
entered as an array formula CTRL-SHIFT-ENTER.
I have an excel file like the following:
and I would like to replace the value of votes and avgsocre of those rows with Print = 1 with the rows I have in another file, which looks like the following:
The index number in the second file is exactly off by 1 and since there are also \N's with values from rows with Print = 0, so I cannot use replace then vlookup.
Would appreciate any help on this.
It looks like the variables that I'll call Design and Author uniquely identify your observations, so it would probably make more sense to look up Votes and AvgScore based on those values rather than trying to use the row-1 value in col A of the second file.
In your main file, make a column Votes_new with the formula (in this example, for row 2707):
{=INDEX(SecondFile!$A$1:$H$11, MATCH(1, (SecondFile!$B$1:$B$11=$A2707)*(SecondFile!$C$1:$C$11=$B2707)*(SecondFile!$G$1:$G$11=1), 0), 4)}
Use the same formula with 5 instead of 4 in the last argument in a column for AvgScore_new.
This formula matches on three criteria: Design (equal to the value in the current row), Author (equal to the value in the current row), and Print (equal to 1). It gets the value from the 4th column of the data in the second file, which is Votes (or the 5th column, which is AvgScore).
Note that you have to enter as an array formula using Ctrl+Shift+Enter.
Also note that this will return a #N/A error if there is no match (either because the Design-Author combination is not present or because Print is not equal to 1) so you may want to enclose this in an IFERROR() formula.
Alternatively, if you really want to look up by row reference, you can use =VLOOKUP(ROW()-1, ...) to look up the current row (e.g., 2707) minus 1 (2706) from the second file, and then find the value of print: =VLOOKUP(ROW()-1, SecondFile!$A1$H11, 7, FALSE). You can use this in an IF() function to look up a different column if the value is one: e.g., =IF(VLOOKUP(ROW()-1, SecondFile!$A1$H11, 7, FALSE)=1, VLOOKUP(ROW()-1, SecondFile!$A1$H11, 4, FALSE), "Value not found")