The first table below shows how much each person owes and who pays it (it's part of a larger model so I simplified it for our purposes here).
Our goal in the second table below is to give a sum when both the column and row value match.
For example: A (column C) paid $244.17 (D36:H48) in expenses for B (row 35).
Where am I wrong here? I have tried different methods suggested here.
This is another alternative, that only requires to extend the formula down, but not to the left, because on each row it returns an array with all column values. In cell I3 enter the following formula:
=MMULT(N(TRANSPOSE($A$3:$A$15=H3)),IF($B$3:$F$15="", 0, $B$3:$F$15))
or using LET to avoid repetition of the same range:
=LET(set, $B$3:$F$15, MMULT(N(TRANSPOSE($A$3:$A$15=H3)),IF(set="", 0, set))
Notes:
MMULT only works with numeric values, so empty cells need to be converted.
You can replace TRANSPOSE with TOROW if you want.
$-notation is not required in H3, because we extend the formula only down
Here is the output:
Note: This solution assumes header values to compare are the same, i.e. same values for Paid For (I2:M2) and Paid By (H3:H7). Which is the most common situation. That is why in the formula only Paid By column is used. If that is not the case, then the solution provided by #JB-007 is more flexible, because the values can be different, but then you need to extend the formula in both directions.
screenshot/s here refer:
=SUM($C$4:$E$6*($C$3:$E$3=C$8)*($B$4:$B$6=$B9))
(sumifs will really struglly to work across different dimensions)
PS - as you'll see most will advise sumproduct - I think it's overdue deprecation because there's very little (if anything) you can do with sumproduct that you cannot with sum. You can even do counts with sum SUM(1*($C$3:$E$3=C$8)*($B$4:$B$6=$B9))) returns the count of where these values are equivalent...
Save yourself the extra seven letters over and over! ☺
Related
Looking for something like a typical index match formula that can look to the right and return value to the left, but look at all columns in a range. Take below valid formula for example.
(Excel 2021.)
Finds A1's value in column D, and it returns value from column C.
=INDEX($C$1:$C$10,MATCH(A1,$D$1:$D$10,0))
In my ideal world I can Keep $D$1 and change $D$10 to $F$10 so it searches all columns D/E/F, and still returns C like below. However that does not work in our real world, any other ideas please? Thanks!
=INDEX($C$1:$C$10,MATCH(A1,$D$1:$F$10,0))
Update*
To clarify there are mix of letters and numbers. Also this table will be about 50k rows so hoping as simple as possible.
Also Column C will all be unique for sure, and D-F should be unique values but there is a chance a mistake and a few duplicates might be in.
You need MMUL() with INDEX(). Try below formula if you have Excel-365.
=FILTER(C1:C10,MMULT(--(D1:F10=A1),SEQUENCE(COLUMNS(D1:F1))))
For older version try
=INDEX($C$1:$C$10,LARGE(MMULT(--($D$1:$F$10=A1),TRANSPOSE({1,1,1}))*ROW($C$1:$C$10),1))
Since your INDEX/MATCH take from the same rows, you can first simplify your original search with
=XLOOKUP(A1,$D$1:$D$10,$C$1:$C$10)
XLOOKUP combines HLOOKUP and VLOOKUP with exact match being the default.
This will work for searching three rows
IFERROR(IFERROR(XLOOKUP(A1,$D$1:$D$10,$C$1:$C$10), XLOOKUP(A1,$E$1:$E$10,$C$1:$C$10)), XLOOKUP(A1,$F$1:$F$10,$C$1:$C$10))
We can name the columns colC, colD, colE, and colF and it becomes
IFERROR(IFERROR(XLOOKUP(A1,colD,colC), XLOOKUP(A1,colE,colC)), XLOOKUP(A1,colF,colC))
As with other lookups, this returns the first value or #N/A error.
This could be made more scalable for higher number of rows if we are allowed to add a column somewhere.
I am not familiar enough with excel to build this formula:
I am trying to write an excel formula to search for matching data in columns A and F, then report a value from column B that corresponds to the same row as the "found" A column into column G.
So basically, I'm putting the formula into G607 and I am searching, in column A for a value that matches F607.
If I find a matching value in A104, I want the value that reports in G607 to be B104.
My spreadsheet does not have any duplicate values in column A.
VLOOKUP is cool, but I would warn you away from it before you get comfortable with it. VLOOKUP is silly because:
It requires you to define an array where the unique IDs are ALWAYS the left column.
It requires you to COUNT COLUMNS LIKE A HEATHEN to tell it what the output is.
There are memory and other structural problems not worth going into.
You have to sort Column A if you care about exact matches. Amateur hour here.
Guys, I'm joking, I don't care that much, but seriously, Index/Match is cooler.
In G607, put this:
=INDEX(Sheet1!B:B,MATCH(Sheet1!F607,Sheet1!A:A,0))
Break it down:
INDEX() helps us by saying, "What is the answer you want? Cool, now tell me the row and column?" Obviously, if we knew the row/column, we wouldn't care.
Que the MATCH() Equation - this is where we say, "Hey, see F607? Yeah, find where it matches in Column A."
IF THERE WERE DUPLICATES IN COLUMN A, it would stop at the first entry and report that. Not a concern here since you don't have duplicates!
The 3rd argument (has a 0) in the MATCH equation just says "Hey, make it an exact match".
Index/Match like this makes sure that we can:
Pick any match column we want. VLOOKUP wouldn't work if Column B had the Unique IDs, and Column A was the answer instead.
Pick any output column we want without counting. Seriously, who counts in 2018?
Arrays are minimized b/c 1 and 2 above, plus other reasons.
No sorting required. Winner. Winner.
Assuming all cells are on the same sheet,
G607 = VLOOKUP(F607, A:B, 2, 0)
I'm working on data from a population of people with allergies. Each person has a unique ExceptionID, and each allergen has a unique AllergenID (451 in total).
I have a data table with 2 columns (ExceptionID and AllergenID), where each person's allergies are listed row by row. This means that the ExceptionID column has repeated values for people with multiple allergies, and the AllergenID column has repeated values for the different people who have that allergy.
I am trying to count how many times each pair of allergies is present in this population (e.g. Allergen#107 & Allergen#108, Allergen#107 & Allergen#109,etc). To keep it simple I've created a matrix of 451 rows X 451 columns, representing every pair (twice actually because A/B and B/A are equivalent).
I somehow need to use the row name (allergenID) to lookup the ExceptionID in my data table, and count the cases where that matches the ExceptionIDs from the column name (also AllergenID). I have no problem using Vlookup or Index/Match, but I'm struggling with the correct combination of a lookup and Sumproduct or Countif formula.
Any help is greatly appreciated!
Mike
PS I'm using Excel 2016 if that changes anything.
-=UPDATE=-
So the methods suggested by Dirk and MacroMarc both worked, though I couldn't apply the latter to my full data set (17,000+ rows) because it was taking a long time.
I've since decided to turn this into a VBA macro because we now want to see the counts of triplets instead of pairs.
With the 2 columns you start with, it is as good as impossible... You would need to check every ExceptionID to have 2 different specific AllergenID. Better use a helper-table with ExceptionID as rows and AllergenID as columns (or the opposite... whatever you like). The helper table needs a formula like:
=COUNTIFS($A:$A,$D2,$B:$B,E$1)
Which then can be auto-filled. (The ranges are from my example, you need to change them to your needs).
With this helper-matrix you can easily go for your bigger matrix like this:
=COUNTIFS(E:E,1,INDEX($E:$G,,MATCH($I2,$E$1:$G$1,0)),1)
Again, you can auto-fill with this formula, but you need to change it, so it fits your needs.
Because the columns have the same ID2 (would be your AllergenID), there is no need to lookup them because E:E changes automatically with the auto-fill.
Most important part of the formulas are the $ which should not be messed up, or you can not auto-fill it.
Picture of my self-made example (formulas are from the upper left cell in each table):
If you still have any questions, just ask :)
It can be done straight from your original set-up with array formulas:
Please note that array formulas MUST be entered with Ctrl-Shift-Enter, before copying across and down:
In the example pic, I have NAMED the data ranges $A$2:$A$21 as 'People' and $B$2:$B$21 as 'Allergens' to make it a nicer set-up. You can see in the formula bar how that looks as a formula. However you could use the standard references like this in your first matrix cell:
EDIT: silly me, N function is not needed to turn the booleans into 1's and 0's, since multiplying booleans will do the trick. Below formula works...
SUM(IF(MATCH($A$2:$A$21,$A$2:$A$21,0)=ROW($A$2:$A$21)-1, NOT(ISERROR(MATCH($A$2:$A$21&$E2,$A$2:$A$21&$B$2:$B$21,0)))*NOT(ISERROR(MATCH($A$2:$A$21&F$1, $A$2:$A$21&$B$2:$B$21,0))), 0))
Then copy from F2 across and down. It can be perhaps improved in technique with sumproduct or whatever, but it's just a rough example of the technique....
I have two excel sheets where I need to match three values to return a fourth. The similar columns are month, agent, and subdomain. The fourth column is called difference.
Concatenate would work, as per #MakeCents suggestion, but if you don't want a helper column, SUMPRODUCT would work.
example:
=SUMPRODUCT(--(A2:A12="d"),--(B2:B12="S"),--(C2:C12="Apr"),D2:D12)
would search range A2:A12 for "d", B2:B12 for "S" and C2:C12 for "Apr", and return the value fom D2:D12 that corresponds to where all 3 are true. If multiple lines match, it will add the value in D2:D12 for all matching rows.
The -- is used to change the True/False results into 0 and 1 for use in multiplication
Limitations of SUMPRODUCT
Recommended to specify the range explicitly; it will be slower with just
column references
(A1:A4000 is ok, A:A is not)
It will return an error if any of the values are errors
It will return numeric results only - text is evaluated as Zero
Although I believe #MakeCents comment / suggestion on how to do this is the way I would go since it is the simplest, you could accomplish this a different way (MUCH more processor-intensive, though) using the Index() and Match() functions and Array formulas.
For example, suppose your 3 columns of data you're looking to match against are columns A-C and you're looking to return the matching value from column D in Sheet1
Now, the 3 values you're looking to have matched are in cells A1, B1 & C1 of Sheet2, you could use the following formula:
=INDEX(Sheet1!D:D,MATCH(1,(Sheet1!A:A=A1)*(Sheet1!B:B=B1)*(Sheet1!C:C=C1),0))
And ENTER IT AS AN ARRAY FORMULA by pressing Ctrl + Shift + Enter
Hope this helps!
You are looking for a Lookup with multiple criteria.
One of the most robust options is
=INDEX(D:D,SUMPRODUCT(--(A:A="d"),--(B:B="S"),--(C:C="Apr"),ROW(D:D)),0)
It does not need to be entered as an array formula.
Taken from [1] (blogs.office.com).
See also this very complete answer, which summarizes this and other options for performing a lookup with multiple criteria.
PS1: Note that I used references to full columns, as per this.
PS2: This can be considered an enhancement to the solution by Sean for the case when the output column does not contain numbers.
References
[1] This post is written by JP Pinto, the winner of the Great White Shark Award given for the best article written about VLOOKUP during VLOOKUP Week.
Try this
=IF(A4=Data!$A$4:$A$741,IF(B4=Data!$B$4:$B$741,"Hai"))
TOP Table is Input, and bottom table is preview for required output.
For Each ID I need to find earliest datetime. I also need other information from other columns (please see image below).
My current solution is:
In Cell E2 =A2
Cell E3 drag down =IF(E2<>A3,IF(E1=A3,"",A3),"")
In Cell F2 drag down =IF(E2<>"",MIN(IF($A$2:$A$14=E2,$C$2:$C$14)),"") Ctrl+Shift+Enter
One more option without any intermediate calculations:
Select the whole range starting E2 and to the last row where IDs are located - for the sample given it's row 14, so select range E2:E14: =IFERROR(INDEX($A$2:$A$14,SMALL(IF(MATCH($A$2:$A$14,$A$2:$A$14,0)=ROW(INDIRECT("1:"&ROWS($A$2:$A$14))),MATCH($A$2:$A$14,$A$2:$A$14,0),""),ROW(INDIRECT("1:"&ROWS($A$2:$A$14))))),"") and press CTRL+SHIFT+ENTER instead of usual ENTER - this will define a Multicell ARRAY formula and will result in curly {} brackets around it (but do NOT type them manually!).
F2 (ID2): =IF(E2="","",SUMPRODUCT(--(E2=$A$2:$A$14),--(G2=$C$2:$C$14),$B$2:$B$14)) - normal formula.
G2 (Min Date): =IF(E2="","",MIN(IF(E2=$A$2:$A$14,$C$2:$C$14,2^100))) and press CTRL+SHIFT+ENTER instead of usual ENTER - this will define an ARRAY formula and will result in curly {} brackets around it (but do NOT type them manually!).
H2 (InCh): =IF(E2="","",INDEX($D$2:$D$14,SUMPRODUCT(--(E2=$A$2:$A$14),--(F2=$B$2:$B$14),--(G2=$C$2:$C$14),ROW(INDIRECT("1:"&ROWS($D$2:$D$14)))))) - normal formula.
Remarks:
To make the solution more compact and easy to read, define named range for ID column, and then reference other data columns using OFFSET.
ID2 values may not be unique - as they are on the sample for IDs 1...3.
Resulting set for Min Date should be formatted the same way as source Date row.
The key formula of the solution - is multicell monster which returns unique IDs without empty rows - as OP requested)
Sample file: https://www.dropbox.com/s/d2098updfh8djnf/MinDateIDs.xlsx
This is quite a challenge... I think I have found an approach that works. For the sake of clarity, I used a few helper columns. Also, I did not use any named ranges but stuck with the column-row indications. You might want to change that.
It looks like this:
and zooming in to the relevant columns:
Column F contains an array formula to filter out duplicates. An approach is explained here. The formula I used in F2 is
=INDEX($A$2:$A$14, MATCH(MIN(IF(COUNTIF($F$1:F1,$A$2:$A$14)=0, 1, MAX((COUNTIF($A$2:$A$14, "<"&$A$2:$A$14)+1)*2))*(COUNTIF($A$2:$A$14, "<"&$A$2:$A$14)+1)), COUNTIF($A$2:$A$14, "<"&$A$2:$A$14)+1, 0))
Use Ctrl-Shift-Enter to confirm as array formula. Drag this down or copy into column F. Then columns G and H contain the starting and ending indices of the duplicate ID values. This answer helped, please upvote it :-). The two formulas used are:
=MATCH(2,1/FREQUENCY($F2,$A$2:$A$14))
in G2, and
=FREQUENCY($A$2:$A$14,$F2)
in H2. Again, drag them down to get the full column filled. Next, column I is for clarification only -- and for sanity checking. It contains the desired minimum date from each sub-array. Column J substitutes that formula into a MATCH to find the actual index of the desired date.
=MIN(OFFSET($C$2:$C$14,$G2-1,0,1+$H2-$G2,1))
in I2 and
=$G2-1+MATCH(2,1/FREQUENCY(MIN(OFFSET($C$2:$C$14,$G2-1,0,1+$H2-$G2,1)), OFFSET($C$2:$C$14,$G2-1,0,1+$H2-$G2,1))
in J2. Finally, columns L, M and N index into the original set of data via
=INDEX(B$2:B$14,$J2)
in L2, which you can drag horizontally and then vertically.
When you are done, you can hide the helper columns, or fold everything into big formulas. Good luck with that... There might be an easier way to achieve this, but I did not find it.
If you want the value from column D in G then assuming that column C values are unique you could just use a VLOOKUP, i.e. in G2 copied down
=VLOOKUP(F2,C$2:D$14,2,0)
Per your picture, they're all in the same sheet. Just sort by ID, then Date (ascending). As you work your way down the ID column, each time the ID changes, you know you've found the row with the minimum Date for that specific ID. Create an extra column to signify where ID changes occur, and filter for those rows (hide the column if you so desire).
And... voila.
Know this link is old, but there is a much shorter and easier way!
How about using a pivot table using the Minimum as field setting and then do a =GETPIVOTDATA() to get the information back!
Seems a lot simpler as these formulas!
Actually, I just realized I've been overthinking this...Excel keeps the top item and removes all that follow when removing duplicates.
So if you are going to create an extra working table anyway, why not just copy the range/columns you want to keep, then use the basic sort.
Sort first by ID, then by the column you want as the second filter. Be sure the sorts are in the order you want (e.g. newest to oldest, oldest to newest, A to Z, Largest to smallest, etc).
Once the data is sorted, remove duplicates based on ID. You are left with all of your columns of data, filtered by newest/oldest/largest/smallest per individual.
This worked for my table with 30,000+ records, filtered down to 1500 unique individuals with most recent (plus associated amount), and with a second filter, the largest (plus associated date) for each person.