Cross table comparisons, sumproduct - excel

I am trying to compare two different Excel (2010/xlsx) tables with related data to find matches. They would be on different sheets but in the same workbook (not that it should affect the problem).
I think the best route is some combination of sumproduct, match, and index... but I haven't been able to get them to work so far. I see the main question (cell G17) being solved by creating a subset of rows from Table 2 to compare against their corresponding data in Table 1 (index/match), then using arrays to do a multiple criteria selection to count how many match the criteria I chose (sumproduct).
I have played around with vlookup, countif(s), and sumif(s) but haven't seen a good way to apply them to this problem.

You can use SUMIF as a "quasi-lookup" like this
=SUMPRODUCT((file="doc")*(modified < SUMIF(user,creator,create)))

I'm not sure how to do it in a single cell as you've asked, but I would create an extra column in the second table which uses vlookup to find the created date, and another column containing whether or not the created date is greater than the modified date. Finally, you could use countif to combine them.
To be more concrete, in your example, I would put =vlookup(F3,A$3:D$5,2,FALSE) in cell I3, and =I3>H3 into cell J3, and expand both of these down. Then cell G17 could be given by =countif(J3:J5,TRUE).

Related

VLOOKUP text with two criteria

I'm looking for a way to insert a column based on two criteria, as illustrated below. I have a main table with one row per company, and I want to add a column to this with the city names. However, the lookup table has two rows for some companies - one for "small" and one for "large". I'm only interested in retrieving the cities for companies that have size value "small".
I know that I can achieve this with =SUMIFS if the content of the column was a number instead of text. However, with the cities column consisting of text, I don't know how to proceed. I'd ideally like a solution where I don't have to use a helper column.
Edit: this is just an example of my data. I have hundreds of rows,the duplicate answer suggested uses INDEX/MATCH which requires me to give the exact cell location of each condition. This is not the case in my data.
There are a few solutions that I usually use for these tasks. They're not elegant i.e. not a 2-criteria look-up per se, but they get the job done.
Going by your data structure, you have these choices:
Sort your lookup table by size-company, with size in descending order. Thereafter, it's a straightforward vlookup since your big companies are seggregated from small ones.
Build a new key consisting of company-size i.e. CONCAT(company,size) and do the vlookup based on this key.
It's not possible with VLOOKUP. Look my solution in the picture using a array formula.
Solution using array formulas
Formula in F2: =INDEX($C$1:$C$6;SUM(IF(E2=$A$2:$A$6;1)*IF($B$2:$B$6="small";1)*ROW($C$2:$C$6));1)
Ps: don't forget to confirm the formula with Ctrl+Shift+Enter.
Multi-column lookups are certianly possible but not using VLOOKUP. You'll need to use INDEX and MATCH. This becomes pretty complex as it combines array formulas with boolean logic. Here's a nice explanation.
https://exceljet.net/formula/index-and-match-with-multiple-criteria
For your example, assuming Desired Result Company is in column I.
=INDEX($F$4:$F$5,MATCH(1,(D4:D5=I4)*(E4:E5="small"),0))

Excel VLOOKUP with multiple possible options in table array

I have two lists, the first is a set of users. The second list contains different encounter dates for these users.
I need to identify the date that is within 10 days of the "Renew Date" [Column C], but not before. With Member 1 this would be row 3 1/8/2017. With Member 2 this would be row 6, 1/21/2017.
Now using a VLOOKUP which the user before me who managed this spreadsheet obviously isn't viable as it's simply going to pickup the first date that has a matching Member ID. Is there a way to do this in Excel at all?
I have included a link to a sample file and a screenshoit of the sample data.
https://drive.google.com/file/d/0B5kjsJZFrUgFcUFTNFBzQzN4cm8/view?usp=sharing
To avoid the slowness and complexities of array formulas, you can try with SUMIFS but the problem is that if you have more than one match, it will add them, not return the first match. The sum will look like an aberration. Will work however if you are guaranateed that you have only one match in the data.
An alternative is also to use AVERAGEIFS, which, in case of multiple matches, will give you their average and it will look like a valid date and a good result. Enter this formula in D2 and fill down the column:
D2:
=AVERAGEIFS(G:G,F:F,A2,G:G,">="&C2,G:G,"<="&C2+10)
and don't forget to format column D as Date.
Try this
=SUMPRODUCT($G$2:$G$7,--($F$2:$F$7=A2),--($G$2:$G$7<=C2+10),--($G$2:$G$7>C2))
Format the result as date. See screenshot (my system uses DMY order)
Don't use whole column references with this formula. It will slow down the workbook.

Sumproduct or Countif on a 2D matrix

I'm working on data from a population of people with allergies. Each person has a unique ExceptionID, and each allergen has a unique AllergenID (451 in total).
I have a data table with 2 columns (ExceptionID and AllergenID), where each person's allergies are listed row by row. This means that the ExceptionID column has repeated values for people with multiple allergies, and the AllergenID column has repeated values for the different people who have that allergy.
I am trying to count how many times each pair of allergies is present in this population (e.g. Allergen#107 & Allergen#108, Allergen#107 & Allergen#109,etc). To keep it simple I've created a matrix of 451 rows X 451 columns, representing every pair (twice actually because A/B and B/A are equivalent).
I somehow need to use the row name (allergenID) to lookup the ExceptionID in my data table, and count the cases where that matches the ExceptionIDs from the column name (also AllergenID). I have no problem using Vlookup or Index/Match, but I'm struggling with the correct combination of a lookup and Sumproduct or Countif formula.
Any help is greatly appreciated!
Mike
PS I'm using Excel 2016 if that changes anything.
-=UPDATE=-
So the methods suggested by Dirk and MacroMarc both worked, though I couldn't apply the latter to my full data set (17,000+ rows) because it was taking a long time.
I've since decided to turn this into a VBA macro because we now want to see the counts of triplets instead of pairs.
With the 2 columns you start with, it is as good as impossible... You would need to check every ExceptionID to have 2 different specific AllergenID. Better use a helper-table with ExceptionID as rows and AllergenID as columns (or the opposite... whatever you like). The helper table needs a formula like:
=COUNTIFS($A:$A,$D2,$B:$B,E$1)
Which then can be auto-filled. (The ranges are from my example, you need to change them to your needs).
With this helper-matrix you can easily go for your bigger matrix like this:
=COUNTIFS(E:E,1,INDEX($E:$G,,MATCH($I2,$E$1:$G$1,0)),1)
Again, you can auto-fill with this formula, but you need to change it, so it fits your needs.
Because the columns have the same ID2 (would be your AllergenID), there is no need to lookup them because E:E changes automatically with the auto-fill.
Most important part of the formulas are the $ which should not be messed up, or you can not auto-fill it.
Picture of my self-made example (formulas are from the upper left cell in each table):
If you still have any questions, just ask :)
It can be done straight from your original set-up with array formulas:
Please note that array formulas MUST be entered with Ctrl-Shift-Enter, before copying across and down:
In the example pic, I have NAMED the data ranges $A$2:$A$21 as 'People' and $B$2:$B$21 as 'Allergens' to make it a nicer set-up. You can see in the formula bar how that looks as a formula. However you could use the standard references like this in your first matrix cell:
EDIT: silly me, N function is not needed to turn the booleans into 1's and 0's, since multiplying booleans will do the trick. Below formula works...
SUM(IF(MATCH($A$2:$A$21,$A$2:$A$21,0)=ROW($A$2:$A$21)-1, NOT(ISERROR(MATCH($A$2:$A$21&$E2,$A$2:$A$21&$B$2:$B$21,0)))*NOT(ISERROR(MATCH($A$2:$A$21&F$1, $A$2:$A$21&$B$2:$B$21,0))), 0))
Then copy from F2 across and down. It can be perhaps improved in technique with sumproduct or whatever, but it's just a rough example of the technique....

Get current row in from sumif statement

I'm using Excel 2010 and have a datasheet with multiple tabs, this is the current SUMIF function that I am using
=IF(SUMIF('Master Data'!$J$2:$J$200,'Resource View (2)'!B22,'Master Data'!$W$2:$W$200)>0,Current_row_different column, "")
Basically, I find some rows in the other sheet that have a value of 1 I then want to use these rows that have a value of 1 but use a different column within that row to populate the true condition.
So say for instance row 4 contains a 1 at column A I then want to stay on row 4 but get the value of column B for the true condition.
Is this possible?
EDIT:
I have now managed to get the function working however its a bit of an Excel hack, because I have had to move the columns around in the master data and its a bit messy now anyway here's what I've got
=IF(SUMIF('Master Data'!$C$2:$C$200,'Resource View (2)'!B22,'Master Data'!$W$2:$W$200)>0,VLOOKUP(B22,'Master Data'!$C$2:$F$90,3,FALSE),"")
Now I know this is because VLOOKUP searches through the first column specified and it doesn't seem to work at all if i try to put the range in backwards i.e. $F$90:$C$2 instead of $C$2:$F$90. Is there any way to manipulate VLOOKUP to work like this?
Yes, actually there is a way - a brother of VLOOKUP - it's INDEX(MATCH()). It is a very useful tool, as you can look at any column and return any other, not only looking at the first one to return the ones to the right. Also, INDEX(MATCH()) can be used in forming array formulas, if you need that, while VLOOKUP cannot.
Basically, this part of your formula:
VLOOKUP(B22,'Master Data'!$C$2:$F$90,3,FALSE)
would be changed with this:
INDEX('Master Data'!$E$2:$E$90,MATCH(B22,'Master Data'!$C$2:$C$90,FALSE))
So, after all, the equivalent for
=IF(SUMIF('Master Data'!$C$2:$C$200,'Resource View (2)'!B22,'Master Data'!$W$2:$W$200)>0,VLOOKUP(B22,'Master Data'!$C$2:$F$90,3,FALSE),"")
would be
=IF(SUMIF('Master Data'!$C$2:$C$200,'Resource View (2)'!B22,'Master Data'!$W$2:$W$200)>0,INDEX('Master Data'!$E$2:$E$90,MATCH(B22,'Master Data'!$C$2:$C$90,FALSE)),"")

Using SUMIFS with multiple AND OR conditions

I would like to create a succinct Excel formula that SUMS a column based on a set of AND conditions, plus a set of OR conditions.
My Excel table contains the following data and I used defined names for the columns.
Quote_Value (Worksheet!$A:$A) holds an accounting value.
Days_To_Close (Worksheet!$B:$B) contains a formula that results in a number.
Salesman (Worksheet!$C:$C) contains text and is a name.
Quote_Month (Worksheet!$D:$D) contains a formula (=TEXT(Worksheet!$E:$E,"mmm-yy"))to convert a date/time number from another column into a text based month reference.
I want to SUM Quote_Value if Salesman equals JBloggs and Days_To_Close is equal to or less than 90 and Quote_Month is equal to one of the following (Oct-13, Nov-13, or Dec-13).
At the moment, I've got this to work but it includes a lot of repetition, which I don't think I need.
=SUM(SUMIFS(Quote_Value,Salesman,"=JBloggs",Days_To_Close,"<=90",Quote_Month,"=Oct-13")+SUMIFS(Quote_Value,Salesman,"=JBloggs",Days_To_Close,"<=90",Quote_Month,"=Nov-13")+SUMIFS(Quote_Value,Salesman,"=JBloggs",Days_To_Close,"<=90",Quote_Month,"=Dec-13"))
What I'd like to do is something more like the following but I can't work out the correct syntax:
=SUMIFS(Quote_Value,Salesman,"=JBloggs",Days_To_Close,"<=90",Quote_Month,OR(Quote_Month="Oct-13",Quote_Month="Nov-13",Quote_Month="Dec-13"))
That formula doesn't error, it just returns a 0 value. Yet if I manually examine the data, that's not correct. I even tried using TRIM(Quote_Month) to make sure that spaces hadn't crept into the data but the fact that my extended SUM formula works indicates that the data is OK and that it's a syntax issue. Can anybody steer me in the right direction?
You can use SUMIFS like this
=SUM(SUMIFS(Quote_Value,Salesman,"JBloggs",Days_To_Close,"<=90",Quote_Month,{"Oct-13","Nov-13","Dec-13"}))
The SUMIFS function will return an "array" of 3 values (one total each for "Oct-13", "Nov-13" and "Dec-13"), so you need SUM to sum that array and give you the final result.
Be careful with this syntax, you can only have at most two criteria within the formula with "OR" conditions...and if there are two then in one you must separate the criteria with commas, in the other with semi-colons.
If you need more you might use SUMPRODUCT with MATCH, e.g. in your case
=SUMPRODUCT(Quote_Value,(Salesman="JBloggs")*(Days_To_Close<=90)*ISNUMBER(MATCH(Quote_Month,{"Oct-13","Nov-13","Dec-13"},0)))
In that version you can add any number of "OR" criteria using ISNUMBER/MATCH
You can use DSUM, which will be more flexible. Like if you want to change the name of Salesman or the Quote Month, you need not change the formula, but only some criteria cells. Please see the link below for details...Even the criteria can be formula to copied from other sheets
http://office.microsoft.com/en-us/excel-help/dsum-function-HP010342460.aspx?CTT=1
You might consider referencing the actual date/time in the source column for Quote_Month, then you could transform your OR into a couple of ANDs, something like (assuing the date's in something I've chosen to call Quote_Date)
=SUMIFS(Quote_Value,"<=90",Quote_Date,">="&DATE(2013,11,1),Quote_Date,"<="&DATE(2013,12,31),Salesman,"=JBloggs",Days_To_Close)
(I moved the interesting conditions to the front).
This approach works here because that "OR" condition is actually specifying a date range - it might not work in other cases.
Quote_Month (Worksheet!$D:$D) contains a formula (=TEXT(Worksheet!$E:$E,"mmm-yy"))to convert a date/time number from another column into a text based month reference.
You can use OR by adding + in Sumproduct. See this
=SUMPRODUCT((Quote_Value)*(Salesman="JBloggs")*(Days_To_Close<=90)*((Quote_Month="Cond1")+(Quote_Month="Cond2")+(Quote_Month="Cond3")))
ScreenShot
Speed
SUMPRODUCT is faster than SUM arrays, i.e. having {} arrays in the SUM function. SUMIFS is 30% faster than SUMPRODUCT.
{SUM(SUMIFS({}))} vs SUMPRODUCT(SUMIFS({})) both works fine, but SUMPRODUCT feels a bit easier to write without the CTRL-SHIFT-ENTER to create the {}.
Preference
I personally prefer writing SUMPRODUCT(--(ISNUMBER(MATCH(...)))) over SUMPRODUCT(SUMIFS({})) for multiple criteria.
However, if you have a drop-down menu where you want to select specific characteristics or all, SUMPRODUCT(SUMIFS()), is the only way to go. (as for selecting "all", the value should enter in "<>" + "Whatever word you want as long as it's not part of the specific characteristics".
In order to get the formula to work place the cursor inside the formula and press ctr+shift+enter and then it will work!
With the following, it is easy to link the Cell address...
=SUM(SUMIFS(FAGLL03!$I$4:$I$1048576,FAGLL03!$A$4:$A$1048576,">="&INDIRECT("A"&ROW()),FAGLL03!$A$4:$A$1048576,"<="&INDIRECT("B"&ROW()),FAGLL03!$Q$4:$Q$1048576,E$2))
Can use address / substitute / Column functions as required to use Cell addresses in full DYNAMIC.

Resources