Getting the sum of entries that match the conditions from 2 columns - excel

I have this setup in my worksheet
where column A contains the conditions I need to look out for,
column B contains the total number of records that fit the conditions,
and columns D and E contain the records I need to count
what I want to happen is to create a formula for column B where it can count the total records that fit 2 conditions I need
for example in the case of B3, there are 2 instances of Dog Red in column E where column D contains the word small, and for B3 there is only 1 instance of Dog Red where column D contains the word large
the same conditions apply for the rest of column B
is there a way to compute for the total records that fit the 2 conditions in a single cell?

The formula below will work for B3:B4. For B6:B7 the reference to A$2 would have to be changed manually.
=COUNTIFS($E$3:$E$9,A$2,$D$3:$D$9,"*"&A3&"*")
It should be possible to extend the formula to be able to find the right color dog in column A so that you can copy the formula down all the way without change. I haven't done this here because I feel your example isn't representative of your final worksheet. For the moment, please just take note that your arrangement of count criteria makes referencing them difficult. Perhaps a better way of displaying them can be found.

Related

Sum column based on conditions for subsums

So I have a table which basically looks as follows:
Criterion Value
1 -5
1 1
2 5
2 5
3 2
3 -1
I want to sum the values in column B based on the criteria in column A, but only if the sum for an individual criterion is not negative. So for example if I ask for the sum of all values where criterion is between 1 and 3, the result should be 11 (the values for criterion 1 not being included in the sum because they add up to a negative number.
My first idea was to add a third column with a sumif([criterion];[#criterion];[value]) and then use a sumifs function which checks whether that that third column is negative. However, my table has +100k lines and with that many sumif functions it becomes intolerably slow.
I know I could create a pivot table to the same effect, but that has two drawbacks: I would have to create a separate sheet, which would add complexity, and my table is frequently updated which means I would have to manually update that pivot table every time to allow for downstream calculations. NBD and I could do that as a last resort, but I wonder whether there isn't a more elegant way to solve this problem.
I would want to avoid VBA to avoid complexity (the sheet will be used by other persons).
Thank you
This can be easily done using UNIQUE() and the two versions of SUMIF() in this way:
First collect all the criteria with =UNIQUE(A2:A7) -- Assuming your data are in columns A and B starting from row 2, this goes in cell C2, with "Criteria" in C1
Compute the subtotals for all criteria using =SUMIF($A$2:$A$7, C2, $B$2:$B$7) -- This goes in cell D2 and extends as the criteria do, "Partials" in cell D1
sum all the data in step 2 yielding a positive sum with =SUMIF(D2:D7, ">0") in cell E2
If you have a lot of data I suggest to use the column references to avoid absolute references and the need to adjust the formulas as data change (in number):
The first formula becomes =UNIQUE(A:A) -- Don't care about the heading being taken (strings and empty cells are not summed)
For the second formula use =SUMIF(A:A, C2, B:B)
Use =SUMIF(D:D, ">0") for the last step
This should be reasonably fast, using just as many extra cells as the number of distinct criteria (multiplied by 2).

Distinct count of values in a column if two other cells match

Struggling a bit with array formula's to count distinct values in one column, in the rows where two other cells match. Sorry if I can't explain any better. Best is to show you the formula I created and provide some sample data:
Sheet 1
Column A Column B
Period 101 Code X
Period 309 Code Y
Period 101 Code Y
Period 101 Code Z
Period 404 Code Y
Period 101 Code X
Period 101 Code X
Period 404 Code X
Period 404 Code Z
Sheet 2
Column A Column B (where the formula should be)
Code X 2
Code Y 3
Code Z 2
Basically I want to count the distinct values in Sheet 1 column A, only where the value in Sheet 1 column B matches the value in sheet 2 column A. I have provided the expected outcome for the three code values.
I have tried with the following formula, but I am unable to count distinct values in another column where the two cells match:
{=SUM(--(FREQUENCY(IF(C5:C11=G5,MATCH(B5:B11,B5:B11,0)),ROW(B5:B11)-ROW(B5)+1)>0))}
Please ignore the rows and columns used in the formula, also the values in Column A and B on sheet 1 both occur multiple times, but the values in column one on sheet 2 only occur once.
I am curious how someone would solve this one. Thank you in advance.
Caveat: This answer is unlikely to be useful to the OP, as these techniques are as yet only available to Excel Insiders
But once these new features are available to the main stream they will be a game changer.
This uses the new Dynamic Array feature coming to Excel soon.
To create the list of unique values from Column B, place this formula in a single cell. Excel will "Spill" into as many rows as needed to return the unique list of values from Column B. For example, I have used cell E2
=UNIQUE(FILTER($B:$B,$B:$B<>""))
Now, place this formula in a the next adjacent cell, I've used F2
=COUNTA(UNIQUE(FILTER($A:$A,$B:$B=$E$2)))
Again, you only need to put this formula in one cell, no need to copy down. Excel will "Spill" the result into as many cells as needed, to match column E.
Your formula doesn't match your sample data but let's assume the below:
Formula in H5:
=SUM(--(FREQUENCY(IF(C$5:C$13=G5,MATCH(B$5:B$13,B$5:B$13,0)),ROW(B$5:B$13)-ROW(B$5)+1)>0))
Entered as array through CtrlShiftEnter and drag down
Notice the semi-absolute cell references (you used relative ones) + how my ranges are larger than yours (you looked from C5:C11 only)

Aggregating data using INDEX MATCH MATCH or SUMIFS

I'm trying to create an Excel formula that is able to sum multiple rows in a table, where the rows and column to be summed are determined by the contents of other cells.
Ordinarily I would use Index Match Match to achieve this, but the multiple rows summation has left me stumped.
I've seen a couple of examples on here of Index Match with a SUMIFS formula, but nothing that pairs this with Index Match Match.
I have two tables on different Excel sheets. The first one looks a little this (the actual table is 105 columns x 200 rows):
That is from a sheet called "Firm Cost Summary". Row 4 contains a list of unique employee numbers. Column A is the expense category per our accounting system and Column B is a broader category that should be used in Excel to group similar items. Column E onwards then contains the numerical information to be aggregated.
What I would then like to do is summarise that table in a more presentable format that can then be manipulated in other ways. The table looks like this:
That is on a sheet called "Staff Cost Summary". I would like to fill out the info in the yellow cells, i.e. total the salary, bonus, benefits, etc, of each staff member. Ideally this would be a formula I input in cell E6 that I can then drag right and downwards to fill the table.
To give an example, to fill out cell I6 in the second table, the formula should look in cell A6 to find the employee number (1 in this case) and look this up in row 1 of the first table to find the appropriate column of the first table (column E in this case).
The formula should then look in cell I5 of the second table to see that we are looking to aggregate benefits, then look down column B of the first table to find each row that should be summed (rows 7-10 in this case).
With that in mind, here's what I've got:
=INDEX('Firm Cost Summary'!$A$4:$G$10,MATCH('Staff Cost Summary'!$A6,'Firm Cost Summary'!$A$4:$G$10,0),MATCH('Staff Cost Summary'!E$5,'Firm Cost Summary'!$B$4:$B$10,0))
Total benefits for Joe Bloggs are the sum of cell E7:E10 of table 1, i.e. 5 + 10 + 50 + 100 = 165.
Clearly there are multiple matches in column B of that table, so the above formula gives an answer of 0. Any ideas how I can tweak that to make it work?
Put this in E6 and copy over and down
=SUMIFS(INDEX('Firm Cost Summary'!$D:$DD,0,MATCH($A6,'Firm Cost Summary'!$D$4:$DD$4,0)),'Firm Cost Summary'!$B:$B,E$5)
The index/match returns the correct column to be added.

Find the difference between data in 2 columns in Excel

I have 2 sets of data. I put it in Excel e.g. column A and column B. Now I want to know which data from B is part of column A. I run this formula =IF(COUNTIF($A$1:$A$327238,B1)>0,"Exist", "Nope")
Then I 'filter it and look only 'Exist'. Based on that I know that all data in B that has label 'Exist' is part of column A
Now I want to know opposite i.e. which data from A are part of B. For that reason I use the same formula but I replace the data in columns i.e. data from B now in A and vice versa.
Then I randomly verify results.
For case 1 it looks it works fine but for second case it looks it's not accurate.
My assumption: should it work in case 2 as well ( maybe I just was not very accurate in some way ) and I should expect it to work?
Thanks
In cell C1 (assuming your data starts from 1st row) type the following =IF(A2=B2,"equal","no"), and then populate the same formula to the last row where there is still data, so that for row N, your formula in column C is =IF(AN=BN,"equal","no"). After that you will just need to count the cells with value "no" to know the differences. Sorry if I didn't get the question correctly.
Ok, assuming that the two sets of data are in columns A and B (they might be of different sizes), and the last rows of data are L and M respectively, click on D1 and type the following: =IFNA(INDEX(B$1:B$5,MATCH(A1,B$1:B$5,0),1),"Unique"). Drag down to apply this formula on D1 - DL. That's it, you have the duplicate elements. Since the duplicate elements are the same in both columns - A and B, you don't need to repeat this for column B. Note, that for all the unique elements the corresponding rows of column D have the word "Unique", so if you want the unique elements, you can just get the elements from A with the mentioned row numbers:
Just select any column's first row cell and type the following formula: =IF(D1="Unique",INDEX(A$1:A$L,ROW(D1)),"Duplicate").

Compare two data sheets

The issue I'm faced with is I have two sheets of data in Excel. They are a stocksheet list, listing items that have a variance from a stocktake. The items are randomly placed between both documents, so it is almost impossible to do a side-by-side view even if I were to order the columns (which I already have). For example it would be like this:
Sheet 1:
A1 (Apple) (1)
A2 (Carrot) (-3)
A3 (Banana) (4)
A4 (Chocolate (-7)
Whereas Sheet 2 may be:
A1 (Orange) (-2)
A2 (Apple) (3)
A3 (Muffin) (-8)
A4 (Carrot) (3)
So as you can see, the same data may appear, and if it does I want to compare those two sets, to know the variance, i.e. Sheet 1 said -3 whereas sheet 2 said +1... I preferably would like to do this in a batch if possible, as there are over 800 cells to go through.
Just so that you can see what I'm dealing with, here's links to pastebins of both sheets;
Sheet 1: http://pastebin.com/6i7QKJ6N
Sheet 2: http://pastebin.com/zjtC2U7q
Is there anything anyone can think of that would be able to assist me, other than me going through this one by one which I am considering doing?
Excuse me from avoiding the real situation and sticking with your example. Assuming the values are in ColumnB in the corresponding rows, then:
in Sheet1: =VLOOKUP(A1,Sheet2!A:B,2,FALSE)
in Sheet2: =VLOOKUP(A1,Sheet1!A:B,2,FALSE)
say in ColumnsC should 'align' the entries (where both exist, otherwise #N/A). =B1=C1 in D1 copied down should then help to identify the mismatches and say =B1-C1 in E1 copied down the quantification the discrepancies between the sheets, by 'vegetable'.
There should be no need for a batch mode for this.
I'm assuming that the unique identifier for the stock items is the column labelled CYSKU, right?
If that's so, then there are only 192 common items between the two sheets. I ran a vlookup in both sheets a bit similar to the one pnuts used and used a filter.
There are more variances between CYCOST than with CYRETL as far as I can see (I haven't compared the other columns).
To perform the comparison, you can do the following:
Insert a column between columns C and F (just after CYSKU) and put a vlookup formula in row 2 of this column and fill it down:
=VLOOKUP(C2, Sheet2!C:C, 1, 0)
Insert a filter and filter out #N/A from this column to get only those that are common between the two sheets.
In column M (after CYDVAR), insert another vlookup and fill it down:
=VLOOKUP(C2, Sheet2!C:F, 4, 0)
This will give you the corresponding CYRETL from Sheet2. You can then compare the two CYRETL.
How VLOOKUP works:
The first parameter is what VLOOKUP will be looking for.
The second parameter is the table range in which to look the first parameter.
The third parameter is the nth column from which a match will be returned, limited to the table (if the table is in column A:A, only 1 column is available, if the table is A:B, 2 columns are available, etc).
The last parameter is for either exact or approximate match. Exact is 0 (or FALSE) and approximate is 1 (or TRUE).
You can just change the table range and the column number to change the value you're looking for from Sheet2.

Resources