How to sort a column based on exact matches with another column - excel

I have an inventory table that looks like this (subset):
part number | price | quantity
10115 | 14.95 | 10
1050 | 5.95 | 12
1074 | 7.49 | 8
110-1353 | 13.99 | 22
and i also have another table in sheet 2 that looks like this (subset):
part number | quantity
10023 | 1
110-1353 | 3
10115 | 2
20112 | 1
I want to basically subtract the quantities in the second table from the ones in the first table. What is the best way of doing this? I have looked in to VLOOKUP and INDEX MATCH but they are not quite right for this. Would this perhaps actually be better in say an Access DB ?

I have add another two columns next to sheet 1 last column. Let us assume that the second table range is A1:B5.
Image:
Formulas:
Column D:
=IFNA(VLOOKUP(A2,Sheet2!$A$2:$B$5,2,FALSE),0)
Column E:
=C2-D2

If you wanted to tackle this using MS Access, the SQL code might look like this:
select
t1.[part number],
t1.price,
t1.quantity - nz(t2.quantity, 0) as qty
from
inventory t1 left join table2 t2 on t1.[part number] = t2.[part number]
Here, I assume that you have a table called inventory and a table called table2 (change these to suit your database).
A left join is used to ensure that all records from inventory are returned, regardless of whether a match is found in table2, and the Nz function is used to return 0 for records for which there is no part number match in table2.

Related

SUM columns on the same sheet based on conditions or SUMIFS from another sheet

Here is a small sample table
+--------+-------+--------+
| COL 1 | COL 2 | COL 3 |
+--------+-------+--------+
| abc123 | Total | |
+--------+-------+--------+
| abc123 | cat1 | 100.00 |
+--------+-------+--------+
| abc123 | cat2 | 200.00 |
+--------+-------+--------+
| def123 | Total | |
+--------+-------+--------+
| def123 | cat1 | 100.00 |
+--------+-------+--------+
| def123 | cat2 | 200.00 |
+--------+-------+--------+
In COL 3, IF COL 2 is "Total" I need to SUM everything in COL 3 for each row in COL1 that is the same. (EG. COL3 Total row should be 300.00 for abc123 and then 300.00 for def123) Otherwise if COL 2 is NOT "Total" I need to do SUMIFS('Sheet3'!N:N,'Sheet3'!A:A,Sheet2!A473,'Sheet3'!Q:Q,Sheet2!Q473)*Sheet4!$U$2)
How can I can I accomplish the first part of the SUM?
Edit:
I think my example is too rigid and appears like it is set.
Let me see if I can explain in more fluid terms. I will have to describe this some what in database terms. All of the columns are on one sheet for the purposes of the "Total" portion.
COL 1 is my partition. Each of the "ID's" in COL 1 consists of 57 rows. Within 1 of those 57 rows is "Total" in another column, in the example that is COL 2.
So I have a large table that in COL 1 there are say 5 different ID's with 57 rows for each ID resulting in 285 rows.
Now I had a sorting function that would likely make this whole thing easier, but that function is crashing excel and not sorting both required sorts ( https://techcommunity.microsoft.com/t5/excel/sort-function-causes-a-crash-and-does-not-perform-secondary-sort/m-p/1477123#M66205 )
I suppose if I can get the sorting function to stop crashing excel this becomes slightly easier as then "Total" is consistently placed in row 2, 58, 116, etc. and I can add up everything below it. Right now, because that sort doesn't work, I have to add up everything from COL 3 that is NOT assigned to "Total" in COL 2 and has the same ID in COL1.
So in the table above abc123 is 3 rows and I need to add up the two rows that are not total for abc123 and have the formula spit out 300 into COL 3 for total.
Then def123 needs the same treatment.
Here is the tough part: the sorting is inconsistent because the data comes from a Redshift query so it is random for each ID. The IDs themselves are in random order. I think I can get the sort for COL 1 to work without crashing excel, but the secondary sort with the custom order is crashing it.
One way to avoid the Circular Reference error when trying to Total a column is to use two Sums, one above and one below.
So, assuming that your Columns 1, 2 and 3 are A, B and C, and that data starts in Row 2 (Row 1 being a header), you need the Sum of cells above the current row:
SUMIFS(C$1:C1, A$1:A1, A2)
Plus the Sum of the cells below the current row:
SUMIFS(C3:INDEX(C:C, 1+COUNTA(A:A)), A3:INDEX(A:A, 1+COUNTA(A:A)), A2)
(Note that this will actually terminate one row above and below the dataset)
Put this together with an IF statement:
=IF(B2="Total", SUMIFS(C$1:C1, A$1:A1, A2) + SUMIFS(C3:INDEX(C:C, 1+COUNTA(A:A)), A3:INDEX(A:A, 1+COUNTA(A:A)), A2), EXISTING_FORMULA_HERE)
Alternatively, you could try writing an Array Formula to calculate the SUM directly, a bit like when using multiple conditions in a MATCH, something like this: (not enough information in the question to do this exactly)
=SUMPRODUCT('Sheet3'!N:N*(COUNTIFS(A:A,'Sheet3'!$A:A)>0)*(COUNTIFS(B:B,'Sheet3'!$Q:Q)>0))
(Sum of Sheet3!N:N when a row exists in the current sheet that matches columns Sheet3!A:A in Column A and Sheet3!Q:Q in Column B)
Note that working on Entire Columns with Array Formulae is quite slow, so you may want to limit those just to the Used Range

Excel Sumproduct/Sumif based on multiple criteria across two tables not in order

I've been struggling with an issue for a while now and haven't been able to find a solution, so any help would be greatly appreciated!
I'm trying to build a formula that sums up values from a column based on criteria spread across two tables (I've simplified below):
Table 1
+-------------+---------+---------------------+------------+----------+------+
| Customer ID | Twin ID | Customer Entry Date | Exit Date | Spending | Days |
+-------------+---------+---------------------+------------+----------+------+
| 111 | 333 | 24.12.2015 | 28.05.2018 | 5000 | 200 |
| 222 | 444 | 19.06.2014 | | 4000 | 300 |
+-------------+---------+---------------------+------------+----------+------+
Table 2
+-------------+---------+---------------------+-----------+----------+------+
| Customer ID | Twin ID | Customer Entry Date | Exit Date | Spending | Days |
+-------------+---------+---------------------+-----------+----------+------+
| 444 | | | | | 200 |
| 333 | | | | | 0 |
+-------------+---------+---------------------+-----------+----------+------+
I now need to find a formula, that will allow me to sum up the column "Spending" from table 1 based on the following criteria:
"Twin ID" in Table 1 is not empty and the value matches the value "Customer ID" in Table 2 --> this has been the main complication for me, as the Customer IDs in Table 2 are in a different order than the Twin IDs in Table 1
"Entry Date" in Table 1 is < a specific date
"Exit Date" in Table 1 is >= a specific date or empty
"Days" in Table 2 is >0 (for the respective Customer ID that matches the Twin ID from Table 1)
Or in other words: "If customers 111,222 etc. have a twin, and this twin has days >0, and the entry and exit dates of the customer are < > a specific date or empty, then sum up the spending of those customers"
I've tried various iterations of the SUMPRODUCT formula, and this one currently works as long as the two tables are in the same order (i. e. Twin ID "333" is in row 2 in Table 1 and in row 2 in Table 2):
=SUMPRODUCT(--(Table1!Customer Entry Date<DATE1);--(Table1!Exit Date>=Date2);--(Table1!TwinID<>"");--(Table2!Days>0);Table1!Spending)
Is there any way to make this formula work regardless of the order of the row items (i. e. Twin ID "333" is in row 2 in Table 1 and in row 3 in Table 2)?
Any help would be greatly appreciated!
Try this
=SUMPRODUCT((Table1!Customer_Entry_Date<Date1)*(Table1!Customer_Entry_Date>Date2)*(Table1!Twin_ID<>"")*(COUNTIFS(Table2!Customer_ID,Table1!Twin_ID,Table2!Days,">0")>0)*Table1!Spending)
It's similar to your formula, but uses Countifs to see if a matching Customer ID for Table 1's Twin ID is anywhere in table 2.
Note that your named ranges (if that's what you're using) should not include the column headers or else you'll get a #Value error when it tries to do the multiplication.
You could avoid it by putting IF(Isnumber()) around the last part of the bracket, but then it would have to be entered as an array formula
=SUM((Table1!Customer_Entry_Date<Date1)*(Table1!Customer_Entry_Date>Date2)*(Table1!Twin_ID<>"")*(COUNTIFS(Table2!Customer_ID,Table1!Twin_ID,Table2!Days,">0")>0)*IF(ISNUMBER(Table1!Spending),Table1!Spending))
I managed to solve the issue.
Anyone facing a similar problem, please see the example file for solution: https://wetransfer.com/downloads/90aedc5943f52274e36102a79e23c18e20180628212338/2fd1c1
=+SUMPRODUCT(SUMIF(Table1TwinID;Table2CustomerID;Table1Spending)*(Table2Days>0)*((COUNTIFS(Table1TwinID;Table2CustomerID;Table1EntryDate;"<"&DATE1)*COUNTIFS(Table1TwinID;Table2CustomerID;Table1ExitDate;">="&DATE2))+(COUNTIFS(Table1TwinID;Table2CustomerID;Table1EntryDate;"<"&DATE1)*COUNTIFS(Table1TwinID;Table2CustomerID;Table1ExitDate;""))))

Row number and partition in Excel

I have data in excel such as:
ID | Fee
123456789 | 100
987654321 | 100
987654321 | 75
987654321 | 50
I need to calculate a fee reduction for the items that are not the max price. The spreadsheet is sorted by ID, then Fee in the fashion needed. What I do not know how to do is use a similar row_number() over(partition by) in excel that I would normally do in SQL
Desired output would be
ID | Fee | rn
123456789 | 100 | 1
987654321 | 100 | 1
987654321 | 75 | 2
987654321 | 50 | 3
This formula will do the job:
=COUNTIF($A$2:INDIRECT("A"&ROW(A2)),A2)
There is no need for sorting the data and you won't fall out of the range.
ROW() is used to make the range dynamic, so if we drag the formula down, ROW() will always give us ending point:
There's probably a more complex formula one could just throw at the data without having to monkey with the data, but I think this may be an easier solution:
Sort the data by ID (smallest to largest) and Fee (Smallest to largest)
Use formula =Countif(A2:A5, A2) to count how many times the same id appears in the data for the current cell and every cell below it. Copying this down to fill out the missing column.
you can use =COUNTIF($A$2:A2,A2); note that only the first $A$2 will not move.
Arrange everything in column A (in any order).
In B1 type this : =IF(A1=A2, (B2+1),1), extent this over the entire column B.

Percentage of Sum of two Pivot cells

I'm trying to work out a small problem with my excel Pivot table. I have data from a Excel Sheet which i have made a Pivot table of. The data is structured as below
Name | Count Cell1 | Sum of Cell 2 |
Eric | 25 | 5 |
Sam | 5 | 1 |
Joe | 10 | 5 |
What i want to have is a formula that takes the Count of Cell 1 and divide it by Sum of Cell 2 and display it in % like the example below.
Name | Count Cell1 | Sum of Cell 2 | Difference|
------------------------------------------------
Eric | 25 | 5 | 20% |
Sam | 5 | 1 | 20% |
Joe | 10 | 5 | 50% |
All formulas i have tried only uses the original Table cells and not the sums of them.
So is there a smart way to have a formula lookup inside of a pivot table and display it in %?
In your Pivot Table, you can enter a calculated field to do what you want.
Select somewhere in your pivot table (e.g. one of the Sum of Cell2 fields)
In the PivotTable Tools > Options ribbon, in the Calculations section, click Fields, Items & Sets and from there pick Calculated Field
Change the name to Difference and the Formula =Cell2/Cell1
In the Field Settings for that field, change the Custom Name to Difference and Number Format to Percentage
EDIT - question updated for Count & Sum
So, as far as I can see, trying to do the combination of Sum/Count really upsets it... the only workaround I could find was adding a helper column in the data source with just the number 1... in that way, the sum of that gives you the count, and so the Calculated Field can be Cell2/HelperColumn -horrible!

Counting the number of older siblings in an Excel spreadsheet

I have a longitudinal spreadsheet of adolescent growth.
ID | CollectionDate | DOB | MOTHER ID | Sex
1 | 1Aug03 | 3Apr90 | 12 | 1
1 | 4Sept04 | 3Apr90 | 12 | 1
1 | 1Sept05 | 3Apr90 | 12 | 1
2 | 1Aug03 | 21Dec91 | 12 | 0
2 | 4Sept04 | 21Dec91 | 12 | 0
2 | 1Sept05 | 21Dec91 | 12 | 0
3 | 1Aug03 | 30Jan89 | 23 | 0
3 | 4Sept04 | 30Jan89 | 23 | 0
This is a sample of how my data is formatted and some of the variables that I have. As you can see, since it is longitudinal, each individual has multiple measurements. In the actual database there are over 10 measurements per individual and over 250 individuals.
What I am wanting to do is input a value signifying the number of older brothers and older sisters each individual has. That is why I have included the Mother ID (because it represents genetic relatedness) and sex. These new variable columns would just say how many older siblings of each sex each individual has. Is there a formula that I could use to do this quickly?
=COUNTIFS($B:$B,"<>"&$B2,$H:$H,$H2,$AI:$AI,$AI2,$J:$J,"<"&$J2)
Create a column named Distinct with this formula
=1/COUNTIF([ID],[#ID])
Then you can find all the older 0-sexed siblings like this
=SUMPRODUCT(([DOB]>[#DOB])*([MOTHERID]=[#MOTHERID])*([Sex]=0)*([Distinct]))
Note that I made the data a Table and used table notation. If you're not familiar [COLUMNNAME] refers to the whole column and [#COLUMNNAME] refers to the value in that column on the current row. It's similar to saying $A:$A and A2 if you're dealing with column A.
The first formula gives you a value to count that will always result in 1 for a particular ID. So ID=1 has three lines and Distinct will result in .33333 for each line. When you add up the three lines you get 1. This is similar to a SELECT DISTINCT in Sql parlance.
The SUMPRODUCT formula sums [Distinct] for every row where the DOB is greater than the current DOB, the Mother is the same as the current Mother, and the Sex is zero.
I have a possible solution. It involves adding two columns -- One for "# older siblings" and one for "unique?". So here are all the headings I have currently:
A -- ID
B -- CollectionDate
C -- DOB
D -- MOTHER ID
E -- Sex
F -- # older siblings
G -- unique?
In G2, I added the following formula:
=IF(A2=A1,0,1)
And dragged down. As long as the data is sorted by ID, this will only display "1" once for each unique person.
In F2, I added the following formula:
=COUNTIFS(G:G,"=1",D:D,"="&D2,C:C,"<"&C2)
And dragged down. It seemed to work correctly for the sample data you provided.
The stipulations are:
You would need the two columns.
The data would need to be sorted by ID
I hope this helps.
You need a formula like this (for example, for row 2):
=COUNTIFS($A:$A,"<>"&$A2,$E:$E,$E2,$D:$D,$D2,$C:$C,"<"&$C2)
Assuming E:E is column for sex, D:D is column for mother ID and C:C is column for DOB.
Write this formula in H2 cell for example and drag it down.

Resources