Aggregate values based on lookup IDs while meeting multiple criteria - excel

I am given sales data as per "sheet 1" below. Each row contains the quarter, the product, the region ID, and the sales figure. Each region can have multiple IDs, and I have a lookup table (sheet 2) which denote which region each ID belongs to.
My goal is to get to sheet 3. Essentially I am trying to write a formula that will reference the product name in column A, the quarter in cell A1 (which the user will input), and aggregate the relevant sales figures under each region in column B.
I have tried nesting an INDEX & MATCH function within a SUMIFS within a SUMPRODUCT as per below, but I am getting a #VALUE! error:
=SUMPRODUCT(SUMIFS(INDEX(sheet1!$D:$D$,MATCH(1,($A4=sheet1!$B:$B)*($A$1=Sheet1!$A:$A),0),0),sheet1$C:$C,sheet2!$A$2:$A$8)*(sheet2!$B$2:$B$8=$B4))
Does anyone know what is wrong with my formula, or if there is a better approach to this problem?
Sheet 1 (Raw data)
A
B
C
D
1
Quarter
Product
ID
2021 Sales
2
Q1
A
1
39
3
Q1
A
3
41
4
Q1
A
7
20
5
Q1
A
14
7
6
Q1
A
25
2
7
Q1
A
27
2
8
Q1
A
44
45
9
Q1
B
1
28
10
Q1
B
3
34
11
Q1
B
7
29
12
Q1
B
14
48
13
Q1
B
25
5
14
Q1
B
27
15
15
Q1
B
44
32
16
Q2
A
1
19
17
Q2
A
3
28
and so forth…
Sheet 2 (region ID lookup table)
A
B
1
ID
Region
2
1
East
3
3
East
4
7
Central
5
14
Central
6
25
Central
7
27
West
8
44
West
Sheet 3 (Report)
A
B
C
1
Q1
2
3
Product
Region
Sales
4
A
East
29
5
A
Central
42
6
A
West
31

A little rework:
=SUMPRODUCT(SUMIFS(Sheet1!D:D,Sheet1!C:C,IF(Sheet2!$B$2:$B$8=$B4,Sheet2!$A$2:$A$8),Sheet1!A:A,$A$1,Sheet1!B:B,A4))
Depending on version one may need to confirm with Ctrl-Shift-Enter instead of Enter when exiting edit mode.
With the dynamic array formula FILTER we can replace the IF() part with FILTER():
=SUMPRODUCT(SUMIFS(Sheet1!D:D,Sheet1!C:C,FILTER(Sheet2!$A$2:$A$8,Sheet2!$B$2:$B$8=$B4),Sheet1!A:A,$A$1,Sheet1!B:B,A4))
And it will save a few iterations.

Related

Excel MERGE two tables

I have SET 1
CLASS
Student
TEST
SCORE
A
1
1
46
A
1
2
50
A
1
3
45
A
2
1
45
A
2
2
47
A
2
3
31
A
3
1
34
A
3
2
45
B
1
1
36
B
2
1
31
B
2
2
41
B
3
1
50
C
1
1
42
C
3
1
31
and SET 2
CLASS
SIZE
YEARS
A
39
7
B
20
12
C
31
6
and wish to COMBINE to make SET 3
CLASS
STUDENT
TEST
SCORE
SIZE
YEARS
A
1
1
46
39
7
A
1
2
50
39
7
A
1
3
45
39
7
A
2
1
45
39
7
A
2
2
47
39
7
A
2
3
31
39
7
A
3
1
34
39
7
A
3
2
45
39
7
B
1
1
36
20
12
B
2
1
31
20
12
B
2
2
41
20
12
B
3
1
50
20
12
C
1
1
42
31
6
C
3
1
31
31
6
so basically add the SIZE and YEARS columns from SET 2 and merge on CLASS onto SET 1. In excel how you can do this? I need to match on CLASS
Define both sets as tables and “left join” in PowerQuery. There you can choose the columns of the resulting table.
https://learn.microsoft.com/en-us/power-query/merge-queries-left-outer
If you have Set 1 on the top left of a worksheet "Set1" and Set 2 on the top left of a worksheet "Set2", then you can use the formula
=VLOOKUP(A2;'Set2'!$A$2:$C$4;2;FALSE), where $A$2:$C$4 is the range of Set2, and A2 is the class value from Set1, which is what is used to do the lookup in Set2. The next argument, 2, means to take the second row from Set2, and the FALSE at the end means that you only want exact matches on the CLASS. You can do auto-fill with this formula, and do similar steps for the years. If you look up the help for VLOOKUP within Excel, that should help you to understand how it works.
Your first set of data is essentially your primary set of data that you just want to add attribute columns to. I built this example on Google Sheets which should help explain. Using spill formulas, only a few cells are needed with their own formulas. You can see them as they are highlighted in yellow. When you use in Excel, obviously make sure you change the column references, but this would get you the answer.
Note you have to have SpillRange in Excel for this to work. To test, see if you have the formula =unique()
This solution may work for you if both sets start in the same column. As example in my image, both of them start at column A. You can get all data with a single VLOOKUP formula:
Formula in cell E2 is:
=VLOOKUP($A2;$A$22:$R$25;COLUMN($B22);FALSE)
Notice the mixed references at first and third argument and absolute references in the second one. Third argument is critical, because is the relational position between both sets, that's the reason it's easier if both sets start at same column. If not, you'll need to adjust this argument substracting or adding, depending on the case.
Anyways, with a single formula, you can get any number of columns. The only disavantage of this formula is that you need to manually drag to right until you got all the columns (10, 30 or whatever). You'll notice you are done because the formula will raise an error:
This error means you are trying to get a referenced outside of your column area.

Nesting INDEX function within a SUMPRODUCT to aggregate values within the correct column

This is an adaption to my previous question regarding how to aggregate values based on lookup IDs and multiple criteria.
This time I would like to index for the correct year. My goal is to get to sheet 3, where a formula contained in cells C4:D6 will reference the product name in column A, the quarter in cell A1 (which the user will input) and the year in cell C3 & D3, and aggregate the relevant sales figures under each region in column B.
In my previous question, I was provided a solution that would nest a SUMIF within a SUMPRODUCT. I am trying to build on this function by adding an INDEX & MATCH formula within the formula to index for the correct year's column in sheet 1. I have tried the following in the report but am getting a #N/A error.
=SUMPRODUCT(SUMIFS(INDEX(Sheet1!$D:$F,0,MATCH(C$3,Sheet1!$D$1:$F$1,0)),Sheet1!C:C,IF(Sheet2!$B$2:$B$8=$B4,Sheet2!$A$2:$A$8),Sheet1!A:A,$A$1,Sheet1!B:B,A4))
UPDATE: It has been discovered that the above formula indeed works. This issue was that Sheet 1 was a pivot table, and as such the column headers for each year was in text format, and was different than the formatting of the look up cell in the report, thus there was no link to reference the data.
Sheet 1 (Raw data)
A
B
C
D
E
F
1
Quarter
Product
ID
2021
2020
2019
2
Q1
A
1
$12
$12
$9
3
Q1
A
3
$4
$30
$50
4
Q1
A
7
$48
$15
$39
5
Q1
A
14
$42
$7
$26
6
Q1
A
25
$36
$50
$20
7
Q1
A
27
$45
$8
$9
8
Q1
A
44
$12
$10
$2
9
Q1
B
1
$40
$32
$23
10
Q1
B
3
$15
$14
$30
11
Q1
B
7
$21
$4
$42
12
Q1
B
14
$38
$26
$13
13
Q1
B
25
$31
$45
$9
14
Q1
B
27
$32
$46
$30
15
Q1
B
44
$21
$40
$30
16
Q2
A
1
$6
$1
$43
17
Q2
A
3
$12
$16
$44
and so forth…
Sheet 2 (lookup table)
A
B
1
ID
Region
2
1
East
3
3
East
4
7
Central
5
14
Central
6
25
Central
7
27
West
8
44
West
Sheet 3 (Report)
A
B
C
D
1
Q1
2
3
Product
Region
2021
2020
4
A
East
$16
$42
5
A
Central
$126
$45
6
A
West
$57
$22

How to select a set of values in pandas data frame (multiple colums with multiple row conditions)

I have a huge ass csv file like given below which I opened as dataframe using pandas. I want to extract data from multiple columns at different date sets.
I want to select from a particular date and hour to another for the last 3 column values. The slicing options I tried and googled were for single column.
date heure PM10 NO2 O3
0 01/01/2016 1 27 22 36
1 01/01/2016 2 25 29 27
2 01/01/2016 3 26 47 10
3 01/01/2016 4 16 40 13
4 01/01/2016 5 15 34 13
5 02/01/2016 1 15 34 13
6 02/01/2016 2 15 34 13
Target output - taking data from a particular data and hour to another one.
3 01/01/2016 4 16
4 01/01/2016 5 15
Thank you. The data set is obviously way bigger than 4 No.
You can do this:
df_selected = df[(df.date >= "01/01/2016") &
(df['hour']>=4) &
(df.date < "02/01/2016") &
(df['hour']<6)
].iloc[:,:3] #first three columns
Alternatively, for the columns selection you can use .loc[:,['name', 'of', 'columns']] or for the last n columns .iloc[:,-n:].
Be careful with date because I'm not sure what happens with an "English" date, maybe you have to change the date using df['date'] = pd.to_datetime(df.date).

Display rows as blank if another cell is empty

I have the following 'hospitals' sheet in excel:
A B C D E
1 Regions Region 1 Region 2 Region 3 Region 4
2 Region 1 Hospital 1 Hospital 6 Hospital 11 Hospital 15
3 Region 2 Hospital 2 Hospital 7 Hospital 12 Hospital 16
4 Region 3 Hospital 3 Hospital 8 Hospital 13 Hospital 17
5 Region 4 Hospital 4 Hospital 9 Hospital 14 Hospital 18
6 Region 5 Hospital 5 Hospital 10
7 Region 6
8 Region 7
9 Region 8
On my 'report' sheet, I have the following table set up with column headers 'Region' in A6 and 'Hospital' in B6:
A B C D E
6 Region Hospital Dept Admissions Discharges
7 Region 1 Hospital 1 A&E 24 12
8
9
10
11 Hospital 2 Opth 45 76
12
13
14
15 Hospital 3
16
17
18
19 Hospital 4
20
21
22
23 Hospital 5
24
A7 in the table above is a drop-down menu with the values from A2-A9 from my 'hospitals' sheet. When this is entered, I'd like to return a list of hospitals from that particular region in cells B7, B11, B15, B19, B23 etc of my 'report' sheet.
However, when it gets to the last hospital in the respective column on the 'hospitals' sheet, I would then like the formatting of columns A:E on the report sheet to appear as blank, rather than have zeros or #N/A values in columns C-E of my report sheet. Is this something that can be done in VBA?
To summarise, basically, I need some code for the workbook that will show the following range of cells report!A27:E30 to be empty/blank of all formatting if there is no value in 'hospitals!B7'. i.e. when the formulas in column B of the 'report' sheet stop pulling values from column b of the 'hospitals' sheet, everything below this will appear empty.
I'm not sure this is possible.
VBA is not required for the task.
For Sheet 'report' the formula in cells:
B7: =IFERROR("Hospital "&(RIGHT($A$7,1)/1-1)*5+1,"")
B11: =IFERROR("Hospital "&(RIGHT($A$7,1)/1-1)*5+2,"")
B15: =IFERROR("Hospital "&(RIGHT($A$7,1)/1-1)*5+3,"")
B19: =IFERROR("Hospital "&(RIGHT($A$7,1)/1-1)*5+4,"")
B23: =IFERROR(IF(RIGHT($A$7,1)<3,("Hospital "&RIGHT($A$7,1)/1-1)*5+5,""),"")

Sum number according to date and name in excel

To sum the third column (numbers o companies) I've used this
=SUM(1/COUNTIF(Names;Names))
Names is name of array in C column and CTRL+SHIFT+ENTER and it works perfectly.
Now I'd like to sum earnings but only for each company once and with the latest data. For example, the result shoud be like this
=C4+C6+C7+C8+C9+C10
(93)
Thanks
A B C D
1 # company earnings date
2 1 ISB 12 10/11/2011
3 2 DTN 15 11/11/2011
4 3 ABC 13 12/11/2011
5 4 ISB 17 13/11/2011
6 5 RTV 18 14/11/2011
7 6 DTN 22 15/11/2011
8 7 PVS 11 16/11/2011
9 8 ISB 19 17/11/2011
10 9 ANH 10 18/11/2011
Sum 6 93
Assuming ascending dates, you could try with CTRL+SHIFT+ENTER in C11:
=SUM((MAX(A2:A10)-MATCH(B2:B10,LOOKUP(MAX(A2:A10)-A2:A10,A2:A10-1,B2:B10),0)=A2:A10-1)*C2:C10)
I'd suggest using a helper column as the easiest approach. In E2 use this formula
=IF(COUNTIF(B2:B$1000,B2)=1,C2,"")
and copy down the column. Now sum column D for the required answer.
Note that the above formula assumes 1000 rows of data maximum, increase if required.

Resources