Excel conditional SUMPRODUCT / SUMIFS / Array Formula for optional dimension - excel

I have a sheet of data with multiple dimensions like this:
A B C D E
1 COUNTRY FLAVOUR SIZE DATE SALES ($)
2 Japan Strawberry 100ml 10/12/14 100
3 Japan Banana 100ml 10/03/15 100
4 China Orange 200ml 14/04/15 30
5 France Strawberry 200ml 11/04/15 400
6 UK 200ml 23/04/15 250
7 ....
I want to aggregate this data over a date range, where the summary sheet has each dimension (country & flavour), and if I do not specify a dimension it sums all rows for that dimension.
A B C
1 COUNTRY FLAVOUR SALES TOTAL
2 Japan Strawberry 100
3 Japan 200
4 Strawberry 500
I can do this if all the dimensions are present (i.e. row 2 above) using a SUMPRODUCT or SUMIFS:
=SUMPRODUCT((data!A$2:A$100=A1)*(data!B$2:B$100=B1)*(data!D$2:D$100>[start_date]*(data!D$2:D$100<[end_date])*(data!E$2:E$100))
However I have not been able to figure out how to include all rows for a dimension if that input cell is empty (e.g. row 3 above). I tried:
Adding an IF statement or OR statement within the criteria (e.g. OR(data!A$2:A$100=A1,isblank(A1))).
Using a + in a SUMPRODUCT as an OR statement, (per this answer https://stackoverflow.com/a/27536131/1450420)
One solution is to have different branches of the formula depending on which summary dimensions are present, but that would quickly get out of control if I extend this same behaviour to further dimensions like Size.
Any help appreciated!
(I'm running Excel Mac 2011).
EDIT
Per #BrakNicku's comment one of the formulas I tried was =SUMPRODUCT(((data!A$2:A$100=A2)+ISBLANK(A2))*((data!B$2:B$100=B2)+ISBLANK(B2))*(data!E$2:E$100))
The reason this doesn't work is that sometimes my data has blank attributes (edited above). For some reason this formula double-counts rows where the attribute present matches (e.g. data!A6) but the other attribute is missing (e.g. data!B6).
EDIT 2
I can see why this double-counting is happening, because the + is summing the match because data!A$2:A$100=A2 (they match because they are both blank) and the match because ISBLANK(A2) (it is indeed blank). The question would remain how to achieve this without double counting. If needed a workaround could be to fill all blank cells on my data with some placeholder value.

The reason for double-counting values is here:
((data!A$2:A$100=A2)+ISBLANK(A2))
If a cell in A column is blank, both parts of this sum are equal 1. To get rid of this problem you can change it to:
(((data!A$2:A$100=A2)+ISBLANK(A2))>0)

Try this (I only included the first two, I left the dates out):
=SUMPRODUCT((((Data!$A$2:$A$5=A2)+(A2=""))>0)*(((Data!$B$2:$B$5=B2)+(B2=""))>0)*(Data!$E$2:$E$5))

Related

Sumproduct - counting equal pairs of numbers (and filtering them)

In columns D&E I have a list of scores for a game, where D is points for and E is points against, like so
D E
1 3
4 2
3 3
3 1
I'm trying to create a formula that displays a win / draw / loss record based on whether column D is larger, equal to or smaller than column E. In this example it would display 2 / 1 / 1.
So far I have this
=(SUMPRODUCT(--(D12:D200>E12:E200)))&" / "&SUMPRODUCT(--(D12:D200=E12:E200))&" / "&(SUMPRODUCT(--(D12:D200<E12:E200)))
But there are two issues. One is that all the blank rows are being counted as equals, so the result is coming out as 2 / 186 / 1.
The second is that in another column I have a list of days of the week, and I would like to be able to filter out rows by day and have the results reflect this. I have different formulas using SUBTOTAL instead of SUM to count overall number of points, which works fine. But I don't know what the equivalent change I need to make would be for my formula. Any help would be appreciated.
As for your first issue, your formula indeed takes blanks into account and treats them as equals. You can adjust your middle SUMPRODUCT formula to omit the blanks, just like that:
=SUMPRODUCT(ISNUMBER(D12:D200)*(--(D12:D200=E12:E200)))
The second question is regarding filtering out rows by the day of the week. Here's the view before "Day" filter is applied - as you can see we have 5 wins (blue), 4 draws (orange) and 3 losses (green).
You need to use the following formula to make SUMPRODUCT dynamic (i.e. it will react to filtering out rows):
=SUMPRODUCT(SUBTOTAL(3,OFFSET(F12:F200,ROW(F12:F200)-ROW(F12),,1)),--(D12:D200>E12:E200))&" / "&SUMPRODUCT(SUBTOTAL(3,OFFSET(F12:F200,ROW(F12:F200)-ROW(F12),,1)),ISNUMBER(D12:D200)*(--(D12:D200=E12:E200)))&" / "&SUMPRODUCT(SUBTOTAL(3,OFFSET(F12:F200,ROW(F12:F200)-ROW(F12),,1)),--(D12:D200<E12:E200))
Here's the result just for Monday:

How to create a dynamic formula to find the average of a set of values for a given vector

I am trying to create a formula that gives me the average of the last 12 entries in a given dataset depending on the associated vector.
Let's make an example:
I have in column F2,G2,H2 and I2 dates, Company1, Company2 and Company3 respectively. Then from row3 to row 33 I have months dates starting from May 2016.
Date Company1 Company2 Company3
May-16 2,453,845
Jun-16 13,099,823
Jul-16 14,159,037
Aug-16 38,589,050 8,866,101
Sep-16 63,290,285 13,242,522
Oct-16 94,005,364 14,841,793
Nov-16 123,774,792 7,903,600 41,489,883
Dec-16 93,355,037 12,449,604 69,117,105
Jan-17 47,869,982 13,830,712 83,913,764
Feb-17 77,109,905 10,361,555 68,176,643
The goal is to create a formula that, when I drag it down, correctly calculates the average of the last 12 values for a given company.
So for example i would have, say in table "B2:C5":
Company1 76,856,345
Company2 11,120,859
Company3 65,674,349
And, if a new Company4 is added to the list, then I just have to drag it down the formula, to calculate the average of the last 12 months for Company4.
Until now, I have came up with this formula:
=AVERAGE(LOOKUP(LARGE(IF(ISNUMBER(G:G),ROW(G:G)),ROW(INDIRECT("1:"&MIN(12,COUNT(G:G))))),ROW(G:G),G:G ))
This formula correctly calculates the average of a given column, considering only the last 12 values. The last step would be to come up with a formula that includes all the columns and then calculates the average for the given company.
Thanks!
I recommend that you use a named range to define your data in columns G:I. When a company is added, just modify the named range's specs. I used the name Target. Of course, you can replace it with $G:$I if you feel so inclined but I would rather recommend reducing the number of rows in the range, which is easier to manage when it is named.
Use the formula below to extract the company names from the first row of Target into the first column of your averages table. This is to ensure that the names are spelled identically in both locations.
=INDEX(Target,1,ROW()-2)
The number 2 indicates the number of rows above the row containing the formula. it is copied here from cell M3. There, ROW()-2 creates the number 1, counting sequentially as the formula is copied down.
Now I have the formula below in my cell N3 and copied down.
=SUM(INDEX(Target,0,MATCH($M3,INDEX(Target,1,0),0)))
The formula simply sums up the columns G, H, and I in 3 consecutive rows.
In the final step I inserted the range definition established above, meaning excluding the SUM() function, into your existing formula.
=AVERAGE(LOOKUP(LARGE(IF(ISNUMBER(INDEX(Target,0,MATCH($M3,INDEX(Target,1,0),0))),ROW(INDEX(Target,0,MATCH($M3,INDEX(Target,1,0),0)))),ROW(INDIRECT("1:"&MIN(12,COUNT(INDEX(Target,0,MATCH($M3,INDEX(Target,1,0),0))))))),ROW(INDEX(Target,0,MATCH($M3,INDEX(Target,1,0),0))),INDEX(Target,0,MATCH($M3,INDEX(Target,1,0),0))))

Getting the next higher value with VLOOKUP or INDEX/MATCH

I have the following Excel spreadsheet:
A B C D
1 0 Product 1 7.500 Product 4
2 1.000 Product 2
3 5.000 Product 3
4 10.000 Product 4
5
In Cell C1 I type a random number (in this case 7.500). Now I want that in Cell D1 the corresponding Product is shown to the value in Cell C1. Since 7.500 does not exist in Column A the next higher value should be used. In this case 10.000 which belongs to Product 4.
I tried to go with the following formula in Cell D2 but instead of getting Product 4 I get #NV as a result.
=INDEX(A1:B4;MATCH(C1;A1:A4;-1);2)
The only solution I found so far was changing the values in Column A from ascending to descending. However, I would prefer to have a solution which does not require a change of the order in the list.
Do you have any idea how to solve this issue without changing the order in the list?
For unsorted data you can use below formula::
=INDEX(B1:B4,MATCH(SMALL($A$1:$A$4,COUNTIF($A$1:$A$4,"<"&C1)+1),A1:A4,0))
See image for reference

Highlight duplicates, ignoring same row

I have a worksheet containing names in 2 dimensions. Each row represents a general location, every other column represents a specific slot in that location (each location has the same number of available slots), alternating with a parameter belonging to that name. There is a name in each cell. Here's a simplified version to show what my data looks like:
Location 0 ( ) 1 ( ) 2 ( ) 3 ( )
Garden Tim 3 Pete 1 Oscar 1 Lucy 2
Room1 Lucy 1 Tim 1 Lucy 5 Anna 1
Kitchen Frank 1 Frank 2 Frank 1 Lucy 1
What I want to achieve is to highlight (using conditional formatting, I'm open to alternative methods though) each entry that also appears in another row. So basically it should highlight duplicates, but ignore duplicates in the same row. The first row and column are to be excluded from the operation (no big deal, I just don't select them), as are the parameter columns (this is a big deal, as this pretty much breaks everything I've tried including the first answers given). I have access to the entire meaningful data area (all cells containing names) by the name "entries" and all meaningful entries in a given row by the name "row".
In my example above, all Tim and Lucy entries should be highlighted because they have duplicates in other rows. Pete, Oscar and Anna are unique, so they're not highlighted. Frank, while having duplicates, only has them in the same row, no other row contains Frank, so he should not be highlighted. Excel's own highlight duplicates would highlight Frank, while handling all the others correctly.
How can I modify the conditional formatting's behaviour to ignore duplicates in the same row?
The following formula (thanks to #Dave) resulted in a #VALUE! error:
=(COUNTIF(entries;B2)-COUNTIF(row;B2))>0
or you could just do (no need for an IF() when used in Conditional Formatting Formula box:
=COUNTIF($B$2:$I$4;$B2)>COUNTIF($B2:$I2;$B2)
This single formula should prevent the parameters from being highlighted
select B2:I2 and
put this (exactly) in the conditional formatting box: =AND(NOT(ISNUMBER(B2));COUNTIF($B$2:$I$4;B2)>COUNTIF($B2:$I2;B2))
Something like this:
=(COUNTIF($B$2:$E$4,B2)-COUNTIF($B2:$E2,B2))>0
The first countif counts all instances in the range, the second one subtracts the count of entries in the row. If there are more instances in the entire range than in the row it returns true

Excel count unique occurrences of a text entry based on a status contained in a seperate column

Alright, this is driving me insane...
I have a section of data in a spreadsheet that looks like this:
Column A Column B Column C
lksdf-46-we-32 Fire 1
lksdf-46-we-32 Fire 2
lksdf-46-we-32 Fire 3
lksdf-46-we-32 Fire 4
wgw3f-18-bw-11 Ice 1
wgw3f-18-bw-11 Ice 2
wgw3f-18-bw-11 Ice 3
wgw3f-18-bw-11 Ice 4
possf-12-he-91 Fire 1
possf-12-he-91 Fire 2
possf-12-he-91 Fire 3
possf-12-he-91 Fire 4
oiwen-20-lw-93 Water 1
oiwen-20-lw-93 Water 2
oiwen-20-lw-93 Water 3
oiwen-20-lw-93 Water 4
In another spreadsheet, named 'Variables', I have a lookup category that looks something like this:
Column A
Fire
Water
I need to find the number of distinct entries in column A of the raw data sheet where column B matches any entry in column A of the Variables sheet. What I'm looking for is an excel formula, but everything I've tried either returns duplicates (as a starting point) or returns 0. Also, could you please explain in detail how the query works in excel? I'm a fairly experienced programmer, but I'm having a heck of a time wrapping my head around these functions in excel that I've been tasked to finish by the end of the day.
Try this "array formula" somewhere in the raw data sheet
=SUM(IF(FREQUENCY(IF(ISNUMBER(MATCH(B2:B100,Variables!A:A,0)),IF(A2:A100<>"",MATCH(A2:A100,A2:A100,0))),ROW(A2:A100)-ROW(A2)+1),1))
confirmed with CTRL+SHIFT+ENTER
The formula uses FREQUENCY function, with the "bins" being the row numbers, and counts bins that have 1 or more entry. Entries are only made when the column B item matches Variables column A.....and the second MATCH function ensures that the same row number (the first match) is entered for each repeated item in column A, which guarantees that duplicates are not counted
This formula looks at 100 rows of data in raw data sheet, increase as required but note that formula is very "expensive" so may prove impractical with very large datasets

Resources