how to count distinct values in excel for a matrix form - excel

I have looked if this has been asked, but could not find out exactly.
I' ve been trying to count distinct values.
I tried sumproduct,sum(1/countif) etc, so far I got nothing but a div error or 0.
Basically, I' ve two columns: Campaign_no and customer_id.
what I need is count unique customers for each campaigns and count unique customers that appears in the campaigns at the same time, sort of matrix.
The table is as follows:
Campaign_no
Cust_id
A
1
A
2
A
2
B
1
B
4
B
5
B
9
C
4
C
5
C
6
C
7
What I need is below:
Campaigns
A
B
C
A
2
1
0
B
1
4
2
C
0
2
4
As you see Campaign A has 2 unique customers, so A-A cell is 2.
Campaign A and B have one customer in common, so A-B cell is 1.
Campaign A and C have no common customer, this box got 0.
Campaign B and C has 4 unique customer on their own,
but they have two common customers, so B-C box has 2 ( if those customers were the same, it would have been 1) .
Is there way of calculating this without vba or PT? I'm using Excel 2017.
Much appreciated.

Here is a solution using helper cells.
C2 is =A2&B2. Copy it to C3:C12.
D2 is =IF(ISNA(MATCH(B2,D1:$D$1,0)),B2,""). Copy it to D3:D12.
E2 is =IF($D2="","",1-ISNA(MATCH(E$1&$D2,$C$2:$C$12,0))). Copy it to E2:G12.
E15 is =SUMIFS($E$2:$E$12,E2:E12,1). Copy it to F15:G15.
E16 is =SUMIFS($F$2:$F$12,E2:E12,1). Copy it to F16:G16.
E17 is =SUMIFS($G$2:$G$12,E2:E12,1). Copy it to F17:G17.
You may be able to get away without using the helper column C in Office 2017. I only have Office 365, so I couldn't it correctly.

Here's one that you could try, but it assumes that the data is sorted into contiguous blocks in alphabetical order of campaign exactly as shown in the sample data:
=SUMPRODUCT((COUNTIFS($A$2:$A$12,F$1,$B$2:$B$12,INDEX($B$2:$B$12,MATCH($E2,$A$2:$A$12,0)):INDEX($B$2:$B$12,MATCH($E2,$A$2:$A$12,1)))>0)
/COUNTIFS($A$2:$A$12,$E2,$B$2:$B$12,INDEX($B$2:$B$12,MATCH($E2,$A$2:$A$12,0)):INDEX($B$2:$B$12,MATCH($E2,$A$2:$A$12,1))))
The idea is that you use countifs to check through each customer ID in campaign A (for example) to see if it's present in campaign B. But it's possible that a customer ID appears more than once in campaign A, so you still have to divide by the count of each customer number in campaign A to get the unique count.

Related

How do I retrieve value in Column B Given 2 Criteria on the Same Row in an excel Formula

I have a table similar to the one below and am trying to get column B data based on the criteria in Column A and C. A = UserID, B = Description, C = Cost_Priority, D Cost. The table lists each userID, and common problem description, ranking the cost for the problem description and the cost of the problem description. This is a supplied table that I am working from.
UserID
Problem_Description
Cost_Priority
Cost
111
Problem A
1
395.00
111
Problem B
2
200.00
111
Problem C
0
150.00
111
Problem D
0
145.00
112
Problem G
1
800.73
112
Problem S
2
200.46
112
Problem T
0
100.51
Resulting Table should look like the one below where UserID is Given along with the columns
that define the Cost Priority Required. The problem I am having is getting the problem description
based on static values in the User ID columns static values of 1 for Highest cost problem and 2 for the 2nd highest cost problem.
UserID
Highest Cost Problem
2nd Highest Cost Problem
111
Problem A
Problem B
112
Problem G
Problem S
I have tried using a vlookup method to grab the USERID and compare Cost_Priority to 1 or 2 in an if statement but it was returning the Problem Description column in order including where Cost Priority was 0. I was wondering if someone else had any other ideas to populate the 2nd and 3rd columns in the 2nd table.
If you are on Microsoft 365 then try below formula. Drag down and across as needed.
=INDEX(SORT(FILTER($B$2:$C$8,($A$2:$A$8=$A12)*($C$2:$C$8>0)),2,1),COLUMN(A$1),1)
For older versions of excel try below array formula-
=INDEX($B$2:$B$8,MATCH(1,($A$2:$A$8=$A12)*($C$2:$C$8=COLUMN(A$1)),0))
Press CTRL+SHIFT+ENTER to evaluate the formula as it is an array formula.

List result of lookup A in B, B in C without helper column

I have 2 tables:
Table1 containing Customer & Part#
Table2 containing Part# & Type
(The actual data lists are larger)
Table1 (Customer & Part#) & Table3 (Helper):
Customer
Part#
Helper
A
1
X
B
2
Y
C
3
X
A
4
Y
A
5
X
A
5
X
A
2
Y
Table2:
Part#
Type
1
X
2
Y
3
X
4
Y
5
X
Desired result for combination of customer A and Type X:
Part#
1
5
5
These being the 3 results of part numbers in Table1 that are Customer A and the lookup of the Part# results in Type X (see also Helper column).
I'm able to retrieve the results by creating the helper column as shown in the example data, however I want to skip this column and solve it in one go. But I don't know if that's even possible.
I was thinking about something in this direction.. =INDEX (Table1[Part'#],IF(Table1[Customer]="A",ROW(Table1[Customer]))
..but there I get stuck. I think I can pickup from there with IF, ISNUMBER, SEARCH but my head errors there.
Does anybody know a way to skip the helper column for this?
PS I have office365, but FILTER is not yet released by company rules (unfortunately).
PS I prefer a formula solution, but VBA is allowed when necessary
Here is a formula solution for Excel version 2010 to 2019
In I3, formula copied down :
=IFERROR(INDEX(B:B,AGGREGATE(15,6,ROW(A$3:A$9)/(VLOOKUP(N(IF({1},B$3:B$9)),D$3:E$7,2,0)=H$3)/(A$3:A$10=G$3),ROW(A1))),"")

Count Unique Dates Associated with Location

I am trying to count the total of Unique Dates based on the location.
Context: I trying to create a formula for counting the number of unique dates based on location. My Spreadsheet looks like this
A B C
1 **Participant Location Date**
2 Participant-A High School X 11/7
3 Participant-B High School X 11/7
4 Participant-C High School X 11/8
5 Participant-E High School Y 11/7
6 Participant-F High School Z 11/7
7 Participant-G High School Z 11/8
So for example: high School X had 2 different dates. What would the formula be to count the unique dates based on the location?
This is also being completed on google sheets.
Thank you!
Another way (with no helper columns) would be to use query() and unique().
=query(unique(B:C), "Select Col1, count(Col2) where Col1 <>'' group by Col1 label count(Col2)'# of unique dates'", 1)
Illustration:
With a simple helper column :
=1/COUNTIFS($A$2:$A$7,A7,$B$2:$B$7,B7)
And to get your results :
=SUMIF($A$2:$A$7,E2,$C$2:$C$7)
This is not one-formula solution but I think it works. First, create a third column concatenating the columns that you want to compare. In this case, at cell D2 write:
=CONCATENATE(B2,C2)
This is for the first row of your example. Then, replicate that to the following rows.
Finally, create a formula that counts unique values:
=SUM(IF(FREQUENCY(IF(LEN(D2:D7)>0,MATCH(D2:D7,D2:D7,0),""), IF(LEN(D2:D7)>0,MATCH(D2:D7,D2:D7,0),""))>0,1))
Assuming your new column of concatenated values is at D2:D7.

Count occurrences of strings just once per row in Google Sheets

I have strings of spreadsheet data that need counting by 'type' but not instance.
A B C D
1 Lin 1 2 1
2 Tom 1 4 2
3 Sue 3 1 4
The correct sum of students assigned to teacher 1 is 3, not 4. That teacher 1 meets Lin in lessons B and D is irrelevant to the count.
I borrowed a formula which works in Excel but not in Google Sheets where I and others need to keep and manipulate the data.
F5=SUMPRODUCT(SIGN(COUNTIF(OFFSET(B$2:D$2, ROW($2:$4)-1, 0), E5)))
A B C D E
2 Lin 1 2 1
3 Tom 1 4 2
4 Sue 3 1 4
5 1 [exact string being searched for, ie a teacher name]
I don't know what is not being understood by Google Sheets in that formula. Does anyone know the correct expression to use, or a more efficient way to get the accurate count I need, without duplicates within rows inflating the count?
So this is the mmult way, which works by finding the row totals of students assigned to teacher 1 etc., then seeing how many of the totals are greater than 0.
=ArrayFormula(sum(--(mmult(n(B2:D4=E5),transpose(column(B2:D4)))>0)))
or
=ArrayFormula(sum(sign(mmult(n(B2:D4=E5),transpose(column(B2:D4))))))
Also works in Excel if entered as an array formula without the ArrayFormula wrapper.
A specific Google Sheets one can be quite short
=ArrayFormula(COUNTUNIQUE((B2:D4=E5)*row(B2:D4)))-1
counting the unique rows containing a match.
Note - I am subtracting 1 in the last formula above because I am assuming there is at least one zero (non-match) which should be ignored. This would fail in the extreme case where all students in all classes are assigned to the same teacher so you have a matrix (e.g.) of all 1's. This would be more theoretically correct:
=ArrayFormula(COUNTUNIQUE(if(B2:D4=E5,row(B2:D4),"")))

How to get the latest date with same ID in Excel

I want to Get the Record with the most recent date as same ID's have different dates. Need to pick the BOLD values. Below is the sample data, As original data consist of 10000 records.
ID Date
5 25/02/2014
5 7/02/2014
5 6/12/2013
5 25/11/2013
5 4/11/2013
3 5/05/2013
3 19/02/2013
3 12/11/2012
1 7/03/2013
2 24/09/2012
2 7/09/2012
4 6/12/2013
4 19/04/2013
4 31/03/2013
4 26/08/2012
What I would do is in column B use this formula and fill down
=LEFT(A1,1)
in column C
=DATEVALUE(MID(A1,2,99))
then filter column B to a specific value of interest and sort by column C to order these values by date.
Edit: Even easier do a two level sort by B then by C newest to oldest. The first B in the list is newest.
Do you need a programmatic / formula only solution or can you use a workflow? If a workflow will work, then how about this:
Construct a pivot table of your data
Make the Rows Labels the ID
Make the Values Max of Date
The resulting table is your answer.
Row Labels Max of Date
1 07/03/13
2 24/09/12
3 05/05/13
4 06/12/13
5 25/02/14

Resources