Count number of unique combinations of two columns - excel

I have a spreadsheet of statistics from sports games over a season, for different leagues - each row holds a single event that happened in a game, such as a penalty. There are many rows of events for each individual game. One of the columns is the league, another is the home team and another is the away team. How can I count the total number of games in a given league? In other words, I would need to count the number of unique pairs of strings from Home and Away, where League = "Ligue 1".
EDIT
I have tried:
=SUMPRODUCT(1/(COUNTIFS(E2:E81078,"Ligue 1",F2:F81078,F2:F81078,G2:G81078,G2:G81078)))
which returns a DIV/0 error (it does work if I dont include the column E = "Ligue 1" criteria).

This is similar to your formula but deals with the division by zero
=SUM(IFERROR((1/COUNTIFS(E2:E81078,"Ligue 1",F2:F81078,F2:F81078,G2:G81078,G2:G81078)),0))
Enter it with Ctrl+Shift+Enter rather than just Enter. If done correctly you will see {} around the formula
Try not to use ranges that are bigger than your data because it will slow these kind of formulas down significantly
Update
This might also work if your data is ordered the way you show in your question. It counts the number of times the home team changes in Ligue 1 data :
=SUMPRODUCT((F3:F81079<>F2:F81078)*(E2:E81078="Ligue 1"))
Note that the ranges in column F are offset by one row

You can do this with a Pivot Table.
Add a "helper" column where you concatenate the two teams, preferably with a delimiter in between, eg:
=CONCATENATE(B2, "|", C2)
Use, for example Teams for the column header
Then, Insert ► Pivot Table and be sure to select to Add to Data Model
This adds the option for Distinct Counts to the Values Settings
Then Drag "league" to the Rows area, "Teams" to the Values area, and select Distinct Count for the Value Setting
You might get a table similar to below, which you can format in many different ways:

EXCEL SCREENSHOT=SUMPRODUCT(1/COUNTIFS($B$1:$B$7,B1:B7,$C$1:$C$7,C1:C7))
TRY THIS =SUMPRODUCT(1/COUNTIFS($B$1:$B$7,B1:B7,$C$1:$C$7,C1:C7))

Related

VLOOKUP to bring data from another sheet

I have a table containing a some football data, such as Country, League, Teams, Standing table information, such as total matches played, wins, draws, losses, goals scored and conceded, and so on.
Here's a file download link
It contains two sheets.
First sheet is STANDINGS_EXTENDED:
I need to fill these 3 tables with the data contained in another STANDINGS worksheet.
Here's a screenshot of the STANDINGS sheet:
My aim is that once I fill LeagueId and Group Id (which is optional) fields then inside all three tables will be produced the data as in this sample.
I wonder if it is possible to achieve this without VBA. But i have no clue on where to start from. I tried is several different ways, but i get only first result from STANDINGS worksheet for any league i enter.
Looking forward to your help.
Thank you!
UPDATE:
So far I could get the count of rows related to Overall, Home and Away using these formulas:
=COUNTIFS(STANDINGS!E:E;STANDINGS_EXTENDED!E1;STANDINGS!F:F;"StandingsOverall")
=COUNTIFS(STANDINGS!E:E;STANDINGS_EXTENDED!$E$1;STANDINGS!F:F;"StandingsHome")
=COUNTIFS(STANDINGS!E:E;STANDINGS_EXTENDED!$E$1;STANDINGS!F:F;"StandingsAway")
Also, what I can get is the first row of these results using this formula:
=VLOOKUP($E$1;STANDINGS!$E:$V;4;FALSE)
What I need to figure out is how to modify above formulas so that I fill tables with remaining rows.
In order to do this you need a formula in every single field of your 3 tables that link it to data on the Standings tab. That would be 13 x 3 x 20 formulas. Therefore one would try to create formulas that can be copied, in the best case less than 13 original ones, but definitely one formula for each field.
Each formula would look for a unique identifier in the Standings list. I can't see any unique identifiers there but you might create them by concatenation, such as "League" + "Country" + "Position". The more detail you need the larger the formula. The key is: without a unique identifier for each row you can't retrieve data. But once a row has been identified you can get the value from any of its columns.
If your tables sometimes have 12 rows, sometimes 20, and sometimes 25 you must provide space for the possible maximum and then design your formulas to return a blank if there is nothing to display.
In conclusion, the core of your system is in the Standings table. It must be set up so that data can be retrieved from it. Ideally, your selection on the Standings Extended sheet would generate a concatenated unique identifier for a list to which you can add the fixed number in the Pos column to identify individual rows in the Standing table. As long as you can't identify rows no data can be retrieved.
Using VBA gives you more flexibility but doesn't relieve you of the task to create uniquely identifiable rows.

How do you count the occurrence of same data set in two columns-set in excel?

How can we count the occurrence of each set of data? For eg I want to check how many time the customer country in column A comes alongside country in column B ie (How many times Australia-Australia occurs in column A and column B?). The result for unique occurrences are place in right hand side of the sheet. I have found out unique occurrences of the sets and want to count how many times each occur.
You asked for a formula, but a pivot table can do the same thing faster; and without requiring you to create the table for unique countries (option found under insert, usually the first button in the ribbon):
This is how it looks like after pulling the fields in the right 'boxes', the 'Tabular' report layout is selected and the subtotals turned off.
You can make 'Australia' repeat itself too under report layout if so you wish.
Again, SUMPRODUCT is your friend:
=SUMPRODUCT(--(($A$2:$A$11&$B$2:$B$11)=(D2&E2)))
You can use COUNTIFS function as below.
=COUNTIFS(A:A,D2,B:B,E2)
Adjust the ranges to suit your data and copy down.

Count values for each row with a unique ID

I have a bunch of rows in a table. Each row reflects an event in a patient. However, one patient can have experienced multiple events, so it's possible for there to be multiple rows with the same patient number. Now I'd like to count the amount of male patients in my database, without counting the ones that had multiple events multiple times. Each patient is identified by a unique patient ID that could be used for this.
This shouldn't be all that complicated if not for the fact that I'm using a table that also has several filters, so I need to use SUBTOTAL for any counting functions.
I literally have no idea where to start, so I can't really provide any code...
Any function that could point me in the right direction would be greatly appreciated.
Thanks for the help.
~Laurens
Use a Pivot Table to filter and count all your patients database. Select your data and select Insert -> Tables -> Pivot Tables. Put your filters at the Filter section of the table and the Patient ID in the Rows section. Then, you can use COUNT to get the number of patients.
For more information about Pivot Tables, you can check this: https://support.office.com/en-us/article/Create-a-PivotTable-to-analyze-worksheet-data-a9a84538-bfe9-40a9-a8e9-f99134456576
To get the number of unique IDs in the same column, if the IDs are numeric, you can use SUM with FREQUENCY:
=SUM(IF(FREQUENCY($A$1:$A$1000,$A$1:$A$1000)>0,1))
If they're text and numbers mixed, you can get unique IDs with this one:
=SUM(IF(FREQUENCY(MATCH($A$1:$A$1000,$A$1:$A$1000,0),MATCH($A$1:$A$1000,$A$1:$A$1000,0))>0,1))
(From here)
Here you go
You've not mentioned whether an event is optional.
You might want to add extra column H with formula like h2=if(c2="",0,1) with 1/0 and multiply it as well in G.
Basically if column G contains a 1 you include it
Here's what the results of the formula look like:
Revision
Table is sorted by Patient id..
on change of patient id column H contains a 1, it'll be 0 otherwise.
So H2 is hard coded to 1, H3,H4,H6 will evaluate to 1.
So now G2=H2*E2 etc. You can filter by column H.
The beauty of mapping things into binary zeros and ones is you can do multiplication to achieve a logical AND result, whilst at the same time breaking a complex task into a series of steps. You can then apply a filter to the data to get the rows where column G are not zero, and see the totals count. Normally I'd insert a column between header and data on row 2 and then have G2=SUM(G3:G9)
Sum column H for number of patients.

Transpose multiple rows into columns

I've came across this task and I'm stuck big time. I've tried a PivotTable but it didn't give me the desired result. The only thing that works is a manual transpose but the number of records is 5k odd.
What I'm trying to achieve here is to transpose the data from rows for the company into columns so at a later stage to be able to count the number of votes and average per company.
PivotTable can do the job. All you need is a helper column using COUNTIFS. Notice the formula in cell D2.
And the PivotTable would look like this (set to Tabular Layout)
A note to take here is COUNTIFS can get really slow when the number of records grow to around 10k or more (or just my slow pc :/). When this happens, the workaround is: first sort your data, then use COUNTIFS over a limited number of cells only. For example, at cell D2, the formula will be =COUNTIFS(A2:A102,A2,B2:B102,B2), hence counting only 100 records rather than the whole bunch as you fill down the formula.
If what you want is the number of votes and average per company, that can be done in a variety of ways.
Using a Pivot Table, drag companies to the rows area; drag rating to the values area twice. Then change the Value Field setting on one of the Ratings to Count; and on the other to Average.
Add some formatting and various options gives you:
Or if you have a list of the Organizations (Company Names) in, let us say, G3:Gn, and your data table in columns A:C, you can use formulas:
Count: H3: =COUNTIF($B$1:$B$1000,G3)
Average: I3: =AVERAGEIF($B$1:$B$1000,G3,C1:$C$1000)
And fill down as far as needed.
Since you mentioned a PT did not suit , assuming RATING is in F2, please try in G3 copied down to suit:
=IF(AND(COLUMN()-7<COUNTIF($E:$E,$E3),$E2<>$E3),OFFSET($F3,COLUMN()-7,0),"")
then drag all the formulae to the right until an entire column appears blank. Note this requires the TARGET ATTENDEE ORGANIZATION column be sorted.

How to count unique values in Excel with two conditions

I know there is already a question that has been answered about counting uniques with a condition (Count Unique values with a condition), but I want to know how to count uniques in a column with TWO conditions.
I have a dataset with dates of locations created as well as city. Each location has an owner and sometimes an owner can have multiple locations so I want to count unique owners by city and month (both already exist as columns).
How can I do this?
The formula I suggested in the link is this
=SUM(IF(FREQUENCY(IF(B2:B100=1,IF(A2:A100<>"",MATCH(A2:A100,A2:A100,0))),ROW(A2:A100)-ROW(A2)+1),1))
that counts different values in A2:A100 if B2:B100 =1
You can just add more IFs with more conditions, making sure you get the requisite number of parentheses in the correct locations, e.g. for the number of different owners by city and month try this version for March in Chicago
=SUM(IF(FREQUENCY(IF(City="Chicago",IF(Month="March",IF(Owner<>"",MATCH(Owner,Owner,0)))),ROW(Owner)-MIN(ROW(Owner))+1),1))
confirmed with CTRL+SHIFT+ENTER
To add to Barry's answer: you don't have to nest the IF's as that gets messy quickly. You can simply multiply them together like this:
IF((City="Chicago)*(Month="March"),...)
It's much easier to add variables that way and keep track of parenthesis.
If you are using Excel 2013, there is a very simple approach w/o any formulas,
Consolidate (excel feature under DATA), arrange in a way the City names are on left, and owner names next column to right, on top row put the labels of city and owner, then select the data and click both (header and left row) options, and for the operation from top of the dialogue choose count.
You should have the report you are looking for.
Note: You also might need to remove duplicates also you can do so based on two column conditions.

Resources