How can I count Uniques in a group? - excel

I've got a list of data which has a hierarchy of sorts. There's the primary group which then falls into secondary group which then has a tertiary group of unique data. I'm trying to figure out how to represent the number of unique secondary groups in a primary group.
E.g. Group A has a list of subgroups A-1,A-1,A-2,A-2,A-2,A-3 and Group B has a list of subgroups B-1,B-2,B-2. In here I want to show in a chart how many unique subgroups there are in a group and fraction of each, i.e. Group A has 3 subgroups; 2 A-1, 3 A-2, 1 A-3, and Group B has 2 subgroups; 1 B-1 and 2 B-2.
The increased hierarchical orders throw me for a loop. Any ideas?
Edit: I've included an example of how the data looks roughly (just several magnitudes more data)

Use this one array formula:
=SUM(IF($A$2:$A$23=E2,1/COUNTIFS($A$2:$A$23,E2,$B$2:$B$23,$B$2:$B$23)))
being an array formula it must be confirmed with Ctrl-Shift-Enter instead of Enter when exiting edit mode. If done correctly then Excel will put {} around the formula.

You can do this with a couple of helper columns.
First in Column D put this formula in D2 and drag it down:
=A2&B2
Then in Column E enter this formula as an array (paste and then press CTRL+SHIFT+ENTER) and drag it down until you get a '0':
=INDEX($D$3:$D$100, MATCH(0, COUNTIF($E$1:E1, $D$3:$D$100), 0))
Then in Column F you can start your table by entering the Alpha variables (A,B,C).
In Column G enter this formula in G2 and drag down:
=COUNTIF($E:$E,$F2&"*")
You should end up with something like this:

Related

Get Spill-result to reference correct row #

As a result of this closed topic I started wondering myself the following:
Let's say we have data like this:
A
1
a
2
b
3
c
4
a
5
b
6
c
7
g
8
h
9
i
I want to divide the list into 3 (I used TEXTJOIN) and check which of the 3 of these is unique.
I used a combination of MATCH and COUNTIF (and SEQUENCE) for this.
Question:
I managed to get that correct, but wanted to get the 3 textjoin-results all at once as a spill range.
In the picture below you can see my results of attempts, but I couldn't get the TEXTJOIN to reference the correct column in it's spill-result. What is missing, or what am I doing wrong?
Not sure what you want for output.
If all you want is to determine which is/are unique entries, just use the UNIQUE function. Note the exactly_once argument.
For example:
C1: =OR(B1=UNIQUE($B$1:$B$3,,TRUE))
and fill down, but the formula will fill down automatically.
Edit:
I don't know how to get the TEXTJOIN function to spill down in groups of three like you want. However, to be able to enter the formula just once, and have the results appear in column B, and have that adjust as you add/remove entries from column A, you can use a Table structure.
In that case, row 1 would be the column headers, and the formula in B2 would be:
=IFERROR(TEXTJOIN(",",TRUE,INDEX([Column1],SEQUENCE(3)+(ROW()-ROW(Table2[#Headers]))*3-3)),"")`
No need to fill down.
Edit2:
To create a list of unique (listed only once) values from Column B in Column C, you can use the UNIQUE function. However, due to limitations on SPILL functions within a Table, this column cannot be part of the table.
In the example below, where Column C is NOT part of the table:
B2: =IFERROR(TEXTJOIN(",",TRUE,INDEX([Column1],SEQUENCE(3)+(ROW()-ROW(Table2[#Headers]))*3-3)),"")
C2: =UNIQUE(Table2[Column2],,TRUE)
Edit3:
If you were OK with omitting columnB, and reporting the unique triplets in separate cells instead of concatenated, you could do that with the UNIQUE function:
or, without a table,
=UNIQUE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/3,3,2)),,TRUE)
Note that the start argument in the SEQUENCE function represents the first line of data in column A, (2 in this example)

Excel MATCH to sum of two cell values

I have a table of data that include a name column and two numeric columns. Example:
A B C
Fred 4 2
Sam 3 6
George 1 7
I'm wanting to retrieve the name in column A for the largest sum of columns B and C. In the example above, I would want "Sam" because 3+6 is greater than any of the other sums.
I know I could create a hidden column (D) that's filed with
=SUM(B2,C2)
and do something like this:
=INDEX(A:A,MATCH(MAX(D:D),D:D,0))
but I'd rather avoid the hidden column, if possible. Is there a way to perform an index-match based on the sum of two cells?
Use the array formula:
=INDEX(A:A,MATCH(MAX((B:B)+(C:C)),(B:B)+(C:C),0))
Array formulas must be entered with Ctrl + Shift + Enter rather than just the Enter key.
(Note the appearance of braces in the formula bar)

Excel extract unique objects from list

I have a large list of companies who collaborate with my company. My company is broken down into different research groups. A company might collaborate with a single research group multiple times on the list or with multiple research groups.
I would like to find out which companies only collaborate with a single research group. Example of data is below, you can see that company A only collaborates with group 1 but multiple times, but company B collabs with many groups. How can I count this?
Examples data:
Group Company
1 A
1 B
1 C
1 A
1 C
2 D
2 D
2 E
2 E
2 B
2 D
3 D
3 F
3 B
3 F
4 G
4 B
4 B
It would be a binary result, 1=company is unique to group, 0=company is not unique to group.
An extension to this (although not included in my question) would be, how many groups do companies collab with on average
There's a workaround for what you want to achieve. Following solution use couple of helper columns and at the end will give result whether company is unique to group or not and count of groups companies collaborate with.
Assuming your data Group and Company are in Column A and Column B respectively follow the following steps:
Step 1: Get unique combination of Group and Company
In Cell D2 enter the following formula and drag/copy down as required.
=IFERROR(INDEX($A$2:$A$19 & "," & $B$2:$B$19,MATCH(0,INDEX(COUNTIF($D$1:D1,$A$2:$A$19 & "," & $B$2:$B$19),0,0),0)),"")
Step 2: Get count of each combination in data
In Cell E2 enter the following formula and drag/copy down till the row where Column D display values.
=COUNTIFS($A$2:$A$19,LEFT(D2,(FIND(",",D2,1)-1)),$B$2:$B$19,MID(D2,FIND(",",D2)+1,256))
This formula will give the count of occurrence of each combination from Column D in your data. For example, Group 1 and Company A occurs two time in you data, Group 2 and Company D occurs 3 times in your data, and so on.
Step 3: Get list of unique companies from Column B
In Cell F2 enter the following formula and drag/copy down as required.
=IFERROR(INDEX($B$2:$B$14,MATCH(0,INDEX(COUNTIF($F$1:F1,$B$2:$B$14),0,0),0)),"")
Step 4: Get count of groups each company collaborate with
In Cell G2 enter the following formula and drag/copy down till the row where Column F display values.
=COUNT(IF(MID($D$2:$D$12,FIND(",",$D$2:$D$12)+1,256)=F2,$E$2:$E$12))
This is an array formula so commit it by pressing Ctrl+Shift+Enter
Step 5: Check whether company is unique to Group or not
In Cell H2 enter the following formula and drag/copy down till the row where Column G display values.
=IF(COUNT(IF(MID($D$2:$D$12,FIND(",",$D$2:$D$12)+1,256)=F2,$E$2:$E$12))=1,1,0)
Again, this is an array formula so commit it by pressing Ctrl+Shift+Enter
or instead use this formula =IF(G2=1,1,0)
EDIT : As per requirement mentioned in comment
In Cell J2 enter:
=IFERROR(INDEX($B$2:$B$19 & "," & $A$2:$A$19,MATCH(0,INDEX(COUNTIF($D$1:J1,$A$2:$A$19 & "," & $B$2:$B$19),0,0),0)),"")
In Cell K2 enter:
=COUNTIFS($A$2:$A$19,MID(J2,FIND(",",J2)+1,256),$B$2:$B$19,LEFT(J2,(FIND(",",J2,1)-1)))
In Cell L2 enter:
=IFERROR(INDEX($A$2:$A$19,MATCH(0,INDEX(COUNTIF($L$1:L1,$A$2:$A$19),0,0),0)),"")
In Cell M2 enter:
=COUNT(IF(VALUE(MID($J$2:$J$12,FIND(",",$J$2:$J$12)+1,256))=L2,$E$2:$E$12))
This is an array formula.
In Cell N2 enter:
=IF(N2=1,1,0)
or
=IF(COUNT(IF(VALUE(MID($J$2:$J$12,FIND(",",$J$2:$J$12)+1,256))=L2,$E$2:$E$12))=1,1,0)
This formula is also an array formula.
See image for reference:

To filter multiple columns with a condition on the results

I am trying to find a way of highlighting a result with multiple conditions. I have no knowledge of pivot tables. I would rather use a formula or macros. The table is organised by Dealer.
Acc NAME Add Dealer Total
68687 Sara 11 Wood 111A 0
68687 Sara 11 Wood 111A 0
32187 Sara 11 Wood 111A 0
12345 Tom 10 Main 7878C 2
12345 Tom 10 Main 7878C 2
54321 Tom 10 Main 7878C 2
My table is similar to the one above. I want to select where the Total is greater than 0 & for each Dealer each unique Account number with the lowest Account number highlighted somehow.
So the results I want for the table above would be: Dealer 7878C, Accounts 12345, 54321.
12345 being the lower of the two, it is highlighted.
I don't mind copying the results onto another sheet, as I don't want to remove any data from the sheet. I started by just filtering the Totals for >0 and I was thinking of trying to filter for unique values in Account but its the next step that I am stuck on. A countifs formula?
The sheet is quite large and I'm just not sure which is the best way to try and do it.
Thanks for any help.
There's a nice but complicated way to do it.
With your original data:
With changed data:
As you can see I've placed your data in A1:E7.
I use two array formulas, one for the Dealer in G2:G5 and one for the Accounts H2:N5. The Dealer formula is vertical, and the Accounts formula is horizontal.
For the dealers put this array formula in G2 (press Ctrl+Shift+Enter to enter it):
=IFERROR(INDEX($D$2:$D$7,SMALL(IF(($E$2:$E$7>0)*(COUNTIF($G$1:$G1,$D$2:$D$7)=0),ROW($D$2:$D$7)-1),ROW($G$1:$G1))),"")
Now copy G2 down to G3:G5 to get the rest of the relevant dealers.
For the accounts put this array formula in H2:
=IFERROR(SMALL(IF(($D$2:$D$7=$G2)*(COUNTIF($G2:G2,$A$2:$A$7)=0),$A$2:$A$7),1),"")
Now copy H2 to the right, I2:N2, and down to H3:N5.
To make the first accounts bold I simply make the H column formatted as Bold.
You can copy these formulas farther as needed. Note that the locations are important. If you want to place the formulas elsewhere you'll need to change the references accordingly.
Formulas explained
What these formulas do is check for your conditions, and then get the smallest value that hasn't been retrieved yet, in the upper / left most cells.
The two formulas are mostly the same, apart from the fact that in the account numbers we can use the actual numbers, and with the dealer we use the row number instead.
The dealer formula from the inside out:
The conditions are set in the IF part of the formula, with a multiplier * as a logical AND (TRUE*TRUE=TRUE FALSE*TRUE=FALSE).
The first condition in IF(($E$2:$E$7>0)*(COUNTIF($G$1:$G1,$D$2:$D$7)=0),... checks for the row's Total value to be greater than zero, the second condition checks that the dealer is not already present in the G column. The second condition is irrelevant in the first cell, but in the second cell G3 it becomes COUNTIF($G$1:$G2,... which returns more than 0 if the dealer already exists, and evaluate to FALSE.
If the conditions are met the IF returns the dealer's index by using its row minus 1 ROW($D$2:$D$7)-1, which returns 1 for the first etc. as the starting row is 2. Otherwise it returns FALSE which is ignored.
The SMALL function returns the k-th smallest item. It ignores the FALSE items, and in our case returns the k-th smallest index that meets the conditions (Total>0 and not already present in the results). SMALL(...,ROW($G$1:$G1) in the first cell return the first item. ROW($G$1:$G2) in the second cell G3 evaluates to 2 and returns the second smallest item, and so forth.
The INDEX function simply returns the dealer from the data according to the index.
And finally, the IFERROR is there only to hide the errors when the end of the results is reached.
based on your sample data and assuming a header row in row 1 and the left column being column A.
=COUNTIF($A$2:A2,A2)
place that in F2 and copy down. Then do a filter on the helper column =1

Excel function for ranking duplicate values

I have an excel sheet containing two columns of data that I'd like to rank.
Suppose we have the following:
A B
Franz 58
Daniel 92
Markus 37
Jörg 58
I would like a formula to rank the above data based on column B, and where there are duplicate values (Franz and Jörg) to put the alphabetical name first. What I have at the moment is simply duplicating Franz twice:
=INDEX(Name,MATCH(A2,Points,0))
Can someone advise me of formula / code that will rank the data and arrange duplicate values alphabetically?
Thanks
I would add a helper column in next to your data to help out with ties.
so in column C use
=B1+1/COUNTIF($A$1:$A$4,"<="&A1)/10
This will add on a decimal ranking system based on the name. This assumes that your numbers in column B do not have decimal places, if they do then you will need to increase the 10 on the end of the formula to account for it ie: for 2 decimal places use 1000, 3 : 10000 etc
Use this formula to get the first name
=INDEX(name,MATCH(LARGE(points,1),points,0))
adjust the 1 to 2 for the second name etc
EDIT had the sign around the wrong way
This will rank your data and will not repeat duplicates too:
In C2:
=SUM(1*(b2>$b$2:$b$5))+1+IF(ROW(b2)-ROW($b$2)=0,0,SUM(1*(b2=OFFSET($b$2,0,0,INDEX(ROW(b2)-ROW($b$2)+1,1)-1,1))))
CTRL+SHIFT+ENTER to turn it into an array
Drag these down to C5 and it will not duplicate rank where the name is the same, it will rank them alphabetically if they are the same.
Then if you wanted to order them automatically in order of top performer/score you then do this:
Putting this in E2:
=INDEX(A2:A5,MATCH(LARGE(C2:C5,ROW()-1),C2:C5,0))
...and drag down
Then use a vlookup on your data to return the score putting this in F2:
=vlookup(E2,A2:C5,2,false)
...and drag down
This should give you a table of highest scoring people in score order.
Assuming A2 is the first of the ranked points scores try this version
=INDEX(Name,SMALL(IF(A2=Points,ROW(Points)-MIN(ROW(Points))+1),COUNTIF(A$2:A2,A2)))
confirmed with CTRL+SHIFT+ENTER and copied down
Requires the Name list to be sorted because names with duplicate scores will be listed in the order shown

Resources