Extract unique combinations from Excel sheets - excel

I am looking for a way to extract all unique combinations from a number of Excel sheets with multiple columns. E.g.:
#no. | fruit | city | year | something else
1 | apple | London | 2015 | some text
2 | banana | London | 1999 | no text
3 | apple | Oxford | 1895 | some text
4 | banana | London | 1999 | no text
How can I get a list of all unique rows (except for column 1 of course) with any function in Excel or VBA? Preferably it is a script-like way, because the sheets contain over 6000 rows of varying information.
Any thoughts?

If you would like to do it with just formulas, here is some simple steps:
1) Add column at the end of table. In this column, concenate all rows like this:
#no. | fruit | city | year | something else|
1 | apple | London | 2015 | some text |=B2&C2&D2&E2
2 | banana | London | 1999 | no text |
3 | apple | Oxford | 1895 | some text |
4 | banana | London | 1999 | no text |
2) Then add another column to count and numerate occurences of dublicates. Put COUNTIF() formula and fill down:
no | fruit | city | year | something else| Column F |
1 | apple | London | 2015 | some text |=B2&C2&D2&E2|=COUNTIF($F$2:F2,F2)
2 | banana | London | 1999 | no text |
3 | apple | Oxford | 1895 | some text |
4 | banana | London | 1999 | no text |
If you filter last column with criteria=1 you can get all unique rows.

Related

Excel Sum product values and stock then multiplie when multiple criteria

So I have this information:
+---------------+---------+-------+------------+
| Chocolate | Brand | Stock | Sale value |
+---------------+---------+-------+------------+
| Chokito | Nestlé | 1520 | $3,50 |
| Snickers | Mars | 3300 | $5,20 |
| Snickers 2 | Mars | 500 | $2,50 |
| Kit Kat | Nestlé | 2000 | $9,10 |
| Double Decker | Cadbury | 1000 | $2,50 |
| Idaho | Mars | 0 | $6,10 |
| Caramello | Cadbury | 350 | $7,50 |
| Cadbury Daily | Cadbury | 1000 | $3,10 |
| Almond Joy | Hershey | 500 | $1,50 |
| Twix | Nestlé | 999 | $4,50 |
| Zero Bar | Hershey | 488 | $5,50 |
+---------------+---------+-------+------------+
Wha I want to get the total stock value for each brand. I get these values by inserting a column of of stock * value then doing a Pivot Table
Cadbury $8.225,00
Hershey $3.434,00
Mars $18.410,00
Nestlé $28.015,50
But what I want to do is a formula in Excel that will get this same values.
I first tried using SUMIF but obvioulsy it didnt worked xD
I cant think of any other formula
Thanks for your help
Try,
=SUMPRODUCT((C$2:C$12), (D$2:D$12), --(B$2:B$12=G4))
For a dynamic length of data,
=SUMPRODUCT((C$2:INDEX(C:C, MATCH(1E+99, C:C))), (D$2:INDEX(D:D, MATCH(1E+99, C:C))), --(B$2:INDEX(B:B, MATCH(1E+99, C:C))=G4))
Alternative approach using sumif
Place the following in E2 and copy down
=D2*E2
this will give the value you of each individual chocolate level in stock
in column G generate a list of brands
in H2 use the following formula and copy down as needed
=SUMIF(B:B,G2,E:E)

Excel to return list of items with specific repetition

I am trying to create a list of names that repeat a specific number of times, based on another variable. Basically, if I have the following:
Column A Column B
Amy 5
John 2
Carl 3
the result would be:
Amy
Amy
Amy
Amy
Amy
Carl
Carl
Carl
John
John
I have built the initial list using the Index-Small-Countif, method, to get an alphabetical and distinct list, and then another formula to determine how many times each item repeats. I know I need to use some sort of index/offset with reference to rows, but just can't quite get it to work out.
The list is dynamic and changes daily, so manually retyping the list each day would result in too much human error and time (list is about 50 distinct items, with total number of rows at the end being around 400). Ultimately, the list will be used for a number of sumproduct/vlookups.
I can do this fairly quickly in VBA, but the users of this document don't always trust VBA and trying to get them to Enable Macros each time is not something that is going to work.
Thank you very much for any help you can offer!
Based on your table:
+---+------+---+
| | A | B |
+---+------+---+
| 1 | Amy | 5 |
| 2 | John | 2 |
| 3 | Carl | 3 |
+---+------+---+
In column C stick a "0" at C4 and formula =B1+C2 copying down to just before the 0:
+---+------+---+----+
| | A | B | C |
+---+------+---+----+
| 1 | Amy | 5 | 10 |
| 2 | John | 2 | 5 |
| 3 | Carl | 3 | 3 |
| 4 | | | 0 |
+---+------+---+----+
Now we have an upper bound of the row that each value should be placed on which we can use in a Match() formula which will feed an Index() formula.
In a new column (I'm using E) IN E1: =INDEX($A$1:$A$3,MATCH(ROW(),$C$1:$C$3,-1),1) and copy down
+----+------+---+----+--+------+
| | A | B | C | D | E |
+----+------+---+----+--+------+
| 1 | Amy | 5 | 10 | | Carl |
| 2 | John | 2 | 5 | | Carl |
| 3 | Carl | 3 | 3 | | Carl |
| 4 | | | 0 | | John |
| 5 | | | | | John |
| 6 | | | | | Amy |
| 7 | | | | | Amy |
| 8 | | | | | Amy |
| 9 | | | | | Amy |
| 10 | | | | | Amy |
+----+------+---+----+--+------+
The list is backwards because of that oddball backwards from 0 thing we did in Column C. This is to make that Match() last parameter of -1 (Greater than) work correctly.
I would imagine with some tweaking this could be done a little cleaner, but this should get you in the ballpark.
Although I would still be a big proponent of finding users who are capable of enabling macros. Ugh.

Excel - Return multiple matching values from a column matching two variables, horizontally in one row

I have this table:
| | A | B | C |
|---|---------|--------|---|
| 1 | | | |
| 2 | Oranges | Red | 1 |
| 3 | Apples | Yellow | 2 |
| 4 | Grapes | Orange | 3 |
| 5 | Oranges | Orange | 4 |
| 6 | Apples | Red | 5 |
| 7 | Grapes | Green | 6 |
| 8 | Apples | Green | 7 |
I want to check for matching values in Column A like Apples,Yellow , Apples,Green, etc... and return all the corresponding values from Column B in one row:
I tried to nest AND into IF but didn't work out as it wasn't returning any values at all.
| | A | B | C | D | E |
|----|---------|-------------|---|---|---|
| 11 | Apples | Green | 1 | | |
| 12 | Oranges | YellowGreen | 2 | | |
My code:
=INDEX($B$2:$B$8, SMALL(IF($A$11=$A$2:$A$8, ROW($A$2:$A$8)-ROW($A$2)+1), COLUMN(A1)))
How do I get this formula to look at two variables to match?
Thank you.
You seem to be using an array formula, wouldn't concatenating work?
{=INDEX($C$2:$C$8, SMALL(IF($A11&" "&$B11=$A$2:$A$8&" "&$B$2:$B$8, ROW($A$2:$A$8)-ROW($A$2)+1), COLUMN(A1)))}

Create Distinct ID by Group in Excel without sorting

I have an Excel column that cannot be sorted and for which I need to create a unique id by group, similar to what is below:
+--------+------+
| Name | ID |
+--------+------+
| Jim | 1 |
| Sarah | 1 |
| Tim | 1 |
| Jim | 2 |
| Rachel | 1 |
| Sarah | 2 |
| Jim | 3 |
| Sarah | 3 |
| Rachel | 2 |
| Tim | 2 |
+--------+------+
You can do this with a simple COUNTIF() and getting a little creative with your cell references:
=COUNTIF($A$1:$A1, A2) + 1
Put that in B2 (assuming your list with headers starts in A1) and then copy down.
COUNTIF() here is counting the number of times the name in the adjacent cell has appears in all of the cells above it. As you copy it down, that range will grow to include all cells between A1 and the next row up.

Count distinct occurrences and averages in a column based on identifiers in another column

In MS Excel, I want to count the number of distinct categories (ignoring a specific item) based on a different column. Also, I want to find the average and the max for the same selection. This is the data:
+--------+-----------+-------+
| Person | idea | score |
+--------+-----------+-------+
| George | vacuum | 9 |
| George | box | 6 |
| George | x | 1 |
| Joe | scoop | 4 |
| Joe | x | 1 |
| Joe | x | 1 |
| Joe | scoop | 4 |
| Joe | gear | 7 |
| Mike | harvester | 10 |
| Mike | gear | 7 |
| Mike | box | 6 |
+--------+-----------+-------+
The result should be the following:
+--------+----------------+------------+-----------+
| Person | distinct ideas | Avg. score | Max score |
+--------+----------------+------------+-----------+
| George | 2 | 5.3 | 9 |
| Joe | 2 | 3.4 | 7 |
| Mike | 3 | 7.7 | 10 |
+--------+----------------+------------+-----------+
Because Joe has two "scoop" and one "gear" idea, and I want to ignore the "x" items.
I reluctantly gave up and did it manually for each person, e.g., this is for the first person:
SUM(IF(FREQUENCY(MATCH(B2:B4,B2:B4,0),MATCH(B2:B4,B2:B4,0))>0,1))-IF(COUNTIF(B2:B4,"x")>0,1,0)
Doesn't Excel have functions to return a range instead of a value? If I could select the range based on the name of the person in the first columns, I could count distinct occurrences or find the average in another column.
Add a 4th column and label it Distinct Ideas
If your table starts in A1, then:
EDIT: Formula changed to exclude "x". Screen shot also changed
D2: =IF(TRIM($B2)="x",0,IF(SUMPRODUCT(($A$2:$A2=A2)*($B$2:$B2=B2))>1,0,1))
and fill down.
Then construct a Pivot table
Person to Row Labels
Distinct Ideas to Values area
score to Values and select to Average
Score to Values area and Select Max
Format as desired. Here is one result:

Resources