Finding combinations and counting them in Excel - excel

I don't know much about Excel and I'm trying to do the following:
So, if a I had column A and column B:
A B
red green
red green
red green
blue pink
blue pink
blue pink
blue pink
black white
black white
Let's say I have hundreds of rows of combinations. What I need to do is on a second sheet, show all the different combinations and the number of times each occurs. So for the above, the result would be:
Combination: Number of times:
red green 3
blue pink 4
black white 2
So, I would need to give me the combination and the number of times it occurs.
Any idea how I could do this?

Make a header into your spreadsheet: A1 = color1, B1 = color2, C1 = combination
1- Type on C2
=A2&"-"&B2
drag the formula down on column C until the last row in which there are data on columns A and B.
2- Go to "Insert" --> "PivotTable"
Drag "combination" into the "Row Labels", and Drag "combinations" into the "Values" label.
You need to have a mathematical operation in the pivot-table "Values" field, and the "Count" operation is already set automatically when one drags a variable into it (so, it should appear "Count of combinations").
Here is a screenshot about how the Pivot Table should look like:

One way you could do this is the following:
Select the entire data, copy it and paste it where you want to calculate the number of occurences. Select that range and in the Data tab select Remove Duplicates. This will get you all unique occurences of patterns.
Now, with the following formula you can get the count of each of those cases. Notice that this is an array formula so when you enter it initially you have to hit Ctrl+Shift+Enter in the formula box for it to calculate properly. Here's the formula, just change the cells to those that match your need:
=SUM(IF($A$1:$A$4&$B$1:$B$4=A1&B1,1,0))
Here,$A$1:$A$4&$B$1:$B$4 concatenates the two columns together to create "keys". It then matches this with the current combination to check (A1&B1) and then returns 0 or 1 and sums the total to get the count.

Add a third column - "Count" - adding the value "1" in each row of this column.
Include this column in your Pivot Table data and then allocate your fields in the Pivot Table as follows:
Columns: A | Rows: B | Values: Count

Related

Finding combinations of two columns in excel, with a condition in one column

I am trying to count the combination of columns A and B, by fixing a value for column B and excluding all the duplicates.
In the exemple bellow, I would like to count all unique combinations of column A & B where B is equal to "green". The result should be 4
A B
one green
one green
two green
four pink
three green
four pink
blue green
black white
black white
If you happen to have Excel 2016+, with the new UNIQUE and FILTER functions, you can use:
=ROWS(UNIQUE(FILTER(myRng,INDEX(myRng,0,2)="green")))
I have had to do this before, you can accomplish this using an array formula with the frequency function.
=SUM(--(FREQUENCY(IF(B2:B10="green",MATCH(A2:A10,A2:A10,0)),ROW(A2:A10)-ROW(A2)+1)>0))
Note: This formula must be entered using ctrl+shift+enter
For a complete explanation of how this works please see this article:
Count unique text values

How to create a list by searching if a column contains a value?

I have in column A values (red, white) and in column B values (Marie, Jane, David, Jack etc.) There are several hundreds of rows so that there are different names once but for each name a color (red or white) is assigned. So for example:
column A column B
red Marie
red Jane
white David
red Jack
white Ashley
etc.
I want to search all names with color white and make a list of names to column C.
I know IF-statement is simplest solution BUT I don't want blank cells inbetween. I want a full list of names so that there are no useless cells. So =IF(A1="white"; B1; "") would not work because I don't want the "" part. Instead, is it possible to move to next cell to see if that cell/row includes the word white? And if so, it would return the value next to the cell "white".
I have also tried INDEX-MATCH but it only returns the first value to when I try to use autofill. So the name Marie would just copy hundred times.
VLookup hasn't helped me either.
I used a Table structure so that the formulas would autofill if the table changed in size.
C2: =IFERROR(INDEX(Table1[[#All],[Name]],AGGREGATE(15,6,1/([Color]=Table1[[#Headers],[White]])*ROW(Table1)-ROW(Table1[#Headers])+1,ROWS($1:1))),""))
Return an array of the Table row numbers that match the condition. The non-matched will return FALSE or 0
([Color]=Table1[[#Headers],[White]])*ROW(Table1)-ROW(Table1[#Headers])+1
1/(…) will change that to an array of row numbers, or a DIV/0 errors
The AGGREGATE function then returns the relevant row numbers.
the IFERROR function returns a blank if there is no relevant row number
If you change the column header White to Red, the names that appear will change.

Counting the total amount of rows which bear negative values with corresponding row

I am in need of a formula, which counts the sums of two rows for a whole column.
e.g.
I have data in column A and column B and would like to make a count of the sums for A1+B1, A2+B2, etc.. for around 1800 rows. If one of the columns is empty, it should not be included in the count.
This is to find out how many negatives I have in my column when adding A1 + B1, A2+ B2 and so on...
I try to explain it in the link to the picture.
http://i.stack.imgur.com/HlZPa.png
Basically I could do it with an extra column but since I have 50+ customers I would have to add a column for each to make the diff, so I was hoping to express the Yellow column in a formula so I only have to adapt that instead of insert a calculation column for each column.
Thanks for any tips!
You could filter the rows based on some values to get all rows containing negative value if this is what you wish to achieve.
follow these steps:
Select The column that contains negative values.
click on the "Data" tab.
Click "Filter" or Cntrl+Shift+L.
Click the black arrow pointing down on top of the column you have selected.
Select "Number Filters"
Click "Less Than"
Type 0 (zero) and hit Enter.
you should have only the cells containing negative values displayed.
If you need to get the total rows containing positive values in a column, use this formula
=MIN(COUNTIF(A:A,">0")) where column is "A"
To get the total of negative cells in a row, you can use
=MIN(COUNTIF(A:A,"<0")) where column is "A"

How to exclude records in excel that have specific attributes

I have a long list of email addresses in an excel sheet where the emails are in column A and I have colors in column B. For instance, see table:
EMAIL COLOR
1. test#example.com red
2. test#example.com blue
3. testing123#example.com blue
4. testing123#example.com blue
5. testtest#example.com red
6. testtest#example.com blue
I can't figure out how to filter out or remove any email address that is associated with the color red without doing this manually (I have thousands of rows of data, so this isn't happening).
So in this case, the only email address I want to eventually import into my email program is testing123#example.com. Imagine there are thousands of rows like this - is there a conditional formula for column C that can lookup this relationship and provide a "true/false" flag for each email record?
filter by color with criteria red
copy all emails affiliated with color red to another column not attached
to the first 2 columns (put a space between i.e. column F)
add a 3rd column and do a vlookup by email =vlookup(a2,F:F,1,0)
filter 3 columns by column C everything but #N/A and delete those rows
Use an AutoFilter on your color column to display only those rows containing Red. Then delete the visible rows.
See Contextures
EDIT#1:
Based on your comment, we will use a "helper" column. The "helper" column will mark those rows where either the color is red or the email address is duplicated elsewhere with a red color. In the following example, the data is in columns A and B. In C2 we enter:
=IF(OR( B10="red",SUMPRODUCT(--(A$2:A$100=A10)*(B$2:B$100="red"))>0),"D","")
(The formula assumes 100 rows of data.)
As you see, row#2 is D because it is red and row#3 is also D because row#2 was.
Now set the AutoFilter to display only the D rows and delete the visible rows.

Reduce an excel 2d table to a single row 'list' (and remove the duplicates

I have a table.
1 2 3 4
A red purple green red
B blue yellow white brown
C pink green purple red
D pink pink orange white
E green red hazel black
F orange orange blue orange
I want to return (into a range) a list of every colour that appears (but only one entry in the list per colour, so no duplicates). I have found many answers for the single col version, but I really would like to extend to 2D. I would prefer an array formula solution than a VBA solution (though I'll give it a go).
see this for example.
Ignore Duplicates and Create New List of Unique Values in Excel
The table may occupy any position on a sheet!
To extract uniques from a two-dimensional table, see:
Coderre Formula
EDIT#1:
In this example the 4X6 table is in C4 thru F9
The helper column is H4 thru H27
The Uniques are in column I starting in I4
In H4 enter:
=OFFSET($C$4,ROUNDUP(ROWS($1:1)/4,0)-1,MOD(ROWS($1:1)-1,4))
and copy down
In I4 enter:
=H4
In I5 enter the array formula:
=IFERROR(INDEX($H$5:$H$27, MATCH(0, COUNTIF($I$4:I4, $H$5:$H$27), 0)),"")
and copy down
Array formulas must be entered with Ctrl + Shift + Enter rather than just the Enter key.
Here is what we see:
Might be achieved by entering something ("z" would do) at the intercept of 1 and A, then creating a PivotTable with multiple consolidation ranges (see for example) and in the new sheet apply Advanced Filter to the Value column, Copy to another location, Copy to: where desired and select Unique records only. The PT and drill-down details can then be deleted.

Resources