I have a situation similar to Excel - Counting unique records in a group but with a final twist that's giving me a headache.
This is basically how my data looks:
A B C
--- --- ---
1 5 2
1 6 2
1 5 2
2 7 1
2 7 1
2 7 1
3 8 1
3 8 1
I'm trying to generate the value in column C. I need to count the number of unique values in column B for each different value in column A. Column A is sorted so that all of the values are together.
I've tried this:
=COUNTIFS($A$2:$A$8,$A2, $B$2:$B$1638,"<>"&"")
That gives me a count of the number of values in the group, but not the number of unique values in the group (so, in my example, it would 3, 3, and 2). I've also tried a pretty cool trick I found on this page which counts all of the unique values in the entire column (so in my example, it would be 4 all the way down). I can't figure out how to split the difference.
I've also tried to figure out if it can be done using the IF function, bu I'm coming up dry on that too. Any help here?
Use this array formula in cell D2:
=SUM(--(FREQUENCY( IF( $A$2:$A$9=A2, MATCH($B$2:$B$9,$B$2:$B$9,0)), ROW( $B$2:$B$9)-ROW( $B$2)+1)>0))
Put this in the formula bar and press CTRL + SHIFT + ENTER (instead of just ENTER) to save it as an array formula. Excel with place these brackets { } around your formula.
Then copy it down. It works well, just be aware that array formulas can get very slow if you use thousands of them in a workbook.
I found this following a link to here from the "cool trick" you linked to.
If it works let me know, and don't forget to mark my answer as accepted. Good luck!
Related
I have a program that auto-generates reports into excel file and I need to extract amount of unique orders that have specific inventory type starting with "R" in another column. Normally it would be simple but the same order number can repeat in multiple rows, so I need to create formula that will count it without duplicates.
Order Number Location
1 R-11
1 R-12
1 R-13
2 R-12
3 N-11
4 N-12
Unique orders with "R*" location: 2
Result of count based on above set of data should be: 2 - since there are two different order numbers that have location starting with "R".
I've tried and created following formula
=SUMPRODUCT((LEFT(B2:B7;1)="R")/COUNTIFS(B2:B7;B2:B7&"";A2:A7;A2:A7&""))
But it also sums unique values in "Location" column, and I get 4 instead of 2. How can I fix that?
Perhaps something like the following (array formula, enter with Ctrl + Shift + Enter.)
=SUM(--(FREQUENCY(IF(LEFT(B2:B7,1)="R",A2:A7),A2:A7)>0))
I have two workbooks that I need to pull data from workbook1, to workbook2. The identifier to achieve such is empID Now for eachempID I need to show what location(s) they worked. So sample data looks like this
Workbook1
empID.....Name....Address...City...State....Zip
1
2
3
4
5
Workbook2
empID.......locationworked
1 12
2 33
1 11
4 22
3 9
1 55
5 19
2 76
1 99
I have used this formula to return the data to a different cell for each empID
=IFERROR(INDEX($B$2:$B$8, SMALL(IF($A$11=$A$2:$A$8, ROW($A$2:$A$8)-ROW($A$2)+1), ROW(1:1))),"" )
But I want to create a Comma Separated list and put everything in one cell, like so
1 11,12,55,99
2 33,76
etc
Is there a way to modify the syntax so that a comma separated list is created like in my desired output?
In workbook 2, I added this formula to column C
=IFERROR(VLOOKUP(A1,A2:$C$50,3,0)&","&B1,B1).
This assumes that your data goes as far down as row 50. Replace $C$50 with whatever row is last in your spreadsheet.
If this is a variable list, use
=INDIRECT("A2:C"&MATCH(TRUE,D:D="",0),1)
in place of the
A2:$C$50
however don't forget to use Ctrl + Shift + Enter to set the formula to an array.
Next, copy this formula down all rows. The VLOOKUP will work up the sheet. Then you can reference this list from your report sheet (I believe in this case its Sheet 1) with a VLOOKUP. it will automatically pick the first instance of each employee ID which contains the csv list.
I'd like to point out that whilst bad_neighbor's solution is quite accurate and reusable for future data changes, it is often preferable to avoid lookups where possible, and to store calculated results as values, since these aren't perfectly efficient and tend to slow down the sheet something awful given a larger quantity of data, for example when filtering / unfiltering. It's worse in older versions.
So, if this list formatting were part of a manual operation, and assuming the requirement is for each list item to be in ascending order (per the question's output), I'd do the following instead:
If workbook2's order is important, add an index of the rows (D1 := 1; D2 := D1 + 1; paste values).
Sort workbook2 by [A ascending, B descending], including index if present.
Apply this formula to column C - a fillup version of the lookup.
C1 := IF(A1=A2,C2&", "&B1,B1)
Copy-paste special values column C.
Lookup from workbook1 + copy-paste special values.
Optionally sort back according to original index (D) in workbook2.
In Excel suppose I have a table with the following two columns and following data:
ID Value
1 6
1 2
1 1
2 4
3 5
In excel what I would like to do is write the word duplicate in a third column (say result) when the id is duplicate and is not the highest value.
In this example duplicate would be written next to Value(2),ID(1) and Value(1),ID(1). Value(6), ID(1) would not have duplicate written next to it becasue it has the highest value out of all the ID(1)'s.
Is there an excel formula I can use to do this? If not what VBA would I need? In reality this is a large database and there will be more than 3 duplicates.
The result should look like this:
ID Value
1 6
1 2 Duplicate
1 1 Duplicate
2 4
3 5
Not sure if this is correct. But please correct me if I am wrong.
=IF(MIN($A$2:$A$6 = MIN($B$2:$B$6)), "duplicate", "")
This array formula should work (Ctrl+Shift+Enter) to confirm, though if you have lots of data could be rather slow.
=IF(B2=MAX(IF($A$2:$A$6=A2,$B$2:$B$6)),"","Duplicate")
if the duplicates are in column A, the cell B3 could read: (if ID are decreasing)
=if(COUNTIF($A$1:$A2,A3)>0,"Duplicate #" & COUNTIF($A$1:$A3,A3),"")
does this help?
I have googled for hours, not being able to find a solution to what I need/want. I have an Excel sheet where I want to sum the values in one column based on the criteria that either one of two columns should have a specific value in it. For instance
A B C
1 4 20 7
2 5 100 3
3 100 21 4
4 15 21 4
5 21 24 8
I want to sum the values in C given that at least one of A and B contains a value of less than or equal to 20. Let us assume that A1:A5 is named A, B1:B5 is named B, and C1:C5 is named C (for simplicity). I have tried:
={SUMPRODUCT(C,((A<=20)+(C<=20)))}
which gives me the rows where both columns match summed twice, and
={SUMPRODUCT(C,((A<=20)*(C<=20)))}
which gives me only the rows where both columns match
So far, I have settled for the solution of adding a column D with the lowest value of A and B, but it bugs me so much that I can't do it with formulas.
Any help would be highly appreciated, so thanks in advance. All I have found when googling is the "multiple criteria for same column" problem.
Thanks. That works. Found another one that works, after I figured out that excel does not treat 1 + 1 = 1 as I learnt in discrete mathematics, but as you say, counts the both the trues. Tried instead with:
{=SUM(IF((A<=20)+(B<=20);C;0))}
But I like yours better.
Your problem that it is "summing twice" in this formula
={SUMPRODUCT(C,((A<=20)+(C<=20)))}
is due to addition turning first TRUE plus the second TRUE into 2. It is not actually summing twice, because for any row, if only one condition is met, it would count that row only once.
The solution is to transform either the 1 or the 2 into a 1, using an IF:
={SUMPRODUCT(C,IF((A<=20)+(C<=20))>0, 1, 0)}
That way, each value in column C would only be counted at max once.
Following this site you could build up your SUMPRODUCT() formula like this:
=SUMPRODUCT(C,SIGN((A<=20)+(C<=20)))
So, instead of a nested IF() you control your or condition with the SIGN()function.
hth
If you plan to use a large set of data then it is best to use the array formula:
{=SUM(IF((A1:A5<=20)+(B1:B5<=20),C1:C5,0))}
Obviously adjust the range to suit the data set, however if the whole of each column is to form part of the formula then you can simply adjust to:
{=SUM(IF((A:A<=20)+(B:B<=20),C:C,0))}
This will perform the calculation on all rows of data within the A, B and C columns. With either example remember to press Ctrl + Shift + Enter in order to trigger the array formula (as opposed to typing the { and }).
I'm working on taking information from a table like so:
A 1
2
B 3
1
4
C 2
5
Essentially, a series of sets (A,B,C) with their elements arranged vertically beside them.
What I'm trying to do is retrieve the list of column 1 values that have a certain value in column 2. For instance, if the lookup value for column 2 was 1, I would want A and B to match, but not C. Best case scenario, I could generate a new column containing the matches. Is there a way to do this without resorting to VBA?
EDIT:
The data I am working with is not so clean, here's a doctored version of it
1 2 3 4
83 Fun Edit ZZZZZZ*AAAAAA 210
365,400 176
210
85 Fun Edit 600,500 205
MEDICARE[705] 176
200
The extracted data does not like to preserve relationships between data beyond the column 1 identifier. In this case, the information in column 3 "###, ###" comes from item 176 in column 4. So filling down and taking the row will result in issues downstream.
In the long run, the data in column 4 is just a key for matching the information in this extract with another one.
I appreciate everyone's help thus far, and apologize for my insufficient original example.
Here's a short workflow that will do it:
Select the entire range
Press Ctrl+G (Goto)
Click Special
Tick Blanks and OK
Type = and arrow up. You should have a formula that looks like =A1
Press Ctrl+Enter. At this point all the missing alpha values should be filled in.
Apply Autofilter and filter the numbers to show only 1
If you want to use the filtered alpha list elsewhere, copy the values showing, and paste elsewhere.