counting unique row values in excel - excel

I have product list like this. I have only product name list
Apple 1
Apple 1
Apple 1
Orange 2
Orange 2
Orange 2
Mango 3
Mango 3
Pineapple 4
Pineapple 4
Pineapple 4
Pineapple 4
Pineapple 4
Pineapple 4
Avocado 5
I want to count data in this way. please help.

In B1 insert 1 and in B2 insert:
=IF(A2=A1,B1,B1+1)
And from there just fill.

Put 1 in B1 then put this formula in B2,
=if(a2<>a1, max(b$1:b1)+1, b1)
... and fill down.
Alternately just put his formula into B1,
=SUMPRODUCT(1/(COUNTIF(A$1:A1,A$1:A1)))
... and fill down.

Related

Filter rows based on the count of unique values

I need to count the unique values of column A and filter out the column with values greater than say 2
A C
Apple 4
Orange 5
Apple 3
Mango 5
Orange 1
I have calculated the unique values but not able to figure out how to filer them df.value_count()
I want to filter column A that have greater than 2, expected Dataframe
A B
Apple 4
Orange 5
Apple 3
Orange 1
value_counts should be called on a Series (single column) rather than a DataFrame:
counts = df['A'].value_counts()
Giving:
A
Apple 2
Mango 1
Orange 2
dtype: int64
You can then filter this to only keep those >= 2 and use isin to filter your DataFrame:
filtered = counts[counts >= 2]
df[df['A'].isin(filtered.index)]
Giving:
A C
0 Apple 4
1 Orange 5
2 Apple 3
4 Orange 1
Use duplicated with parameter keep=False:
df[df.duplicated(['A'], keep=False)]
Output:
A C
0 Apple 4
1 Orange 5
2 Apple 3
4 Orange 1

If formula comparing two columns

Need a formula that says if the combination of information in Column 3 & 4 in the same row match the combination of Column 1 & 2 in the same row then say YES, otherwise say No. Column 5 shows what I would like to see. There is a orange in column 1 that has 2222222 in column 2 so it would show as NO where as there is a Banana in Column I that has 2222222 in column 2.
FIRST SET OF VALUES SECOND SET OF VALUES
COLUMN 1 COLUMN 2 COLUMN 3 COLUMN 4 COLUMN 5
ORANGE 11111111 ORANGE 22222222 NO
BANANA 22222222 BANANA 22222222 YES
PEAR 33333333 PEAR 55555555 NO
PEAR 44444444 WATERMELON 55555555 YES
WATERMELON 11111111 GRAPES 66666666 YES
WATERMELON 55555555 PEACH 33333333 YES
GRAPES 66666666 PLUM 44444444 NO
GRAPES 77777777 PINEAPPLE 34343434 YES
GRAPES 22222222
PEACH 33333333
PLUM 88888888
PLUM 77777777
PINEAPPLE 99999999
PINEAPPLE 12121212
PINEAPPLE 34343434
PINEAPPLE 56565656
Use COUNTIFS():
=IF(COUNTIFS(I:I,M3,J:J,N3),"Yes","No")

Add rows according to other rows

My DataFrame object similar to this one:
Product StoreFrom StoreTo Date
1 out melon StoreQ StoreP 20170602
2 out cherry StoreW StoreO 20170614
3 out Apple StoreE StoreU 20170802
4 in Apple StoreE StoreU 20170812
I want to avoid duplications, in 3rd and 4th row show same action. I try to reach
Product StoreFrom StoreTo Date Days
1 out melon StoreQ StoreP 20170602
2 out cherry StoreW StoreO 20170614
5 in Apple StoreE StoreU 20170812 10
and I got more than 10k entry. I could not find similar work to this. Any help will be very useful.
d1 = df.assign(Date=pd.to_datetime(df.Date.astype(str)))
d2 = d1.assign(Days=d1.groupby(cols).Date.apply(lambda x: x - x.iloc[0]))
d2.drop_duplicates(cols, 'last')
io Product StoreFrom StoreTo Date Days
1 out melon StoreQ StoreP 2017-06-02 0 days
2 out cherry StoreW StoreO 2017-06-14 0 days
4 in Apple StoreE StoreU 2017-08-12 10 days

Get Top Performer by Subgroup Using Index and Match

I am trying to rank names in Column C from largest to smallest score.
Category Score Name Total Rank Apple Rank Orange Rank
Apple 10 Joe Rachel Rachel 0
Orange 15 Don Natalie 0 Natalie
Apple 20 James Tom Tom 0
Apple 1 Rob Nothing Nothing 0
Orange 3 Mary Gina 0 Gina
Orange 100 Rachel James 0 James
Orange 99 Natalie Don 0 Don
Orange 87 Tom Joe 0 Joe
Apple 27 Gina Mary Mary 0
Orange 30 Nothing Rob 0 Rob
This works in Column E for Apples AND Oranges, with formula in E2 that is
=INDEX($C$2:$C$25,MATCH(1,INDEX(($B$2:$B$25=LARGE($B$2:$B$25,ROWS(E$1:E1)))*(COUNTIF(E$1:E1,$C$2:$C$25)=0),),0))
However, the goal is to compare Apples to Apples and Oranges to Oranges.
Only, the formulas in Columns F and G show "0" values for those rows that aren't in the right Apple/Orange category.
For F2:
=IF($A:$A="Apple",INDEX($C:$C,MATCH(1,INDEX(($B:$B=LARGE($B:$B,ROWS(F$1:F1)))*(COUNTIF(F$1:F1,$C:$C)=0),),0)),0)
For G2:
=IF($A:$A="Orange",INDEX($C:$C,MATCH(1,INDEX(($B:$B=LARGE($B:$B,ROWS(G$1:G1)))*(COUNTIF(G$1:G1,$C:$C)=0),),0)),0)
How do I modify the codes so that 0 values won't show up?
Something like this would be great: (screenshot made by just copy pasting values...)
Apple Rank Orange Rank
Rachel Natalie
Tom Gina
Nothing James
Mary Don
Joe
Rob
Note: Unless the whole column ranges are required the steps below may seem to take an uncomfortably long time if these ranges are not restricted.
Assuming you have what below is in ColumnA:G and a corresponding layout:
then ColumnsI:J may be achieved quite simply by copying ColumnF:G and Paste Special..., Values into I1, then select ColumnsI:J, HOME > Editing - Find & Select, Replace..., Find what: 0, Replace with: , Replace All followed by Find & Select, Go To Special..., select Blanks (only), OK, right-click on one of the chosen cells and Delete..., Shift cells up, OK.
To remove the 0s from ColumnF:G only replacing the final 0 in each formula with "" is sufficient.

excel/vba - find fist and last occurrence of a particular value in a column

So if I have a column such as:
A1
1 Apple
2 Apple
3 Apple
4 Oj
5 Oj
6 Oj
7 Oj
8 Pear
9 Pear
How could I return the values 1 & 3 for Apple, 4 & 7 for OJ, etc?
Formula-wise you can use MATCH functions, e.g. for first Apple position
=MATCH("Apple",A1:A9,0)
for last
=MATCH(2,INDEX(1/(A1:A9="Apple"),0))
or if the fruit are sorted as per your example (or merely grouped) you can get the last by adding the number of apples to the first -1
so with first MATCH function in C1 that would be
=COUNTIF(A1:A9,"Apple")+C1-1

Resources