How to find row with specific column value only - python-3.x

I am trying to figure out the names who only have specific column value and nothing else.
I have tried filtering the rows according to the column value but that isn't what I want, I want the names who only went to eat pizza.
I want names who only had pizza, so my code should return John only and not peter as john only had pizza
Click to view data frame

Your description is not clear. At first, it looks like a simple .loc will be enough. However, after viewing your picture of sample data, I realized it is not that simple. To get what you want, you need to identify duplicated or non-duplicated names having one Restaurant value only, and pick it. To do this, you need to use nunique and check it eq(1), and assign it a mask m. Finally, using m with slicing to get your desire output:
Your sample data:
In [512]: df
Out[512]:
Name Restaurant
0 john pizza
1 peter kfc
2 john pizza
3 peter pizza
4 peter kfc
5 peter pizza
6 john pizza
m = df.groupby('Name').Restaurant.transform('nunique').eq(1)
df[m]
Out[513]:
Name Res
0 john pizza
2 john pizza
6 john pizza
If you want to show only one row, just chain additional .drop_duplicates
df[m].drop_duplicates()
Out[515]:
Name Restaurant
0 john pizza

Related

Google sheets formula to get the Top 5 List With Duplicates

I'm trying to compile a best 5 and worst 5 list. I have two rows, column B with the number score and column C with the name. I only want the list to include the name.
In my previous attempts the formula would get the top/bottom 5 but as soon as a duplicate score appeared the first known name with that value would just repeat.
Here is my data
26 Cal
55 John
55 Mike
100 Steve
26 Thomas
100 Jaden
100 Jack
95 Josh
87 Cole
75 Brett
I've managed to get the bottom 5 list formula correct. This formula works perfectly and includes all names of duplicate scores.
Example of what I get:
Cal
Thomas
John
Mike
Brett
=INDEX($C$56:$E$70,SMALL(IF($B$56:$B$70=SMALL($B$56:$B$70,ROWS(E$2:E2)),ROW($B$56:$B$70)-ROW($B$56)+1),SUM(IF($B$56:$B$70=SMALL($B$56:$B$70,
ROWS(E$2:E2)),1,0))-SUM(IF($B$56:$B$70<=SMALL($B$56:$B$70,ROWS(E$2:E2)),1,0))+ROWS(E$2:E2)))
Here is the formula I've tried to get the top 5 - however I keep getting an error.
=INDEX($C$56:$E$70,LARGE(IF($B$56:$B$70=LARGE($B$56:$B$70,ROWS(E$2:E2)),ROW($B$56:$B$70)-ROW($B$56)+1),SUM(IF($B$56:$B$70=LARGE($B$56:$B$70,
ROWS(E$2:E2)),1,0))-SUM(IF($B$56:$B$70<=LARGE($B$56:$B$70,ROWS(E$2:E2)),1,0))+ROWS(E$2:E2)))
Example of what I'm looking for
Steve
Jaden
Jack
Josh
Cole
You can set two queries like this for both cases:
=QUERY(B56:C70,"Select C order by B desc limit 5")
=QUERY(B56:C70,"Select C order by B limit 5")
Use SORTN() function like-
=SORTN(A1:B10,5,,1,1)
To keep only one column, wrap the SORTN() function with INDEX() and specify column number. Try-
=INDEX(SORTN(A1:B10,5,,1,1),,2)

Flag duplicate relationship in excel

I have data available in excel as given below:
Below is the criteria for the expected column:
Duplicate relationship should be color coded/flagged.
I have added expected result in a column G Is duplicated?.
To achieve that I have tried using Match function of excel, but it doesn't match my requirement.
Please suggest what should be the correct approach to fix this.
You can create a sorted csv of each of the words on each row:
=TEXTJOIN(",",FALSE,SORT($A3:$D3,1,1,TRUE))
name
relation
name
relation
sorted_csv
Milly
Wife
Jack
Husband
Husband,Jack,Milly,Wife
Jack
Husband
Milly
Wife
Husband,Jack,Milly,Wife
Reacher
Son
Jack
Father
Father,Jack,Reacher,Son
Reacher
Son
Jack
Mother
Jack,Mother,Reacher,Son
Then you can count the rows by sorted_csv:
=COUNTIF($E$3:$E$6,$E3)
name
relation
name
relation
sorted_csv
count by sorted_csv
Milly
Wife
Jack
Husband
Husband,Jack,Milly,Wife
2
Jack
Husband
Milly
Wife
Husband,Jack,Milly,Wife
2
Reacher
Son
Jack
Father
Father,Jack,Reacher,Son
1
Reacher
Son
Jack
Mother
Jack,Mother,Reacher,Son
1
Any row that has a count greater than 1 is a duplicate of another row.

EXCEL Get top 3 largest numbers in repetitive array

enter image description hereI have an array of people with scores in other column. I need to find top 3 people with highest score and print their names.
Example:
Maria 1
Thomas 4
John 3
Jack 2
Ray 2
Laura 4
Kate 3
Result should be:
Thomas
Laura
John
What I get:
Thomas
Thomas
John
What I get:
Thomas
John
num
I have tried using LARGE, MATCH, MIN, MAX but nothings works.
My first failure code:
=INDEX($A$2:$A$8; MATCH(LARGE(($B$2:$B$8);{1;2;3}); $B$2:$B$8;0))
My second failure code:
{=INDEX($A$2:$A$14;SMALL(IF($B$2:$B$14=MAX($B$2:$B$14);ROW($B$2:$B$14)-1);ROW(B4)-1))}
Put this in the second row of the column you want:
=INDEX(A:A,AGGREGATE(15,7,ROW($B$1:$B$7)/((COUNTIF($D$1:D1,$A$1:$A$7)=0)*($B$1:$B$7=LARGE(B:B,ROW(1:1)))),1))
And drag down three rows:

Extract list in excel based on criteria

I have a list of names and a list of categories in a table.
Example:
Name Category 1 Category 2 Category 3
Jane Doe X X
Bill Smith X X
Eric Hamilton X
From that list, I want to list the people for each category.
Example:
Category 1 Category 2 Category 3
Jane Doe Jane Doe Bill Smith
Bill Smith Eric Hamilton
Is there a way I can do this in excel?
I found this video which seems to accomplish what I want. The formula is a bit more complicated than what I was hoping for, but it worked. I just removed some of the absolute cell references and copied the formula for the number of categories I currently have and it grouped the users properly.
https://www.youtube.com/watch?v=QkHfZtvC7UQ

Excel - Transposing unsorted items

So lets say that I have a table like this..
TID Person Type Name
1 Andy F Orange
2 Andy M Beef
3 Andy V Carrot
4 Andy V Spinach
5 Bobby M Ham
6 Bobby F Apple
7 Bobby V Carrot
I want to transpose it so that it will be sorted according to the Type, I want it to look like so
Person F M V
Andy Orange Beef Carrot
Bobby Apple Ham Carrot
How can I manage to do this? Oh, and I'll also point some stuff in case you guys missed it:
The Types have no particular order, if you notice Andy's, the order is F M V V, but Bobby's is M F V.
Multiple instances of Type may occur, just like in Andy's case, notice the double V. But even so, I want it so that the only V that counts is the first one, thats why in the transposed table, the V is Carrot, because the Carrot occurred first (the Spinach is ignored).
I dont know if I ask too much, but even just the gist of the solution would be very helpful for me. The main point of my question is to ask how can I transpose such unsorted items, whilst paying attention to the 1st point. The 2nd point is important too, but I can wait or ask later if you guys dont feel like answering.
Thanks for reading, please share me your knowledge.
The easisest/quickest way is to create a new column before the TID column which has this formula in it.
=[Person]&"_"&[Type]
For instance say your data started in column B, see screen shots (TID), then the first formula would be:
=C2&"_"&D2 and will result in Andy_F being created. Copy this down for all the names you have.
You should have something like this:
NEW TID Person Type Name
Andy_F 1 Andy F Orange
Andy_M 2 Andy M Beef
Andy_V 3 Andy V Carrot
Andy_V 4 Andy V Spinach
Bobby_M 5 Bobby M Ham
Bobby_F 6 Bobby F Apple
Bobby_V 7 Bobby V Carrot
Next, set up a table like this (using copy unique items, if necessary), with unique names on the vertical and Types along the horizontal:
F M V
Andy [form]
Bobby
Where [form] is a vlookup formula as in the screen shots below:
Resulting in the correct table for you, once the formula is copied to all cells in the new table:
Vlookup will grab the first item that matches its search critera, so multiple matches will be ignored.
The formula for Andy F in the table is VLOOKUP($G2&"_"&H$1,$A$2:$E$8,5,0), with the data as in the screen shots.
A better way might be to use VBA, but this should do the trick.

Resources