Using Pivot Tables to track user chat log - python-3.x

I have a platform tracking multiple customer and rep chats. I want to create data frames containing a single customer's entire, ordered chat history.
Here's a link to the example data
And for quick reference:
Convo Room Date Message Order User ID Role Chat contents
A1 3-Oct-17 1 JOHN CUSTOMER Hi, can you help?
A1 4-Oct-17 2 ALICE REP Sure, what's up?
A1 5-Oct-17 3 JOHN CUSTOMER I have warts.
A1 6-Oct-17 4 JOHN CUSTOMER Please don't hang up, it's just warts.
B1 7-Oct-17 1 JOHN CUSTOMER Hi, can YOU help?
B1 8-Oct-17 2 MARY REP Sure, I heard about Alice.
B1 9-Oct-17 3 MARY REP I also have warts.
B1 10-Oct-17 4 JOHN CUSTOMER Oh, nevermind then, gotta go.
C1 7-Oct-17 1 JIM CUSTOMER Hi, can you help?
C1 8-Oct-17 2 ALICE REP Maybe, what's up?
C1 9-Oct-17 3 JIM CUSTOMER Not warts.
C1 10-Oct-17 4 ALICE REP Good, that's the only thing I cannot handle.
D1 15-Oct-17 1 JOHN CUSTOMER Hi, pls help. Warts.
D1 16-Oct-17 2 JUDE REP Perfect, I cure them!
D1 17-Oct-17 3 JUDE REP …with fire.
D1 18-Oct-17 4 JUDE REP Are you still there? Dang, lost another one.
In my mind, the first step is to get the data ordered using pivot tables. Next, I can focus on separating chats into data frames for sentiment analysis or other metrics.
I believe I am close, but I keep getting one part of the sorting wrong.
What I have so far:
df = test.pivot_table(index=['Role', 'User ID', 'Date', 'Convo Room', 'Message Order'],columns=["Role"],aggfunc='first')
df.head()
Which returns the following:
Using Excel, I believe this is generally what I want though I'm certain there are many ways to visualize this:

You are in the right track , just need to make sure you pass the right column into pivot_table
df.pivot_table(index=[ 'User ID', 'Convo Room', 'Message Order'],columns=["Role"],values='Chat contents',aggfunc='first')

Related

GSuite App Maker - how to pivot datasources

I have a simple app that takes attendance for a list of students. There are two datasources, a students table with first name, last name, id, and site, and an attendance table with first name, last name, date and present.
I want to be able to log attendance by getting a list of the students, entering a date, and check a box if they're in attendance or not (boolean column).
What I would like to do is pivot the attendance date for a new view so that instead of having a column with distinct dates, I'll have a columns for each date showing the value of the checkbox.
Ex:
Attendance 1:
First Last Date Present
Bob Smith 10/1 0
Bob Smith 10/2 1
Bob Smith 10/3 1
Kevin Brown 10/1 1
Kevin Brown 10/2 1
Kevin Brown 10/3 1
New Pivoted View:
First Last 10/1 10/2 10/3
Bob Smith 0 1 1
Kevin Brown 1 1 1
Is there a simple way to get this result in App Maker?
EDIT: For clarification. The primary purpose of the app is to capture attendance data in a classroom setting. So there is a flow where a teacher pre-populates a list of students and then checks the boxes down the line to log where a student was present/absent.
What I would like to be able to do is provide a page that presents the attendance data in a wide format so teachers can also look across the columns to see who was there on a given day.

How to find row with specific column value only

I am trying to figure out the names who only have specific column value and nothing else.
I have tried filtering the rows according to the column value but that isn't what I want, I want the names who only went to eat pizza.
I want names who only had pizza, so my code should return John only and not peter as john only had pizza
Click to view data frame
Your description is not clear. At first, it looks like a simple .loc will be enough. However, after viewing your picture of sample data, I realized it is not that simple. To get what you want, you need to identify duplicated or non-duplicated names having one Restaurant value only, and pick it. To do this, you need to use nunique and check it eq(1), and assign it a mask m. Finally, using m with slicing to get your desire output:
Your sample data:
In [512]: df
Out[512]:
Name Restaurant
0 john pizza
1 peter kfc
2 john pizza
3 peter pizza
4 peter kfc
5 peter pizza
6 john pizza
m = df.groupby('Name').Restaurant.transform('nunique').eq(1)
df[m]
Out[513]:
Name Res
0 john pizza
2 john pizza
6 john pizza
If you want to show only one row, just chain additional .drop_duplicates
df[m].drop_duplicates()
Out[515]:
Name Restaurant
0 john pizza

Excel VBA to find non unique values with multiple conditions

I am looking for some help trying to create an excel macro. I have a very large sheet that look a bit like this:
Account NAME Address Dealer
68687 Sara 11 Wood 1111
68687 Sara 11 Wood 1111
68687 Sara 11 Wood 1111
12345 Tom 10 Main 7878
12345 Tom 10 Main 7878
54321 Tom 10 Main 7878
10101 John 25 Lake 3232
10101 25 Lake 3232
11111 John 25 Lake 3232
What I need to do is to highlight all the rows on the sheet where each Dealer has more than one unique value in the Account column, but it must also have some value in the name column.
So in the above example I would only want to highlight all the rows for dealer 7878.
I am not certain if I should look at loops or arrays, they might take a long time as the sheet is quite large.
Looking for some help.
Thanks.
James - Dirk gave you a good answer in his comment. It looks like this ...
The format formula is also put into Column F, so you can see the results of the calculation.
If you feel you should still have a VBA solution, this gives you a good starting point for how to layout your code ...
Ignore rows with empty name
Count rows where the dealer is the same as the dealer in the current row, and the account is NOT the same as the account in the current row
If the count found in Step 2 is greater than 0, highlight the current row.

List the next couple occuring birthdays in excel

Given the following data
A B
Steven 01/05/1958
Mike 05/12/1923
Bob 05/11/2001
Richard 10/22/1985
Maverick 12/25/1991
Ed 01/07/1954
I'd like to get a list in, let's just say the column D, containing the next couple birthdays that will occur.
So if today was 05/05/2016, I'd like to see
D E
Bob 05/11/2001
Mike 05/12/1923
My current approach (yet not working properly) is to create another column and have the days until the birthday calculated there, using this formula:
=DATE(YEAR(B2)+DATEDIF(B2+1;TODAY();"y")+1;MONTH(B2);DAY(B2))-TODAY()
Then I list the birthdays that come up in the next 5 days using:
=IF(ISERROR(INDEX($A$2:$C$5,SMALL(IF($A$2:$C$5<5,ROW($A$2:$A$5)),ROW(1:1)),2)),"",INDEX($A$2:$C$5,SMALL(IF($A$2:$A$5<5,ROW($A$2:$A$5)),ROW(1:1)),2))
I'd rather have the next 5 upcoming birthdays, no matter how far away from today they are.
Any Ideas how to achieve this without using makros?
Help is much appreciated!
To get the birthday difference from today in days :
=(DATEDIF($D$1,DATE(IF((DATE(YEAR($D$1),MONTH(B2),DAY(B2))>$D$1),YEAR($D$1),YEAR($D$1)+1),MONTH(B2),DAY(B2)),"D"))+0
The first BD from current date :
=VLOOKUP(SMALL(A2:A8,1)+0,A2:B8,2,FALSE)
Please see the img for more details :
Another approach would be to use the Advanced Filter. And you could automate it using VBA.
For the Criteria:
A2: =DATE(YEAR(TODAY()),MONTH(B6),DAY(B6))>=TODAY()
B2: =(TODAY()+$C$2)>=DATE(YEAR(TODAY()),MONTH(B6),DAY(B6))
Range is the number of days after today to show birthdays.
OK slightly different approach
Instead of counting days in a helper column, change the date in a helper column. Then sort that helper column for only the first 5 entries. This will show upcoming birthDAYS instead of birthDATES.
So assuming Names in column A, Dates in Column B, Column C is created with:
=DATE(YEAR(TODAY())+IF(TODAY()>DATE(YEAR(TODAY()),MONTH(B2),DAY(B2)),1,0),MONTH(B2),DAY(B2))
Now I was assuming no header rows and A1 was the first entry so to display the next 5 entries in column D I used:
=IF(ROW()<=5,SMALL($C$1:$C$6,ROW()),"")
Now this will not pull names but just the upcoming birthDAYS not birthDATES. The year being the difference between the two.
If you want to pull the names as well you can use the following:
=IF(D2<>"",INDEX($A$1:$A$6,MATCH(D2,$C$1:$C$6,0)),"")
Right now it will not return names of multiple people with the same date but there are ways around that. If you need that too let us know.
(A) (B) (C) (D) (E)
Steven 58/01/05 17/01/05 16/05/11 Bob
Mike 23/05/12 16/05/12 16/05/12 Mike
Bob 01/05/11 16/05/11 16/10/22 Richard
Richard 85/10/22 16/10/22 16/12/25 Maverick
Maverick 91/12/25 16/12/25 17/01/05 Steven
Ed 54/01/07 17/01/07
UPDATE
In order to deal with duplicate birthdays... try the following:
=IF(E5<>"",INDEX($A$1:$A$6,MATCH(E5,$C$1:$C$6,0)+COUNTIF($E$1:E5,E5)-1),"")
That was the entry for row 5. I tested it with the same birthdate, but I forgot to check for same birthday (different years).
UPDATE 2
NEW TABLE
The table below matches to formula from the last update
(A) (B) (C) (D) (E) (F) (G)
Steven 58/01/05 17/01/05 59 16/05/11 Bob 15
Mike 23/05/12 16/05/12 93 16/05/12 Mike 93
Bob 01/05/11 16/05/11 15 16/10/22 Richard 31
Richard 85/10/22 16/10/22 31 16/12/25 Maverick 25
Maverick 91/12/25 16/12/25 25 16/12/25 Ed 21
Ed 95/12/25 16/12/25 21
I inserted a column in D which shifted things right. The following was placed in D1.
=year(C1)-Year(B1)
In column G I had it lookup the age
=IF(E1<>"",INDEX($D$1:$D$6,MATCH(E1,$C$1:$C$6,0)+COUNTIF($E$1:E1,E1)-1),"")

Search and return multiple rows in excel

This is my problem. I have a spreadsheet containing alot of data. What I want to do is create a way that I can search for a recurring name, and return all the information (rows) associated to it. Here is a mini example below:
A B C D E F
ID Name Date Client ID Balance Owed
100 Tom 1/11/11 256 300 200
100 Tom 1/12/11 565 500 150
100 Tom
200 Jay
200 Jay
300 Frank
100 Tom
100 Tom
400 Ted
You get the idea (I hope). So what I want to do on another sheet is search for "Tom" and get it to return ALL instances of Tom in the Name column and return the data in the rows associated to Tom. So I would get back 5 results of Tom with all the necessary information. Thanks in advance!
B
Have you tried Pivot Table option found in the Excel, this might help you without any coding, if all you need is to find some duplicates
Could just apply a filter to the data and have the user select their name from column B. No need to copy data this way, so the update issue goes away. (probably best to delete the blank row below the heading row first)

Resources