I have two columns in my excel,Session_Start_time and Time_taken. Session_start_time has date and time and time_taken has time taken to complete the session like below .
For example
Session_Start_time | Time_Taken
01-AUG-2016 00:03:57 | 10
01-AUG-2016 00:07:19 | 15
01-AUG-2016 00:10:28 | 10
02-AUG-2016 00:13:26 | 20
02-AUG-2016 00:20:26 | 30
02-AUG-2016 00:25:26 | 20
03-AUG-2016 03:20:26 | 30
03-AUG-2016 04:13:26 | 40
03-AUG-2016 07:13:26 | 40
I need to group the session_start_time by the dates and have the avg time_taken for that particular day.
Session_Start_time | Time_Taken
01-AUG-2016 | 11.67
02-AUG-2016 | 23.33
03-AUG-2016 | 36.66
You could add a third column that pulls out just the date of Session_Start_Time with the formula below starting in C2 and drag it down to fill:
=MONTH(A2)&"/"&DAY(A2)&"/"&YEAR(A2)
From there, you could create a pivot table with your new column as your row labels, and your Time_Taken as y our values.
Related
I am having a problem, hope you can help.
I need to have the differente in hours between duplicates. Example:
Date Time | SESSION_ID | Column I need
24/01/2020 10:00 | 100 | NaN
24/01/2020 11:00 | 100 | 1
14/03/2020 12:00 | 290 | NaN
16/03/2020 13:00 | 254 | NaN
16/03/2020 14:00 | 100 | 1251
In session_ID column, there are 3 duplicates with value 100.
I need to know the difference in hours between those sessions, which would be 1 hour between the first and the second, and 1 251 hours between the second and the third.
Does anyone has any type of clue on how this could be done?
If one has the Dynamic Array formula XLOOKUP, put this in C2 and copy down:
=IF(COUNTIF($B$1:B1,B2),A2-XLOOKUP(B2,$B$1:B1,$A$1:A1,,0,-1),"NaN")
Then format the column: [h]
If not then use INDEX/AGGREGATE in its place:
=IF(COUNTIF($B$1:B1,B2),A2-INDEX(A:A,AGGREGATE(14,7,ROW($B$1:B1)/($B$1:B1=B2),1)),"NaN")
I'm looking to remove duplicates from a 250,000 row excel sheet based on a 3 month rolling time condition.
We have a lot of usersIDs and the dates which they visited but a lot of these visits are very far apart (sometimes over a year) and a lot of them are within the same day/couple day period.
The best way to explain what I want to do is with an example. So if they first visited on 1st Jan, 1st Jan, 3rd Jan, 8th Feb, 4th June, 5th June, 1st Dec, 1st Dec, 2nd Dec, I would want to grab that first date of 1st Jan, 4th June and 1st Dec.
If they visited 1st Jan, 1st Jan, 3rd Jan, 8th Feb, 9th Apr then 1st August, 1st Sept, I would want 1st Jan and 8th August.
So we want to grab the first date, then see how often they visit within 3 months of each visit and if they leave for more than a 3 month period, grab the first date that they return. Sometimes they come back 4 or 5 times after 3 months and the data can span several years.
Is there a way for me to achieve this? It would be great to get some help as this is driving me mad.
Cheers
If the UserID is in column A and the VisitDate is in B with the headings in row 1 and then a blank row in 2 and the data starting in row 3 then try this (explanation below):
Array Formula version:
sort the rows ascending by VisitDate
in B2 put 1/1/1900 so it won't match anything (but it has to be a date)
in C3 put this array formula (press control-shift-enter instead of just enter):
=SUM((B$2:B2<DATE(YEAR(B3),MONTH(B3)-3,DAY(B3)))*(A$2:A2=A3))=SUM((A$2:A2=A3)*1)
Copy the formula in C3 down to every row of data
Filter on Unique = TRUE
if you want to resort you will need to copy and paste back column C by values
New non-array formula version:
sort the rows ascending by VisitDate
in B2 put 1/1/1900 so it won't match anything (but it has to be a date)
in C3 put this normal formula (just press enter):
=COUNTIFS(B$2:B2,"<"&DATE(YEAR(B3),MONTH(B3)-3,DAY(B3)),A$2:A2,A3)=COUNTIF(A$2:A2,A3)
Copy the formula in C3 down to every row of data
Filter on Unique = TRUE
if you want to resort you will need to copy and paste back column C by values
This produces the following with my sample data (array formulas may take a very long time to calculate for lots of rows):
| A | B | C
---+--------+------------+--------
1 | UserID | VisitDate | Unique
2 | | 1/01/1900 |
3 | a | 1/01/2017 | TRUE
4 | a | 1/01/2017 | FALSE
5 | b | 2/01/2017 | TRUE
6 | b | 2/01/2017 | FALSE
7 | a | 3/01/2017 | FALSE
8 | c | 3/01/2017 | TRUE
9 | c | 3/01/2017 | FALSE
10 | b | 4/01/2017 | FALSE
11 | c | 5/01/2017 | FALSE
12 | a | 8/02/2017 | FALSE
13 | b | 9/02/2017 | FALSE
14 | c | 10/02/2017 | FALSE
15 | a | 4/06/2017 | TRUE
16 | a | 5/06/2017 | FALSE
17 | b | 5/06/2017 | TRUE
18 | b | 6/06/2017 | FALSE
19 | c | 6/06/2017 | TRUE
20 | c | 7/06/2017 | FALSE
21 | a | 1/12/2017 | TRUE
22 | a | 1/12/2017 | FALSE
23 | a | 2/12/2017 | FALSE
24 | b | 2/12/2017 | TRUE
25 | b | 2/12/2017 | FALSE
26 | b | 3/12/2017 | FALSE
27 | c | 3/12/2017 | TRUE
28 | c | 3/12/2017 | FALSE
29 | c | 4/12/2017 | FALSE
Because the formula compares the current row with all the rows above looking for rows with dates in the past the data needs to be sorted with the oldest dates first.
How the array formula works:
=SUM((B$2:B2<DATE(YEAR(B3),MONTH(B3)-3,DAY(B3)))*(A$2:A2=A3))=SUM((A$2:A2=A3)*1)
DATE(YEAR(B3),MONTH(B3)-3,DAY(B3)) is 3 months ago (even if it is 92 days)
(B$2:B2<DATE(YEAR(B3),MONTH(B3)-3,DAY(B3))) is an array of TRUE/FALSE values which has a TRUE for every row above that is older than 3 months ago
(A$2:A2=A3) is an array of TRUE/FALSE values which has a TRUE for every row above that matches the user ID
(B$2:B2<DATE(YEAR(B3),MONTH(B3)-3,DAY(B3)))*(A$2:A2=A3) does an AND of the arrays so 1 is returned (TRUE*TRUE=1) for each row above that has the same name and a date that is older than 3 months ago
SUM((B$2:B2<DATE(YEAR(B3),MONTH(B3)-3,DAY(B3)))*(A$2:A2=A3)) adds all the TRUE rows above that have the same name and a date that is older than 3 months ago
SUM((A$2:A2=A3)*1) adds the number of rows above that have the same name (TRUE*1=1)
=SUM((B$2:B2<DATE(YEAR(B3),MONTH(B3)-3,DAY(B3)))*(A$2:A2=A3))=SUM((A$2:A2=A3)*1) compares the two sums and returns TRUE if all the rows above that have the same name are all older than 3 months ago
Methodology:
I originally just played with a column of dates - no userID. I wanted to find a way to know if the date on a particular was more than 3 months after all the dates before it (I implicitly assumed that the dates were sorted). I reasoned that if a count of the dates before the current row matched a count of the dates before the current row that were older than 3 months in the past then I would have the answer I wanted. So I originally put this formula in C3 and copied it down:
=COUNTIF(B$2:B2,"<"&(B3-90))=COUNTA(B$2:B2)
Then change it to 3 months instead of 90 days:
=COUNTIF(B$2:B2,"<"&DATE(YEAR(B3),MONTH(B3)-3,DAY(B3)))=COUNTA(B$2:B2)
And then to add the userID we need a way to compare multiple criteria - this is where COUNTIFS comes in (if you have Excel 2007 or better):
=COUNTIFS(B$2:B2,"<"&DATE(YEAR(B3),MONTH(B3)-3,DAY(B3)),A$2:A2,A3)=COUNTIF(A$2:A2,A3)
And then I converted it to this array formula:
=SUM((B$2:B2<DATE(YEAR(B3),MONTH(B3)-3,DAY(B3)))*(A$2:A2=A3))=SUM((A$2:A2=A3)*1)
In retrospect I don't know if giving the array formula was a good idea or not: I don't know whether the array formula would be better/faster than COUNTIFS or not. So use whichever you prefer.
In my theoretical data set, I have a list which shows the date-time of a sale, and the employee who completed the transaction.
I know how to do grouping in order to show how many sales each employee has per day, but I'm wondering if there's a way to count how many grouped days have more than 0 sales.
For example, here's the original data set:
Employee | Order Time
A | 8/12 8:00
B | 8/12 9:00
A | 8/12 10:00
A | 8/12 14:00
B | 8/13 10:00
B | 8/13 11:00
A | 8/13 15:00
A | 8/14 12:00
Here's the pivot table that I have created:
Employee | 8/12 | 8/13 | 8/14
A | 3 | 1 | 1
B | 1 | 2 | 0
And here's what I want to know:
Employee | Working Days
A | 3
B | 2
Split your Order Time column (assumed to be B) into two, say with Text to Columns and Space as the delimiter (might need a little adjustment). Then pivot (using the Data Model) as shown:
and sum the results (outside the PT) such as with:
=SUM(F3:H3)
copied down to suit.
Columns F:G may then be hidden.
I fully support #Andrea's Comment (a correction) on the above:
I think this could have been made simpler. If you remove the "Time" in values of the pivot table and then move "Order" from columns to values and use distinct count as in the example. It should count Employee per date making the sum not needed. If you scale this to make it larger. Say 50 dates then the =Sum() needs to be moved each time.
after years of quietly learning from this site I've finally hit a question who's answer I cannot seem to find on StackOverflow...
I have a pivot table that needs to calculate Net Promoter Score from several groups within a population.
Net promoter score is calculated like so:
[% of Population that give 9 or 10/10] - [% of Population that give 1 to 6/10]
Each individual record in my source data can only have a single Score of between 1 and 10:
RAW DATA:
Date (dd/mm) Country Type Score (1-10) NPS Category
01/05 US Order enq. 9 Promoter
13/05 US Check-out 5 Detractor
28/05 US Order enq. 7 Passive
So, with help from the answers below I've added a column to categorise each individual into the Promoter (9 or 10), Passive (7 or 8) and Detractor (1 to 6) groups based on that score: screenshot of raw data (with sensitive items hidden).
All that remains now is:
How can I create a calculated 'NPS' column like the one shown in my (rudimentary) representation of a pivot table below that takes the Detractor value from the Promoter value?
D = Detractor group
Pa = Passive group
Pr = Promoter group
| Order enquiry | Check-out |
| D Pa Pr NPS | D Pa Pr NPS |
-------------------------------------------------- |
GB | | |
May | 0 0 100 100 | 30 20 50 20 |
Jun | 10 30 60 50 | 35 35 60 25 |
Jul | 30 20 50 20 | 40 10 40 0 |
US | | |
May | 45 15 40 - 5 | 50 10 40 -10 |
Jun | 40 30 30 -10 | 40 30 30 -10 |
Jul | 5 35 60 55 | 20 40 40 20 |
My attempt at a calculated column can be seen in this screenshot. This results in an error and of course I haven't managed to convert the NPS counts into percentages yet.
It would be my suggestion to create a new column in the source that calculates D, Pa,Pr by a formula.
You can now create the % for these values in the pivot. The NPS column can either be calculated after pivoting the output field, or by creating a pivot-column formula in Excel.
It's not clear from your question how your data is laid out, or what exactly you're asking. From what I can see, you need to add a column in your raw data table, which says something like
=COUNTIFS(UniqueID,MyUniqueID,Score,">=9")-COUNTIFS(UniqueID,MyUniqueID,Score,"<=6")
Then another column that says
=IF(NetPromoter>=9,"Pr",IF(NetPromoter>=7,"Pa","D"))
And then in your pivot table you add the Classification as a new subcolumn, and add the NPS as the Average of your NPS column, or something like that.
Please show your data if you want the formulas changed to meet your actual range/variable terms.
I have a table in excel setup as followed:
DATE | TIME | PERSON IDENTIFIER | ARRIVAL OR LEAVING
01/01/15 | 13:00 | AB1234 | A
01/01/15 | 13:01 | AC1234 | A
01/01/15 | 13:03 | AD1234 | A
01/01/15 | 13:05 | AE1234 | A
01/01/15 | 13:09 | AF1234 | A
01/01/15 | 13:10 | AB1234 | L
01/01/15 | 13:15 | AG1234 | A
01/01/15 | 13:13 | AC1234 | L
The table shows when people arrive and leave a medical ward. The ward holds 36 patients and I'm wanting to get an idea of how close it is to capacity (it's normally always full). The ward is open 24/7 and has patients arriving 24/7 but I'd like to show the time it is at the certain capacities.
For example if we inputted 24 hours of data
36 patients (0 empty beds) - 22hr 15min
35 patients (1 empty bed) - 01hr 30min
34 patients (2 empty beds) - 00hr 15min
I'm thinking we just need a count for every time some arrives and a negative count when they leave but I can't figure out how to extract the time from that.
This is going to be pretty ugly (NB using your columns from above):
order the entries sequentially
you can keep a running tally in column E of patients on hand currently with E1 = 36(or whatever starting value you have) and =IF(D2="A",E1+1,E1-1).
Get the time elapsed since the previous entry with =(B3-B2) and put that in column F
Count the chunks where you had less than a full house with =SUMIF(F:F, "<36")