Time varies in postgres server and excel - excel

I am trying a query which groups the data by months.
test_db=# select date_trunc('month', install_ts) AS month, count(id) AS count from api_booking group by month order by month asc;
month | count
------------------------+-------
2016-08-01 00:00:00+00 | 297
2016-09-01 00:00:00+00 | 2409
2016-10-01 00:00:00+00 | 2429
2016-11-01 00:00:00+00 | 3512
(4 rows)
This is the output in my postgres db shell.
How ever, when I try this query in excel, this is the output,
month | count
------------------------+-------
2016-07-31 17:00:00+00 | 297
2016-08-31 17:00:00+00 | 2409
2016-09-30 17:00:00+00 | 2429
2016-10-31 17:00:00+00 | 3512
(4 rows)
The problem is I think excel is understanding date format in some different timezone.
So, How can I tell excel to read it correctly?
OR any solution to this problem?

Try...
select date(date_trunc('month', install_ts)) AS month, count(id) AS count from api_booking
The date() strips out the time from a date with a time.

Related

subtract second datetime row from first datetime row of a column if another column shows duplicate values

I have a dataframe with two columns Order date and Customer(which have duplicates of only 2 values which has been sorted), I want to subtract the second Order date of the second occurrence of a Customer from the first Order date. Order date is in datetime format
here is a sample of the table
context I'm trying to calculate the time it takes for a customer to make a second order\
Order date Customer
4260 2022-11-11 16:29:00 (App admin)
8096 2022-10-22 12:54:00 (App admin)
996 2021-09-22 20:30:00 10013
946 2021-09-14 15:16:00 10013
3499 2022-04-20 12:17:00 100151
... ... ...
2856 2022-03-21 13:49:00 99491
2788 2022-03-18 12:15:00 99523
2558 2022-03-08 12:07:00 99523
2580 2022-03-04 16:03:00 99762
2544 2022-03-02 15:40:00 99762
I have tried deleting by index but it returns just the first two values.
expected output should be another dataframe with just the Customer name and the difference between the Second and first Order dates of the duplicate customer in minutes
expected output:
| Customer | difference in minutes |
| -------- | -------- |
| 1232 | 445.0 |
|(App Admin)| 3432.0 |
| 1145 | 2455.0 |
|6653 | 32.0 |
You can use groupby:
df['Order date'] = pd.to_datetime(df['Order date'])
out = (df.groupby('Customer', as_index=False)['Order date']
.agg(lambda x: (x.iloc[0] - x.iloc[-1]).total_seconds() / 60)
.query('`Order date` != 0'))
print(out)
# Output:
Customer Order date
0 (App admin) 29015.0
1 10013 11834.0
4 99523 14408.0
5 99762 2903.0

How to group by dates in Excel

I have two columns in my excel,Session_Start_time and Time_taken. Session_start_time has date and time and time_taken has time taken to complete the session like below .
For example
Session_Start_time | Time_Taken
01-AUG-2016 00:03:57 | 10
01-AUG-2016 00:07:19 | 15
01-AUG-2016 00:10:28 | 10
02-AUG-2016 00:13:26 | 20
02-AUG-2016 00:20:26 | 30
02-AUG-2016 00:25:26 | 20
03-AUG-2016 03:20:26 | 30
03-AUG-2016 04:13:26 | 40
03-AUG-2016 07:13:26 | 40
I need to group the session_start_time by the dates and have the avg time_taken for that particular day.
Session_Start_time | Time_Taken
01-AUG-2016 | 11.67
02-AUG-2016 | 23.33
03-AUG-2016 | 36.66
You could add a third column that pulls out just the date of Session_Start_Time with the formula below starting in C2 and drag it down to fill:
=MONTH(A2)&"/"&DAY(A2)&"/"&YEAR(A2)
From there, you could create a pivot table with your new column as your row labels, and your Time_Taken as y our values.

Excel Pivot Table: How do I count the number of working days for employees based on date-time values?

In my theoretical data set, I have a list which shows the date-time of a sale, and the employee who completed the transaction.
I know how to do grouping in order to show how many sales each employee has per day, but I'm wondering if there's a way to count how many grouped days have more than 0 sales.
For example, here's the original data set:
Employee | Order Time
A | 8/12 8:00
B | 8/12 9:00
A | 8/12 10:00
A | 8/12 14:00
B | 8/13 10:00
B | 8/13 11:00
A | 8/13 15:00
A | 8/14 12:00
Here's the pivot table that I have created:
Employee | 8/12 | 8/13 | 8/14
A | 3 | 1 | 1
B | 1 | 2 | 0
And here's what I want to know:
Employee | Working Days
A | 3
B | 2
Split your Order Time column (assumed to be B) into two, say with Text to Columns and Space as the delimiter (might need a little adjustment). Then pivot (using the Data Model) as shown:
and sum the results (outside the PT) such as with:
=SUM(F3:H3)
copied down to suit.
Columns F:G may then be hidden.
I fully support #Andrea's Comment (a correction) on the above:
I think this could have been made simpler. If you remove the "Time" in values of the pivot table and then move "Order" from columns to values and use distinct count as in the example. It should count Employee per date making the sum not needed. If you scale this to make it larger. Say 50 dates then the =Sum() needs to be moved each time.

DAX Ranking events year over year

I have a table of data that has a format similar to the following:
EventID | Event Date
--------------------
1 | 1/1/2014
2 | 2/8/2014
3 | 10/1/2014
4 | 2/5/2014
5 | 4/1/2014
6 | 9/1/2014
What I am trying to do is create a DAX formula to rank each event in the order that it happened for the year. So I want to end up with something like this. This way I can compare the events year over year as the events don't happen on any regular time schedule.
Event Date | Year | Rank
------------------------
1/1/2014 | 2014 | 1
2/8/2014 | 2014 | 2
10/1/2014 | 2014 | 3
2/5/2015 | 2015 | 1
4/1/2015 | 2015 | 2
9/1/2015 | 2015 | 3
I have tried to do this by creating a formula that will give me the day number of the year:
Day of Year =(YEARFRAC(CONCATENATE("Jan 1 ", YEAR([Event Date])),[Event Date])*360)+1
Then using rankX on this table, but I cant seem to get the proper result. Perhaps I am not understanding the use of rankX or going about this the right way.
=RANKX(FILTER(Event,EARLIER(Event[Event Year])=Event[Event Year]),Event[Day of Year])
or
=RANKX(All(Event[Event Year]),[Day of Year],,1,Dense)
Any ideas would be much appreciated!
Thanks for any help in advance!
Create the following measures:
[Year]:=YEAR(LASTDATE(Event[Event Date]))
and
[Rank]:=RANKX(FILTER(ALL(Event),[Year]=YEAR(MAX(Event[Event Date]))),FIRSTDATE(Event[Event Date]),,1,DENSE)
and this is the result that you get:
Note: My dates are in UK format and I suspect yours were in US format, so the rankings do not appear to tally with your example, but it does work!

Excel to calculate capacity levels

I have a table in excel setup as followed:
DATE | TIME | PERSON IDENTIFIER | ARRIVAL OR LEAVING
01/01/15 | 13:00 | AB1234 | A
01/01/15 | 13:01 | AC1234 | A
01/01/15 | 13:03 | AD1234 | A
01/01/15 | 13:05 | AE1234 | A
01/01/15 | 13:09 | AF1234 | A
01/01/15 | 13:10 | AB1234 | L
01/01/15 | 13:15 | AG1234 | A
01/01/15 | 13:13 | AC1234 | L
The table shows when people arrive and leave a medical ward. The ward holds 36 patients and I'm wanting to get an idea of how close it is to capacity (it's normally always full). The ward is open 24/7 and has patients arriving 24/7 but I'd like to show the time it is at the certain capacities.
For example if we inputted 24 hours of data
36 patients (0 empty beds) - 22hr 15min
35 patients (1 empty bed) - 01hr 30min
34 patients (2 empty beds) - 00hr 15min
I'm thinking we just need a count for every time some arrives and a negative count when they leave but I can't figure out how to extract the time from that.
This is going to be pretty ugly (NB using your columns from above):
order the entries sequentially
you can keep a running tally in column E of patients on hand currently with E1 = 36(or whatever starting value you have) and =IF(D2="A",E1+1,E1-1).
Get the time elapsed since the previous entry with =(B3-B2) and put that in column F
Count the chunks where you had less than a full house with =SUMIF(F:F, "<36")

Resources