Python x days y hrs z mins to date - python-3.x

I have a pandas dataframe with the following values:
tx['Age']
0 7 mins ago
1 8 mins ago
2 12 mins ago
3 14 mins ago
4 15 mins ago
5 9 hrs 21 mins ago
6 11 hrs 13 mins ago
7 11 hrs 13 mins ago
8 11 hrs 14 mins ago
9 11 hrs 15 mins ago
10 1 day 3 hrs ago
11 1 day 3 hrs ago
12 1 day 3 hrs ago
13 1 day 3 hrs ago
14 1 day 3 hrs ago
15 1 day 3 hrs ago
16 1 day 3 hrs ago
17 1 day 3 hrs ago
18 1 day 3 hrs ago
19 1 day 3 hrs ago
20 1 day 4 hrs ago
21 2 days 12 hrs ago
22 2 days 14 hrs ago
23 2 days 14 hrs ago
24 2 days 22 hrs ago
Name: Age, dtype: object
And I would like to transform this column into dates.
Any idea folks?
Thanks!

There are two parts in your problem. First, extract the days, hours, and minutes and replace the missing parts with zeros:
pattern = r"(?:(\d+) days?)? ?(?:(\d+) hrs?)? ?(?:(\d+) mins?)?"
parts = df['Age'].str.extract(pattern).fillna(0).astype(int)
Second, convert the days, hours, and minutes to minutes and feed the minutes to the TimedeltaIndex constructor:
minutes = ((parts[0] * 24 + parts[1]) * 60 + parts[2]).astype(str)
-pd.TimedeltaIndex("00:" + minutes + ":00")
Note the "minus" sign: it means "ago." The result is a Timedelta. To make it a date, you must add it to some reference date.

Related

resampling a pandas dataframe and filling new rows with zero

I have a time series as a dataframe. The first column is the week number, the second are values for that week. The first week (22) and the last week (48), are the lower and upper bounds of the time series. Some weeks are missing, for example, there is no week 27 and 28. I would like to resample this series such that there are no missing weeks. Where a week was inserted, I would like the corresponding value to be zero. This is my data:
week value
0 22 1
1 23 2
2 24 2
3 25 3
4 26 2
5 29 3
6 30 3
7 31 3
8 32 7
9 33 4
10 34 5
11 35 4
12 36 2
13 37 3
14 38 10
15 39 5
16 40 7
17 41 10
18 42 11
19 43 15
20 44 9
21 45 13
22 46 5
23 47 6
24 48 2
I am wondering if this can be achieved in Pandas without creating a loop from scratch. I have looked into pd.resample, but can't achieve the results I am looking for.
I would set week as index, reindex with fill_value option:
start, end = df['week'].agg(['min','max'])
df.set_index('week').reindex(np.arange(start, end+1), fill_value=0).reset_index()
Output (head):
week value
0 22 1
1 23 2
2 24 2
3 25 3
4 26 2
5 27 0
6 28 0
7 29 3
8 30 3

How do I round up months and days to the nearest month or half month

I have a series of time frames in excel that I wish to round up/down to the either the half month or the nearest month.
I have triedto use this code to find the difference between 2 dates in months and days.
=DATEDIF(G3, H3, "ym") &" months, " &DATEDIF(G3, H3, "md") &" days"
Days
2 months, 30 days
2 months, 29 days
1 months, 0 days
3 months, 28 days
1 months, 0 days
3 months, 0 days
3 months, 0 days
6 months, 5 days
4 months, 17 days
5 months, 24 days
6 months, 5 days
5 months, 24 days
5 months, 24 days
5 months, 24 days
2 months, 29 days
5 months, 24 days
3 months, 0 days
4 months, 17 days
4 months, 17 days
3 months, 0 days
4 months, 1 days
0 months, 29 days
1 months, 0 days
0 months, 28 days
0 months, 29 days
3 months, 0 days
0 months, 1 days
For 1 month 14 days round up to 1.5 month
Nearest month or half month
Days
3 months
2 months
1 months
4 months
1 months
3 months
3 months
6 months
4.5 months
6 months
If your text were in A1 then
=TRIM(LEFT(A1,2))+IF(VALUE(TRIM(MID(A1,11,2)))=14,0.5,IF(VALUE(TRIM(MID(A1,11,2)))>14,1,0))&" Months"

Replace values from one column in dataframe

import pandas as pd
import numpy as np
import ast
pd.options.display.max_columns = 20
I have dataframe column season that looks like this (first 20 entries):
season
0 2006-07
1 2007-08
2 2008-09
3 2009-10
4 2010-11
5 2011-12
6 2012-13
7 2013-14
8 2014-15
9 2015-16
10 2016-17
11 2017-18
12 2018-19
13 Career
14 season
15 2018-19
16 Career
17 season
18 2017-18
19 2018-19
It starts with season and ends with Career. I want to replace years with numbers starting with 1 and ending when there's career. I want to be like this:
season
0 1
1 2
2 3
3 4
4 5
5 6
6 7
7 8
8 9
9 10
10 11
11 12
12 13
13 Career
14 season
15 1
16 Career
17 season
18 1
19 2
So counting should reset every time there's season in column and end every time there's career.
Create consecutive groups by compare mask created by Series.isin with shifted values with GroupBy.cumcount for counter:
s = df['season'].isin(['Career', 'season'])
df['new'] = np.where(s, df['season'], df.groupby(s.ne(s.shift()).cumsum()).cumcount() + 1)
print (df)
season new
0 2006-07 1
1 2007-08 2
2 2008-09 3
3 2009-10 4
4 2010-11 5
5 2011-12 6
6 2012-13 7
7 2013-14 8
8 2014-15 9
9 2015-16 10
10 2016-17 11
11 2017-18 12
12 2018-19 13
13 Career Career
14 season season
15 2018-19 1
16 Career Career
17 season season
18 2017-18 1
19 2018-19 2
For replace column season:
s = df['season'].isin(['Career', 'season'])
df.loc[~s, 'season'] = df.groupby(s.ne(s.shift()).cumsum()).cumcount() + 1
print (df)
season
0 1
1 2
2 3
3 4
4 5
5 6
6 7
7 8
8 9
9 10
10 11
11 12
12 13
13 Career
14 season
15 1
16 Career
17 season
18 1
19 2

How to schedule a Cron job to run 4th week of the year

I work on an application that uses native Unix CRON tab for scheduling jobs. the description of parameters are as follows :
Minute, Hour, Da_of_Week(1-7, 1=Sun), Day_of_Month(1-31), Day_of_Year(1-365), Week (1-52), Month (1-12)
I want to run a job on Monday of the 1st week of the year at 8 pm, but I don't know how to determine when the week starts. Is 31 Dec 2017 - 06th Jan 2018 a first week or 7th Jan to 13th Jan 2018 a first week ?
Having cron jobs running on particular week numbers is not easy
as everything depends on the definition of week numbers you are
used too.
European (ISO 8601)
This ISO 8601 standard is widely used in the world: EU and most of other
European countries, most of Asia, and Oceania
The ISO 8601 standard states the following:
There are 7 days in a week
The first day of the week is a Monday
The first week is the first week of the year which contains a
Thursday. This means it is the first week with 4 days or more
in January.
With this definition, it is possible to have a week number 53. These occur with the first of January is on a
Friday (E.g. 2016-01-01, 2010-01-01). Or, if the year before was a
leap year, also a Saturday. (E.g. 2005-01-01)
December 2015 January 2016
Mo Tu We Th Fr Sa Su CW Mo Tu We Th Fr Sa Su CW
1 2 3 4 5 6 49 1 2 3 53
7 8 9 10 11 12 13 50 4 5 6 7 8 9 10 01
14 15 16 17 18 19 20 51 11 12 13 14 15 16 17 02
21 22 23 24 25 26 27 52 18 19 20 21 22 23 24 03
28 29 30 31 53 25 26 27 28 29 30 31 04
American or Islamic (Not ISO 8601)
Not all countries use the ISO 8601 system. They use a more absolute approach.
The American system is used in Canada, United States, New Zealand, India, Japan,...
The Islamic system is generally used in the middle east.
Both systems are very similar.
American:
There are 7 days in a week
The first day of the week is a Sunday
The first week starts on the 1st of January
Islamic:
There are 7 days in a week
The first day of the week is a Saturday
The first week starts on the 1st of January
With these definitions, it is possible to have partial weeks at the
beginning and the end of a year. Hence the first and last week of the
year could not contain all weekdays.
American:
December 2015 January 2016
Su Mo Tu We Th Fr Sa CW Su Mo Tu We Th Fr Sa CW
1 2 3 4 5 49 1 2 01
6 7 8 9 10 11 12 50 3 4 5 6 7 8 9 02
13 14 15 16 17 18 19 51 10 11 12 13 14 15 16 03
20 21 22 23 24 25 26 52 17 18 19 20 21 22 23 04
27 28 29 30 31 53 24 25 26 27 28 29 30 05
31 06
Islamic:
December 2015 January 2016
Sa Su Mo Tu We Th Fr CW Sa Su Mo Tu We Th Fr CW
1 2 3 4 49 1 01
5 6 7 8 9 10 11 50 2 3 4 5 6 7 8 02
12 13 14 15 16 17 18 51 9 10 11 12 13 14 15 03
19 20 21 22 23 24 25 52 16 17 18 19 20 21 22 04
26 27 28 29 30 31 53 23 24 25 26 27 28 29 05
30 31 06
Note: this could be particularly cumbersome for the task you try to
perform. Especially if it has to occur on the Monday of the first
week. This Monday might not exist.
Importing this in the cron
Adding these systems to the cron cannot be done in a direct way. The
week testing should be done by means of a conditional test of the form
weektestcmd weeknr && cmd
For a cronjob to be run only on the Monday of the 4th week of the year at 20:00 system time (as the OP requested), the crontab would look then as:
# Example of job definition:
# .---------------- minute (0 - 59)
# | .------------- hour (0 - 23)
# | | .---------- day of month (1 - 31)
# | | | .------- month (1 - 12) OR jan,feb,mar,apr ...
# | | | | .---- day of week (0 - 6) (Sunday=0 or 7)
# | | | | |
# * * * * * command to be executed
0 20 * * 1 weektestcmd 4 && cmd
With weektestcmd defined as
ISO 8601 week numbers:
#!/usr/bin/env bash
[[ $(date '+%V') -eq $1 ]]
American calendar week numbers:
#!/usr/bin/env bash
# obtain the day of year
doy=$(date "+%j")
# compute the week offset of the first of January
## compute the day of the week with Mo=1 .. Su=7
offset=$(date -d $(date "+%Y")-01-01 "+%u")
## Take the modulo for the offset as Su=0
offset=$(( offset%7 ))
# Compute the current week number
cw=$(( (doy + offset + 6)/7 ))
[[ $cw -eq $1 ]]
Islamic calendar week numbers:
#!/usr/bin/env bash
# obtain the day of year
doy=$(date "+%j")
# compute the week offset of the first of January
## compute the day of the week with Mo=1 .. Su=7
offset=$(date -d $(date "+%Y")-01-01 "+%u")
## Take the modulo for the offset as Sa=0
offset=$(( (offset + 1)%7 ))
# Compute the current week number
cw=$(( (doy + offset + 6)/7 ))
[[ $cw -eq $1 ]]
Note: Be aware that in the American and Islamic system it might be possible not to have a Monday in week 1.
Note: There are other methods of defining a week number. Nonetheless, the approach stays the same. Define a script which checks the week number and use it in the cron.
You have to put a condition in your crontab to do that. Your cron will look
something like this,
0 20 1-7 1 * root [ `date +%a` == "Mon" ] && /run/some/script
cron 0 20 1-7 1 * runs at 8pm everyday from 1st to 7th in the month of January.
Following checks that the day is Monday before executing your script.
[ `date +%a` == "Mon" ]
With this, script will run on the 7th January 2019, which is within the first week of the year.
$ cal 01 2019
January 2019
Su Mo Tu We Th Fr Sa
1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31

Excel Pivot table - get maximum for a period of 24 hours

I have an excel with:
Days of the week and 24 hours for each day.
Each hour I get some points.
I would like to calcute the maximum of cumulate points I can get within 24 hours.
[TEST.XLSX]
2 Columns:
Monday Points
0 34
1 32
2 4
3 54
4 12
5 55
6 4
7 4
8 555
9 787
10 8
11 76
12 78
13 8
14 656
15 7
16 4
17 45
18 54
19 543
20 56
21 65
22 4
23 3
Tuesday
0 56
1 7
2 333
3 9
4 876
5 3333
6 3333
7 76
8 3333
9 465
10 7
11 6
12 5
13 6
14 7
15 6
16 7
17 65
18 555555555
19 6
20 5
21 4
22 6
23 6
Wednesday
0 6
1 7
...
Thanks for your help!
Use real date time values in your hours column. Delete the rows with the day text. Instead, use a formula that increments from a starting date/time. For example: cell A2 contains the date and midnight time for Nov 17. Cell A3 and copied down contains the formula
=A2+TIME(1,0,0)
which increments by one hour.
Now you con build a pivot table. Group by the date/time value by day and hour. Show the subtotal for the day and set its value field settings to Max.

Resources