Convert string formatted date to epoch - python-3.x

How to convert string formatted date to epoch format. Like the give date is in this format 2018, June 12th how to convert it to epoch in python?

You can use python datetime for all your date time related conversions.
fstring = "%Y, %B %d"
end = {"th" : fstring + "th", "rd": fstring + "rd"}
days = ["2018, June 12th", "2018, June 3rd"]
for day in days:
fstr = end["rd"] if day.endswith("rd") else end["th"]
print(datetime.datetime.strptime(day, fstr).timestamp())
1528741800.0
1527964200.0
More info here

Related

python Datetime - extract year and month

How to extract month and year from a string in DateTime stamp format?
sales_close_7 = 2018-12-07
import datetime
sales_close_7 = '2018-12-07'
date_object = datetime.datetime.strptime(sales_close_7, '%Y-%m-%d').date()
print(date_object.year)
Output: 2018
print(date_object.month)
Output: 12

Group by the dates to weeks

I have time in ms in epoch format, I need to translate this into a date and group it by a week number.
I tried the following procedure:
df.loc[0, 'seconds'] = df['seconds'].iloc[0]
for _, grp in df.groupby(pd.TimeGrouper(key='seconds', freq='7D')):x
print (grp)
df["week"].to_period(freq='w')
For example, if my 'seconds' column is presented like 1557499095332, then I want the 'dates' column to be 10-05-2019 20:08:15 and the 'Week' column to present W19 or 19.
How do I go about this?
Try using strftime method:
from datetime import datetime as dt
x = 1557499095332
dt.fromtimestamp(x/1000).strftime("%A, %B %d, %Y %I:%M:%S")
dt.fromtimestamp(x/1000).strftime("%W")
3rd line will return 'Friday, May 10, 2019 03:38:15'
4th line will return '18' (it's because 1st of January 2019 will return '0' as it's first week)

How to extract or validate date format from a text using python?

I'm trying to execute this code:
import datefinder
string_with_dates = 'The stock has a 04/30/2009 great record of positive Sept 1st, 2005 earnings surprises, having beaten the trade Consensus EPS estimate in each of the last four quarters. In its last earnings report on May 8, 2018, Triple-S Management reported EPS of $0.6 vs.the trade Consensus of $0.24 while it beat the consensus revenue estimate by 4.93%.'
matches = datefinder.find_dates(string_with_dates)
for match in matches:
print(match)
The output is:
2009-04-30 00:00:00
2005-09-01 00:00:00
2018-05-08 00:00:00
2019-02-04 00:00:00
The last date has come due to the percentage value 4.93% ... How to overcome this situation?
I cannot fix the datefinder module issue. You stated that you needed a solution, so I put this together for you. It's a work in progress, which means that you can adjusted it as needed. Also, some of the regex could have been consolidated, but I wanted to break them out for you. Hopefully, this answer helps you until you find another solution that works better for your needs.
import re
string_with_dates = 'The stock has a 04/30/2009 great record of positive Sept 1st, 2005 earnings surprises having beaten the trade Consensus EPS estimate in each of the last ' \
'four quarters In its last earnings report on March 8, 2018, Triple-S Management reported EPS of $0.6 vs.the trade Consensus of $0.24 while it beat the ' \
'consensus revenue estimate by 4.93%. The next trading day will occur at 2019-02-15T12:00:00-06:30'
def find_dates(input):
'''
This function is used to extract date strings from provide text.
Symbol references:
YYYY = four-digit year
MM = two-digit month (01=January, etc.)
DD = two-digit day of month (01 through 31)
hh = two digits of hour (00 through 23) (am/pm NOT allowed)
mm = two digits of minute (00 through 59)
ss = two digits of second (00 through 59)
s = one or more digits representing a decimal fraction of a second
TZD = time zone designator (Z or +hh:mm or -hh:mm)
:param input: text
:return: date string
'''
date_formats = [
# Matches date format MM/DD/YYYY
'(\d{2}\/\d{2}\/\d{4})',
# Matches date format MM-DD-YYYY
'(\d{2}-\d{2}-\d{4})',
# Matches date format YYYY/MM/DD
'(\d{4}\/\d{1,2}\/\d{1,2})',
# Matches ISO 8601 format (YYYY-MM-DD)
'(\d{4}-\d{1,2}-\d{1,2})',
# Matches ISO 8601 format YYYYMMDD
'(\d{4}\d{2}\d{2})',
# Matches full_month_name dd, YYYY or full_month_name dd[suffixes], YYYY
'(January|February|March|April|May|June|July|August|September|October|November|December)(\s\d{1,2}\W\s\d{4}|\s\d(st|nd|rd|th)\W\s\d{4})',
# Matches abbreviated_month_name dd, YYYY or abbreviated_month_name dd[suffixes], YYYY
'(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sept|Oct|Nov|Dec)(\s\d{1,2}\W\s\d{4}|\s\d(st|nd|rd|th)\W\s\d{4})',
# Matches ISO 8601 format with time and time zone
# yyyy-mm-ddThh:mm:ss.nnnnnn+|-hh:mm
'\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}(\+|-)\d{2}:\d{2}',
# Matches ISO 8601 format Datetime with timezone
# yyyymmddThhmmssZ
'\d{8}T\d{6}Z',
# Matches ISO 8601 format Datetime with timezone
# yyyymmddThhmmss+|-hhmm
'\d{8}T\d{6}(\+|-)\d{4}'
]
for item in date_formats:
date_format = re.compile(r'\b{}\b'.format(item), re.IGNORECASE|re.MULTILINE)
find_date = re.search(date_format, input)
if find_date:
print (find_date.group(0))
find_dates(string_with_dates)
# outputs
04/30/2009
March 8, 2018
Sept 1st, 2005
2019-02-15T12:00:00-06:30

Invalid Dates Python

I am new to Python. I was just wondering, how can you write code that makes beyond a certain date an invalid input. For example, if the user inputs anything after 12/02/2013, it will produce an error. Everything after that date will work perfectly
As glibdud suggested, use datetime objects.
date = datetime.date(YYYY, MM, DD)
where (YYYY, MM, DD) are integers representing years, months, and days. The condition can then be checked in your script with
inputDate > maxDate
for example:
import datetime
maxDate = datetime.date(2013, 12, 2)
y = int(input('Enter year:'))
m = int(input('Enter numerical month (1-12):'))
d = int(input('Enter numerical day (1-31):'))
inputDate = datetime.date(y, m, d)
if inputDate > maxDate:
print('Error - date after 02 December 2013')
else:
print('Success!')
Gives:
Enter year:2018
Enter numerical month (1-12):1
Enter numerical day (1-31):1
Error - date after 02 December 2013
and
Enter year:2000
Enter numerical month (1-12):1
Enter numerical day (1-31):1
Success!

Getting Millisecond in Hive timestamp with offset Timezone

I want to convert a timestamp to millisecond with different formats in hive.
Currently I'm able to convert a string to the correct timestamp using the following code but wanted to store the timestamp data type from something of the format of YYYYMMDD-HH:MM:SS[.sss][Z | [ + | - hh[:mm]]] where:
YYYY = 0000 to 9999
MM = 01-12
DD = 01-31
HH = 00-23 hours
MM = 00-59 minutes
SS = 00-59 seconds
sss = milliseconds
hh = 01-12 offset hours
mm = 00-59 offset minutes
Example: 20060901-02:39-05 is five hours behind UTC, thus Eastern Time on 1st of September 2006 and the timestamp with be in the yyyy-MM-dd HH:mm:ss.SSS format
What I have for UTC timestamp of YYYYMMDD-HH:MM:SS.sss is as follows:
cast(concat(concat_ws('-',substr(tag[52],1,4), substr(tag[52],5,2), substr(tag[52],7,2)),
space(1),
concat_ws(':',substr(tag[52],10,2), substr(tag[52],13,2), substr(tag[52],16,2)),
'.', substr(tag[52],19,3)) AS TIMESTAMP)
This takes a tag and does string manipulation of values of the tag to put into Timestamp datatype resulting in yyyy-MM-dd HH:MM:SS.sss...
I would like something similar to this that puts into Timestamp with offset in Hive.
Is this even possible?

Resources