How to differentiate between a dateutil parsed date and a datetime with 0 for all time values - python-3.x

As far as I can tell there is no way to distinguish the difference between these two date strings ('2020-10-07', '2020-10-07T00:00:00') once they are parsed by dateutil. I really would like to be able to tell the difference between a standalone date and a date with a timestamp of zero.
import dateutil.parser
import datetime
date_str = '2020-10-07'
time_str = '2020-10-07T00:00:00'
s = dateutil.parser.parse(date_str)
e = dateutil.parser.parse(time_str)
The ultimate goal is to set the time to the beginning of the day in the end of the day when it is a standalone date but leave the date alone when there is a time included. Get close with something like this but it still can't differentiate from this one case. If do you know of any good solution to this that would be really helpful.
if s == e and s.time() == datetime.time.min:
e = datetime.datetime.combine(e, datetime.time.max)
Post is somewhat useful but it's outdated and I'm not even sure that it would work for my use case. Finding if a python datetime has no time information

Here's a function which uses a simple try/except to test if the input can be parsed to a date (i.e. has no time information) or a datetime object (i.e. has time information). If the input format is different from ISO format, you could also implement specific strptime directives.
from datetime import date, time, datetime
def hasTime(s):
"""
Parameters
----------
s : string
ISO 8601 formatted date / datetime string.
Returns
-------
tuple, (bool, datetime.datetime).
boolean will be True if input specifies a time, otherwise False.
"""
try:
return False, datetime.combine(date.fromisoformat(t), time.min)
except ValueError:
return True, datetime.fromisoformat(t)
# do nothing else here; will raise an error if input can't be parsed
for t in ('2020-10-07', '2020-10-07T00:00:00', 'not-a-date'):
print(t, hasTime(t))
# output:
# >>> 2020-10-07 (False, datetime.datetime(2020, 10, 7, 0, 0))
# >>> 2020-10-07T00:00:00 (True, datetime.datetime(2020, 10, 7, 0, 0))
# >>> ValueError: Invalid isoformat string: 'not-a-date'

Related

Python 3 Convert ISO 8601 to milliseconds

I'm receiving an ISO 8601 format from an API GET request ("2020-02-25T00:02:43.000Z"). I'm trying to convert it to milliseconds, because that format is required in the payload of the API POST call. I've been successful running the code from a Linux system, but I get ValueError: Invalid format string from Windows.
From Linux:
import dateutil.parser
time = "2020-02-25T00:02:43.000Z"
parsed_time = dateutil.parser.parse(time)
t_in_millisec = parsed_time.strftime('%s%f')
t_in_millisec[:-3]
returns
'1582588963000'
From Windows:
import dateutil.parser
1 time = "2020-02-25T00:02:43.000Z"
2 parsed_time = dateutil.parser.parse(time)
----> 3 t_in_millisec = parsed_time.strftime('%s%f')
ValueError: Invalid format string
Is there a way around this?
Here is the list of what works on windows and indeed the %s is not present.
https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/strftime-wcsftime-strftime-l-wcsftime-l?redirectedfrom=MSDN&view=vs-2019
I always use datetime, if you have the opportunity to use it here is an example :
datetime.datetime(2020,2,25,0,2,43).timestamp()
or
import datetime
time = "2020-02-25T00:02:43.000Z"
date = datetime.datetime.strptime(time, '%Y-%m-%dT%H:%M:%S.%fZ')
timestamp = str((date - datetime.datetime(1970, 1, 1)).total_seconds()*1000)
print(timestamp[:-2])
The reason this doesn't work in Windows is that the strftime function calls the native OS's C library, and Unix ticks (i.e. seconds since midnight on Jan 1, 1970) aren't a part of the Windows operating system.
If you want to get the number of seconds since Jan 1, 1970, then you can simply subtract the original date and get the total seconds from the timedelta. Python makes this easier and provides a timestamp function that does the computation for you (and includes subseconds as a decimal component).
import dateutil.parser
time = "2020-02-25T00:02:43.000Z"
parsed_time = dateutil.parser.parse(time)
timestamp = parsed_time.timestamp() * 1000
return str(int(timestamp))

Check Date Format from a Python Tkinter entry widget

Using a Tkinter input box, I ask a user for a date in the format YYYYMMDD.
I would like to check if the date has been entered in the correct format , otherwise raise an error box. The following function checks for an integer but just need some help on the next step i.e the date format.
def retrieve_inputBoxes():
startdate = self.e1.get() # gets the startdate value from input box
enddate = self.e2.get() # gets the enddate value from input box
if startdate.isdigit() and enddate.isdigit():
pass
else:
tkinter.messagebox.showerror('Error Message', 'Integer Please!')
return
The easiest way would probably be to employ regex. However, YYYYMMDD is apparently an uncommon format and the regex I found was complicated. Here's an example of a regex for matching the format YYYY-MM-DD:
import re
text = input('Input a date (YYYY-MM-DD): ')
pattern = r'(19|20)\d\d[- /.](0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])'
match = re.search(pattern, text)
if match:
print(match.group())
else:
print('Wrong format')
This regex will work for the twentieth and twentyfirst centuries and will not care how many days are in each month, just that the maximum is 31.
Probably you've already solved this, but if anyone is facing the same issue you can also convert the data retrieved from the entry widgets to datetime format using the strptime method, and using a try statement to catch exceptions, like:
from datetime import *
def retrieve_inputBoxes():
try:
startdate = datetime.strptime(self.e1.get(), '%Y-%m-%d')
enddate = datetime.strptime(self.e2.get(), '%Y-%m-%d')
except:
print('Wrong datetime format, must be YYYY-MM-DD')
else:
print('startdate: {}, enddate: {}').format(startdate, enddate)
Note that the output string that will result will be something like YYYY-MM-DD HH:MM:SS.ssssss which you can truncate as follows the get only the date:
startdate = str(startdate)[0:10] #This truncates the string to the first 10 digits
enddate = str(enddate)[0:10]
In my opinion, this method is better than the Regex method since this method also detects if the user tries to input an invalid value like 2019-04-31, or situations in which leap years are involved (i.e. 2019-02-29 = Invalid, 2020-02-29 = Valid).

Sort by datetime in python3

Looking for help on how to sort a python3 dictonary by a datetime object (as shown below, a value in the dictionary) using the timestamp below.
datetime: "2018-05-08T14:06:54-04:00"
Any help would be appreciated, spent a bit of time on this and know that to create the object I can do:
format = "%Y-%m-%dT%H:%M:%S"
# Make strptime obj from string minus the crap at the end
strpTime = datetime.datetime.strptime(ts[:-6], format)
# Create string of the pieces I want from obj
convertedTime = strpTime.strftime("%B %d %Y, %-I:%m %p")
But I'm unsure how to go about comparing that to the other values where it accounts for both day and time correctly, and cleanly.
Again, any nudges in the right direction would be greatly appreciated!
Thanks ahead of time.
Datetime instances support the usual ordering operators (< etc), so you should order in the datetime domain directly, not with strings.
Use a callable to convert your strings to timezone-aware datetime instances:
from datetime import datetime
def key(s):
fmt = "%Y-%m-%dT%H:%M:%S%z"
s = ''.join(s.rsplit(':', 1)) # remove colon from offset
return datetime.strptime(s, fmt)
This key func can be used to correctly sort values:
>>> data = {'s1': "2018-05-08T14:06:54-04:00", 's2': "2018-05-08T14:05:54-04:00"}
>>> sorted(data.values(), key=key)
['2018-05-08T14:05:54-04:00', '2018-05-08T14:06:54-04:00']
>>> sorted(data.items(), key=lambda item: key(item[1]))
[('s2', '2018-05-08T14:05:54-04:00'), ('s1', '2018-05-08T14:06:54-04:00')]

How to create a datetime object given seconds since unix epoch?

There seems to be a lot of confusion online on doing a very basic thing: create a datetime object with UTC timezone given seconds since unix epoch in the UTC timezone. Basically, I always want to work in absolute time/UTC.
I'm using python 3.5 (the latest right now) and want to simply get a datetime object in the context of UTC (+0/Zulu offset) from a floating point value of elapsed seconds since 1970 Jan 01.
This is wrong since the first time is created in my local timezone, and then I attempt to switch to UTC.
import datetime
import pytz
dt = datetime.datetime.fromtimestamp(my_seconds).replace(tzinfo=pytz.UTC)
Python provided the method utcfromtimestamp just for that case. utcfromtimestamp
import datetime
seconds = 0
utcdate_from_timestamp = datetime.datetime.utcfromtimestamp(seconds)
If my_seconds is a POSIX timestamp then to convert it to datetime in Python 3:
#!/usr/bin/env python3
from datetime import datetime, timedelta, timezone
utc_dt = datetime(1970, 1, 1, tzinfo=timezone.utc) + timedelta(seconds=my_seconds)
utc_dt = datetime.fromtimestamp(my_seconds, timezone.utc)
naive_utc_dt = datetime.utcfromtimestamp(my_seconds)
If your local timezone is "right" (non-POSIX) then only the first formula is correct (the others interpret my_seconds as TAI timestamp with datetime(1970, 1, 1, 0, 0, 10) TAI epoch in this case).
The first formula is more portable and may support a wider input range than the others.
The results of the 1st and 2nd expressions may differ due to rounding errors on some Python versions.
The 2nd and 3rd calls should differ only by tzinfo attibute (the latter returns a naive datetime object (.tzinfo is None)). You should prefer timezone-aware datetime objects, to avoid ambiguity.

convertion of datetime to numpy datetime without timezone info

Suppose I have a datetime variable:
dt = datetime.datetime(2001,1,1,0,0)
and I convert it to numpy as follows numpy.datetime64(dt) I get
numpy.datetime64('2000-12-31T19:00:00.000000-0500')
with dtype('<M8[us]')
But this automatically takes into account my time-zone (i.e. EST in this case) and gives me back a date of 2001-12-31 and a time of 19:00 hours.
How can I convert it to datetime64[D] in numpy that ignores the timezone information and simply gives me
numpy.datetime64('2001-01-01')
with dtype('<M8[D]')
The numpy datetime64 doc page gives no information on how to ignore the time-zone or give the default time-zone as UTC
I was just playing around with this the other day. I think there are 2 issues - how the datetime.datetime object is converted to np.datetime64, and how the later is displayed.
The numpy doc talks about creating a datatime64 object from a date string. It appears that when given a datetime.datetime object, it first produces a string.
np.datetime64(dt) == np.datetime64(dt.isoformat())
I found that I could add timezone info to that string
np.datetime64(dt.isoformat()+'Z') # default assumption
np.datetime64(dt.isoformat()+'-0500')
Numpy 1.7.0 reads ISO 8601 strings w/o TZ as local (ISO specifies this)
Datetimes are always stored based on POSIX time with an epoch of 1970-01-01T00:00Z
As for display, the test_datetime.py file offers some clues as to the undocumented behavior.
https://github.com/numpy/numpy/blob/280f6050d2291e50aeb0716a66d1258ab3276553/numpy/core/tests/test_datetime.py
e.g.:
def test_datetime_array_str(self):
a = np.array(['2011-03-16', '1920-01-01', '2013-05-19'], dtype='M')
assert_equal(str(a), "['2011-03-16' '1920-01-01' '2013-05-19']")
a = np.array(['2011-03-16T13:55Z', '1920-01-01T03:12Z'], dtype='M')
assert_equal(np.array2string(a, separator=', ',
formatter={'datetime': lambda x :
"'%s'" % np.datetime_as_string(x, timezone='UTC')}),
"['2011-03-16T13:55Z', '1920-01-01T03:12Z']")
So you can customize the print behavior of an array with np.array2string, and np.datetime_as_string. np.set_printoptions also takes a formatter parameter.
The pytz module is used to add further timezone handling:
#dec.skipif(not _has_pytz, "The pytz module is not available.")
def test_datetime_as_string_timezone(self):
# timezone='local' vs 'UTC'
a = np.datetime64('2010-03-15T06:30Z', 'm')
assert_equal(np.datetime_as_string(a, timezone='UTC'),
'2010-03-15T06:30Z')
assert_(np.datetime_as_string(a, timezone='local') !=
'2010-03-15T06:30Z')
....
Examples:
In [48]: np.datetime_as_string(np.datetime64(dt),timezone='local')
Out[48]: '2000-12-31T16:00:00.000000-0800'
In [49]: np.datetime64(dt)
Out[49]: numpy.datetime64('2000-12-31T16:00:00.000000-0800')
In [50]: np.datetime_as_string(np.datetime64(dt))
Out[50]: '2001-01-01T00:00:00.000000Z'
In [51]: np.datetime_as_string(np.datetime64(dt),timezone='UTC')
Out[51]: '2001-01-01T00:00:00.000000Z'
In [52]: np.datetime_as_string(np.datetime64(dt),timezone='local')
Out[52]: '2000-12-31T16:00:00.000000-0800'
In [81]: np.datetime_as_string(np.datetime64(dt),timezone=pytz.timezone('US/Eastern'))
Out[81]: '2000-12-31T19:00:00.000000-0500'

Resources