Python strptime cannot understand timezone offset - python-3.x

I have a very simple timestamp I need to parse:
10/2/2020 3:19:42 PM (UTC-7)
But using python 3.6, when I try to parse this, I get the following:
>>> datetime.strptime('10/2/2020 3:19:42 PM (UTC-7)', '%m/%d/%Y %I:%M:%S %p (%Z%z)')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "_strptime.py", line 565, in _strptime_datetime
tt, fraction = _strptime(data_string, format)
File "_strptime.py", line 362, in _strptime
(data_string, format))
ValueError: time data '10/2/2020 3:19:42 PM (UTC-7)' does not match format '%m/%d/%Y %I:%M:%S %p (%Z%z)'
I have tried dateutil.parser, as well as several variations of the format string. The piece that's tripping up strptime is the (UTC-7) portion.
Is the string format wrong? How can I parse this string and receive the timezone information as well? Any help is appreciated.
Edit: If the string is (UTC-0700) then the parsing works. But I cannot control how the timestamps are being formatted, is there a way to parse them in their current format (UTC-7)?

Ah, it turned out to be quite silly:
>>> import dateutil
>>> dateutil.parser.parse(dt, fuzzy=True)
datetime.datetime(2020, 10, 2, 15, 19, 42, tzinfo=tzoffset(None, 25200))
Should have used fuzzy logic before. :-)
EDIT: The above does NOT work (thanks to #wim for pointing it out) - Fuzzy flag is ignoring the sign of the offset string.
Here is code that works:
>>> from datetime import datetime
>>> import re
>>> dt = '10/2/2020 3:19:42 PM (UTC-7)'
>>> sign, offset = re.search('\(UTC([+-])(\d+)\)', dt).groups()
>>> offset = f"0{offset}00" if len(offset) == 1 else f"{offset}00"
>>> dt = re.sub(r'\(UTC.\d+\)', f'(UTC{sign}{offset})', dt)
>>> datetime.strptime(dt, '%m/%d/%Y %I:%M:%S %p (%Z%z)')
datetime.datetime(2020, 10, 2, 15, 19, 42, tzinfo=datetime.timezone(datetime.timedelta(-1, 61200), 'UTC'))

Related

How to select a row in a pandas DataFrame datetime index using a datetime variable?

I am not a Professional programmer at all and slowly accumulating some experience in python.
This is the issue I encounter.
On my dev machine I had a python3.7 installed with pandas version 0.24.4
the following sequence was working perfectly fine.
>>> import pandas as pd
>>> df = pd.Series(range(3), index=pd.date_range("2000", freq="D", periods=3))
>>> df
2000-01-01 0
2000-01-02 1
2000-01-03 2
Freq: D, dtype: int64
>>> import datetime
>>> D = datetime.date(2000,1,1)
>>> df[D]
0
in the production environnent the pandas version is 1.1.4 and the sequence described does not work anymore.
>>> df[D]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/ec2-user/.local/lib/python3.7/site-packages/pandas/core/series.py", line 882, in __getitem__
return self._get_value(key)
File "/home/ec2-user/.local/lib/python3.7/site-packages/pandas/core/series.py", line 989, in _get_value
loc = self.index.get_loc(label)
File "/home/ec2-user/.local/lib/python3.7/site-packages/pandas/core/indexes/datetimes.py", line 622, in get_loc
raise KeyError(key)
KeyError: datetime.date(2000, 1, 1)
Then, unexpectedly, by transforming D in a string type the following command did work :
>>> df[str(D)]
0
Any idea of why this behaviour has changed in the different versions ?
Is this behaviour a bug or will be permanent over time ?
should I transform all the selections by datetime variables in the code in string variables or is there a more robust way over time to do this ?
It depends of version. If need more robust solution use datetimes for match DatetimeIndex:
import datetime
D = datetime.datetime(2000,1,1)
print (df[D])
0

python3 cant convert string to datetime object

I have the following code:
from datetime import datetime
date_time_str = '2020-07-17 21:59:49.55'
date_time_obj = datetime.strptime(date_time_str, '%y-%m-%d %H:%M:%S.%f')
print "The type of the date is now", type(date_time_obj)
print "The date is", date_time_obj
Which results in the err:
Traceback (most recent call last):
File "main.py", line 5, in <module>
date_time_obj = datetime.strptime(date_time_str, '%y-%m-%d %H:%M:%S.%f')
File "/usr/lib/python2.7/_strptime.py", line 332, in _strptime
(data_string, format))
ValueError: time data '2020-07-17 21:59:49.553' does not match format '%y-%m-%d %H:%M:%S.%f'
Why cant I convert this date? The following example works:
date_time_str = '18/09/19 01:55:19'
date_time_obj = datetime.strptime(date_time_str, '%d/%m/%y %H:%M:%S')
First of all, this is not valid Python 3 code. You've used the Python 2 print statement in your code, and trying to run this on Python 3 causes a SyntaxError.
As the error indicates, your date string does not match the format you specified. Take a look at the format codes; the first issue I notice is that you give a 4-digit year (2020) but try to line it up with %y, which is for two-digit year. There may be other issues as well, which should be easy to find looking through that table.

Using dateutil. parser to account for timezones, but parse won't recognizer tzinfos?

I'm trying to extract time stamps from a list that contains different timezones.
I am using dateutil.parser. I believe I want to use the parse function for this, including timezone information, but it appears it doesn't want to accept them. Can someone tell me where I'm going wrong?
from dateutil.parser import parse
timezone_info = {
"PDT": "UTC -7",
"PST": "UTC -8",
}
date_list = ['Oct 21, 2019 19:30 PDT',
'Nov 4, 2019 18:30 PST']
for dates in date_list:
print(parse(dates))
# This gives:
# 2019-10-21 19:30:00
# 2019-11-04 18:30:00
for date in date_list:
print(parse(dates, tzinfos = timezone_info))
This is the output:
2019-10-21 19:30:00
2019-11-04 18:30:00
C:\Users\mbsta\Anaconda3\envs\untitled2\lib\site-packages\dateutil\parser\_parser.py:1218: UnknownTimezoneWarning: tzname PDT identified but not understood. Pass `tzinfos` argument in order to correctly return a timezone-aware datetime. In a future version, this will raise an exception.
category=UnknownTimezoneWarning)
C:\Users\mbsta\Anaconda3\envs\untitled2\lib\site-packages\dateutil\parser\_parser.py:1218: UnknownTimezoneWarning: tzname PST identified but not understood. Pass `tzinfos` argument in order to correctly return a timezone-aware datetime. In a future version, this will raise an exception.
category=UnknownTimezoneWarning)
Traceback (most recent call last):
File "C:/Users/mbsta/PycharmProjects/untitled2/tester.py", line 16, in <module>
print(parse(dates, tzinfos = timezone_info))
File "C:\Users\mbsta\Anaconda3\envs\untitled2\lib\site-packages\dateutil\parser\_parser.py", line 1374, in parse
return DEFAULTPARSER.parse(timestr, **kwargs)
File "C:\Users\mbsta\Anaconda3\envs\untitled2\lib\site-packages\dateutil\parser\_parser.py", line 660, in parse
ret = self._build_tzaware(ret, res, tzinfos)
File "C:\Users\mbsta\Anaconda3\envs\untitled2\lib\site-packages\dateutil\parser\_parser.py", line 1185, in _build_tzaware
tzinfo = self._build_tzinfo(tzinfos, res.tzname, res.tzoffset)
File "C:\Users\mbsta\Anaconda3\envs\untitled2\lib\site-packages\dateutil\parser\_parser.py", line 1175, in _build_tzinfo
tzinfo = tz.tzstr(tzdata)
File "C:\Users\mbsta\Anaconda3\envs\untitled2\lib\site-packages\dateutil\tz\_factories.py", line 69, in __call__
cls.instance(s, posix_offset))
File "C:\Users\mbsta\Anaconda3\envs\untitled2\lib\site-packages\dateutil\tz\_factories.py", line 22, in instance
return type.__call__(cls, *args, **kwargs)
File "C:\Users\mbsta\Anaconda3\envs\untitled2\lib\site-packages\dateutil\tz\tz.py", line 1087, in __init__
raise ValueError("unknown string format")
ValueError: unknown string format
Process finished with exit code 1
I believe the issue here is that the offsets you are specifying are not a valid format for tzstr, which expects something that looks like a TZ variable. If you change the strings to "PST+8" and "PDT+7", respectively, it will work as intended.
That said, I think you'd be much better off using a tzfile, which is one of the main things that tzinfos is for:
from dateutil import parser
from dateutil import tz
PACIFIC = tz.gettz("America/Los_Angeles")
timezone_info = {"PST": PACIFIC, "PDT": PACIFIC}
date_list = ["Oct 21, 2019 19:30 PDT",
"Nov 4, 2019 18:30 PST"]
for dtstr in date_list:
print(parser.parse(dtstr, tzinfos=timezone_info))
This prints:
2019-10-21 19:30:00-07:00
2019-11-04 18:30:00-08:00
And since it attaches a full time zone offset, you can do arithmetic on the results without worrying (since it's a full time zone, not a fixed offset).

Datetime Module and Timedelta

I need to add onehour to the currenttime and subtract it with the minutes, e
For example:current time = 7:31,addedhour = 7:31 + 1 hour = 8:31,required time = 8:31 - 31 = 8:00
Any help or a workaround will be greatly appreciated.
from datetime import datetime, timedelta
import time
addedtime = (datetime.now() + timedelta(hours=1)).strftime('%H:%M')
requiredtime = addedtime - timedelta(now.minutes).strftime('%H:%M')
You're setting addedtime to a string rather than a datetime, then getting into trouble because you're trying to subtract a timedelta from that string:
>>> addedtime = (datetime.now() + timedelta(hours=1)).strftime('%H:%M')
>>> addedtime
'23:30'
>>> addedtime - timedelta(minutes=4)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for -: 'str' and 'datetime.timedelta'
Instead, keep them as timepoints for as long as you need to manipulate them as timepoints, converting them to a string when you need the final result:
>>> time1 = datetime.now()
>>> time1
datetime.datetime(2019, 10, 17, 22, 23, 55, 860195)
>>> time2 = time1 + timedelta(hours=1)
>>> time2
datetime.datetime(2019, 10, 17, 23, 23, 55, 860195)
>>> time3 = time2 - timedelta(minutes=time2.minute)
>>> time3
datetime.datetime(2019, 10, 17, 23, 0, 55, 860195)
>>> time3.strftime("%H:%M")
'23:00'
Of course, you can also do it as a single operation since you can both add one hour and subtract some minutes with a single timedelta:
>>> final = (time1 + timedelta(hours=1, minutes=-time1.minute)).strftime("%H:%M")
>>> final
'23:00'
Why not explore one of Python's many amazing datetime libraries ...
pip install parsedatetime
import parsedatetime as pdt
from datetime import datetime
if __name__ == '__main__':
cal = pdt.Calendar()
dt, result = cal.parse("10 minutes before an hour from now")
print(datetime(*dt[:6]))

Cannot convert from an iterable of Python `datetime` objects to an array of Numpy `datetime64` objects using `fromiter()`. Bug?

I'm using Python 3.6.2.
I've learnt from this question how to convert between the standard datetime type to np.datetime64 type, as follows.
dt = datetime.now()
print(dt)
print(np.datetime64(dt))
Output:
2017-12-19 17:20:12.743969
2017-12-19T17:20:12.743969
But I can't convert an iterable of standard datetime objects into a Numpy array. The following code ...
np.fromiter([dt], dtype=np.datetime64)
... gives the following error.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-14-46e4618bda89> in <module>()
----> 1 np.fromiter([dt], dtype=np.datetime64)
TypeError: Cannot cast datetime.datetime object from metadata [us] to according to the rule 'same_kind'
However, using np.asarray() works.
np.asarray([dt])
Output:
array([datetime.datetime(2017, 12, 19, 17, 20, 12, 743969)], dtype=object)
Might this be a bug with either np.fromiter() or np.datetime64?
It may just be a matter of setting the datetime units:
In [368]: dt = datetime.now()
In [369]: dt
Out[369]: datetime.datetime(2017, 12, 19, 12, 48, 45, 143287)
Default action for np.array (don't really need fromiter with a list) is to create an object dtype array:
In [370]: np.array([dt,dt])
Out[370]:
array([datetime.datetime(2017, 12, 19, 12, 48, 45, 143287),
datetime.datetime(2017, 12, 19, 12, 48, 45, 143287)], dtype=object)
Looks like plain 'datetime64' produces days:
In [371]: np.array([dt,dt], dtype='datetime64')
Out[371]: array(['2017-12-19', '2017-12-19'], dtype='datetime64[D]')
and specifying the units:
In [373]: np.array([dt,dt], dtype='datetime64[m]')
Out[373]: array(['2017-12-19T12:48', '2017-12-19T12:48'], dtype='datetime64[m]')
This also works with fromiter.
In [374]: np.fromiter([dt,dt], dtype='datetime64[m]')
Out[374]: array(['2017-12-19T12:48', '2017-12-19T12:48'], dtype='datetime64[m]')
In [384]: x= np.fromiter([dt,dt], dtype='M8[us]')
In [385]: x
Out[385]: array(['2017-12-19T12:48:45.143287', '2017-12-19T12:48:45.143287'], dtype='datetime64[us]')
I've learned to use the string name of the datetime64, which allows me to specify the units, rather than the most generic np.datetime64.

Resources