Using zoneinfo with pandas.date_range - python-3.x

I am trying to use zoneinfo instead of pytz. I am running into a problem using zoneinfo to initiate dates and passing it on to pd.date_range.
Below is an example of doing the exact same thing with pytz and with zoneinfo. But, while passing it to pd.date_range getting an error with the latter.
pytz example:
start_date = datetime(2021, 1, 1, 0, 0, 0)end_date = datetime(2024, 1, 1, 0, 0, 0) # exclusive end range
pt = pytz.timezone('Canada/Pacific')start_date = pt.localize(start_date)end_date = pt.localize(end_date)
pd.date_range(start_date, end_date-timedelta(days=1), freq='d')
zoneinfo example:
start_date1 = '2021-01-01 00:00:00
start_date1 = datetime.strptime(start_date1, '%Y-%m-%d %H:%M:%S').replace(microsecond=0, second=0, minute=0, tzinfo=ZoneInfo("America/Vancouver"))end_date1 = start_date1 + relativedelta(years=3)
pd.date_range(start_date1, end_date1-timedelta(days=1), freq='d')
Yet, when using zoneinfo I get the following error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
~/Documents/GitHub/virtual/lib/python3.9/site-packages/pandas/_libs/tslibs/timezones.pyx in pandas._libs.tslibs.timezones.get_dst_info()
AttributeError: 'NoneType' object has no attribute 'total_seconds'
Exception ignored in: 'pandas._libs.tslibs.tzconversion.tz_convert_from_utc_single'
Traceback (most recent call last):
File "pandas/_libs/tslibs/timezones.pyx", line 266, in pandas._libs.tslibs.timezones.get_dst_info
AttributeError: 'NoneType' object has no attribute 'total_seconds'
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
~/Documents/GitHub/virtual/lib/python3.9/site-packages/pandas/_libs/tslibs/timezones.pyx in pandas._libs.tslibs.timezones.get_dst_info()
AttributeError: 'NoneType' object has no attribute 'total_seconds'
Exception ignored in: 'pandas._libs.tslibs.tzconversion.tz_convert_from_utc_single'
Traceback (most recent call last):
File "pandas/_libs/tslibs/timezones.pyx", line 266, in pandas._libs.tslibs.timezones.get_dst_info
AttributeError: 'NoneType' object has no attribute 'total_seconds'
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
/var/folders/vp/7ptlp5l934vdh1lvmpgk4qyc0000gn/T/ipykernel_67190/3566591779.py in <module>
5 end_date1 = start_date1 + relativedelta(years=3)
6
----> 7 pd.date_range(start_date1, end_date1-timedelta(days=1), freq='d')
8
9 # Because certain distributions will be a result of combined distributions,
~/Documents/GitHub/virtual/lib/python3.9/site-packages/pandas/core/indexes/datetimes.py in date_range(start, end, periods, freq, tz, normalize, name, closed, **kwargs)
1095 freq = "D"
1096
-> 1097 dtarr = DatetimeArray._generate_range(
1098 start=start,
1099 end=end,
~/Documents/GitHub/virtual/lib/python3.9/site-packages/pandas/core/arrays/datetimes.py in _generate_range(cls, start, end, periods, freq, tz, normalize, ambiguous, nonexistent, closed)
450
451 if tz is not None and index.tz is None:
--> 452 arr = tzconversion.tz_localize_to_utc(
453 index.asi8, tz, ambiguous=ambiguous, nonexistent=nonexistent
454 )
~/Documents/GitHub/virtual/lib/python3.9/site-packages/pandas/_libs/tslibs/tzconversion.pyx in pandas._libs.tslibs.tzconversion.tz_localize_to_utc()
~/Documents/GitHub/virtual/lib/python3.9/site-packages/pandas/_libs/tslibs/timezones.pyx in pandas._libs.tslibs.timezones.get_dst_info()
AttributeError: 'NoneType' object has no attribute 'total_seconds'
Testing the parameters:
start_date==start_date1
and
end_date==end_date1
Both tests result in True.

if understanding correctly you want to create a date range (1D freq) using ZoneInfo…if correct I see a few things going on with your code.
#1 When dealing with datetimes be sure the object is in the correct dtype. I believe datetime64 format will work better.
#2 From the provide code I don’t think ‘strptime’ or ‘replace’ are needed. To access "America/Vancouver" within ZoneInfo you can make it work if you parse start_date1 into years, months, days, hours and minutes.
#3 When start_date1 is parsed, you can add 3 to years (or another number) to create the end date.
The above will create a DatetimeIndex over the specified range.
Datetimes are always tricky. As always you can get to the same destination using different paths…this is just one of them.
start_date_str = '2021-01-01 00:00:00'
start_date_datetime64 = pd.to_datetime(start_date_str) # change dtype to datetime64
year = start_date_datetime64.year
month = start_date_datetime64.month
day = start_date_datetime64.day
hour = start_date_datetime64.hour
minute = start_date_datetime64.minute
start_date_formatted = dt.datetime(year, month, day, hour, minute, tzinfo=ZoneInfo("America/Vancouver"))
end_date_formatted = dt.datetime(year + 3, month, day, hour, minute, tzinfo=ZoneInfo("America/Vancouver"))
result = pd.date_range(start_date_formatted, end_date_formatted-pd.Timedelta(days=1), freq='d')
OUTPUT- DatetimeIndex, dtype='datetime64[ns, America/Vancouver]', length=1095, freq='D')

This error was a result of compatibility between the pandas version and the nbformat version. Once I updated both to the newest version, the code worked with no error.

Related

How to find hours passed since given time in millseconds in Python?

One of the tool gives the start time in milliseconds as below:
'StartMilliseconds': 1645250400857
How do I find the hours passed since this timestamp? I tried below
>>> start=datetime.datetime.fromtimestamp(1645250400857/1000.0)
>>> now=datetime.datetime.now
>>> now-start
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for -: 'builtin_function_or_method' and
'datetime.datetime'
You can find the current timestamp in seconds using datetime.now().timestamp() and then apply the conversions to compute hours since start:
from datetime import datetime
start_ms = 1645250400857
hours_since_start = (datetime.now().timestamp() - start_ms / 1000) / 3600

AttributeError: 'str' object has no attribute 'date'

I have written code as below:
import datetime as dt
d = dt.date(2000, 1, 15)
But I am getting an error:
AttributeError Traceback (most recent call last)
<ipython-input-23-669287944c85> in <module>
1 # for loop - to convert float->str date to month-year format
2 rnum = 0
----> 3 d = dt.date(2000, 1, 15)
4 data_2["Date_F"] = d
5 for dt in data_2["strDate"]:
AttributeError: 'str' object has no attribute 'date'
I am using Jupyter. Please tell me how to resolve this error.
dont know why you are getting this. I am getting output like datetime.date(2000, 1, 15)
You may check your indentation or something.

Python Numba - Convert DataFrame series object to numpy array

I have a pandas dataframe with strings I am trying to use the set operation using python numba to get the unique characters in the column that contains strings in the dataframe. Since, numba does note recognize pandas dataframes, I need to convert the string column to an numpy array. However, once converted the column shows the dtype as a object. Is there a way that I could convert the pandas dataframe (column of strings) to a normal array (not an object array)
Please find the code for your understanding.
z = train.head(2).sentence.values #Train is a pandas DataFrame
z
Output:
array(["Explanation\nWhy the edits made under my username Hardcore Metallica Fan were reverted? They weren't vandalisms, just closure on some GAs after I voted at New York Dolls FAC. And please don't remove the template from the talk page since I'm retired now.89.205.38.27",
"D'aww! He matches this background colour I'm seemingly stuck with. Thanks. (talk) 21:51, January 11, 2016 (UTC)"],
dtype=object)
Python Numba code:
#njit
def set_(z):
x = set(z.sum())
return x
set_(z)
Output:
---------------------------------------------------------------------------
TypingError Traceback (most recent call last)
<ipython-input-51-9d5bc17d106b> in <module>()
----> 1 set_(z)
~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/numba/dispatcher.py in _compile_for_args(self, *args, **kws)
342 raise e
343 else:
--> 344 reraise(type(e), e, None)
345 except errors.UnsupportedError as e:
346 # Something unsupported is present in the user code, add help info
~/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/numba/six.py in reraise(tp, value, tb)
656 value = tp()
657 if value.__traceback__ is not tb:
--> 658 raise value.with_traceback(tb)
659 raise value
660
TypingError: Failed at nopython (nopython frontend)
Internal error at <numba.typeinfer.ArgConstraint object at 0x7fbe66c01a58>:
--%<----------------------------------------------------------------------------
Traceback (most recent call last):
File "/home/ec2-user/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/numba/errors.py", line 491, in new_error_context
yield
File "/home/ec2-user/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/numba/typeinfer.py", line 194, in __call__
assert ty.is_precise()
AssertionError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ec2-user/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/numba/typeinfer.py", line 138, in propagate
constraint(typeinfer)
File "/home/ec2-user/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/numba/typeinfer.py", line 195, in __call__
typeinfer.add_type(self.dst, ty, loc=self.loc)
File "/home/ec2-user/anaconda3/envs/tensorflow_p36/lib/python3.6/contextlib.py", line 99, in __exit__
self.gen.throw(type, value, traceback)
File "/home/ec2-user/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/numba/errors.py", line 499, in new_error_context
six.reraise(type(newerr), newerr, tb)
File "/home/ec2-user/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/numba/six.py", line 659, in reraise
raise value
numba.errors.InternalError:
[1] During: typing of argument at <ipython-input-50-566e4e12481d> (3)
--%<----------------------------------------------------------------------------
File "<ipython-input-50-566e4e12481d>", line 3:
def set_(z):
x = set(z.sum())
^
This error may have been caused by the following argument(s):
- argument 0: Unsupported array dtype: object
This is not usually a problem with Numba itself but instead often caused by
the use of unsupported features or an issue in resolving types.
To see Python/NumPy features supported by the latest release of Numba visit:
http://numba.pydata.org/numba-doc/dev/reference/pysupported.html
and
http://numba.pydata.org/numba-doc/dev/reference/numpysupported.html
For more information about typing errors and how to debug them visit:
http://numba.pydata.org/numba-doc/latest/user/troubleshoot.html#my-code-doesn-t-compile
If you think your code should work with Numba, please report the error message
and traceback, along with a minimal reproducer at:
https://github.com/numba/numba/issues/new
Would anyone be able to help me in this regard.
Thanks & Best Regards
Michael

TypeError: 'numpy.float64' object cannot be interpreted as an integer

I am trying to run the detect_ts function from pyculiarity package but getting this error on passing a two-dimensional dataframe in python.
>>> import pandas as pd
>>> from pyculiarity import detect_ts
>>> data=pd.read_csv('C:\\Users\\nikhil.chauhan\\Desktop\\Bosch_Frame\\dataset1.csv',usecols=['time','value'])
>>> data.head()
time value
0 0 32.0
1 250 40.5
2 500 40.5
3 750 34.5
4 1000 34.5
>>> results = detect_ts(data,max_anoms=0.05,alpha=0.001,direction = 'both')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Windows\System32\pyculiar-0.0.5\pyculiarity\detect_ts.py", line 177, in detect_ts
verbose=verbose)
File "C:\Windows\System32\pyculiar-0.0.5\pyculiarity\detect_anoms.py", line 69, in detect_anoms
decomp = stl(data.value, np=num_obs_per_period)
File "C:\Windows\System32\pyculiar-0.0.5\pyculiarity\stl.py", line 35, in stl
res = sm.tsa.seasonal_decompose(data.values, model='additive', freq=np)
File "C:\Anaconda3\lib\site-packages\statsmodels\tsa\seasonal.py", line 88, in seasonal_decompose
trend = convolution_filter(x, filt)
File "C:\Anaconda3\lib\site-packages\statsmodels\tsa\filters\filtertools.py", line 303, in convolution_filter
result = _pad_nans(result, trim_head, trim_tail)
File "C:\Anaconda3\lib\site-packages\statsmodels\tsa\filters\filtertools.py", line 28, in _pad_nans
return np.r_[[np.nan] * head, x, [np.nan] * tail]
TypeError: 'numpy.float64' object cannot be interpreted as an integer
The problem with your code might be that np.nan is a float64 type value but the np.r_[] expects comma separated integers within its square brackets.
Hence you need to convert them to integer type first.
But we have another problem here.
return np.r_[[(int)(np.nan)] * head, x, [(int)(np.nan)] * tail]
This should have solved the problem in ordinary cases....
But it wont work in this case, as NaN cannot be type casted to integer type.
ValueError: cannot convert float NaN to integer
Thus, no proper solution can be suggested unless we know what you are trying to do here. Try providing a bit more details about your code and you are sure to get help from us.
:-)

TypeError: len() of unsized object when comparing and I cannot make sense of it

I am trying to select sensors by placing a box around their geographic coordinates:
In [1]: lat_min, lat_max = lats(data)
lon_min, lon_max = lons(data)
print(np.around(np.array([lat_min, lat_max, lon_min, lon_max]), 5))
Out[1]: [ 32.87248 33.10181 -94.37297 -94.21224]
In [2]: select_sens = sens[(lat_min<=sens['LATITUDE']) & (sens['LATITUDE']<=lat_max) &
(lon_min<=sens['LONGITUDE']) & (sens['LONGITUDE']<=lon_max)].copy()
Out[2]: ---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-12-7881f6717415> in <module>()
4 lon_min, lon_max = lons(data)
5 select_sens = sens[(lat_min<=sens['LATITUDE']) & (sens['LATITUDE']<=lat_max) &
----> 6 (lon_min<=sens['LONGITUDE']) & (sens['LONGITUDE']<=lon_max)].copy()
7 sens_data = data[data['ID'].isin(select_sens['ID'])].copy()
8 sens_data.describe()
/home/kartik/miniconda3/lib/python3.5/site-packages/pandas/core/ops.py in wrapper(self, other, axis)
703 return NotImplemented
704 elif isinstance(other, (np.ndarray, pd.Index)):
--> 705 if len(self) != len(other):
706 raise ValueError('Lengths must match to compare')
707 return self._constructor(na_op(self.values, np.asarray(other)),
TypeError: len() of unsized object
Of course, sens is a pandas DataFrame. Even when I use .where() it raises the same error. I am completely stumped, because it is a simple comparison that shouldn't raise any errors. Even the data types match:
In [3]: sens.dtypes
Out[3]: ID object
COUNTRY object
STATE object
COUNTY object
LENGTH float64
NUMBER object
NAME object
LATITUDE float64
LONGITUDE float64
dtype: object
So what is going on?!?
-----EDIT------
As per Ethan Furman's answer, I made the following changes:
In [2]: select_sens = sens[([lat_min]<=sens['LATITUDE']) & (sens['LATITUDE']<=[lat_max]) &
([lon_min]<=sens['LONGITUDE']) & (sens['LONGITUDE']<=[lon_max])].copy()
And (drumroll) it worked... But why?
I'm not familiar with NumPy nor Pandas, but the error is saying that one of the objects in the comparison if len(self) != len(other) does not have a __len__ method and therefore has no length.
Try doing print(sens_data) to see if you get a similar error.
I found a similar issue and think the problem may be related to the Python version you are using.
I wrote my code in Spyder
Python 3.6.1 |Anaconda 4.4.0 (64-bit)
but then passed it to someone using Spyder but
Python 3.5.2 |Anaconda 4.2.0 (64-bit)
I had one numpy.float64 object (as far as i understand, similar to lat_min, lat_max, lon_min and lon_max in your code) MinWD.MinWD[i]
In [92]: type(MinWD.MinWD[i])
Out[92]: numpy.float64
and a Pandas data frame WatDemandCur with one column called Percentages
In [96]: type(WatDemandCur)
Out[96]: pandas.core.frame.DataFrame
In [98]: type(WatDemandCur['Percentages'])
Out[98]: pandas.core.series.Series
and i wanted to do the following comparison
In [99]: MinWD.MinWD[i]==WatDemandCur.Percentages
There was no problem with this line when running the code in my machine (Python 3.6.1)
But my friend got something similar to you in (Python 3.5.2)
MinWD.MinWD[i]==WatDemandCur.Percentages
Traceback (most recent call last):
File "<ipython-input-99-3e762b849176>", line 1, in <module>
MinWD.MinWD[i]==WatDemandCur.Percentages
File "C:\Program Files\Anaconda3\lib\site-packages\pandas\core\ops.py", line 741, in wrapper
if len(self) != len(other):
TypeError: len() of unsized object
My solution to his problem was to change the code to
[MinWD.MinWD[i]==x for x in WatDemandCur.Percentages]
and it worked in both versions!
With this and your evidence, i would assume that it is not possible to compare numpy.float64 and perhaps numpy.integers objects with Pandas Series, and this could be partly related to the fact that the former have no len function.
Just for curiosity, i did some tests with float and integer objects (please tell the difference with numpy.float64 object)
In [122]: Temp=1
In [123]: Temp2=1.0
In [124]: type(Temp)
Out[124]: int
In [125]: type(Temp2)
Out[125]: float
In [126]: len(Temp)
Traceback (most recent call last):
File "<ipython-input-126-dc80ab11ca9c>", line 1, in <module>
len(Temp)
TypeError: object of type 'int' has no len()
In [127]: len(Temp2)
Traceback (most recent call last):
File "<ipython-input-127-a1b836f351d2>", line 1, in <module>
len(Temp2)
TypeError: object of type 'float' has no len()
Temp==WatDemandCur.Percentages
Temp2==WatDemandCur.Percentages
Both worked!
Conclusions
In another python version your code should work!
The problem with the comparison is specific for numpy.floats and perhaps numpy.integers
When you include [] or when I create the list with my solution, the type of object is changed from a numpy.float to a list, and in this way it works fine.
Although the problem seems to be related to the fact that numpy.float64 objects have no len function, floats and integers, which do not have a len function either, do work.
Hope some of this works for you or someone else facing a similar issue.

Resources