Suppose I have a datetime variable:
dt = datetime.datetime(2001,1,1,0,0)
and I convert it to numpy as follows numpy.datetime64(dt) I get
numpy.datetime64('2000-12-31T19:00:00.000000-0500')
with dtype('<M8[us]')
But this automatically takes into account my time-zone (i.e. EST in this case) and gives me back a date of 2001-12-31 and a time of 19:00 hours.
How can I convert it to datetime64[D] in numpy that ignores the timezone information and simply gives me
numpy.datetime64('2001-01-01')
with dtype('<M8[D]')
The numpy datetime64 doc page gives no information on how to ignore the time-zone or give the default time-zone as UTC
I was just playing around with this the other day. I think there are 2 issues - how the datetime.datetime object is converted to np.datetime64, and how the later is displayed.
The numpy doc talks about creating a datatime64 object from a date string. It appears that when given a datetime.datetime object, it first produces a string.
np.datetime64(dt) == np.datetime64(dt.isoformat())
I found that I could add timezone info to that string
np.datetime64(dt.isoformat()+'Z') # default assumption
np.datetime64(dt.isoformat()+'-0500')
Numpy 1.7.0 reads ISO 8601 strings w/o TZ as local (ISO specifies this)
Datetimes are always stored based on POSIX time with an epoch of 1970-01-01T00:00Z
As for display, the test_datetime.py file offers some clues as to the undocumented behavior.
https://github.com/numpy/numpy/blob/280f6050d2291e50aeb0716a66d1258ab3276553/numpy/core/tests/test_datetime.py
e.g.:
def test_datetime_array_str(self):
a = np.array(['2011-03-16', '1920-01-01', '2013-05-19'], dtype='M')
assert_equal(str(a), "['2011-03-16' '1920-01-01' '2013-05-19']")
a = np.array(['2011-03-16T13:55Z', '1920-01-01T03:12Z'], dtype='M')
assert_equal(np.array2string(a, separator=', ',
formatter={'datetime': lambda x :
"'%s'" % np.datetime_as_string(x, timezone='UTC')}),
"['2011-03-16T13:55Z', '1920-01-01T03:12Z']")
So you can customize the print behavior of an array with np.array2string, and np.datetime_as_string. np.set_printoptions also takes a formatter parameter.
The pytz module is used to add further timezone handling:
#dec.skipif(not _has_pytz, "The pytz module is not available.")
def test_datetime_as_string_timezone(self):
# timezone='local' vs 'UTC'
a = np.datetime64('2010-03-15T06:30Z', 'm')
assert_equal(np.datetime_as_string(a, timezone='UTC'),
'2010-03-15T06:30Z')
assert_(np.datetime_as_string(a, timezone='local') !=
'2010-03-15T06:30Z')
....
Examples:
In [48]: np.datetime_as_string(np.datetime64(dt),timezone='local')
Out[48]: '2000-12-31T16:00:00.000000-0800'
In [49]: np.datetime64(dt)
Out[49]: numpy.datetime64('2000-12-31T16:00:00.000000-0800')
In [50]: np.datetime_as_string(np.datetime64(dt))
Out[50]: '2001-01-01T00:00:00.000000Z'
In [51]: np.datetime_as_string(np.datetime64(dt),timezone='UTC')
Out[51]: '2001-01-01T00:00:00.000000Z'
In [52]: np.datetime_as_string(np.datetime64(dt),timezone='local')
Out[52]: '2000-12-31T16:00:00.000000-0800'
In [81]: np.datetime_as_string(np.datetime64(dt),timezone=pytz.timezone('US/Eastern'))
Out[81]: '2000-12-31T19:00:00.000000-0500'
Related
even there are several posts concerning NetCDF files and timestamp conversion I draw a blank today.
I
read in a NetCDF data set (version 3), and after I call variables information:
# Load required Python packages
import netCDF4 as nc
import matplotlib.pyplot as plt
from datetime import datetime, timedelta
import pandas as pd
#read in a NetCDF data set
ds = nc.Dataset(fn)
# call time variable information
print(ds['time'])
As answer I get:
<class 'netCDF4._netCDF4.Variable'>
float64 time(time)
units: seconds since 1904-01-01 00:00:00.000 00:00
long_name: time UTC
axis: T
unlimited dimensions: time
current shape = (5760,)
filling on, default _FillValue of 9.969209968386869e+36 used
Now I would like to transform the seconds since 1.1.1904 time stamp into a DD.MM.YYYY HH:MM:SS.sss format. (by the way: why is there a second 00:00 information included after the time stamp?)
(1) I tried:
t = ds['time'][:]
dtime = []
dtime = (pd.to_datetime(t, format='%d.%m.%Y %H:%M:%S.micros') - datetime(1904, 1, 1)).total_seconds()
And I get the error:
pandas_libs\tslibs\strptime.pyx in pandas._libs.tslibs.strptime.array_strptime()
time data '3730320000' does not match format '%d.%m.%Y %H:%M:%S' (match)
(2) I tried:
d = datetime.strptime("01-01-1904", "%m-%d-%Y")
dt = d + timedelta(seconds=(t))
I get the
TypeError: unsupported type for timedelta seconds component: MaskedArray
(3) I tried
d = datetime.strptime("%m-%d-%Y", "01-01-1904")
dt = d + timedelta(seconds=(ds['time']))
And I get the answer:
unsupported type for timedelta seconds component: netCDF4._netCDF4.Variable
Has somebody a clearer view on the solution than I have at the moment?
Thanks,
Swawa
The NetCDF4 python library has a method for this: num2date().
https://unidata.github.io/netcdf4-python/#num2date. No need for datetime module.
NetCDF4 variables contain metadata attributes which describe the variable as seen in the output to your print:
print(ds['time']) #In particular the time variable units attribute.
# t contains just the numeric values of the time in `seconds since 1904-01-01 00:00:00.000 00:00`
t = ds['time'][:]
dtime = []
# t_var is the NetCDF4 variable which has the `units` attribute.
t_var = ds.['time']
#dtime = (pd.to_datetime(t, format='%d.%m.%Y %H:%M:%S.micros') - datetime(1904, 1, 1)).total_seconds()
dtime = NetCDF4.num2date(t, t_var.units)
The above should give you all the times in the dtime list as datetime objects.
print(dtime[0].isoformat())
print(dtime[-1].isoformat())
A simpler way would be:
dtime = NetCDF4.num2date(ds['time'][:], ds['time].units)
I have datetime in string needing to be converted in datetime format. Below is my code but it returns error. what I am missing here.
from datetime import datetime
LocalStartTime='2020-09-17T10:55:06.4000000+1000'
datetime_object = datetime.strptime(LocalStartTime, '%Y-%m-%dT%H:%M:%S.%f%z')
Required output shd be date converted in current timezone to format like: '2020-09-17 20:55:06' whatever will be the actual value.
returns below error:
ValueError: time data '2020-09-17T10:55:06.4000000+1000' does not match format '%Y-%m-%dT%H:%M:%S.%f%z'
from datetime documentation:
When used with the strptime() method, the %f directive accepts from one to six digits and zero pads on the right. %f is an extension to the set of format characters in the C standard (but implemented separately in datetime objects, and therefore always available).
you have one too many zeros in the float part after the seconds part.
the limitation is 6 digits.
from datetime import datetime
LocalStartTime='2020-09-17T10:55:06.400000+1000'
datetime_object = datetime.strptime(LocalStartTime, '%Y-%m-%dT%H:%M:%S.%f%z')
should work
Edit:
after the OP edited and asked about converting to a different timestamp:
seems like what you're looking for is timestamp() and fromtimestamp()
you can get the timestamp which is a posix timestamp represented as float, and convert it back to datetime object with fromtimestamp() if you want to remove the float part after the seconds you can convert the time stamp to int.
datetime.fromtimestamp(int(datetime_object.timestamp()))
let's say I got a timestamp since epoch in microseconds 1611590898133828 how could I convert this easily into a datetime object considering the unit microseconds.
from datetime import datetime
timestamp_micro = 1611590898133828
dt = datetime.datetime.fromtimestamp(timestamp_micro / 1e6)
I would like to be able to do easy conversions since sometimes I have microseconds, sometimes seconds, sometimes nanoseconds to convert.
timestamp_micro = 1611590898133828
dt = datetime.datetime.fromtimestamp(timestamp_micro, unit="us")
Is this somehow possible? For me using python's datetime package is just one pain. Maybe you can also recommend another package in which timestamp handling is easier?
pandas.to_datetime provides the option to set the unit as a keyword:
import pandas as pd
t, UNIT = 1611590898133828, 'us'
dt = pd.to_datetime(t, unit=UNIT)
print(dt, repr(dt))
# 2021-01-25 16:08:18.133828 Timestamp('2021-01-25 16:08:18.133828')
You can now work with pandas' timestamps or convert to a regular Python datetime object like
dt.to_pydatetime()
# datetime.datetime(2021, 1, 25, 16, 8, 18, 133828)
Please also note that if you use fromtimestamp without setting a time zone, you'll get naive datetime, which will be treated as local time by Python (UTC offset might not be 0). See e.g. here.
You can create new javascript date objects by simply calling const dt = new Date(timestamp). The timestamp value here is an epoch up to milliseconds precision. JavaScript does have native support for higher precision.
If you constantly need to work with dates, I would recommend you to use a package such as momentJS, since native JS is quite a pain to handle dates/times.
As far as I can tell there is no way to distinguish the difference between these two date strings ('2020-10-07', '2020-10-07T00:00:00') once they are parsed by dateutil. I really would like to be able to tell the difference between a standalone date and a date with a timestamp of zero.
import dateutil.parser
import datetime
date_str = '2020-10-07'
time_str = '2020-10-07T00:00:00'
s = dateutil.parser.parse(date_str)
e = dateutil.parser.parse(time_str)
The ultimate goal is to set the time to the beginning of the day in the end of the day when it is a standalone date but leave the date alone when there is a time included. Get close with something like this but it still can't differentiate from this one case. If do you know of any good solution to this that would be really helpful.
if s == e and s.time() == datetime.time.min:
e = datetime.datetime.combine(e, datetime.time.max)
Post is somewhat useful but it's outdated and I'm not even sure that it would work for my use case. Finding if a python datetime has no time information
Here's a function which uses a simple try/except to test if the input can be parsed to a date (i.e. has no time information) or a datetime object (i.e. has time information). If the input format is different from ISO format, you could also implement specific strptime directives.
from datetime import date, time, datetime
def hasTime(s):
"""
Parameters
----------
s : string
ISO 8601 formatted date / datetime string.
Returns
-------
tuple, (bool, datetime.datetime).
boolean will be True if input specifies a time, otherwise False.
"""
try:
return False, datetime.combine(date.fromisoformat(t), time.min)
except ValueError:
return True, datetime.fromisoformat(t)
# do nothing else here; will raise an error if input can't be parsed
for t in ('2020-10-07', '2020-10-07T00:00:00', 'not-a-date'):
print(t, hasTime(t))
# output:
# >>> 2020-10-07 (False, datetime.datetime(2020, 10, 7, 0, 0))
# >>> 2020-10-07T00:00:00 (True, datetime.datetime(2020, 10, 7, 0, 0))
# >>> ValueError: Invalid isoformat string: 'not-a-date'
There seems to be a lot of confusion online on doing a very basic thing: create a datetime object with UTC timezone given seconds since unix epoch in the UTC timezone. Basically, I always want to work in absolute time/UTC.
I'm using python 3.5 (the latest right now) and want to simply get a datetime object in the context of UTC (+0/Zulu offset) from a floating point value of elapsed seconds since 1970 Jan 01.
This is wrong since the first time is created in my local timezone, and then I attempt to switch to UTC.
import datetime
import pytz
dt = datetime.datetime.fromtimestamp(my_seconds).replace(tzinfo=pytz.UTC)
Python provided the method utcfromtimestamp just for that case. utcfromtimestamp
import datetime
seconds = 0
utcdate_from_timestamp = datetime.datetime.utcfromtimestamp(seconds)
If my_seconds is a POSIX timestamp then to convert it to datetime in Python 3:
#!/usr/bin/env python3
from datetime import datetime, timedelta, timezone
utc_dt = datetime(1970, 1, 1, tzinfo=timezone.utc) + timedelta(seconds=my_seconds)
utc_dt = datetime.fromtimestamp(my_seconds, timezone.utc)
naive_utc_dt = datetime.utcfromtimestamp(my_seconds)
If your local timezone is "right" (non-POSIX) then only the first formula is correct (the others interpret my_seconds as TAI timestamp with datetime(1970, 1, 1, 0, 0, 10) TAI epoch in this case).
The first formula is more portable and may support a wider input range than the others.
The results of the 1st and 2nd expressions may differ due to rounding errors on some Python versions.
The 2nd and 3rd calls should differ only by tzinfo attibute (the latter returns a naive datetime object (.tzinfo is None)). You should prefer timezone-aware datetime objects, to avoid ambiguity.