Rounding datetime to the nearest hour - python-3.x

I have a question very similar to this one and this one but I'm stuck on some rounding issue.
I have a time series from a netCDF file and I'm trying to convert them to a datetime format. The format of the time series is in 'days since 1990-01-01 00:00:00'. Eventually I want output in the format .strftime('%Y%m%d.%H%M'). So for example I read my netCDF file as follows
import netCDF4
nc = netCDF4.Dataset(file_name)
time = np.array(nc['time'][:])
I then have
In [180]: time[0]
Out[180]: 365
In [181]: time[1]
Out[181]: 365.04166666651145
I then did
In [182]: start = datetime.datetime(1990,1,1)
In [183]: delta = datetime.timedelta(time[1])
In [184]: new_time = start + delta
In [185]: print(new_time.strftime('%Y%m%d.%H%M'))
19910101.0059
Is there a a way to "round" to the nearest hour so I get 19910101.0100?

You can round down with datetime.replace(), and round up by adding an hour to the rounded down value using datetime.timedelta(hours=1).
import datetime
def round_to_hour(dt):
dt_start_of_hour = dt.replace(minute=0, second=0, microsecond=0)
dt_half_hour = dt.replace(minute=30, second=0, microsecond=0)
if dt >= dt_half_hour:
# round up
dt = dt_start_of_hour + datetime.timedelta(hours=1)
else:
# round down
dt = dt_start_of_hour
return dt
Note that since we're using replace the values we're not replacing (like the timezone - tzinfo) will be preserved.

I don't think datetime provides a way to round times, you'll have to provide the code to do that yourself. Something like this should work:
def round_to_hour(dt):
round_delta = 60 * 30
round_timestamp = datetime.datetime.fromtimestamp(dt.timestamp() + round_delta)
round_dt = datetime.datetime.fromtimestamp(round_timestamp)
return round_dt.replace(microsecond=0, second=0, minute=0)

Related

transform timestamp (Seconds sins 1.1.1904) from NetCDF file

even there are several posts concerning NetCDF files and timestamp conversion I draw a blank today.
I
read in a NetCDF data set (version 3), and after I call variables information:
# Load required Python packages
import netCDF4 as nc
import matplotlib.pyplot as plt
from datetime import datetime, timedelta
import pandas as pd
#read in a NetCDF data set
ds = nc.Dataset(fn)
# call time variable information
print(ds['time'])
As answer I get:
<class 'netCDF4._netCDF4.Variable'>
float64 time(time)
units: seconds since 1904-01-01 00:00:00.000 00:00
long_name: time UTC
axis: T
unlimited dimensions: time
current shape = (5760,)
filling on, default _FillValue of 9.969209968386869e+36 used
Now I would like to transform the seconds since 1.1.1904 time stamp into a DD.MM.YYYY HH:MM:SS.sss format. (by the way: why is there a second 00:00 information included after the time stamp?)
(1) I tried:
t = ds['time'][:]
dtime = []
dtime = (pd.to_datetime(t, format='%d.%m.%Y %H:%M:%S.micros') - datetime(1904, 1, 1)).total_seconds()
And I get the error:
pandas_libs\tslibs\strptime.pyx in pandas._libs.tslibs.strptime.array_strptime()
time data '3730320000' does not match format '%d.%m.%Y %H:%M:%S' (match)
(2) I tried:
d = datetime.strptime("01-01-1904", "%m-%d-%Y")
dt = d + timedelta(seconds=(t))
I get the
TypeError: unsupported type for timedelta seconds component: MaskedArray
(3) I tried
d = datetime.strptime("%m-%d-%Y", "01-01-1904")
dt = d + timedelta(seconds=(ds['time']))
And I get the answer:
unsupported type for timedelta seconds component: netCDF4._netCDF4.Variable
Has somebody a clearer view on the solution than I have at the moment?
Thanks,
Swawa
The NetCDF4 python library has a method for this: num2date().
https://unidata.github.io/netcdf4-python/#num2date. No need for datetime module.
NetCDF4 variables contain metadata attributes which describe the variable as seen in the output to your print:
print(ds['time']) #In particular the time variable units attribute.
# t contains just the numeric values of the time in `seconds since 1904-01-01 00:00:00.000 00:00`
t = ds['time'][:]
dtime = []
# t_var is the NetCDF4 variable which has the `units` attribute.
t_var = ds.['time']
#dtime = (pd.to_datetime(t, format='%d.%m.%Y %H:%M:%S.micros') - datetime(1904, 1, 1)).total_seconds()
dtime = NetCDF4.num2date(t, t_var.units)
The above should give you all the times in the dtime list as datetime objects.
print(dtime[0].isoformat())
print(dtime[-1].isoformat())
A simpler way would be:
dtime = NetCDF4.num2date(ds['time'][:], ds['time].units)

Comparing two times in Hours using python

I've two dates with time zone formatted as ('%Y-%m-%dT%H:%M:%SZ')
Given_Time = '2020-02-12T02:12:12Z'
current_time = '2020-02-11T06:22:42Z'
I wanted to compare these two times in Hrs. I mean given date is how many hours behind the current time
import datetime
def time_diff(x, y):
x = datetime.datetime.strptime(x,'%Y-%m-%dT%H:%M:%SZ')
y = datetime.datetime.strptime(y,'%Y-%m-%dT%H:%M:%SZ')
return (x-y).total_seconds()/3600
time_diff(Given_Time, current_time)
19.825
time_diff(current_time, Given_Time)
-19.825

Duration Calculator in Python

I have been studying Python by myself since a month ago.
I want to make a duration calculator that shows the total time of each different duration.
For instance, there are two different flights I have to take, and I want to get the total time I would be in the airplanes. It goes like this.
a = input('Enter the duration: ') #11h40m
b = input('Enter the duration: ') #13h54m
#it may show the total duration
01d01h34m
Try this :
Edit : I tried to use strftime to format the 'duration' but had some issues with day.
So I did it manually (you can format it the way you wish)
import datetime
import time
# Convert str to strptime
a_time = datetime.datetime.strptime("11h40m", "%Hh%Mm")
b_time = datetime.datetime.strptime("13h54m", "%Hh%Mm")
# Convert to timedelta
a_delta = datetime.timedelta(hours = a_time.hour,minutes=a_time.minute)
b_delta = datetime.timedelta(hours = b_time.hour,minutes=b_time.minute)
duration = (a_delta + b_delta)
print(str(duration.days) + time.strftime('d%Hh%Mm', time.gmtime(duration.seconds)))
'1d01h34m'

How to calculate total precipitation per day using hourly data for whole year?

I have hourly data from ERA5 for each day in a specific year. I want to convert that data from hourly to daily. I know the long and hard way to do it, but I need something which does that easily.
Copernicus has a code for this here https://confluence.ecmwf.int/display/CKB/ERA5%3A+How+to+calculate+daily+total+precipitation, which works fine if the data set is only converted for one day, but when converting for the whole year, i am having problems with that.
Link to download ERA5 dataset which is available at https://cds.climate.copernicus.eu/cdsapp#!/home
Follow the steps to use copernicus server here
https://confluence.ecmwf.int/display/CKB/How+to+download+ERA5
This script downloads the houly data for only 2 days (1st and 2nd of January 2017):
#!/usr/bin/env python
"""
Save as get-tp.py, then run "python get-tp.py".
Input file : None
Output file: tp_20170101-20170102.nc
"""
import cdsapi
c = cdsapi.Client()
r = c.retrieve(
'reanalysis-era5-single-levels', {
'variable' : 'total_precipitation',
'product_type': 'reanalysis',
'year' : '2017',
'month' : '01',
'day' : ['01', '02'],
'time' : [
'00:00','01:00','02:00',
'03:00','04:00','05:00',
'06:00','07:00','08:00',
'09:00','10:00','11:00',
'12:00','13:00','14:00',
'15:00','16:00','17:00',
'18:00','19:00','20:00',
'21:00','22:00','23:00'
],
'format' : 'netcdf'
})
r.download('tp_20170101-20170102.nc')
## Add multiple days and multiple months to donload more data
Below script will create a netCDF file for only one day
#!/usr/bin/env python
"""
Save as file calculate-daily-tp.py and run "python calculate-daily-tp.py".
Input file : tp_20170101-20170102.nc
Output file: daily-tp_20170101.nc
"""
import time, sys
from datetime import datetime, timedelta
from netCDF4 import Dataset, date2num, num2date
import numpy as np
day = 20170101
d = datetime.strptime(str(day), '%Y%m%d')
f_in = 'tp_%d-%s.nc' % (day, (d + timedelta(days = 1)).strftime('%Y%m%d'))
f_out = 'daily-tp_%d.nc' % day
time_needed = []
for i in range(1, 25):
time_needed.append(d + timedelta(hours = i))
with Dataset(f_in) as ds_src:
var_time = ds_src.variables['time']
time_avail = num2date(var_time[:], var_time.units,
calendar = var_time.calendar)
indices = []
for tm in time_needed:
a = np.where(time_avail == tm)[0]
if len(a) == 0:
sys.stderr.write('Error: precipitation data is missing/incomplete - %s!\n'
% tm.strftime('%Y%m%d %H:%M:%S'))
sys.exit(200)
else:
print('Found %s' % tm.strftime('%Y%m%d %H:%M:%S'))
indices.append(a[0])
var_tp = ds_src.variables['tp']
tp_values_set = False
for idx in indices:
if not tp_values_set:
data = var_tp[idx, :, :]
tp_values_set = True
else:
data += var_tp[idx, :, :]
with Dataset(f_out, mode = 'w', format = 'NETCDF3_64BIT_OFFSET') as ds_dest:
# Dimensions
for name in ['latitude', 'longitude']:
dim_src = ds_src.dimensions[name]
ds_dest.createDimension(name, dim_src.size)
var_src = ds_src.variables[name]
var_dest = ds_dest.createVariable(name, var_src.datatype, (name,))
var_dest[:] = var_src[:]
var_dest.setncattr('units', var_src.units)
var_dest.setncattr('long_name', var_src.long_name)
ds_dest.createDimension('time', None)
var = ds_dest.createVariable('time', np.int32, ('time',))
time_units = 'hours since 1900-01-01 00:00:00'
time_cal = 'gregorian'
var[:] = date2num([d], units = time_units, calendar = time_cal)
var.setncattr('units', time_units)
var.setncattr('long_name', 'time')
var.setncattr('calendar', time_cal)
# Variables
var = ds_dest.createVariable(var_tp.name, np.double, var_tp.dimensions)
var[0, :, :] = data
var.setncattr('units', var_tp.units)
var.setncattr('long_name', var_tp.long_name)
# Attributes
ds_dest.setncattr('Conventions', 'CF-1.6')
ds_dest.setncattr('history', '%s %s'
% (datetime.now().strftime('%Y-%m-%d %H:%M:%S'),
' '.join(time.tzname)))
print('Done! Daily total precipitation saved in %s' % f_out)
What I want is a code which will follows the same step as above data but assuming that I have an input file with one year houly data and convert that to one year daily data.
The result should be daily values for the calculate variable (such as precipitation, etc) for the whole year.
Example: Let's say I have a precipitation data for the whole year as 1mm/hr every day, I would have 2928 values for the whole year.
What I want is 24mm/day for the whole year with only 365 values for a non-leap year.
Example input dataset: Subset of the data can be downloaded from here (for 1st and 2nd January 2017) https://www.dropbox.com/sh/0vdfn20p355st3i/AABKYO4do_raGHC34VnsXGPqa?dl=0. Just use the 2nd script after this to check the code. {the code for the whole year is >10 GB thus can't be uploaded
Thanks in advance
xarray resample is just the tool for you. It converts netCDF data from one temporal resolution (e.g. hourly) to another (e.g. daily) in one line. Using your sample data file, we can create daily-means using the following code:
import xarray as xr
ds = xr.open_dataset('./tp_20170101-20170102.nc')
tp = ds['tp'] # dimensions [time: 48, latitude: 721, longitude: 1440]
tp_daily = tp.resample(time='D').mean(dim='time') # dimensions (time: 2, latitude: 721, longitude: 1440)
You'll see that the resample command takes in a temporal code, in this case 'D' which means daily and then we specify that we want to compute the mean for each day using the hourly data of that day with .mean(dim='time').
If instead, for example, you wanted to compute the daily max rather than the daily mean, you'd replace .mean(dim='time') with .max(dim='time'). You can also go from hourly to monthly (MS or month-start), annual (AS or annual-start), and many more. The temporal frequency codes can be found in the Pandas docs.
An alternative quick method from the command line using CDO would be:
cdo daysum -shifttime,-1hour era5_hourly.nc era5_daily.nc
Note, as per this answer/discussion here: Calculating ERA5 Daily Total Precipitation using CDO
the ERA5 hourly data has the timestep at the end of the hourly window, so you need to shift the timestamp before making the sum, I'm not sure the xarray solution handles that. Also to have mm/day, I think one needs to sum, not take the mean.

How to measure interval time with python

I have following formats of data
start = '10:00:00'
end = '12:05:00'
My interval time wanna show like simple 2 hours 5 minutes 0 second
in this case i've used datetime and timedelta
code
from datetime import datetime, timedelta
start = '10:00:00'
end = '12:05:00'
FMT = '%H:%M:%S'
interval_time = datetime.strptime(end, FMT) - datetime.strptime(start, FMT)
print(interval_time)
if interval_time.days < 0:
tdiff = timedelta(days=0,
seconds=interval_time.seconds, microseconds=interval_time.microseconds)
print(tdiff)
This condition is false, Cause here is no day yet and here i wanna use hours instead of day.
Any help would be appreciated.
Thanks
'

Resources