How to determine the appropriate the timezone to apply for historical dates in a give region in python3 - python-3.x

I'm using python3 on Ubuntu 20.04.
I have a trove of files with naive datetime strings in them, dating back more than 20 years. I know that all of these datetimes are in the Pacific Timezone. I would like to convert them all to UTC datetimes.
However, whether they are relative to PDT or PST is a bigger question. Since when PDT/PST changes has changed over the last 20 years, it's not just a matter of doing a simple date/month threshold to figure out whether to apply the pdt or pst timezone. Is there an elegant way to make this determination and apply it?

Note upfront, for Python 3.9+: use zoneinfo from the standard library, no need anymore for a third party library. Example.
Here's what you can to do set the timezone and convert to UTC. dateutil will take DST changes from the IANA database.
from datetime import datetime
import dateutil
datestrings = ['1991-04-06T00:00:00', # PST
'1991-04-07T04:00:00', # PDT
'1999-10-30T00:00:00', # PDT
'1999-10-31T02:01:00', # PST
'2012-03-11T00:00:00', # PST
'2012-03-11T02:00:00'] # PDT
# to naive datetime objects
dateobj = [datetime.fromisoformat(s) for s in datestrings]
# set timezone:
tz_pacific = dateutil.tz.gettz('US/Pacific')
dtaware = [d.replace(tzinfo=tz_pacific) for d in dateobj]
# with pytz use localize() instead of replace
# check if has DST:
# for d in dtaware: print(d.dst())
# 0:00:00
# 1:00:00
# 1:00:00
# 0:00:00
# 0:00:00
# 1:00:00
# convert to UTC:
dtutc = [d.astimezone(dateutil.tz.UTC) for d in dtaware]
# check output
# for d in dtutc: print(d.isoformat())
# 1991-04-06T08:00:00+00:00
# 1991-04-07T11:00:00+00:00
# 1999-10-30T07:00:00+00:00
# 1999-10-31T10:01:00+00:00
# 2012-03-11T08:00:00+00:00
# 2012-03-11T09:00:00+00:00
Now if you'd like to be absolutely sure that DST (PDT vs. PST) is set correctly, you'd have to setup test cases and verify against IANA I guess...

Related

Change locale for Google Colab

I want to change the local setting (to change the date format) in GoogleCollab
The following works for me in JupyterNotebook but not in GoogleColab:
locale.setlocale(locale.LC_TIME, 'de_DE.UTF-8')
It always returns the error: unsupported locale setting
I have already looked at many other solutions and tried everything.
One solution to change only the time zone I have seen is this one:
'!rm /etc/localtime
!ln -s /usr/share/zoneinfo/Asia/Bangkok /etc/localtime
!date
I figured this one out after a long time:
In Colab, you will have to install the desired locales. You do this with:
!sudo dpkg-reconfigure locales
This will prompt for a numeric input, e.g. 268 and 269 for Hungarian.
So you enter 268 269.
It will also prompt for the default locale, after installation. Here you will need to select your desired custom locale. This time, it is a numeric selection out of 3-5 options, depending, on how many have you selected at the previous step. In my case, I have selected 3, and the default locale became hu_HU.
You need to restart the Colab runtime: Ctrl + M then .
You need to activate the locale:
import locale
locale.setlocale(locale.LC_ALL, 'hu_HU') <- make sure you do it for the LC_ALL context.
The custom locale is now ready to use with pandas:
pd.to_datetime('2021-01-01').day_name() returns Friday, but
pd.to_datetime('2021-01-01').day_name('hu_HU') returns PĂ©ntek
I wasn't successful using German locale on Google Colab, but desired formatting could be obtained as a combination of overriding locale for decimal separator and date formatting.
German formatting rules can be found here.
For custom string formatting nice cheatsheet is here.
from datetime import datetime, timedelta
import matplotlib.dates as mdates
import matplotlib.pyplot as plt
import numpy as np
import locale
german_format_str_full = '%Y-%m-%d, %H.%M Uhr'
german_format_str_date = '%Y-%m-%d'
# genereting plot data, xs are dates with not obvious step
xs = np.arange(datetime(year=2021, month=11, day=28, hour=23, minute=59, second=59),
datetime(year=2021, month=12, day=6, hour=23, minute=59, second=59),
timedelta(hours=5,minutes=47,seconds=27))
ys = np.sin(np.arange(0,len(xs),1)) # whatever
# use overwritten locale for comma as decimal point -- German formatting
plt.rcParams['axes.formatter.use_locale'] = True
locale._override_localeconv["decimal_point"]= ','
# plot
fig, ax = plt.subplots(figsize=(9,4))
ax.plot(xs,ys, 'o-')
# set formatting string using mdates from matplotlib
ax.xaxis.set_major_formatter(mdates.DateFormatter(german_format_str_date))
# rotate formatted ticks or use autoformat 'fig.autofmt_xdate()'
plt.xticks(rotation=70)
plt.title('Google Colab plot with German locale style')
plt.show()
It gives me this plot:
If you need to check how formatting settings look like on your machine you can use locale.nl_langinfo(locale.D_T_FMT). For example:
import locale
from datetime import datetime
now = datetime.now()
# find local date time formatting on Google Colab
local_format_str = locale.nl_langinfo(locale.D_T_FMT)
print('local_format_str on Google Colab: ', local_format_str)
print('now in Google Colab default format:', now.strftime(local_format_str))
german_format_str_full = '%Y-%m-%d, %H.%M Uhr'
german_format_str_date = '%Y-%m-%d'
print('now in German format, full:',now.strftime(german_format_str_full))
print('now in German format, only date:',now.strftime(german_format_str_date))
ridiculous_format = '%Y->%m-->%d'
print('now ridiculous_format:',now.strftime(ridiculous_format))
Based on this answer I was able to load german locales. However it needs to be done in two steps: Installing new, german locale. Restarting kernel and loading german locale.
In short:
import os
# Install de_DE
!/usr/share/locales/install-language-pack de_DE
!dpkg-reconfigure locales
# Restart Python process to pick up the new locales
os.kill(os.getpid(), 9)
More detailed version:
It turned out that the list of available locales is pretty short which can be checked like this:
import locale
from datetime import datetime
now = datetime.now()
# find local date time formatting on Google Colab
local_format_str = locale.nl_langinfo(locale.D_T_FMT)
print('local_format_str on Google Colab: ', local_format_str)
print('now in Google Colab default format:', now.strftime(local_format_str))
print('Loading avaliable locales via real names...')
for real_name in set(locale.locale_alias.values()):
try:
locale.setlocale(locale.LC_ALL, real_name)
print('success: real_name = ', real_name)
except:
pass
print('Loading avaliable locales via aliases...')
for alias , real_name in locale.locale_alias.items():
try:
locale.setlocale(locale.LC_ALL, alias)
print('success: alias = ' , alias, ' , real_name = ', real_name)
except:
pass
With output:
local_format_str on Google Colab: %a %b %e %H:%M:%S %Y
now in Google Colab default format: Wed Dec 1 12:10:52 2021
Loading avaliable locales via real names...
success: real_name = en_US.UTF-8
success: real_name = C
Loading avaliable locales via aliases...
As we can see there is no german locale, so it needs to be installed with code:
import os
# Install de_DE
!/usr/share/locales/install-language-pack de_DE
!dpkg-reconfigure locales
# Restart Python process to pick up the new locales
os.kill(os.getpid(), 9)
giving an output:
Generating locales (this might take a while)...
de_DE.ISO-8859-1... done
Generation complete.
dpkg-trigger: error: must be called from a maintainer script (or with a --by-package option)
Type dpkg-trigger --help for help about this utility.
Generating locales (this might take a while)...
de_DE.ISO-8859-1... done
en_US.UTF-8... done
Generation complete.
Then we load german locale locale.setlocale(locale.LC_ALL, 'german') and the same code as at the beginning (remember about importing again packages) gives us:
Loading avaliable locales via real names...
success: real_name = C
success: real_name = en_US.UTF-8
success: real_name = de_DE.ISO8859-1
Loading avaliable locales via aliases...
success: alias = deutsch , real_name = de_DE.ISO8859-1
success: alias = german , real_name = de_DE.ISO8859-1
and the default formatting is more German:
local_format_str on Google Colab: %a %d %b %Y %T %Z
now in Google Colab default format: Mi 01 Dez 2021 12:12:03

converting time from UTC to CST

I am trying to convert UTC time to CST. But I am not getting the output as expected.
Below is my code:
import datetime
import pytz
fmt = '%Y-%m-%d %H:%M:%S %Z%z'
e = pytz.timezone('US/Central')
time_from_utc = datetime.datetime.utcfromtimestamp(int(1607020200))
time_from = time_from_utc.astimezone(e)
time_from.strftime(fmt)
time_to_utc = datetime.datetime.utcfromtimestamp(int(1609785000))
time_to = time_to_utc.astimezone(tz=pytz.timezone('US/Central'))
print(time_from_utc)
print(time_from)
print(time_to_utc)
print(time_to)
Here is the output:
(base) ranjeet#casper:~/Desktop$ python3 ext.py
2020-12-03 18:30:00
2020-12-03 07:00:00-06:00
2021-01-04 18:30:00
2021-01-04 07:00:00-06:00
I was expecting that after conversion, I should get time corresponding to the time of UTC i.e.
2020-12-03 18:30:00
2020-12-03 12:30:00-06:00
since CST is -6 Hours from UTC.
Any help is appreciated.
the problem is that
time_from_utc = datetime.datetime.utcfromtimestamp(int(1607020200))
gives you a naive datetime object - which Python treats as local time by default. Then, in
time_from = time_from_utc.astimezone(e)
things go wrong since time_from_utc is treated as local time. Instead, set UTC explicitly when calling fromtimestamp:
from datetime import datetime, timezone
import pytz
fmt = '%Y-%m-%d %H:%M:%S %Z%z'
e = pytz.timezone('US/Central')
time_from_utc = datetime.fromtimestamp(1607020200, tz=timezone.utc)
time_from = time_from_utc.astimezone(e)
time_from.strftime(fmt)
time_to_utc = datetime.fromtimestamp(1609785000, tz=timezone.utc)
time_to = time_to_utc.astimezone(tz=pytz.timezone('US/Central'))
which will give you
2020-12-03 18:30:00+00:00
2020-12-03 12:30:00-06:00
2021-01-04 18:30:00+00:00
2021-01-04 12:30:00-06:00
Final Remarks: with Python 3.9, you have zoneinfo, so you don't need a third party library for handling of time zones. Example usage.

NodeJs How to use moment-timezone to convert Epoch to Specific TImezone and format

I have a use-case in which I want to convert an Epoch integer value into a specific Timezone time and a specific format,
Also, I want to convert a human-readable date time into epoch.
I am trying to use moment-tz for the timezone conversion.
I using a specific Epoch timestamp 1555296000 which is -
Monday, April 15, 2019 10:40:00 AM in Kuala Lampur Malaysia,
I am able to convert 2019-04-15 10:40:00 of Asia/Kuala_Lumpur timezone into correct Unix.
But I am unable to convert 1555296000 into another timezone's unix,
i.e I wish to convert 1555296000 into equivalent YYYY-MM-DD hh:mm:ss of Asia/Calcutta timezone.
Following is the code I'm trying to work with -
var moment = require('moment-timezone');
console.log("Convert from Asia/Kuala_Lumpur to Unix -> ", moment.tz("2019-04-15 10:40:00","Asia/Kuala_Lumpur").unix());
// Outputs - 1555296000
console.log("Epoch to Specific TimeZone and Format -> ",moment(1555296000).tz("Asia/Calcutta").format('YYYY-MM-DD hh:mm:ss'));
// Outputs - 1970-01-19 05:31:36
// I want - 2019-04-15 08:10:00
Try this out
const moment = require("moment-timezone");
console.log(
moment
.unix(1555296000)
.tz("Asia/Calcutta")
.format("YYYY-MM-DD HH:mm:ss")
);
2019-04-15 08:10:00 - "Asia/Calcutta" is 2019-04-15 10:40:00 - "Asia/Kuala_Lumpur"

Write date and variable to file

I am trying to write a variable and the date and time on the same line to a file, which will simulate a log file.
Example: July 25 2018 6:00 pm - Variable contents here
So far I am able to write the variable to the file but I am unsure how to use the datetime library or other similar libraries. Some guidance would be appreciated.
Below is the current script.
import subprocess
import datetime
var = "test"
with open('auditlog.txt', 'a') as logfile:
logfile.write(var + "\n")
The fastest way I found is doing something like this:
import time
var = time.asctime()
print(var)
Result: Thu Jul 26 00:46:04 2018
If you want to change the placements of y/m/d etc. you can alternatively use this:
import time
var = time.strftime("%B %d %Y %H:%M pm", time.localtime())
print(var)
Result: July 26 2018 00:50 pm
Have a look here.
By the way, is the subprocess intended in your code? You don't need it to open/write to files. Also you should do logfile.close() in your code after you wrote to it.

ubuntu linux removing date from timestamp from linux R [duplicate]

How would I extract the time from a series of POSIXct objects discarding the date part?
For instance, I have:
times <- structure(c(1331086009.50098, 1331091427.42461, 1331252565.99979,
1331252675.81601, 1331262597.72474, 1331262641.11786, 1331269557.4059,
1331278779.26727, 1331448476.96126, 1331452596.13806), class = c("POSIXct",
"POSIXt"))
which corresponds to these dates:
"2012-03-07 03:06:49 CET" "2012-03-07 04:37:07 CET"
"2012-03-09 01:22:45 CET" "2012-03-09 01:24:35 CET"
"2012-03-09 04:09:57 CET" "2012-03-09 04:10:41 CET"
"2012-03-09 06:05:57 CET" "2012-03-09 08:39:39 CET"
"2012-03-11 07:47:56 CET" "2012-03-11 08:56:36 CET"
Now, I have some values for a parameter measured at those times:
val <- c(1.25343125e-05, 0.00022890575,
3.9269125e-05, 0.0002285681875,
4.26353125e-05, 5.982625e-05,
2.09575e-05, 0.0001516951251,
2.653125e-05, 0.0001021391875)
I would like to plot val vs time of the day, irrespectively of the specific day when val was measured.
Is there a specific function that would allow me to do that?
You can use strftime to convert datetimes to any character format:
> t <- strftime(times, format="%H:%M:%S")
> t
[1] "02:06:49" "03:37:07" "00:22:45" "00:24:35" "03:09:57" "03:10:41"
[7] "05:05:57" "07:39:39" "06:47:56" "07:56:36"
But that doesn't help very much, since you want to plot your data. One workaround is to strip the date element from your times, and then to add an identical date to all of your times:
> xx <- as.POSIXct(t, format="%H:%M:%S")
> xx
[1] "2012-03-23 02:06:49 GMT" "2012-03-23 03:37:07 GMT"
[3] "2012-03-23 00:22:45 GMT" "2012-03-23 00:24:35 GMT"
[5] "2012-03-23 03:09:57 GMT" "2012-03-23 03:10:41 GMT"
[7] "2012-03-23 05:05:57 GMT" "2012-03-23 07:39:39 GMT"
[9] "2012-03-23 06:47:56 GMT" "2012-03-23 07:56:36 GMT"
Now you can use these datetime objects in your plot:
plot(xx, rnorm(length(xx)), xlab="Time", ylab="Random value")
For more help, see ?DateTimeClasses
The data.table package has a function 'as.ITime', which can do this efficiently use below:
library(data.table)
x <- "2012-03-07 03:06:49 CET"
as.IDate(x) # Output is "2012-03-07"
as.ITime(x) # Output is "03:06:49"
There have been previous answers that showed the trick. In essence:
you must retain POSIXct types to take advantage of all the existing plotting functions
if you want to 'overlay' several days worth on a single plot, highlighting the intra-daily variation, the best trick is too ...
impose the same day (and month and even year if need be, which is not the case here)
which you can do by overriding the day-of-month and month components when in POSIXlt representation, or just by offsetting the 'delta' relative to 0:00:00 between the different days.
So with times and val as helpfully provided by you:
## impose month and day based on first obs
ntimes <- as.POSIXlt(times) # convert to 'POSIX list type'
ntimes$mday <- ntimes[1]$mday # and $mon if it differs too
ntimes <- as.POSIXct(ntimes) # convert back
par(mfrow=c(2,1))
plot(times,val) # old times
plot(ntimes,val) # new times
yields this contrasting the original and modified time scales:
Here's an update for those looking for a tidyverse method to extract hh:mm::ss.sssss from a POSIXct object. Note that time zone is not included in the output.
library(hms)
as_hms(times)
Many solutions have been provided, but I have not seen this one, which uses package chron:
hours = times(strftime(times, format="%T"))
plot(val~hours)
(sorry, I am not entitled to post an image, you'll have to plot it yourself)
I can't find anything that deals with clock times exactly, so I'd just use some functions from package:lubridate and work with seconds-since-midnight:
require(lubridate)
clockS = function(t){hour(t)*3600+minute(t)*60+second(t)}
plot(clockS(times),val)
You might then want to look at some of the axis code to figure out how to label axes nicely.
The time_t value for midnight GMT is always divisible by 86400 (24 * 3600). The value for seconds-since-midnight GMT is thus time %% 86400.
The hour in GMT is (time %% 86400) / 3600 and this can be used as the x-axis of the plot:
plot((as.numeric(times) %% 86400)/3600, val)
To adjust for a time zone, adjust the time before taking the modulus, by adding the number of seconds that your time zone is ahead of GMT. For example, US central daylight saving time (CDT) is 5 hours behind GMT. To plot against the time in CDT, the following expression is used:
plot(((as.numeric(times) - 5*3600) %% 86400)/3600, val)

Resources