What is the strftime config for an amazon athena timestamp

What is the strftime config for an amazon athena timestamp - python-3.x

In python 3, I'd do something like this:
"{0:Y-M-d H:m:?.???}".format(datetime.datetime.now())
However, having searched a bit, it would be nice to have a canonical answer somewhere.

Late to the game and I like your answer of just using total seconds but here's how I got athena (using awswrangler) to work with datetime & strftime
query_date = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
query_statement = f"SELECT * FROM table where datetime_col > timestamp '{query_date}'"
my datetime_col has 3 milli seconds but it was not necessary in my query

Ultimately, i chose not to use a timestamp but instead of treat it as an integer and store seconds since epoch. This is effects the same outcome with much less drama. Athena has all the functions one needs to convert these ints into dates for date math. So it is just easier.

Related

Timestamp Timezone Wrong/Missing in Spark/Databricks SQL Output

When converting a timestamp between timezones in databricks/spark sql, the timezone itself seems lost in the end result, and I can't seem to either keep it or add it back.
I have a bunch of UTC times and am using the from_utc_timetamp() to convert them to a different timezone based on another field. The result is calculated correctly, but if I output it with a timezone it shows as UTC. It seems the conversion is done correctly but the end result has no timezone stored with it (affirmed by this answer), so it uses the server zone for the timezone in all cases.
Example: Using the following SQL:
createTimestampUTC,
v.timezone,
date_format(from_utc_timestamp(createTimestampUTC, v.timezone),"yyyy-MM-dd'T'HH:mm:s Z") createTimestampLocal,
I get the following:
You can see that the third column has done the conversions correctly for the timezones, but the output itself still shows as being in UTC timezone.
Repeating this with a lowercase z in the date_format function shows the same; namely, the conversions occur but the end result is still treated as UTC.
createTimestampUTC,
v.timezone,
date_format(from_utc_timestamp(createTimestampUTC, v.timezone),"yyyy-MM-dd'T'HH:mm:s z") createTimestampLocal,
I can also use an O in the format output instead of a Z or z, but this just gives me GMT instead of UTC; same output basically.
All the databricks documentation or stackoverflow questions I can find seem to treat printing timezones as a matter of setting the spark server time and outputting that way, or doing the conversion without keeping the resulting timezone. I'm trying to convert to multiple different timezones though, and to keep the timezone in the output. I need to generate the end result in this format:
Is there a way to do this? How do I either keep the timezone after the conversion or add it back in the format I need based on the timezone column I have? Given that the conversion works, and that I can output the end result with a +0000 on it, all the functionality to do this seems there, how do I put it together?

Spark does not support TIMESTAMP WITH TIMEZONE datatype as defined by ANSI SQL. Even though there are some functions that convert the timestamp across timezones, this information is never stored. Databricks documentation on timestamps explains:
Spark SQL defines the timestamp type as TIMESTAMP WITH SESSION TIME
ZONE, which is a combination of the fields (YEAR, MONTH, DAY, HOUR,
MINUTE, SECOND, SESSION TZ) where the YEAR through SECOND field
identify a time instant in the UTC time zone, and where SESSION TZ is
taken from the SQL config spark.sql.session.timeZone.
In your case spark.sql.session.timeZone is UTC and Z symbol in datetime pattern will always return UTC. Therefore you will never get a correct behavior with date_format if you deal with multiple timezones in a single query.
The only thing you can do is to explicitly store timezone information in a column and manually append it for display.
concat(
date_format(from_utc_timestamp(createTimestampUTC, v.timezone), "yyyy-MM-dd'T'HH:mm:s "),
v.timezone
) createTimestampLocal
This will display 2022-03-01T16:47:22.000 America/New_York. If you need an offset (-05:00) you will need to write a UDF to do the conversion and use Python or Scala native libraries that handle datetime conversions.

Get Epoch timestamp accurate by the day with datetime

I want to get a day-accurate (not hour, minutes, seconds) Epoch timestamp that remains the same throughout the day.
This is accurate by the millisecond (and therefore too accurate):
from datetime import date, datetime
timestamp = datetime.today().strftime("%s")
Is there any simple way to make it less precise?

A UNIX timestamp is by necessity accurate to the (milli)second, because it's a number counting seconds. The only thing you can do is choose a specific time which "stays constant" throughout the day, for which midnight probably makes the most sense:
from datetime import datetime, timezone
timestamp = datetime.now(timezone.utc).replace(hour=0, minute=0, second=0, microsecond=0).timestamp()

It depends what do you want.
If you just want a quick way, either use time.time_ns() or time.time(). Epoch time is used by system (on many OS), and so there is no conversion. The _ns() version avoid floating point maths, so faster.
If you want to store it in more efficient way, you can just do a:
(int(time.time()) % (24*60*60) so you get the epoch at start of the day. Epoch contrary most of other times (and GPS time) has all days long 246060 seconds (so discarding leap seconds).

Spark 3.0 timeStamp parsing doesn't work ever after passing the format

This is a issue I am facing with Spark 3.0, worked before without even specifying a format.
Now, I tried explicitly specifying the format, but it still doesn't work.
Here the input format,
Here's the code I wrote,
Clearly, the format "MM/dd/yyyy hh:mm" should've worked, but it's not.
So I must be ignorant about a couple things here.

Its not correct what you do since spark 3.0 there where major changes regarding the datetime formatting.
Here is a working example:
val df = Seq("12/21/2018 15:17").toDF("a")
df.select(to_timestamp($"a", "M/d/yyyy H:mm")).show()
Notice the capital H? that stands for hours 0-23
the lower case letter h stands for 1 - 12
reference: https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html

It's been a while but writing anyway.. I was experiencing the same problem with spark 2.4.6. Then I used SparkSQL and it worked very well.
I found the solutiın in this link:
SparkSQL - Difference between two time stamps in minutes
Example:
sqlContext.sql("select (bigint(to_timestamp(end_timestamp,'yyyy-MM-dd HH:mm:ss'))-bigint(to_timestamp(start_timestamp,'yyyy-MM-dd HH:mm:ss')))/(60) as duration from table limit 2")

Cassandra timeuuid column to nanoseconds precision

Cassandra table has timeuuid data type column so how do I see the value of type timeuuid in nanoseconds?
timeuuid:
49cbda60-961b-11e8-9854-134d5b3f9cf8
49cbda60-961b-11e8-9854-134d5b3f9cf9
How to convert this timeuuid to nanoseconds
need a select statement like:
select Dateof(timeuuid) from the table a;

There is a utility method in driver UUIDs.unixTimestamp(UUID id) that returns a normal epoch timestamp that can be converted into a Date object.
Worth noting that ns precision from the time UUID will not necessarily be meaningful. A type 1 uuid includes a timestamp which is the number of 100 nanosecond intervals since the Gregorian calendar was first adopted at midnight, October 15, 1582 UTC. But the driver takes a 1ms timestamp (precision depends on OS really, can be 10 or 40ms precision even) and keeps a monotonic counter to fill the rest of the 10000 unused precision but can end up counting into the future if in a 1ms there are over 10k values (note: performance limitations will ultimately prevent this). This is much more performant and guarantees no duplicates, especially as sub ms time accuracy in computers is pretty meaningless in a distributed system.
So if your looking from a purely CQL perspective theres no way to do it without a UDF, not that there is much value in getting beyond ms value anyway so dateOf should be sufficient. If you REALLY want it though
CREATE OR REPLACE FUNCTION uuidToNS (id timeuuid)
CALLED ON NULL INPUT RETURNS bigint
LANGUAGE java AS '
return id.timestamp();
';
Will give you the 100ns's from October 15, 1582. To translate that to nanoseconds from epoc, mulitply it by 100 to convert to nanos and add the difference from epoch time (-12219292800L * 1_000_000_000 in nanos). This might overflow longs so might need to use something different.

Node.js Storing Current Date as Epoch Value

I know there are many questions relative to this, but I can't find exactly what I am looking for..
I am creating an iOS Rideshare app and am utilizing the Google Distance Matrix. What I am looking to do is taking the current date and set it to midnight. For example: currentDate = 12/6/2018 12:00:00.
I want to take this currentDate value and convert is to Epoch and set it up as a baseEpoch value. This way, I can take the user's time input, and add the difference to get the date/time they entered in Epoch form.
I've tried solutions such as:
function convertToEpoch(time)
{
var sep = time.split(':');
var seconds = (+sep[0]) * 60 * 60 + (+sep[1]) * 60 + (+sep[2]);
return seconds;
}
function currentDateAsEpoch(time) {
var time = new Date();
time.setHours(0,0,0,0);
convertToEpoch(time);
}
const baseEpoch = currentDateAsEpoch();
But am getting the error: TypeError: time.split is not a function
I want the baseEpoch to be set as the current date so Google Distance Matrix doesn't return the departure_time error saying time can only be equal to or in the future.
Thank you in advance for your help!

You can use momentjs package for nodejs. This package helps you in dealing date and time effortlessly. You may need to dig more into the docs for better understanding of the moment module(Documentation is simple to understand).
Here is some snippet from momentjs docs.
moment().unix();
//moment#unix outputs a Unix timestamp (the number of seconds since the Unix Epoch).
moment(1318874398806).unix(); // 1318874398
Also using a library like moment which is well tested is better than writing your own functions to handle date and time for following reasons.
The code would be very well tested due to large number of people using them in a day to day basis.
No need to waste your time in reinventing the wheel.
Additional functionalities for future development work.(Like formatting and other date operations)
Easy to understand and implement even for new team members (due very well written documents and good support due to large number of developers using the library)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

What is the strftime config for an amazon athena timestamp - python-3.x

In python 3, I'd do something like this: "{0:Y-M-d H:m:?.???}".format(datetime.datetime.now()) However, having searched a bit, it would be nice to have a canonical answer somewhere.

Ultimately, i chose not to use a timestamp but instead of treat it as an integer and store seconds since epoch. This is effects the same outcome with much less drama. Athena has all the functions one needs to convert these ints into dates for date math. So it is just easier.

Related

Timestamp Timezone Wrong/Missing in Spark/Databricks SQL Output

Get Epoch timestamp accurate by the day with datetime

Spark 3.0 timeStamp parsing doesn't work ever after passing the format

Cassandra timeuuid column to nanoseconds precision

Node.js Storing Current Date as Epoch Value

Categories

Resources