Presto convert timestamp to epoch - presto

In redshift, I can do something like this
date_part(epoch, '2019-03-07 10:17:03.000000') and it would return 1551953823. How do I do this in Presto?

You can either use to_unixtime:
presto> select to_unixtime(timestamp '2019-03-07 10:17:03');
_col0
---------------
1.551953823E9
(1 row)
or use date_diff to subtract the timestamp from the epoch timestamp:
presto> select date_diff('second', timestamp '1970-01-01', timestamp '2019-03-07 10:17:03');
_col0
------------
1551953823
(1 row)
Note that to_unixtime returns a double representing the number of seconds plus milliseconds as a fractional part. If you want just the seconds, you can cast the result to BIGINT:
presto> select cast(to_unixtime(timestamp '2019-03-07 10:17:03') as bigint);
_col0
------------
1551953823
(1 row)

Related

How can we filter rows based on timestamp column?

I have a cassandra column which is of type date and has values in timestamp format like below. How can we filter rows based on this column which have date greater than today's date?
Example:
Type: date
Timestamp: 2021-06-29 11:53:52 +00:00
TTL: null
Value: 2021-03-16T00:00:00.000+0000
I was able to filter rows using columname <= '2021-09-25' which gives ten rows some of them having dates on sep 23 and 24. When i filter using columname < '2021-09-24', i get an error like below
An error occurred on line 1 (use Ctrl-L to toggle line numbers):
Cassandra failure during read query at consistency ONE (1 responses were required but only 0 replica responded, 1 failed)
The CQL timestamp data type is encoded as the number of milliseconds since Unix epoch (Jan 1, 1970 00:00 GMT) so you need to be precise when you're working with timestamps.
Depending on where you're running the query, the filter could be translated in the local timezone. Let me illustrate with this example table:
CREATE TABLE community.tstamptbl (
id int,
tstamp timestamp,
PRIMARY KEY (id, tstamp)
)
These 2 statements may appear similar but translate to 2 different entries:
INSERT INTO tstamptbl (id, tstamp) VALUES (5, '2021-08-09');
INSERT INTO tstamptbl (id, tstamp) VALUES (5, '2021-08-09 +0000');
The first statement creates an entry with a timestamp in my local timezone (Melbourne, Australia) while the second statement creates an entry with a timestamp in UTC (+0000):
cqlsh:community> SELECT * FROM tstamptbl WHERE id = 5;
id | tstamp
----+---------------------------------
5 | 2021-08-08 14:00:00.000000+0000
5 | 2021-08-09 00:00:00.000000+0000
Similarly, you need to be precise when reading the data. You need to specify the timezone to remove ambiguity. Here are some examples:
SELECT * FROM tstamptbl WHERE id = 5 AND tstamp < '2021-08-09 +0000';
SELECT * FROM tstamptbl WHERE id = 1 AND tstamp < '2021-08-10 12:00+0000';
SELECT * FROM tstamptbl WHERE id = 1 AND tstamp < '2021-08-10 12:34:56+0000';
In the second part of your question, the error isn't directly related to your filter. The problem is that the replica(s) failed to respond for whatever reason (e.g. unresponsive/overloaded, down, etc). You need to investigate that issue separately. Cheers!

Geeting invalid_cast_argument error in Athena (Presto)

i am quite new at sql, i am trying a simple query
select
*,
max(cast(version_date as date)) over (partition by id) mx_dt,
min(cast(version_date as date)) over (partition by id) min_dt
from "raw_data"."raw_brands";
but i am getting this error :
An error has been thrown from the AWS Athena client. INVALID_CAST_ARGUMENT: Value cannot be cast to date: 2020-01-16 19:09:25.086223
There are some approaches.
Use date_parse function.
presto> select date_parse('2020-01-16 19:09:25.086223', '%Y-%m-%d %H:%i:%s.%f');
_col0
-------------------------
2020-01-16 19:09:25.086
and then cast to date.
presto> select date(date_parse('2020-01-16 19:09:25.086223', '%Y-%m-%d %H:%i:%s.%f'));
_col0
------------
2020-01-16
presto> select cast(date_parse('2020-01-16 19:09:25.086223', '%Y-%m-%d %H:%i:%s.%f') as date);
_col0
------------
2020-01-16
Use substr function
presto> select cast(substr('2020-01-16 19:09:25.086223', 1, 10) as date);
_col0
------------
2020-01-16

Cassandra cqlsh - how to show microseconds/milliseconds for timestamp columns?

I'm inserting into a Cassandra table with timestamp columns. The data I have comes with microsecond precision, so the time data string looks like this:
2015-02-16T18:00:03.234+00:00
However, in cqlsh when I run a select query the microsecond data is not shown, I can only see time down to second precision. The 234 microseconds data is not shown.
I guess I have two questions:
1) Does Cassandra capture microseconds with timestamp data type? My guess is yes?
2) How can I see that with cqlsh to verify?
Table definition:
create table data (
datetime timestamp,
id text,
type text,
data text,
primary key (id, type, datetime)
)
with compaction = {'class' : 'DateTieredCompactionStrategy'};
Insert query ran with Java PreparedStatment:
insert into data (datetime, id, type, data) values(?, ?, ?, ?);
Select query was simply:
select * from data;
In an effort to answer your questions, I did a little digging on this one.
Does Cassandra capture microseconds with timestamp data type?
Microseconds no, milliseconds yes. If I create your table, insert a row, and try to query it by the truncated time, it doesn't work:
aploetz#cqlsh:stackoverflow> INSERT INTO data (datetime, id, type, data)
VALUES ('2015-02-16T18:00:03.234+00:00','B26354','Blade Runner','Deckard- Filed and monitored.');
aploetz#cqlsh:stackoverflow> SELECT * FROM data
WHERE id='B26354' AND type='Blade Runner' AND datetime='2015-02-16 12:00:03-0600';
id | type | datetime | data
----+------+----------+------
(0 rows)
But when I query for the same id and type values while specifying milliseconds:
aploetz#cqlsh:stackoverflow> SELECT * FROM data
WHERE id='B26354' AND type='Blade Runner' AND datetime='2015-02-16 12:00:03.234-0600';
id | type | datetime | data
--------+--------------+--------------------------+-------------------------------
B26354 | Blade Runner | 2015-02-16 12:00:03-0600 | Deckard- Filed and monitored.
(1 rows)
So the milliseconds are definitely there. There was a JIRA ticket created for this issue (CASSANDRA-5870), but it was resolved as "Won't Fix."
How can I see that with cqlsh to verify?
One possible way to actually verify that the milliseconds are indeed there, is to nest the timestampAsBlob() function inside of blobAsBigint(), like this:
aploetz#cqlsh:stackoverflow> SELECT id, type, blobAsBigint(timestampAsBlob(datetime)),
data FROM data;
id | type | blobAsBigint(timestampAsBlob(datetime)) | data
--------+--------------+-----------------------------------------+-------------------------------
B26354 | Blade Runner | 1424109603234 | Deckard- Filed and monitored.
(1 rows)
While not optimal, here you can clearly see the millisecond value of "234" on the very end. This becomes even more apparent if I add a row for the same timestamp, but without milliseconds:
aploetz#cqlsh:stackoverflow> INSERT INTO data (id, type, datetime, data)
VALUES ('B25881','Blade Runner','2015-02-16T18:00:03+00:00','Holden- Fine as long as nobody unplugs him.');
aploetz#cqlsh:stackoverflow> SELECT id, type, blobAsBigint(timestampAsBlob(datetime)),
... data FROM data;
id | type | blobAsBigint(timestampAsBlob(datetime)) | data
--------+--------------+-----------------------------------------+---------------------------------------------
B25881 | Blade Runner | 1424109603000 | Holden- Fine as long as nobody unplugs him.
B26354 | Blade Runner | 1424109603234 | Deckard- Filed and monitored.
(2 rows)
You can configure the output format of datetime objects in the .cassandra/cqlshrc file, using python's 'strftime' syntax.
Unfortunately, the %f directive for microseconds (there does not seem to be a directive for milliseconds) does not work for older python versions, which means you have to fall back to the blobAsBigint(timestampAsBlob(date)) solution.
I think by "microseconds" (e.g 03.234567) you mean "milliseconds" (e.g. (03.234).
The issue here was a cqlsh bug that failed to support fractional seconds when dealing with timestamps.
So, while your millisecond value was preserved in the actual persistence layer (cassandra), the shell (cqlsh) failed to display them.
This was true even if you were to change time_format in .cqlshrc to display fractional seconds with an %f directive (e.g. %Y-%m-%d %H:%M:%S.%f%z). In this configuration cqlsh would render 3.000000 for our 3.234 value, since the issue was in how cqlsh loaded the datetime objects without loading the partial seconds.
That all being said, this issue was fixed in CASSANDRA-10428, and released in Cassandra 3.4.
It is impossible to show microseconds (1 millionth of a second) using the Cassandra datatype 'timestamp' because the greatest precision available for that datatype is milliseconds (1 thousandth of a second).
http://docs.datastax.com/en/cql/3.1/cql/cql_reference/timestamp_type_r.html
Values for the timestamp type are encoded as 64-bit signed integers
representing a number of milliseconds since the standard base time
known as the epoch
Some related code:
cqlsh> CREATE KEYSPACE udf
WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3};
cqlsh> USE udf;
cqlsh:udf> CREATE OR REPLACE FUNCTION udf.timeuuid_as_us ( t timeuuid )
RETURNS NULL ON NULL INPUT
RETURNS bigint LANGUAGE JAVA AS '
long msb = t.getMostSignificantBits();
return
( ((msb >> 32) & 0x00000000FFFFFFFFL)
| ((msb & 0x00000000FFFF0000L) << 16)
| ((msb & 0x0000000000000FFFL) << 48)
) / 10
- 12219292800000000L;
';
cqlsh:udf> SELECT
toUnixTimestamp(now()) AS now_ms
, udf.timeuuid_as_us(now()) AS now_us
FROM system.local;
now_ms | now_us
---------------+------------------
1525995892841 | 1525995892841000

PostgreSQL : cast string to date DD/MM/YYYY

I'm trying to cast a CHARACTER VARYING column to a DATE but I need a date format like this : DD/MM/YYYY. I use the following SQL query :
ALTER TABLE test
ALTER COLUMN date TYPE DATE using to_date(date, 'DD/MM/YYYY');
The result is a date like this : YYYY-MM-DD.
How can I get the DD/MM/YYYYformat ?
Thanks a lot in advance !
Thomas
A DATE column does not have a format. You cannot specify a format for it.
You can use DateStyle to control how PostgreSQL emits dates, but it's global and a bit limited.
Instead, you should use to_char to format the date when you query it, or format it in the client application. Like:
SELECT to_char("date", 'DD/MM/YYYY') FROM mytable;
e.g.
regress=> SELECT to_char(DATE '2014-04-01', 'DD/MM/YYYY');
to_char
------------
01/04/2014
(1 row)
https://www.postgresql.org/docs/8.4/functions-formatting.html
SELECT to_char(date_field, 'DD/MM/YYYY')
FROM table
The documentation says
The output format of the date/time types can be set to one of the four
styles ISO 8601, SQL (Ingres), traditional POSTGRES (Unix date
format), or German. The default is the ISO format.
So this particular format can be controlled with postgres date time output, eg:
t=# select now();
now
-------------------------------
2017-11-29 09:15:25.348342+00
(1 row)
t=# set datestyle to DMY, SQL;
SET
t=# select now();
now
-------------------------------
29/11/2017 09:15:31.28477 UTC
(1 row)
t=# select now()::date;
now
------------
29/11/2017
(1 row)
Mind that as #Craig mentioned in his answer, changing datestyle will also (and in first turn) change the way postgres parses date.
In case you need to convert the returned date of a select statement to a specific format you may use the following:
select to_char(DATE (*date_you_want_to_select*)::date, 'DD/MM/YYYY') as "Formated Date"
Depends on which type you require as output, but here are 2 quick examples based on an intervals:
SELECT (now() - interval '15 DAY')::date AS order_date -> 2021-07-29
SELECT to_char(now() - interval '15 DAY', 'YYYY-MM-DD') -> 2021-07-29
Let's say your date column is order_date:
SELECT (
RIGHT(order_date, 4)
|| '-'
|| SUBSTRING(order_date, 4, 2)
|| '-'
|| LEFT(order_date, 2)
)::DATE
FROM test
OR
SELECT CAST(
RIGHT(order_date, 4)
|| '-'
|| SUBSTRING(order_date, 4, 2)
|| '-'
|| LEFT(order_date, 2)
AS DATE )
FROM test

Cassandra Timestampe

I want ask about timestampe formate in the insert command
in the following sample when i insert any number like "12" or "15" in the "message_sent _at"
I found that all the values of the timestamp fields is the same value : 1970-01-01 02:00 EGYPT standard time .
sample:
CREATE TABLE chat (
id1 int,
id2 int,
message_sent_at timestamp,
message text,
primary key ((id1, id2), message_sent_at)
)
The units of timestamp type are milliseconds since the epoch (1/1/1970 00:00:00 UTC). Entering 12 means 12 ms after midnight so will be rounded to the time you print (in your timezone) when displayed in that format.
You can create timestamps from dates here: http://www.epochconverter.com/.

Resources