Cassandra parsing date - cassandra

I have a table created using following script:
CREATE TABLE "TestTable2" (
id uuid,
timestamp timestamp,
msg text,
priority int,
source text,
PRIMARY KEY (id, timestamp)
);
Now I'm inserting one row:
INSERT INTO "TestTable2" (id, timestamp, msg, source) values (uuid(), '2002-03-31 02:36:10', 'asdas dasdasd', 'system1');
and I get an error:
Unable to execute CQL script on 'UdcCluster':Unable to coerce '2002-03-31 02:36:10' to a formatted date (long)
If I change the day of month to 30th or hour to 22 the statement is successfully executed.
Can you please explain to me what is wrong with the date?
PS.
Same error repeats for '1998-03-29 02:12:13', '1987-03-29 02:55:21' and '1984-03-25 02:45:25'. In all cases it's 2 am at the ending of March...

You're trying to get from a specific local time to a DateTime instance and you want that to be robust against daylight savings.
Specify the timezone in the pattern: yyyy-mm-dd HH:mm:ssZ
where Z is the RFC-822 4-digit time zone, expressing the time zone's
difference from UTC. For example, for the date and time of Jan 2,
2003, at 04:05:00 AM, GMT:
If no time zone is specified, the time zone of the Cassandra
coordinator node handing the write request is used. For accuracy,
DataStax recommends specifying the time zone rather than relying on
the time zone configured on the Cassandra nodes.
https://docs.datastax.com/en/cql/3.1/cql/cql_reference/timestamp_type_r.html

Related

How can I fetch timestamp data in my timezone?

I am using Cassandra 3.11.13 and I have table with timestamp column. Where are my data stored in terms of +0 timezone, i.e. 2022-10-14 07:51:00.000000+0000, but I am hosting in Kazakhstan GMT+6
I want to export certain rows and certain period of time. When I am exporting into CSV, I am getting a file with timezone +0.
I tried to query like select * from table_name where primary_key = 'smth' and timestamp > '2022-10-14T06:30:00+0600' and timestamp < '2022-10-14T23:59:59+0600', but it's changed nothing.
Question is: How can I fetch timestamp with certain/correct timestamp?
The CQL timestamp data type is encoded as the number of milliseconds since Unix epoch (Jan 1, 1970 00:00 GMT) so its value is encoded in UTC timezone. Clients also display timestamps with a UTC timezone by default.
If you want the data to be displayed in your timezone, you need to configure your app or client to a specific timezone. For example, you can configure cqlsh to use a different timezone by specifying it in the cqlshrc file:
;; Display timezone
timezone = Australia/Melbourne
You can find a sample copy of cqlshrc here. Note that you will need to install the pytz Python library to use different timezones with cqlsh.
For details, see Cassandra CQL shell. Cheers!

re-format timestamp to keep time and time zone

I want to create a table with date and timezone in different columns.
For example:
Date 20170311 Time 10:32:24+1300
The format has to be the same as above.
When I create the table Date was set as type date and time was type timestamp.
When I insert the date, I have to follow a certain format like 2017-03-11, how can I make it the same as the table shown.
When inserting the time and time zone, I have to insert the date alone with it, like '2017-03-22T10:37:50+1300' is there any way that I can reformat it?
After inserting with this format '2017-03-22T10:37:50+1300', the time and time zone changed in the table, how could I keep it the same as input?
CREATE TABLE example (id int, work_date date, sequence timestamp);
INSERT INTO example (id int, work_date date, sequence timestamp) VALUES (1, '2017-03-22', '2017-03-22T10:37:50+1300')
expected result:
1 20170322 10:37:50+1300
actual result:
1 2017-03-22 2017-03-21 21:37:50.000000+0000
Cassandra has several data types related to date & time - date, time, and timestamp, and only the last one has the notion of the time zone.
The formatting of the timestamps is your responsibility - internally data is stored as long (8 bytes) representing number of milliseconds since epoch, and then converted into textual representation by corresponding driver - in case of cqlsh, the formatting is controlled by datetimeformat parameter. Similarly, for date & time data types - they are kept as numbers inside database, not as strings.
If you're accessing the data from your own program, then you can format time as you want.

Cassandra inserts timestamp in UTC time

I have json logs with timestamp(UTC TIME) in it. I map keys and values to Cassandra Table keys and Insert the record. However, Cassandra converts the already UTC timestamps to UTC again by subtracting 5 hours from the timestamp. The timezone here is (GMT + 5).
cqlsh> INSERT INTO myTable (id,time) VAlUES (abc123, 2018-01-12T12:32:31);
Now the time is already UTC time and its still inserts a timestamp of 5 hours ago.
How can I resolve this?
If you're using cqlsh to insert data, then you can specify default timezone in the cqlshrc file using the timezone parameter (see default cqlshrc as example).
If you insert dates programmatically, then you need to convert your time into corresponding type matching to the Cassandra's timestamp type (java.util.Date for Java, for example). In your case change could be simple - just append Z to timestamp string as pointed by Ralf

Cassandra Timestamp : Incorrect time value

I am new in Cassandra. I have a Cassandra( V: 3.11 ) table (data). It is having a column timeStampCol of timestamp type and I am inserting a value in it.
insert into data (timeStampCol) values('2017-05-02 17:33:03');
While accessing the data from table
select * from data;
I got result like -
# Row 1
----------+------------------------------------
timeStampCol | 2017-05-02 08:33:03.000000+0000
Inserted value and retrieved values are different for time.
Reason might be timezone, how can I get it correct ?
Your selected timestamp value is correct, it's just showing in different timezone.
If you insert data into timestamp column without providing timezone like this one :
insert into data (timeStampCol) values('2017-05-02 17:33:03');
Cassandra will choose coordinator timezone
If no time zone is specified, the time zone of the Cassandra coordinator node handing the write request is used. For accuracy, DataStax recommends specifying the time zone rather than relying on the time zone configured on the Cassandra nodes.
You Need To Convert the String date into java.util.Date and set the timezone of coordinator node, In my case it was GMT+6
DateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd");
Date date = dateFormat.parse("2012-01-21");
dateFormat.setTimeZone(TimeZone.getTimeZone("GMT+6")); //Change this time zone
Source : https://docs.datastax.com/en/cql/3.0/cql/cql_reference/timestamp_type_r.html
Cassandra will assume incoming data in the timezone it is set up. For example if you have Cassandra set up in IST, and even though incoming data is UTC, Cassandra will convert it back to UTC, considering data to be in IST.
You might have to set Cassandra coordinator timezone in code or calculate the time difference between the incoming data timezone and Cassandra timezone and add/subtract that from incoming data before it is written to Cassandra. This way you will have the exact timestamps written to Cassandra.

Duplicate timestamps in timeseries - Cassandra

I am going to use cassandra to store activity logs. I have something like this
CREATE TABLE general_actionlog (
date text,
time text,
date_added timestamp,
action text,
PRIMARY KEY ((date,time),date_added)
);
I want to store all the activity in an hour in a single row (=a time serie. "time" is only the hour of the day in the format H:00:00, ignoring minutes and seconds, so I have a row for each Y-m-d H:00:00)
The problem appears when two actions happen in the same timestamp (ex. two page views in the same second), so the second one overwrites the first one.
How can I solve this in a way that I still can query using slices?
Thanks
marc
You want to use timeuuid instead of timestamp for the date_added column. A timeuuid is a v1 UUID. It has a timestamp component (and is sorted by the timestamp), so it effectively provides a conflict-free timestamp.

Resources