Cassandra Timestamp : Incorrect time value - cassandra

I am new in Cassandra. I have a Cassandra( V: 3.11 ) table (data). It is having a column timeStampCol of timestamp type and I am inserting a value in it.
insert into data (timeStampCol) values('2017-05-02 17:33:03');
While accessing the data from table
select * from data;
I got result like -
# Row 1
----------+------------------------------------
timeStampCol | 2017-05-02 08:33:03.000000+0000
Inserted value and retrieved values are different for time.
Reason might be timezone, how can I get it correct ?

Your selected timestamp value is correct, it's just showing in different timezone.
If you insert data into timestamp column without providing timezone like this one :
insert into data (timeStampCol) values('2017-05-02 17:33:03');
Cassandra will choose coordinator timezone
If no time zone is specified, the time zone of the Cassandra coordinator node handing the write request is used. For accuracy, DataStax recommends specifying the time zone rather than relying on the time zone configured on the Cassandra nodes.
You Need To Convert the String date into java.util.Date and set the timezone of coordinator node, In my case it was GMT+6
DateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd");
Date date = dateFormat.parse("2012-01-21");
dateFormat.setTimeZone(TimeZone.getTimeZone("GMT+6")); //Change this time zone
Source : https://docs.datastax.com/en/cql/3.0/cql/cql_reference/timestamp_type_r.html

Cassandra will assume incoming data in the timezone it is set up. For example if you have Cassandra set up in IST, and even though incoming data is UTC, Cassandra will convert it back to UTC, considering data to be in IST.
You might have to set Cassandra coordinator timezone in code or calculate the time difference between the incoming data timezone and Cassandra timezone and add/subtract that from incoming data before it is written to Cassandra. This way you will have the exact timestamps written to Cassandra.

Related

How can I fetch timestamp data in my timezone?

I am using Cassandra 3.11.13 and I have table with timestamp column. Where are my data stored in terms of +0 timezone, i.e. 2022-10-14 07:51:00.000000+0000, but I am hosting in Kazakhstan GMT+6
I want to export certain rows and certain period of time. When I am exporting into CSV, I am getting a file with timezone +0.
I tried to query like select * from table_name where primary_key = 'smth' and timestamp > '2022-10-14T06:30:00+0600' and timestamp < '2022-10-14T23:59:59+0600', but it's changed nothing.
Question is: How can I fetch timestamp with certain/correct timestamp?
The CQL timestamp data type is encoded as the number of milliseconds since Unix epoch (Jan 1, 1970 00:00 GMT) so its value is encoded in UTC timezone. Clients also display timestamps with a UTC timezone by default.
If you want the data to be displayed in your timezone, you need to configure your app or client to a specific timezone. For example, you can configure cqlsh to use a different timezone by specifying it in the cqlshrc file:
;; Display timezone
timezone = Australia/Melbourne
You can find a sample copy of cqlshrc here. Note that you will need to install the pytz Python library to use different timezones with cqlsh.
For details, see Cassandra CQL shell. Cheers!

Cassandra Timestamp behavior with Select query

I have a column "postingdate" with datatype timestamp in Cassandra. I am using spring data Cassandra to save current date/time in this column when posting happens (Instant.now()). This is inserting date/time in UTC.
I have to select records which got posted on "2018-11-06". In table I have one record posted on this date and postingdate column is showing that as "2018-11-07 04:25:24+0000" in UTC.
I am running following query -
select * from mytable where id='5' and postingdate >=
'2018-11-06 00:00:00' and postingdate <= '2018-11-06 23:59:59';
Running this query on Dev Center console (or CQLSH), is giving me same results irrespective of timezone. I tried that in PST as well as IST and got the same result. Is Cassandra doing PST -> UTC OR IST -> UTC conversion before executing the query? If yes then how?
Per documentation:
When timezone is excluded, it's set to the client or coordinator timezone.
You can configure default timezone for CQLSH either by setting the TZ environment variable, or by specifying the timezone parameter in the cqlshrc configuration file.

Unable to coerce to a formatted date - Cassandra timestamp type

I have the values stored for timestamp type column in cassandra table in format of
2018-10-27 11:36:37.950000+0000 (GMT date).
I get Unable to coerce '2018-10-27 11:36:37.950000+0000' to a formatted date (long) when I run below query to get data.
select create_date from test_table where create_date='2018-10-27 11:36:37.950000+0000' allow filtering;
How to get the query working if the data is already stored in the table (of format, 2018-10-27 11:36:37.950000+0000) and also perform range (>= or <=) operations on create_date column?
I tried with create_date='2018-10-27 11:36:37.95Z',
create_date='2018-10-27 11:36:37.95' create_date='2018-10-27 11:36:37.95'too.
Is it possible to perform filtering on this kind of timestamp type data?
P.S. Using cqlsh to run query on cassandra table.
In first case, the problem is that you specify timestamp with microseconds, while Cassandra operates with milliseconds - try to remove the three last digits - .950 instead of .950000 (see this document for details). The timestamps are stored inside Cassandra as 64-bit number, and then formatted when printing results using the format specified by datetimeformat options of cqlshrc (see doc). Dates without explicit timezone will require that default timezone is specified in cqlshrc.
Regarding your question about filtering the data - this query will work only for small amounts of data, and on bigger data sizes will most probably timeout, as it will need to scan all data in the cluster. Also, the data won't be sorted correctly, because sorting happens only inside single partition.
If you want to perform such queries, then maybe the Spark Cassandra Connector will be the better choice, as it can effectively select required data, and then you can perform sorting, etc. Although this will require much more resources.
I recommend to take DS220 course from DataStax Academy to understand how to model data for Cassandra.
This is works for me
var datetime = DateTime.UtcNow.ToString("yyyy-MM-dd HH:MM:ss");
var query = $"SET updatedat = '{datetime}' WHERE ...

Cassandra time not saved in UTC

I need to split my timestamp to date and time separately and insert then to db columns with 'date' and 'time' cqltypes.
I was trying to insert a time value as string to Cassandra table. The time was converted to UTC (05:27:00). But while I checked table using Datastax devcenter, column was populated with value '09:37:54.935541808'. I tried to retrieve the value in spring using repository, then it was returning value as '3473746674935541808'.
How to get the correct value from table for time?
It looks like the limitation of Spring-data. In Cassandra time value is encoded as a 64-bit signed integer representing the number of nanoseconds since midnight. But I don't see the time type listed as supported in spring-data-cassandra documentation, so you may need to write your custom converter for it, as described in documentation.

Cassandra: Ignore timezone for timestamp value

I've written a program that reads a file containing the date (in yyyy/MM/dd format) and uses the Datastax Java Driver to read the date and add it to a cassandra table.
So for instance, if my record contains a date value of '2010/06/01', then this date value gets converted into a date object (using the SimpleDateFormat class).
However, when I view the data (containing the date) in the database, I see that the date (which in the cassandra table is a timestamp type) shows the following:
2010-06-01 00:00:00+0100
The issue here is that I don't want the timestamp to have "+0100" (to indicate that this is british summer time), rather I'd want to store the date just as "2010-06-01 00:00:00+0000".
I've done the following to my program to try and 'ignore' the timezone by doing the following:
SimpleTimeZone tz = new SimpleTimeZone(0, "Out Timezone");
TimeZone.setDefault(tz);
String dateStringFromFile = "2010/06/01";
SimpleDateFormat sdf = new SimpleDateFormat("yyyy/MM/dd");
Date theDate = sdf.parse(dateStringFromFile);
...Now when I add debug statements to my program, I can see that the date shows "2010-06-01 00:00:00+0000" on my log file (this is right for me). However when i see the date stored in Cassandra, i still see that the date shows as
"2010-06-01 00:00:00+0100" and not "2010-06-01 00:00:00+0000".
Is there anything on the cassandra side that I would have to change or update to ignore the timezone (i.e. not put +0100 on the date and to put +0000), so that the timestamp shows as "2010-06-01 00:00:00+0000"?
Please note that I am running Cassandra 3.0.5 on a Docker VM (Centos linux), Java 8.
Any advice is appreciated.
Thanks.
This solution has been sorted out. Nothing to do with cqlshrc. Its to do with forcing the timezone as 0 (like the code in the original post) and getting the time in milliseconds and writing that to the database - (which is a timestamp type column)

Resources