ScyllaDB - [Invalid query] message="marshaling error: Milliseconds length exceeds expected (6)" - cassandra

I have a table with a column of type timestamp and when I insert a value, Scylla saves it with 3 zeros more.
According to Scylla Documentation (https://docs.scylladb.com/getting-started/types/#timestamps):
Timestamps can be input in CQL either using their value as an integer, or using a string that represents an ISO 8601 date. For instance, all of the values below are valid timestamp values for Mar 2, 2011, at 04:05:00 AM, GMT:
1299038700000
'2011-02-03 04:05+0000'
'2011-02-03 04:05:00+0000'
'2011-02-03 04:05:00.000+0000'
'2011-02-03T04:05+0000'
'2011-02-03T04:05:00+0000'
'2011-02-03T04:05:00.000+0000'
So when I create a table for example:
CREATE TABLE callers (phone text, timestamp timestamp, callID text, PRIMARY KEY(phone, timestamp));
And insert values into it:
INSERT INTO callers (phone, timestamp, callID) VALUES ('6978311845', 1299038700000, '123');
INSERT INTO callers (phone, timestamp, callID) VALUES ('6978311846', '2011-02-03 04:05+0000', '456');
INSERT INTO callers (phone, timestamp, callID) VALUES ('6978311847', '2011-02-03 04:05:00.000+0000', '789');
The SELECT * FROM callers; will show all timestamps with 3 zeros more after the dot:
phone | timestamp | callid
------------+---------------------------------+--------
6978311847 | 2011-02-03 04:05:00.000000+0000 | 789
6978311845 | 2011-03-02 04:05:00.000000+0000 | 123
6978311846 | 2011-02-03 04:05:00.000000+0000 | 456
As a result when I try for example to delete a row:
DELETE FROM callers WHERE phone = '6978311845' AND timestamp = '2011-03-02 04:05:00.000000+0000';
An error occurs:
InvalidRequest: Error from server: code=2200 [Invalid query] message="marshaling error: unable to parse date '2011-03-02 04:05:00.000000+0000': marshaling error: Milliseconds length exceeds expected (6)"
How can I store timestamp without getting this error?

You have hr:min:sec.millisec -> Millisec can be up to 999/1000 so essentially that's what the error is saying.
The 3 INSERT statement you did are correct in terms of syntax.
The DELETE statement should be in the same format as the INSERT:
AND timestamp = '2011-03-02 04:05:00.00+0000'
AND timestamp = '2011-03-02 04:05:00.000+0000'
AND timestamp = '2011-03-02 04:05:00+0000'
AND timestamp = '2011-03-02 04:05:00'
The additional 000 that appear in the table are just a display issue.

Related

How can we filter rows based on timestamp column?

I have a cassandra column which is of type date and has values in timestamp format like below. How can we filter rows based on this column which have date greater than today's date?
Example:
Type: date
Timestamp: 2021-06-29 11:53:52 +00:00
TTL: null
Value: 2021-03-16T00:00:00.000+0000
I was able to filter rows using columname <= '2021-09-25' which gives ten rows some of them having dates on sep 23 and 24. When i filter using columname < '2021-09-24', i get an error like below
An error occurred on line 1 (use Ctrl-L to toggle line numbers):
Cassandra failure during read query at consistency ONE (1 responses were required but only 0 replica responded, 1 failed)
The CQL timestamp data type is encoded as the number of milliseconds since Unix epoch (Jan 1, 1970 00:00 GMT) so you need to be precise when you're working with timestamps.
Depending on where you're running the query, the filter could be translated in the local timezone. Let me illustrate with this example table:
CREATE TABLE community.tstamptbl (
id int,
tstamp timestamp,
PRIMARY KEY (id, tstamp)
)
These 2 statements may appear similar but translate to 2 different entries:
INSERT INTO tstamptbl (id, tstamp) VALUES (5, '2021-08-09');
INSERT INTO tstamptbl (id, tstamp) VALUES (5, '2021-08-09 +0000');
The first statement creates an entry with a timestamp in my local timezone (Melbourne, Australia) while the second statement creates an entry with a timestamp in UTC (+0000):
cqlsh:community> SELECT * FROM tstamptbl WHERE id = 5;
id | tstamp
----+---------------------------------
5 | 2021-08-08 14:00:00.000000+0000
5 | 2021-08-09 00:00:00.000000+0000
Similarly, you need to be precise when reading the data. You need to specify the timezone to remove ambiguity. Here are some examples:
SELECT * FROM tstamptbl WHERE id = 5 AND tstamp < '2021-08-09 +0000';
SELECT * FROM tstamptbl WHERE id = 1 AND tstamp < '2021-08-10 12:00+0000';
SELECT * FROM tstamptbl WHERE id = 1 AND tstamp < '2021-08-10 12:34:56+0000';
In the second part of your question, the error isn't directly related to your filter. The problem is that the replica(s) failed to respond for whatever reason (e.g. unresponsive/overloaded, down, etc). You need to investigate that issue separately. Cheers!

Date selection query is not working as expected in Cassandra

I have a domain class and below is the declaration:
class Emp{
static mapWith = "cassandra"
String name
Date doj
}
data:
id name doj
1 X 01-01-2010
2 Y 01-20-2012
Cassandra query:
select * from emp_schema.emp where doj='01-01-2010';
Error:
code=2200 [Invalid query] message="Unable to coerce '01-01-2010' to a formatted date (long)"
the format to query dates in cassandra is yyyy-mm-dd
select * from emp_schema.emp where doj='01-01-2010';
Carlos is correct in that Cassandra requires dates to be formatted like yyy-mm-dd.
But this query will only work if doj is your partition key. If your PRIMARY KEY is not setup to indicate doj as the partition key, your query is not possible.
I would specifically design your table to suit your query. This definition partitions on doj and clusters on id for uniqueness, as multiple emp[loyee]s can probably have the same doj:
create table emp_by_doj (
doj date,
id int,
name text,
primary key (doj,id));
Then you can query by a specific date, and have multiple rows returned for it:
> SELECT * FROM emp_by_doj WHERE doj='2017-06-01';
doj | id | name
------------+------+-------
2017-06-01 | 7721 | Sarah
2017-06-01 | 8122 | Sam
(2 rows)

getting problem in inserting values in microseconds

I am inserting in time using timestamp datatype but it cannot give me values in microseconds. Is it possible to insert values in microseconds or it takes values only up to milliseconds.
cqlsh:cassandradb> select * from time_stamp;
id | name | t
----+---------+---------------------------------
1 | deepank | 2015-02-16 06:30:24.000000+0000
2 | arun | 2016-02-16 06:35:24.483000+0000
this is what I have inserted but I am not expecting this value. I have simply inserted value in milliseconds when I insert in micro it gives me error as following
(2 rows)
cqlsh:cassandradb> select id, name, blobAsBigint(timestampAsBlob(t)) from time_ stamp;
id | name | system.blobasbigint(system.timestampasblob(t))
----+---------+------------------------------------------------
1 | deepank | 1424068224000
2 | arun | 1455604524483
(2 rows)
In this, I have used blob but I think it didn't go well.
cqlsh:cassandradb> insert into time_stamp(id,name, t)values(3,'tarun','2015-02-16 06:30:2.84325');
InvalidRequest: Error from server: code=2200 [Invalid query] message="Unable to coerce '2015-02-16 06:30:2.84325' to a formatted date (long)"
cqlsh:cassandradb> insert into time_stamp(id,name, t)values(3,'tarun','2015-02-16 06:30:2.8432');
InvalidRequest: Error from server: code=2200 [Invalid query] message="Unable to coerce '2015-02-16 06:30:2.8432' to a formatted date (long)"
cqlsh:cassandradb> insert into time_stamp(id,name, t)values(3,'tarun','2015-02-16 06:30:2.842');``
last value is taken
I expect that it could insert value in microseconds;
so that output will be in six digit.
Timestamp in Cassandra has millisecond resolution, and you should use only 3 digits after . in timestamp string. The microsecond resolution is used only for WriteTime of the record.
For some, unknown for me, reason, cqlsh outputs data with microseconds resolution, and imho it's huge UX problem with it...

Cassandra - EQ relation doesn't work on timestamp primary key

I have the following table.
CREATE TABLE experiment(
id uuid,
country text,
data text,
insert_timestamp timestamp,
PRIMARY KEY(insert_timestamp));
I insert data via
INSERT INTO experiment(id, country, data, insert_timestamp) VALUES (uuid(), 'my', 'the data', dateof(now()));
When I
SELECT * from experiment;
I get
insert_timestamp | country | data | id
--------------------------+---------+----------+--------------------------------------
2016-03-03 03:04:36+0000 | my | the data | e08cddd2-b93d-4e39-b0f3-82b813f83a87
But, if I SELECT via insert_timestamp
SELECT * from experiment WHERE insert_timestamp = '2016-03-03 03:04:36+0000';
I get empty result.
insert_timestamp | country | data | id
------------------+---------+------+----
(0 rows)
Any idea why it is so?
A timestamp. Strings constant are allow to input timestamps as dates,
see Working with dates below for more information. Datestamps with
format YYYY-MM-DD HH:MM:SS.SSS are returned.
So when you query the data using 2016-03-03 03:04:36+0000 it is interpreted as 2016-03-03 03:04:36.0+0000 which might not be true when you inserted the data.
Hence it is returning 0 rows.
Note: The date format visible in cql shell is configured in cqlshrc file's UI section.
Also dateOf function is deprecated Details. And based on your data model if there are multiple threads writing data at same time your data will get override.

How to delete a record in Cassandra?

I have a table like this:
CREATE TABLE mytable (
user_id int,
device_id ascii,
record_time timestamp,
timestamp timeuuid,
info_1 text,
info_2 int,
PRIMARY KEY (user_id, device_id, record_time, timestamp)
);
When I ask Cassandra to delete a record (an entry in the columnfamily) like this:
DELETE from my_table where user_id = X and device_id = Y and record_time = Z and timestamp = XX;
it returns without an error, but when I query again the record is still there. Now if I try to delete a whole row like this:
DELETE from my_table where user_id = X
It works and removes the whole row, and querying again immediately doesn't return any more data from that row.
What I am doing wrong? How you can remove a record in Cassandra?
Thanks
Ok, here is my theory as to what is going on. You have to be careful with timestamps, because they will store data down to the millisecond. But, they will only display data to the second. Take this sample table for example:
aploetz#cqlsh:stackoverflow> SELECT id, datetime FROM data;
id | datetime
--------+--------------------------
B25881 | 2015-02-16 12:00:03-0600
B26354 | 2015-02-16 12:00:03-0600
(2 rows)
The datetimes (of type timestamp) are equal, right? Nope:
aploetz#cqlsh:stackoverflow> SELECT id, blobAsBigint(timestampAsBlob(datetime)),
datetime FROM data;
id | blobAsBigint(timestampAsBlob(datetime)) | datetime
--------+-----------------------------------------+--------------------------
B25881 | 1424109603000 | 2015-02-16 12:00:03-0600
B26354 | 1424109603234 | 2015-02-16 12:00:03-0600
(2 rows)
As you are finding out, this becomes problematic when you use timestamps as part of your PRIMARY KEY. It is possible that your timestamp is storing more precision than it is showing you. And thus, you will need to provide that hidden precision if you will be successful in deleting that single row.
Anyway, you have a couple of options here. One, find a way to ensure that you are not entering more precision than necessary into your record_time. Or, you could define record_time as a timeuuid.
Again, it's a theory. I could be totally wrong, but I have seen people do this a few times. Usually it happens when they insert timestamp data using dateof(now()) like this:
INSERT INTO table (key, time, data) VALUES (1,dateof(now()),'blah blah');
CREATE TABLE worker_login_table (
worker_id text,
logged_in_time timestamp,
PRIMARY KEY (worker_id, logged_in_time)
);
INSERT INTO worker_login_table (worker_id, logged_in_time)
VALUES ("worker_1",toTimestamp(now()));
after 1 hour executed the above insert statement once again
select * from worker_login_table;
worker_id| logged_in_time
----------+--------------------------
worker_1 | 2019-10-23 12:00:03+0000
worker_1 | 2015-10-23 13:00:03+0000
(2 rows)
Query the table to get absolute timestamp
select worker_id, blobAsBigint(timestampAsBlob(logged_in_time )), logged_in_time from worker_login_table;
worker_id | blobAsBigint(timestampAsBlob(logged_in_time)) | logged_in_time
--------+-----------------------------------------+--------------------------
worker_1 | 1524109603000 | 2019-10-23 12:00:03+0000
worker_1 | 1524209403234 | 2019-10-23 13:00:03+0000
(2 rows)
The below command will not delete the entry from Cassandra as the precise value of timestamp is required to delete the entry
DELETE from worker_login_table where worker_id='worker_1' and logged_in_time ='2019-10-23 12:00:03+0000';
By using the timestamp from blob we can delete the entry from Cassandra
DELETE from worker_login_table where worker_id='worker_1' and logged_in_time ='1524209403234';

Resources