How to get current timestamp with CQL while using Command Line? - cassandra

I am trying to insert into my CQL table from the command line. I am able to insert everything. But I am wondering if I have a timestamp column, then how can I insert into timestamp column from the command line? Basically, I want to insert current timestamp whenever I am inserting into my CQL table -
Currently, I am hardcoding the timestamp whenever I am inserting into my below CQL table -
CREATE TABLE TEST (ID TEXT, NAME TEXT, VALUE TEXT, LAST_MODIFIED_DATE TIMESTAMP, PRIMARY KEY (ID));
INSERT INTO TEST (ID, NAME, VALUE, LAST_MODIFIED_DATE) VALUES ('1', 'elephant', 'SOME_VALUE', 1382655211694);
Is there any way to get the current timestamp using some predefined functions in CQL so that while inserting into above table, I can use that method to get the current timestamp and then insert into above table?

You can use the timeuuid functions now() and dateof() (or in later versions of Cassandra, toTimestamp()), e.g.,
INSERT INTO TEST (ID, NAME, VALUE, LAST_MODIFIED_DATE)
VALUES ('2', 'elephant', 'SOME_VALUE', dateof(now()));
The now function takes no arguments and generates a new unique timeuuid (at the time where the statement using it is executed). The dateOf function takes a timeuuid argument and extracts the embedded timestamp. (Taken from the CQL documentation on timeuuid functions).
Cassandra >= 2.2.0-rc2
dateof() was deprecated in Cassandra 2.2.0-rc2. For later versions you should replace its use with toTimestamp(), as follows:
INSERT INTO TEST (ID, NAME, VALUE, LAST_MODIFIED_DATE)
VALUES ('2', 'elephant', 'SOME_VALUE', toTimestamp(now()));

In new version of cassandra could use toTimestamp(now()), and note that function dateof is deprecated.
e.g
insert into dummy(id, name, size, create_date) values (1, 'Eric', 12, toTimestamp(now()));

There are actually 2 different ways for different purposes to insert the current timestamp. From the docs:
Inserting the current timestamp
Use functions to insert the current
date into date or timestamp fields as follows:
Current date and time
into timestamp field: toTimestamp(now()) sets the timestamp to the
current time of the coordinator.
Current date (midnight) into
timestamp field: toTimestamp(toDate(now())) sets the timestamp to the
current date beginning of day (midnight).

Related

Cassandra: Execution order of Insert statements in a Batch request

I have a table with the following schema:
CREATE TABLE IF NOT EXISTS data (
key TEXT,
created_at TIMEUUID,
value TEXT,
PRIMARY KEY (key, created_at)
) WITH CLUSTERING ORDER BY (created_at DESC);
Data is appended only and to get the value of a key, we retrieve the record with latest created_at using the query:
SELECT * FROM data WHERE key = ? ORDER BY created_at DESC LIMIT 1
In our application, we can insert multiple records of the same key using Batch statement as following:
BEGIN BATCH
INSERT INTO data (key, created_at, value) VALUES ('MyKey', now(), 'value1');
INSERT INTO data (key, created_at, value) VALUES ('MyKey', now(), 'value2');
....
INSERT INTO data (key, created_at, value) VALUES ('MyKey', now(), 'value10');
APPLY BATCH;
What will be the latest record of this MyKey? Is the result deterministic?
I tested using values with random data and found that the value from SELECT query is always the value of the last statement of the batch.
My assumption is when the batch query sent to Coordinator node, the statements will be parsed by QueryProcessor in the order they appear in the batch query. And for each statement, the native function now() will be evaluated and a unique timeuuid generated in an increasing order.
now()
In the coordinator node, generates a new unique timeuuid in milliseconds when the statement is executed. The timestamp portion of the timeuuid conforms to the UTC (Universal Time) standard. This method is useful for inserting values. The value returned by now() is guaranteed to be unique.
Summary of findings regarding order of statements in a batch can be found in the comments.

Retrieve rows from last 24 hours

I have a table with the following (with other fields removed)
CREATE TABLE if NOT EXISTS request_audit (
user_id text,
request_body text,
lookup_timestamp TIMESTAMP
PRIMARY KEY ((user_id), lookup_timestamp)
) WITH CLUSTERING ORDER BY ( lookup_timestamp DESC);
I create a record with the following
INSERT INTO request_audit (user_id, lookup_timestamp, request_body) VALUES (?, ?, toTimestamp(now()))
I am trying to retrieve all rows within the last 24 hours, but I am having trouble with the timestamp,
I have tried
SELECT * from request_audit WHERE user_id = '1234' AND lookup_timestamp > toTimestamp(now() - "1 day" )
and various other ways of trying to take a day away from the query.
Cassandra has a very limited date operation support. What you need is a custom function to do date math calculation.
Inspired from here.
How to get Last 6 Month data comparing with timestamp column using cassandra query?
you can write a UDF (user defined function) to date operation.
CREATE FUNCTION dateAdd(date timestamp, day int)
CALLED ON NULL INPUT
RETURNS timestamp
LANGUAGE java
AS
$$java.util.Calendar c = java.util.Calendar.getInstance();
c.setTime(date);
c.add(java.util.Calendar.DAY_OF_MONTH, day);
return c.getTime();$$ ;
remember that you would have to enable UDF in config. Cassandra.yml. Hope that is possible.
enable_user_defined_functions: true
once done this query works perfectly.
SELECT * from request_audit WHERE user_id = '1234' AND lookup_timestamp > dateAdd(dateof(now()), -1)
You couldn't do it directly from CQL, as it doesn't support this kind of expressions. If you're running this query from cqlsh, then you can try to substitute the desired date with something like this:
date --date='-1 day' '+%F %T%z'
and execute this query.
If you're invoking this from your program, just use corresponding date/time library to get date corresponding -1 day, but this depends on the language that you're using.

Inserting date using yesterday/tomorrow into Cassandra

I'm trying to insert a date into Cassandra based on the current date.
create table mobileTimeSeries (
deviceid text,
date date,
PRIMARY KEY(deviceid, date));
insert into mobileTimeSeries (deviceid, date) values ('test', toDate(now()));
That works, but I'm wondering if it's possible to do something like
insert into mobileTimeSeries (deviceid, date) values ('test', toDate(now()-1));
insert into mobileTimeSeries (deviceid, date) values ('test', toDate(now()+1));
I just get this error mismatched input '+' expecting ')' (... 'tablet',toDate(now()) [+]...)
Not sure if this is possible at all. Thanks
You can calculate date on your app and just insert it as a date instead of using now().
After CASSANDRA-11936 in 4.0+ you can do now() - 1d kinda things.

Is it possible to insert ddmmyyhh to text column based on now() value of timeuuid column

I'm referring to one of the presentation slide from eBay - http://www.slideshare.net/jaykumarpatel/cassandra-data-modeling-best-practices
I want to try out the same thing. Hence, I create the following table.
CREATE TABLE ebay_event (
date text,
eventtype text,
time timeuuid,
payload text,
PRIMARY KEY((date, eventtype), time));
Then, in my PHP script, I will perform insert using the following insert statement.
insert into ebay_event(date, eventtype, time, payload) values('03031611', 'view', now(), 'additional data');
Instead of hard code value '03031611', is there a way to tell cassandra, to generate ddmmyyhh based on the now() value of timeuuid column?
No. There are no such functions available in cassandra. You will have to create it in the language you are using.
Values for the timestamp type are encoded as 64-bit signed integers
representing a number of milliseconds since the standard base time
known as the epoch: January 1 1970 at 00:00:00 GMT.
There are some functions available that can create date in YYYY-mm-dd format.
Date from timeuuid

Selecting timeuuid columns corresponding to a specific date

Short version: Is it possible to query for all timeuuid columns corresponding to a particular date?
More details:
I have a table defined as follows:
CREATE TABLE timetest(
key uuid,
activation_time timeuuid,
value text,
PRIMARY KEY(key,activation_time)
);
I have populated this with a single row, as follows (f0532ef0-2a15-11e3-b292-51843b245f21 is a timeuuid corresponding to the date 2013-09-30 22:19:06+0100):
insert into timetest (key, activation_time, value) VALUES (7daecb80-29b0-11e3-92ec-e291eb9d325e, f0532ef0-2a15-11e3-b292-51843b245f21, 'some value');
And I can query for that row as follows:
select activation_time,dateof(activation_time) from timetest where key=7daecb80-29b0-11e3-92ec-e291eb9d325e
which results in the following (using cqlsh)
activation_time | dateof(activation_time)
--------------------------------------+--------------------------
f0532ef0-2a15-11e3-b292-51843b245f21 | 2013-09-30 22:19:06+0100
Now lets assume there's a lot of data in my table and I want to retrieve all rows where activation_time corresponds to a particular date, say 2013-09-30 22:19:06+0100.
I would have expected to be able to query for the range of all timeuuids between minTimeuuid('2013-09-30 22:19:06+0100') and maxTimeuuid('2013-09-30 22:19:06+0100') but this doesn't seem possible (the following query returns zero rows):
select * from timetest where key=7daecb80-29b0-11e3-92ec-e291eb9d325e and activation_time>minTimeuuid('2013-09-30 22:19:06+0100') and activation_time<=maxTimeuuid('2013-09-30 22:19:06+0100');
It seems I need to use a hack whereby I increment the second date in my query (by a second) to catch the row(s), i.e.,
select * from timetest where key=7daecb80-29b0-11e3-92ec-e291eb9d325e and activation_time>minTimeuuid('2013-09-30 22:19:06+0100') and activation_time<=maxTimeuuid('2013-09-30 22:19:07+0100');
This feels wrong. Am I missing something? Is there a cleaner way to do this?
The CQL documentation discusses timeuuid functions but it's pretty short on gte/lte expressions with timeuuids, beyond:
The min/maxTimeuuid example selects all rows where the timeuuid column, t, is strictly later than 2013-01-01 00:05+0000 but strictly earlier than 2013-02-02 10:00+0000. The t >= maxTimeuuid('2013-01-01 00:05+0000') does not select a timeuuid generated exactly at 2013-01-01 00:05+0000 and is essentially equivalent to t > maxTimeuuid('2013-01-01 00:05+0000').
p.s. the following query also returns zero rows:
select * from timetest where key=7daecb80-29b0-11e3-92ec-e291eb9d325e and activation_time<=maxTimeuuid('2013-09-30 22:19:06+0100');
and the following query returns the row(s):
select * from timetest where key=7daecb80-29b0-11e3-92ec-e291eb9d325e and activation_time>minTimeuuid('2013-09-30 22:19:06+0100');
I'm sure the problem is that cqlsh does not display milliseconds for your timestamps
So the real timestamp is something like '2013-09-30 22:19:06.123+0100'
When you call maxTimeuuid('2013-09-30 22:19:06+0100') as milliseconds are missing, zero is assumed so it is the same as calling maxTimeuuid('2013-09-30 22:19:06.000+0100')
And as 22:19:06.123 > 22:19:06.000 that causes record to be filtered out.
Not directly related to answer but as an additional addon to #dimas answer.
cqlsh (version 5.0.1) seem to show the miliseconds now
system.dateof(id)
---------------------------------
2016-06-03 02:42:09.990000+0000
2016-05-28 17:07:30.244000+0000

Resources