re-format timestamp to keep time and time zone - cassandra

I want to create a table with date and timezone in different columns.
For example:
Date 20170311 Time 10:32:24+1300
The format has to be the same as above.
When I create the table Date was set as type date and time was type timestamp.
When I insert the date, I have to follow a certain format like 2017-03-11, how can I make it the same as the table shown.
When inserting the time and time zone, I have to insert the date alone with it, like '2017-03-22T10:37:50+1300' is there any way that I can reformat it?
After inserting with this format '2017-03-22T10:37:50+1300', the time and time zone changed in the table, how could I keep it the same as input?
CREATE TABLE example (id int, work_date date, sequence timestamp);
INSERT INTO example (id int, work_date date, sequence timestamp) VALUES (1, '2017-03-22', '2017-03-22T10:37:50+1300')
expected result:
1 20170322 10:37:50+1300
actual result:
1 2017-03-22 2017-03-21 21:37:50.000000+0000

Cassandra has several data types related to date & time - date, time, and timestamp, and only the last one has the notion of the time zone.
The formatting of the timestamps is your responsibility - internally data is stored as long (8 bytes) representing number of milliseconds since epoch, and then converted into textual representation by corresponding driver - in case of cqlsh, the formatting is controlled by datetimeformat parameter. Similarly, for date & time data types - they are kept as numbers inside database, not as strings.
If you're accessing the data from your own program, then you can format time as you want.

Related

Cassandra parsing date

I have a table created using following script:
CREATE TABLE "TestTable2" (
id uuid,
timestamp timestamp,
msg text,
priority int,
source text,
PRIMARY KEY (id, timestamp)
);
Now I'm inserting one row:
INSERT INTO "TestTable2" (id, timestamp, msg, source) values (uuid(), '2002-03-31 02:36:10', 'asdas dasdasd', 'system1');
and I get an error:
Unable to execute CQL script on 'UdcCluster':Unable to coerce '2002-03-31 02:36:10' to a formatted date (long)
If I change the day of month to 30th or hour to 22 the statement is successfully executed.
Can you please explain to me what is wrong with the date?
PS.
Same error repeats for '1998-03-29 02:12:13', '1987-03-29 02:55:21' and '1984-03-25 02:45:25'. In all cases it's 2 am at the ending of March...
You're trying to get from a specific local time to a DateTime instance and you want that to be robust against daylight savings.
Specify the timezone in the pattern: yyyy-mm-dd HH:mm:ssZ
where Z is the RFC-822 4-digit time zone, expressing the time zone's
difference from UTC. For example, for the date and time of Jan 2,
2003, at 04:05:00 AM, GMT:
If no time zone is specified, the time zone of the Cassandra
coordinator node handing the write request is used. For accuracy,
DataStax recommends specifying the time zone rather than relying on
the time zone configured on the Cassandra nodes.
https://docs.datastax.com/en/cql/3.1/cql/cql_reference/timestamp_type_r.html

Oracle: How to convert a string into date format

Have done a lot of search before asking this question, like How to convert a string into date format, How to convert a string into date, but still can't figure it out.
So, here's the question, how to convert these string dates into date in Oracle:
"2016-08-15 10:45:30" (String type) -> 20160815 (Date type)
"20160815104530" (String type) -> 20160815 (Date type)
Any idea will be appreciated.
You are mixing two things here:
The first is the conversion of a String data type, in Oracle VARCHAR2 into a DATE data type.
The DATE data type has a precision of seconds, you can't change that. A DATE data type will always give you the date and time component, i.e year, month, day, hours, minutes and seconds: Oracle SQL Data Type Documentation
However, the second part of what you are asking is about how to format the date when retrieved. This is helpful when running reports, or other kinds of visual display of dates. For example, in the US you would most likely want your date columns appear in the format MM/DD/YYYY while everywhere else in the world you most likely want to stick with DD/MM/YYYY. Oracle lets you do that by telling it what NLS_DATE_FORMAT you want to use. You can set that parameter for each individual session as well as on database level, it is up to you (and your DBA) to decide where and when you want to set that. In your case you can apply this via the ALTER SESSION command:
SQL> ALTER SESSION SET nls_date_format='YYYY-MM-DD';
Session altered.
SQL> SELECT TO_DATE('2016-08-15 10:45:30', 'YYYY-MM-DD HH24:MI:SS') FROM DUAL;
TO_DATE(
----------
2016-08-15
SQL> SELECT TO_DATE('20160815104530', 'YYYYMMDDHH24MISS') FROM DUAL;
TO_DATE(
----------
2016-08-15
You use to_date():
select to_date(substr(str1, 1, 10), 'YYYY-MM-DD')
select to_date(substr(str2, 1, 8), 'YYYYMMDD')

Timeseries data modelling in cassandra

I am trying to store & retrieve data in cassandra in the following way:
Storing Data:
I created the table in the following way:
CREATE TABLE mydata (
myKey TEXT,
datetime TIMESTAMP,
value TEXT,
PRIMARY KEY (myKey,datetime)
);
Where i would store a value for every minute for last 5 years. So it stores 1440 * 365 * 5 = 2628000 records/columns per row (myKey as row key).
INSERT INTO mydata(myKey, datetime, value) VALUES ('1234ABCD','2013-04-03 07:01:00','72F');
INSERT INTO mydata(myKey, datetime, value) VALUES ('1234ABCD','2013-04-03 07:02:00','72F');
INSERT INTO mydata(myKey, datetime, value) VALUES ('1234ABCD','2013-04-03 07:03:00','72F');
.................
I am able to store data and all fine. However, i would like to know, if this is efficient way of doing (storing) data horizontally (2628000 values for each key for 1 million such keys altogether) ?
Retrieving Data:
After storing the data in above format, i am able to select data by using a simple select query for a period.
Ex:
SELECT *
FROM mydata
WHERE myKey='1234ABCD' AND datetime > '2013-04-03 07:01:00' AND datetime < '2013-04-03 07:04:00';
The query works fine and i get result as expected.
However my question is:
How can i select only those values at certain intervals. For example, if i query data for a day, i would get 1440 values (1 for every minute). I would like to get values at every 10 minutes interval (value at every 10th minute) limiting the no. of values to 144.
Is there a way to query the table if we use the above storage strategy?
If not, what are possible options to meet my requirement of querying data at a specific interval like 1-min, 10-min, 1-hour, 1-day etc?
Appreciate any other suggestions.
No it not right ,in future you will face problem because per row key we can only store 2 billion records or columns. After that it will not give error but it will store data also .
For your problem split column timestamp into year , month , day and time .
like 2016 , 04 , 04 and 15:03:00 .Put also year , month , day into partition key .
You definitely need to bound your partition with a modular version of the timestamp. But the granularity really depends on your reads.
If you are mainly going to read per day then use something like this PK((myKey, yyyymmdd), time)
If mainly by weeks PK((mykey, yyyyww), time), or month...
The problem is then if you want to read values for a whole year, then you better have a partition per weeks or month, or even year would do I think if you don't do any deletes, your partition size needs to be smaller than 100MB

How do I query for a specific day such as yesterday in Core Data?

In plain SQL (in my case: sqlite), I would be able to query for a specific date in a DATE column as follows:
SELECT * FROM table WHERE date(dateColumn) = '2015-01-01'
This works because the date() function cuts off the time part of the DATE value.
Can I do something similar with Predicates in Core Data? Or do I have to use something where I determine the start and end of that day and then look for dates between the two date values?

Duplicate timestamps in timeseries - Cassandra

I am going to use cassandra to store activity logs. I have something like this
CREATE TABLE general_actionlog (
date text,
time text,
date_added timestamp,
action text,
PRIMARY KEY ((date,time),date_added)
);
I want to store all the activity in an hour in a single row (=a time serie. "time" is only the hour of the day in the format H:00:00, ignoring minutes and seconds, so I have a row for each Y-m-d H:00:00)
The problem appears when two actions happen in the same timestamp (ex. two page views in the same second), so the second one overwrites the first one.
How can I solve this in a way that I still can query using slices?
Thanks
marc
You want to use timeuuid instead of timestamp for the date_added column. A timeuuid is a v1 UUID. It has a timestamp component (and is sorted by the timestamp), so it effectively provides a conflict-free timestamp.

Resources