Add Column in Apache Cassandra

Add Column in Apache Cassandra - node.js

How to check in node.js that the column does not exist in Apache Cassandra ?
I need to add a column only if it not exists.
I have read that I must make a select before, but if I select a column that does not exist, it will return an error.

Note that if you're on Cassandra 3.x and up, you'll want to query from the columns table on the system_schema keyspace:
aploetz#cqlsh:system_schema> SELECT * FROm system_schema.columns
WHERE keyspace_name='stackoverflow'
AND table_name='vehicle_information'
AND column_name='name';
keyspace_name | table_name | column_name | clustering_order | column_name_bytes | kind | position | type
---------------+---------------------+-------------+------------------+-------------------+---------+----------+------
stackoverflow | vehicle_information | name | none | 0x6e616d65 | regular | -1 | text
(1 rows)

You can check a column existance using a select query on system.schema_columns table.
Suppose you have the table test_table on keyspace test. Now you want to check a column test_column If exit or not.
Use the below query :
SELECT * FROM system.schema_columns WHERE keyspace_name = 'test' AND columnfamily_name = 'test_table' AND column_name = 'test_column';
If the above query return a result then the column exist otherwise not.

Related

SparkSQL Column Query not showing column contents?

I have created a persistant table via df.saveAsTable
When I run the following query I receive these results
spark.sql("""SELECT * FROM mytable """).show()
I get view of the DataFrame and all of it's columns, and all of the data.
However when I run
spark.sql("""SELECT 'NameDisplay' FROM mytable """).show()
I receive results that look like this
| NameDisplay|
|--|
| NameDisplay |
| NameDisplay |
| NameDisplay |
| NameDisplay |
| NameDisplay |
| NameDisplay |
NameDisplay is definitely one of the columns in the table as it's shown when I run select * - how come this is not shown in the second query?

Issue was using quotes on the column names. Needs to be escaped via backtick ``NameDisplay`

Selecting 'NameDisplay', in SQL, is selecting the literal, text "NameDisplay". In that, the result you got are in fact valid.
To select values of the "NameDisplay" column, then you must issue:
"SELECT NameDisplay FROM mytable "
Or, if you need to quote it (maybe in case the column was created like this or has spaces, or is case-sensitive):
"""SELECT `NameDisplay` FROM mytable"""
This is SQL syntax, nothing specific to Spark.

Delete Rows using timestamp column cassandra

I want to delete data between timestamp from my table.
CREATE TABLE propatterns_test.test (
clientId text,
meterId text,
meterreading text,
date timestamp,
PRIMARY KEY (meterId, date) );
My delete query is:
DELETE FROM test WHERE meterid = 'M5' AND date > '2016-12-27 10:00:00+0000';
Which returned this error :
InvalidRequest: Error from server: code=2200 [Invalid query]
message="Invalid operator < for PRIMARY KEY part date"
After that I tried to delete a specific row :
DELETE FROM test WHERE meterid = 'M5' AND date = '2016-12-27 09:42:30+0000';
Actually the table contains the same record, but it was not deleted.
This is what my data looks like:
meterid | date | clientid | meterreading
---------+--------------------------+----------+--------------
M5 | 2016-12-27 09:42:30+0000 | RDS | 35417.8
M5 | 2016-12-27 09:42:44+0000 | RDS | 35417.8
M5 | 2016-12-27 09:47:20+0000 | RDS | 35417.8
M5 | 2016-12-27 09:47:33+0000 | RDS | 35417.8
Nothing is deleting from table. So how can I delete data between timestamp dates which is part of the primary key?

I see a couple of things happening here. First of all, like iconnj mentioned, range deletes are not possible in versions prior to Cassandra 3.0.
Secondly, your single-row delete attempt is failing (I believe) due to the fact that you are not accounting for the milliseconds present on the timestamp. You can see this if you nest your date column inside the timestsampasblob and blobasbigint functions:
aploetz#cqlsh:stackoverflow> SELECT meterid,date,blobAsBigint(timestampAsBlob(date))
FROM propatterns WHERE meterid='M5';
meterid | date | system.blobasbigint(system.timestampasblob(date))
---------+--------------------------+---------------------------------------------------
M5 | 2016-12-27 09:42:30+0000 | 1482831750000
M5 | 2016-12-30 17:31:53+0000 | 1483119113231
M5 | 2016-12-30 17:32:08+0000 | 1483119128812
(3 rows)
Note the zeros on the end of the 2016-12-27 09:42:30+0000 row, that I explicitly INSERTed from your example. Note that the two rows I INSERTed using the dateof(now()) nested functions actually has the milliseconds as the last three digits on the timestamps.
Watch what happens when I take those three digits and add them as milliseconds when I delete one of the rows:
aploetz#cqlsh:stackoverflow> DELETE FROM propatterns WHERE meterid='M5'
AND date='2016-12-30 17:32:08.812+0000';
aploetz#cqlsh:stackoverflow> SELECT meterid,date,blobAsBigint(timestampAsBlob(date))
FROM propatterns WHERE meterid='M5';
meterid | date | system.blobasbigint(system.timestampasblob(date))
---------+--------------------------+---------------------------------------------------
M5 | 2016-12-27 09:42:30+0000 | 1482831750000
M5 | 2016-12-30 17:31:53+0000 | 1483119113231
(2 rows)
In summary:
You cannot perform range deletes prior to Cassandra 3.0.
You cannot delete individual rows keyed by timestamps without specifying milliseconds, if milliseconds are indeed present.

Delete with range clause is possible in C* 3.0 onwards. Looking at the error you got I think you are on a pre 3.0 version in which case you won't be able to do this via CQL

In Cassandra 3 you can use the "...from Y using timestamp XXX where ..." command:
create table mytime (
location_id text,
tour_id text,
mytime timestamp,
PRIMARY KEY (location_id, tour_id));
INSERT INTO mytime (location_id, tour_id, mytime) values ('location1', '1', toTimeStamp(now()));
INSERT INTO mytime (location_id, tour_id, mytime) values ('location1', '2', toTimeStamp(now()));
Be aware: the value you need to use for the timestamp is nanoseconds not miliseconds:
select location_id, mytime, blobAsBigint(mytime), WRITETIME(mytime) from mytime;
location_id |mytime |system.blobasbigint(mytime) |writetime(mytime) |
------------|------------------------|----------------------------|------------------|
location1 |2018-11-28-09.53.52.110 |1543395232110 |1543395232109517 |
location1 |2018-11-28-09.53.52.742 |1543395232742 |1543395232740055 |
So now you can do
delete from mytime using timestamp 1543395232109517 where location_id = 'location1';
Which correctly deletes the entry <= 1543395232109517:
select location_id, mytime, blobAsBigint(mytime), WRITETIME(mytime) from mytime;
location_id |mytime |system.blobasbigint(mytime) |writetime(mytime) |
------------|------------------------|----------------------------|------------------|
location1 |2018-11-28-09.53.52.742 |1543395232742 |1543395232740055 |

How to do negation for 'CONTAINS'

I have Cassandra table with one column defined as set.
How can I achieve something like this:
SELECT * FROM <table> WHERE <set_column_name> NOT CONTAINS <value>
Proper secondary index in was already created.

From the documentation:
SELECT select_expression FROM keyspace_name.table_name WHERE
relation AND relation ... ORDER BY ( clustering_column ( ASC | DESC
)...) LIMIT n ALLOW FILTERING
then later:
relation is:
column_name op term
and finally:
op is = | < | > | <= | > | = | CONTAINS | CONTAINS KEY
So there's no native way to perform such query. You have to workaround by designing a new table to specifically satisfy this query.

Get cassandra tables creation date

How can I get the creation date and time of a cassandra table?
I tried to use cqlsh DESC TABLE but there is no information about the creation time stamp...

Depending on your version of Cassandra, you can check the schema tables. Each table gets a unique ID when it is created, and that ID gets written to the schema tables. If you query the WRITETIME of that ID, it should give you a UNIX timestamp (in microseconds) of when it was created.
Cassandra 2.2.x and down:
> SELECT keyspace_name, columnfamily_name, writetime(cf_id)
FROM system.schema_columnfamilies
WHERE keyspace_name='stackoverflow' AND columnfamily_name='book';
keyspace_name | columnfamily_name | writetime(cf_id)
---------------+-------------------+------------------
stackoverflow | book | 1446047871412000
(1 rows)
Cassandra 3.0 and up:
> SELECT keyspace_name, table_name, writetime(id)
FROM system_schema.tables
WHERE keyspace_name='stackoverflow' AND table_name='book';
keyspace_name | table_name | writetime(id)
---------------+------------+------------------
stackoverflow | book | 1442339779097000
(1 rows)

Cassandra - Delete Time Series Rows in cqlsh

Running Cassandra 2.0.11 and I'm having difficulty deleting a row of time series data in CQLSH. Since I'm unable to use > or < in the WHERE clause of a DELETE statement, I'm assuming I need the exact time
Schema:
CREATE TABLE account_data_by_user (
user_id int,
time timestamp,
account_id int,
account_desc text,
...
PRIMARY KEY ((user_id), time, account_id)
Row in question:
user_id | time | account_id | account_desc |
--------+--------------------------+-----------------+------------------+-
1 | 2015-02-20 08:51:55-0600 | 1 | null |
Attempting:
DELETE
FROM account_data_by_user
WHERE user_id = 1 and time = '2015-02-20 08:51:55-0600' and account_id = 1
The above executes successfully, but the row is still there. I'm assuming the cqlsh output [time] is the problem.
I should note that I can delete a row like this through cqlengine.Model.delete, but I'm not sure what it's executing to accomplish the delete.

So after much google, I've discovered the blob conversion functions from this JIRA issue: https://issues.apache.org/jira/browse/CASSANDRA-5870
Query:
SELECT user_id, host_account_id, blobasbigint(timestampasblob(time))
FROM account_data_by_user where user_id
Returns:
user_id | account_id | blobasbigint(timestampasblob(time))
---------+-----------------+-------------------------------------
1 | 1 | 1424458973126
1 | 184531 | 1423738054142
DELETE
FROM account_data_by_user
WHERE user_id = 1 and time = 1424458973126 and host_account_id = 1;
This successfully removed the desired row.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Add Column in Apache Cassandra - node.js

How to check in node.js that the column does not exist in Apache Cassandra ? I need to add a column only if it not exists. I have read that I must make a select before, but if I select a column that does not exist, it will return an error.

Related

SparkSQL Column Query not showing column contents?

Delete Rows using timestamp column cassandra

How to do negation for 'CONTAINS'

Get cassandra tables creation date

Cassandra - Delete Time Series Rows in cqlsh

Categories

Resources