For Example: I want to create 40 tables in one keyspace. In 40 tables I want to shard 3 tables. Is is it possible to shard specific tables without creating new keyspace.
I have seen How to shard only specific tables using vitess But for this we need to create new keyspace. I don't want to create new keyspace. I want sharded and unsharded tables in one keyspace is it possible?
This is currently not possible. A keyspace is categorized as sharded or unsharded. So, you have to migrate the tables you want to shard into a sharded keyspace and then reshard the keyspace.
Some people worked around this by assigning a "null primary vindex" to the unsharded tables, essentially forcing all rows to live in the first shard. But I don't know if this was experimental or was actually used in production.
Related
On cassandra, we only need 100 days of data for specific tables. However, we only recently set the TTL value and the data older than that still stays in the system as stale data. We were thinking of different approaches to delete the old data out of the system. One suggestion was to create a Spark job to identify the data older than a specific timeframe and delete them all.
Another thought was to create a new table with just 100 days data and delete the old table. But I have various doubts on
how to rename the table where live data is being updated,
how will cassandra deal with such a table? While I have recreated a new table with less data and renamed it on one node(say node 1), will the other nodes in the cluster automatically delete the older data in their tables or sync the table on the node 1 and push all the older data onto it?
I am really new to cassandra and require expert advice on this.
Please suggest if there are better ways to handle this.
Cassandra does not have a way to rename a table, you will need to
create the new table with a different name
ensure this table has the TTL clause
load into it only the subset of records that you are interested on; this could be tricky as the query will depend on the schema of the table, is the column with the timestamp part of the clustering key?
update your application to point to the new table
drop the table
I need to delete all rows in Cassandra but with Amazon Keyspace isn't possible to execute TRUNCATE tbl_name because the TRUNCATE api isn't supported yet.
Now the few ideas that come in my mind are a little bit tricky:
Solution A
select all the rows
cycle all the rows and delete it (one by one or in a batch)
Solution B
DROP TABLE
CREATE TABLE with the structure of the old table
Do you have any idea to keep the process simplest?
Tnx in advance
If the data is not required. Option B - drop the table and recreate. You can pass in the capacity on create table statment using custom table properties.
CREATE TABLE my_keyspace.my_table (
id text,
division text,
project text,
role text,
manager_id text,
PRIMARY KEY (id,division))
WITH CUSTOM_PROPERTIES=
{'capacity_mode':
{'throughput_mode' : 'PROVISIONED',
'read_capacity_units' : 10,
'write_capacity_units' : 20},
'point_in_time_recovery': {'status': 'enabled'}}
AND TAGS={'pii' :'true',
'prod':'true'
};
Option C. If you require the data you can also leverage on-demand capacity mode which is pay-per request mode. With no request you only have to pay for storage. You can change modes once a day.
ALTER TABLE my_keyspace.my_table
WITH CUSTOM_PROPERTIES=
{'capacity_mode': {'throughput_mode': 'PAY_PER_REQUEST'}}
Solution B should be fine in absence of TRUNCATE. In older versions (version prior to 2.1) of Cassandra recreating table with the same name was a problem. Refer article Datastax FAQ Blog. But since then issue has been resolved via CASSANDRA-5202.
If data in table is not required anymore it is better to drop the table and recreate it. Moreover it will be very tedious task if table contains big amount of data.
In Cassandra, is there a way to generate CREATE TABLE statements for all the existing tables inside a particular keyspace?
DESC KEYSPACE KEYSPACE_NAME
Output CQL commands for the given keyspace. These CQL commands can be used to recreate the keyspace and tables.
I want to data from RDBMS to NoSQL. I created the first graph, I've found end tables. I found end nodes, I want to add to the table they belong. But I couldn't so. I must found Primary Key and Foreign Key on the tables. Only I must join tables. What can how I do in Node.js?
I want to migrate 500 tables from mysql to cassandra but do not want to create the schemas in cassandra before migration.
i know the option of CQL-IMPORT in Sqoop but only allows copying data with tables created in cassandra.
Is there any way where i can have all the tables structure copied from MYSQL to Cassandra schema format creation of 500 tables in cassandra with more than 100 columns per table will be time consuming.
please help