Cassandra: Adding new column to the table - cassandra

Hi I just added a new column Business_sys to my table my_table:
ALTER TABLE my_table ALTER business_sys TYPE set<text>;
But again I just droped this column name because I wanted to change the type of column:
ALTER TABLE my_table DROP business_sys;
Again when I tried to add the same colmn name with different type am getting error message
"Cannnot add a collection with the name business_sys because the collection with the same name and different type has already been used in past"
I just tried to execute this command to add a new column with different type-
ALTER TABLE my_table ADD business_sys list<text>;
What did I do wrong? I am pretty new to Cassandra. Any suggestions?

You're running into CASSANDRA-6276. The problem is when you drop a column in Cassandra that the data in that column doesn't just disappear, and Cassandra may attempt to read that data with its new comparator type.
From the linked JIRA ticket:
Unfortunately, we can't allow dropping a component from the comparator, including dropping individual collection columns from ColumnToCollectionType.
If we do allow that, and have pre-existing data of that type, C* simply wouldn't know how to compare those...
...even if we did, and allowed [users] to create a different collection with the same name, we'd hit a different issue: the new collection's comparator would be used to compare potentially incompatible types.
The JIRA suggests that this may not be an issue in Cassandra 3.x, but I just tried it in 3.0.3 and it fails with the same error.
What did I do wrong? I am pretty new to Cassandra. Any suggestions?
Unfortunately, the only way around this one is to use a different name for your new list.

EDIT: I've tried this out in Cassandra and ended up with inconsistent missing data. Best way to proceed is to change the column name as suggested in CASSANDRA-6276. And always follow documentation guidelines :)
-WARNING-
According to this comment from CASSANDRA-6276, running the following workaround is unsafe.
Elaborating on #masum's comment - it's possible to work around the limitation by first recreating the column with a non-collection type such as an int. Afterwards, you can drop and recreate again using the new collection type.
From your example, assuming we have a business_sys set:
ALTER TABLE my_table ADD business_sys set<text>;
ALTER TABLE my_table DROP business_sys;
Now re-add the column as int and drop it again:
ALTER TABLE my_table ADD business_sys int;
ALTER TABLE my_table DROP business_sys;
Finally, you can re-create the column with the same name but different collection type:
ALTER TABLE my_table ADD business_sys list<text>;

Cassandra doesn't allow you to recreate a column with the same name and the same datatype, but there is an workaround to fix it.
Once you have dropped the column with SET type, you can recreate it with only another "default" type such as varchar or interger.
After recreating with one of those types, you can drop the column once again and finally recreate with the proper type.
I illustrated it below
ALTER TABLE my_table DROP business_sys; # the drop you've done
ALTER TABLE my_table ADD business_sys varchar; # recreating with another type
ALTER TABLE my_table DROP business_sys; # dropping again
ALTER TABLE my_table ADD business_sys list<text>; # recreating with proper type

Related

Cassandra Altering the table

I have a table in Cassandra say employee(id, email, role, name, password) with only id as my primary key.
I want to ...
1. Add another column (manager_id) in with a default value in it
I know that I can add a column in the table but there is no way i can provide a default value to that column through CQL. I can also not update the value for manager_id later since I need to know the id (Partition key and the values are randomly generated unique values which i don't know) to update the row. Is there any way I can achieve this?
2. Rename this table to all_employee.
I also know that its not allowed to rename a table in cassandra. So I am trying to copy the data of table(employee) to csv and copy from csv to new table (all_employee) and deleting the old table(employee). I am doing this through an automated script with cql queries in it and script works fine but will fail if it gets executed again(Which i can not restrict) since the table employee will not be there once its deleted. Essentially I am looking for "If exists" clause in COPY query which is not supported in cql. Is there any other way I can achieve the outcome?
Please note that the amount of data in the table is very small so performance in not an issue.
For #1
I dont think cassandra support default column . You need to do that from your appliaction. Write some default value every time you insert a row.
For #2
you can check if the table exists before trying to copy from it.
SELECT your_table_name FROM system_schema.tables WHERE keyspace_name='your_keyspace_name';

Converting a list column to a set column in cassandra/DSE

While iterating on a new feature my team created a list column in our DSE database. We now want it to be a set column. I dropped the column and created it again as a set column and got this error:
ALTER TABLE sometable DROP somecolumn;
ALTER TABLE sometable ADD somecolumn set<text>;
InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot add a collection with the name integrations because a collection with the same name and a different type (list) has already been used in the past"
There isn't even any data in the column. Is there not some sort of hard delete override? We can change the name but I really don't like the idea of a name that will not work if anyone tries to use it. Do I have to remake the whole table?
The best option is to add an alternate column with a different name or create a new table.
Technically it is possible to drop and recreate columns, but if you already have data in these columns on disk and in backups, it may create problems bringing nodes back online if they crash. (You cant load old data of one format into new columns with a different format)
If you really must do this, you can do the following:
ALTER TABLE sometable DROP somecolumn;
ALTER TABLE sometable ADD somecolumn int;
ALTER TABLE sometable DROP somecolumn;
ALTER TABLE sometable ADD somecolumn set<text>;
(Based on a comment in Cassandra: Adding new column to the table)

Change the type of a column in Cassandra

I have created a table my_table with a column phone, which has been declared as of type varint. After entering some data, I realized that it would have been better if I had declared this column as list<int>.
I tried to:
ALTER TABLE my_table
ALTER phone TYPE list<int>
but unfortunately I am not allowed to do so. Hopefully, there is a way to make this change.
UPDATE: Assume that I make a new column phonelist of type list<int>. Is there any efficient way to move the data in the phone column into the phonelist column?
You cannot change the type of an existing column to a map or collection.
The table shows the allowed alterations for data types

How to change PARTITION KEY column in Cassandra?

Suppose we have such table:
create table users (
id text,
roles set<text>,
PRIMARY KEY ((id))
);
I want all the values of this table to be stored on the same Cassandra node (OK, not really the same, same 3, but have all the data mirrored, but you got the point), so to achieve that i want to change this table to be like this:
create table users_v2 (
partition int,
id text,
roles set<text>,
PRIMARY KEY ((partition), id)
);
How can i do that without losing the data from the first table?
It seems to be impossible to ALTER TABLE in order to add such column. i'm OK with that.
What i try to do is to copy data from the first table and insert to the second table.
When i do it as it is, the partition column іs missing, which is expected.
I can ALTER the first table and add a 'partition' column to the end, and then COPY in correct order, but i can't update all the rows in the first table to set the all some partition, and it seems to be no "default" value when column is added.
You simply cannot alter the primary key of a Cassandra table. You need to create another table with your new schema and perform a data migration. I would suggest that you use Spark for that since it is really easy to do a migration between two tables with only a few lines of code.
This also answer to the alter primary key question.
If you have not a lot of data in table there is another way.
In utility "DataStax Dev Center", select table and use command "Export All result to file as INSERT". It will save all data from table to file with Insert CQL-instructions.
Then you should drop table, create new one with new PARTITION KEY and finally fill it by instructions from file via CQL.

How to alter cassandra table columns

I need to add additional columns to a table in cassandra. But the existing table is not empty. Is there any way to update it in a simple way? Otherwise what is the best approach to add additional columns to a non empty table? thx in advance.
There's a good example of adding table columns to an existing table in the CQL documentation on ALTER. The following statement will add the column gravesite (with type varchar) to to the table addamsFamily:
ALTER TABLE addamsFamily ADD gravesite varchar;

Resources