I have a Cassandra table and one column is defined as Set<text>. I want to delete rows that contain specific elements in that set.
For example if the table had a column names contained random values like ["Alice","Bob","Eve"],
I want a command to delete all the rows that contain the word Eve.
If namewas of type text then the command would go something like:
delete from keyspace.table where name='Eve';
however that does not work since name is not text but Set<text>. What would be an equivalent command here?
delete from keyspace.table where name CONTAINS 'Eve';
however you need to have secondary index on name column.
Related
I have xml file that i need to process. My output look like this :
In the nested table there is only one value but i cant figure out how to ungroup them.
Add a column and insert the following M (replacing channel.item.ht with whatever your column name actually is).
if [channel.item.ht] is table then Record.ToList([channel.item.ht]{0}){0} else [channel.item.ht]
I have a table in Cassandra say employee(id, email, role, name, password) with only id as my primary key.
I want to ...
1. Add another column (manager_id) in with a default value in it
I know that I can add a column in the table but there is no way i can provide a default value to that column through CQL. I can also not update the value for manager_id later since I need to know the id (Partition key and the values are randomly generated unique values which i don't know) to update the row. Is there any way I can achieve this?
2. Rename this table to all_employee.
I also know that its not allowed to rename a table in cassandra. So I am trying to copy the data of table(employee) to csv and copy from csv to new table (all_employee) and deleting the old table(employee). I am doing this through an automated script with cql queries in it and script works fine but will fail if it gets executed again(Which i can not restrict) since the table employee will not be there once its deleted. Essentially I am looking for "If exists" clause in COPY query which is not supported in cql. Is there any other way I can achieve the outcome?
Please note that the amount of data in the table is very small so performance in not an issue.
For #1
I dont think cassandra support default column . You need to do that from your appliaction. Write some default value every time you insert a row.
For #2
you can check if the table exists before trying to copy from it.
SELECT your_table_name FROM system_schema.tables WHERE keyspace_name='your_keyspace_name';
I have a question in cassandra db. I want to rename the column name. But its showing syntax error. Because my column name contain space. So how can I change column name:
Ex: sample column into samplecolumn?
You can use alter table to rename a column but theres a lot of restrictions on it. As sstables are immutable in order to change state of things on disk everything must be rewritten.
The main purpose of RENAME is to change the names of CQL-generated primary key and column names that are missing from a legacy table. The following restrictions apply to the RENAME operation:
You can only rename clustering columns, which are part of the primary key.
You cannot rename the partition key.
You can index a renamed column.
You cannot rename a column if an index has been created on it.
You cannot rename a static column (since you cannot use a static column in the table's primary key).
https://docs.datastax.com/en/cql/3.1/cql/cql_reference/alter_table_r.html
I have a large table in Cassandra with a column of type int but no values are outside the range 0-10. I want to reduce the table size by changing the type of the column to tinyint.
This is the error I get
[Query invalid because of configuration issue] message="Cannot change COLUMN_NAME from type int to type tinyint: types are not order-compatible.">
Is there a nice way to handle this with a cast or other such query trickery?
If not ... and without taking the database down, is there a better way to solve this than doing the following?
make a new column of type tinyint
update my code to duplicate data to this column during write operations
copy old data to the new column [will take a while probably]
swap the names of the columns
revert my code change (only update one column)
delete the old int column
I would say deleting old columns and copying data to new columns is not ideal.
If your cassandra column family is accessed by a single entry point (service), my suggestion would be,
Add a new column.
Retain the old column. (You can rename it like COLUMNNAME_OBSOLETE).
After updating your code, only populate the data against new column in your code.
While reading data into domain object, if your new column is null then fill it with old column.
In one of our project, we followed the above steps against prod data and it worked fine. After few months, when we weren't need of COLUMNNAME_OBSOLETE we dropped that column.
The CQL3 specification description of the UPDATE statement begins with the following paragraph:
The UPDATE statement writes one or more columns for a given row in a
table. The (where-clause) is used to select the row to update and must
include all columns composing the PRIMARY KEY (the IN relation is only
supported for the last column of the partition key). Other columns
values are specified through after the SET keyword.
The description in the specification of the DELETE statement begins with a similar paragraph:
The DELETE statement deletes columns and rows. If column names are provided
directly after the DELETE keyword, only those columns are deleted from the row
indicated by the (where-clause) (the id[value] syntax in (selection) is for
collection, please refer to the collection section for more details).
Otherwise whole rows are removed. The (where-clause) allows to specify the
key for the row(s) to delete (the IN relation is only supported for the last
column of the partition key).
The bolded portions of each of these descriptions state, in layman's terms, that these statements can be used to modify data in a solely row-based manner.
However, given the nature of the relationship (or lack thereof) between the rows and the static columns (which exist independent of any particular row) of a table, it seems as though there should be a way to modify such columns given only the keys of the partitions they're respectively contained in. According to the specification however, that does not seem to be possible, and I'm not sure if that is a product of the difficulty to allow such in the CQL3 syntax, or something else.
If a static column cannot be updated or deleted independent of any row in its table, then such operations become coupled with their non-static-column-based counterparts, making the set of columns targeted by such operations, difficult to determine. For example, given a populated table with the following definition:
CREATE TABLE IF NOT EXISTS example_table
(
partitionKeyColumn int
clusteringColumn int
nonPrimaryKeyColumn int
staticColumn varchar static
PRIMARY KEY (partitionKeyColumn, clusteringColumn)
)
... it is not immediately obvious if the following DELETE statements are equivalent:
//#1 (Explicitly specifies all of the columns in and "in" the target row)
DELETE partitionKeyColumn, clusteringColumn, nonPrimaryKeyColumn, staticColumn FROM example_table WHERE partitionKeyColumn = 1 AND clusteringColumn = 2
//#2 (Implicitly specifies all of the columns in (but not "in"?) the target row)
DELETE FROM example_table WHERE partitionKeyColumn = 1 AND clusteringColumn = 2
So, phrasing my observations in question form:
Are the above DELETE statements equivalent?
Does the primary key of at least one row in a CQL3 table have to be supplied in order to update or delete a static column in said table? If so, why?
I do not know about specification but in the real cassandra world, your two DELETE statements are not equivalent.
The first statement deletes the static_column whereas the second one does not. The reason of this is that static columns are shared by rows. You have to specify it explicitly to actually delete it.
Furthermore, I do not think its a good idea to DELETE static columns and non-static columns at the same time. By the way, this statement won't work :
DELETE staticColumn FROM example_table WHERE partitionKeyColumn = 1 AND clusteringColumn = 2
The error output is :
Bad Request: Invalid restriction on clustering column priceable_name since the DELETE statement modifies only static columns