I am using cassandra.
I have two column families A and B. Both the column families have same data but both have different primary keys. Now I am using a batch statement to update the rows in these two tables.
Table schema is as follows:
Primary Key of table A [id1(partition key) id2(partition key) id3(clustering key)]
Primary Key of table B [id1(partition key) id2(partition key) state(clustering key) id3(clustering key)]
I want to update the state of both the tables. State is cluster key in B and in table A, it is simple column.
What I do is fetch the state from A and consider it as old state.
Then in batch what I do is first delete row from table A, then Delete row from Table B, Insert new row in table A and the insert new row in table B.
Note : Using the old state that is fetched from A, I make primary key of B and then delete from B and the insert new row in B.
It is working fine but for parallel requests it is not.
If 2 requests are coming for same primary key from 2 different instances, then I am getting the problem. Table B gets the two entries with old and new state.
So How can I solve that in cassandra?
Cassandra 2.0 and above supports Lightweight transactions.
where you have "IF NOT EXISTS" condition while you are inserting.
In your case, you can't check it when you are fetching the state from table A, but you can restrict it while inserting which will not allow duplicates in your case. E.g.
insert into A(id1, id2, state, id3)
values('val1', 'val1', 'val3', 'val4')
IF NOT EXISTS
So first one which executes will pass, but the second one will fail. So handle a retry / any failure mechanism from your client based on your business requirement.
Check this docs for more information: https://www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0
Related
Lets say I have a table like below with a composite partition key.
CREATE TABLE heartrate (
pet_chip_id uuid,
date text,
time timestamp,
heart_rate int,
PRIMARY KEY ((pet_chip_id, date), time)
);
Lets say there is a batch job to prune all the data older than X. I can't do below query since its missing other partition key in the query.
DELETE FROM heartrate WHERE date < '2020-01-01';
How do you model your data such a way that this can be achieved in Scylla? I understand that internally scylla creates a partition based on partition keys but in this case its impossible to query all the list of pet_chip_id and do N queries to delete.
Just wanted to know how people do this outside RDBMS world.
The recommended way to delete old data automatically in Scylla is using the Time-to-live (TTL) feature:
When you write a row, you add "USING TTL 864000" is you want that data to be deleted automatically in 10 days. You can also specify a default TTL for a given table, so that every piece of data written to the table will get expired after (say) 10 days.
Scylla's TTL feature is separate from the data itself, so it doesn't matter which columns you used as partition keys or clustering keys - in particular the "date" column no longer needs to be a clustering key (or exist at all, for that matter) - unless you also need it for something else.
As #nadav-harel said in his answer if you can define a TTL that's always the best solution but if you can't, a possible solution is to create a materialized view to be able to list the primary keys of the main table based on the field that you need to use in the delete query. In the prune job you can first do a select from the MV and then delete from the main table using the values that you got from the MV.
Example:
CREATE TABLE my_table (
a uuid,
b text,
c text,
d int,
e timestamp
PRIMARY KEY ((a, b), c)
);
CREATE MATERIALIZED VIEW my_mv AS
SELECT a,
b,
c
FROM my_table
WHERE time IS NOT NULL
PRIMARY KEY (b, a, c);
Then in your prune job you could select from my_mv based on b and then delete from my_table based on the values returned from the select query.
Note that this solution might not be effective depending on your model and the amount of data you have, but keep in mind that deleting data is also a way of querying your data and your model should be defined based on your queries needs, i.e. before defining your model, you need to think about every way you will query it (including how you will prune your data).
[Question posted by a user on YugabyteDB Community Slack]
I have a table with TTL and a secondary index, using YugabyteDB 2.9.0 and I’m getting the following error when I try to insert a row:
SyntaxException: Feature Not Supported
Below is my schema:
CREATE TABLE lists.list_table (
item_value text,
list_id uuid,
created_at timestamp,
updated_at timestamp,
is_deleted boolean,
valid_from timestamp,
valid_till timestamp,
metadata jsonb,
PRIMARY KEY ((item_value, list_id))
) WITH default_time_to_live = 0
AND transactions = {'enabled': 'true'};
CREATE INDEX list_created_at_idx ON lists.list_table (list_id, created_at)
WITH transactions = {'enabled': 'true'};
We have two types of queries (80% & 20% distribution):
select * from list_table where list_id= <id> and item_value = <value>
select * from list_table where list_id= <id> and created_at>= <created_at>
We expect per list_id there would be around 1000-10000 entries.
The TTL would be around 1 month.
It is a restriction, it’s currently not supported to transactionally expire rows using TTL out of a table which are indexed (i.e. atomic expiry of TTL entries in both table and index). There are several workarounds to this:
a) In YCQL, we also support an index with a weaker consistency. This is not well documented today, but you can see the details here: https://github.com/YugaByte/yugabyte-db/issues/1696
The main issue to call out when using this variant of index is that error handling (on INSERT failure), is that it is an application side responsibility to retry the INSERT on failure. As noted in the above issue << If an insert/update or batch of such operations fails, it is the app's responsibility to retry the operation so that the index is consistent. Much like in a 2-table case, it would have been the apps responsibility to retry (in case of a failure between the update to the two tables) to make sure both tables are in sync again. >>
This type of index supports a TTL at the table & index level. (which is recommended to keep the same): https://github.com/yugabyte/yugabyte-db/issues/2481#issuecomment-537177471
b)Another workaround is to use a background cleanup job to periodically delete stale records (instead of using TTL).
c)Avoid using indexes and store data in two tables. one organized by the original primary key and one organized by the index columns you wanted (as the primary key). Both tables can have TTL. But it is an application side responsibility to INSERT to both tables when data is added to the database.
The first table's PK would be ((list_id, item_value)), identical to the current main table. nstead of an index you'll have a second table; the second table's PK would be ((list_id), created_at) and both tables would have a TTL. The application must insert the data into both tables. In the 2nd table you have a choice:
(option 1) Duplicate all the columns from the main table here including your JSON columns etc. This makes Q2 lookup fast, the row has everything it needs; but increases your storage requirements.
(option 2): In addition to the PK, just store the item_value column in the second table. For Q2, you must first lookup the 2nd table and get the item_value, and then use list_id and item_value and retrieve the data from the main table (much like an index would do under the covers).
d)Another workaround, is if we could avoid the index and pick the PK to be ((list_id, item_value), created_at).
This would not affect the performance of Q1 because with (where list_id and item_value) provided it can use the PK to find the rows. But it would be slower for Q2 where list_id and created_at are provided because while it can still use list_id, it must filter out the data using the created_at value without the help of an index. So if Q2 is really 20% of your queries, you probably do not want to scan 1 to 10k items to find your matching row.
To clarify option (c), with the example in mind:
The first table's PK would be ((list_id, item_value)); it is the same as your current main table. Instead of an index you'll have a second table; the second table's PK would be ((list_id), created_at).
both tables would have a TTL
The application would have to insert entries into both tables.
In the 2nd table you have a choice:
(option 1) duplicate all the columns from the main table, including your JSON columns etc. This makes Q2 lookup fast, the row has everything it needs; but increases your storage requirements.
(option 2): in addition to the Primary Key, just store the item_value column in the second table. For Q2, you must first lookup the 2nd table and get the item_value, and then use list_id and item_value and retrieve the data from the main table (much like an index would do under the covers)
Say I insert three rows in cassandra in below order one by one
ID,firstname, lastname, websitename
1:fname1, lname1, site1
2:fname2, lname2, site2
3:fname3, lname3, site3
The column store stores columns together, like this:
1:fname1,2:fname2,3:fname3
1:lname1,2:lname2,3:lname3
1:site1,2:site2,3:site3
Does it mean when I insert the first row i.e 1:fname1, lname1, site1, it will each column in separate disk block for all three columns so that
during firstname column has to be read in some query. all related column data is on single block ?
Will it not make write slow as it cassandra has to store the data in 3 blocks instead of one to ensure column data is tored together ?
Cassandra is not a column-oriented database, it is a partition-row store, this means that the data in your example will be stored like this:
"YourTable" : {
row1 : { "ID":1, "firstname":"fname1", "lastname":"lname1", "websitename":"site1", "timestamp":1582988571},
row2 : { "ID":2, "firstname":"fname2", "lastname":"lname2", "websitename":"site2", "timestamp":1582989563}
row3 : { "ID":3, "firstname":"fname3", "lastname":"lname3", "websitename":"site3", "timestamp":1582989572}
...
}
The data is grouped and searched based on the primary key (which is the partition key and could include one or several clustering keys).
Some things to consider:
Cassandra is an append-only store, this means that when you try to update or delete a record, internally it will create a new record with the new value and a different timestamp; for the delete operation it will add a meta-data called "tombstone" that identifies the records that will be removed
Adding or removing nodes to the cluster will trigger a rearrangement of the tokens distribution, this means that the instance or server where a record can be located or maintained may change
Cassandra isn't a classical column store. It stores all inserted/updated data together, organized first by partition key, and then inside partition by clustering columns/primary keys. Data could be in different SSTables when you update them at different time point, but the compaction process will eventually try to merge them together.
If you're interested, you can use sstabledump against data files and see how data is stored. There is also a very good blog post from The Last Pickle about storage engine in the Cassandra 3.0 (it's different from previous versions).
Cassandra is basically a column-family database or row partitioned database along with column information not column based/columnar/column oriented database. When insert/fetch we need to mention partition(aka row key , aka primary key) column information. We can add any column at any point of time.
Column-family stores, like Cassandra, is great if you have high throughput writes and want to be able to linearly scale horizontally.
The term "column-family" comes from the original storage engine that was a key/value store, where the value was a "family" of column/value tuples. There was no hard limit on the number of columns that each key could have.
I want to filter on a table that has a partition and a clustering key with another criteria on a regular column. I got the following warning.
InvalidQueryException: Cannot execute this query as it might involve
data filtering and thus may have unpredictable performance. If you
want to execute this query despite the performance unpredictability,
use ALLOW FILTERING
I understand the problem if the partition and the clustering key are not used. In my case, is it a relevant error or can I ignore it?
Here is an example of the table and query.
CREATE TABLE mytable(
name text,
id uuid,
deleted boolean
PRIMARY KEY((name),id)
)
SELECT id FROM mytable WHERE name='myname' AND id='myid' AND deleted=false;
In Cassandra you can't filter data with non-primary key column unless you create index in it.
Cassandra 3.0 or up it is allowed to filter data with non primary key but in unpredictable performance
Cassandra 3.0 or up, If you provide all the primary key (as your given query) then you can use the query with ALLOW FILTERING, ignoring the warning
Otherwise filter from the client side or remove the field deleted and create another table :
Instead of updating the field to deleted true move your data to another table let's say mytable_deleted
CREATE TABLE mytable_deleted (
name text,
id uuid
PRIMARY KEY (name, id)
);
Now if you only have the non deleted data on mytable and deleted data on mytable_deleted table
or
Create index on it :
The column deleted is a low cardinality column. So remember
A query on an indexed column in a large cluster typically requires collating responses from multiple data partitions. The query response slows down as more machines are added to the cluster. You can avoid a performance hit when looking for a row in a large partition by narrowing the search.
Read More : When not to use an index
I'm trying to understand the difference between these two and the scenarios in which you would prefer to use one over the other.
My specific use case is using cassandra as an event ingestion system backed by an analytics engine that interprets the event.
My model includes
event id (the partition key)
event time (a clustering column)
event type (i'm not sure whether to use clustering column or secondary index)
I figure the most common read scenario will be to get the events over a time range hence event time is the clustering column. A less frequent read scenario might involve further filtering the event query by event type.
A secondary index is pretty similar to what we know from regular relational databases. If you have a query with a where clause that uses column values that are not part of the primary key, lookup would be slow because a full row search has to be performed. Secondary indexes make it possible to service such queries efficiently. Secondary indexes are stored as extra tables, and just store extra data to make it easy to find your way in the main table.
So that's a good ol' index, which we already know about. So far, there's nothing new to cassandra and its distributed nature.
Partitioning and clustering is all about deciding how rows from the main table are spread among the nodes. This is unique to cassandara since it determines the distribution of data. So, the primary key consists of at least one column. The first column in the primary key is used as the partition key. The partition key is used to decide which node to store a row. If the primary key has additional columns, the columns are used to cluster the data on a given node - the data is stored in lexicographic order on a node by clustering columns.
This question has more specifics on clustering columns: Clustering Keys in Cassandra
So an index on a given column X makes the lookup X --> primary key efficient. The partition key (first column in the primary key) determines which node a row is stored on. Clustering columns (additional columns in the primary key) determine which order rows are stored in on their assigned node.
So your intuition sounds about right - the event ID is presumably guaranteed unique, so is great for building a primary key. Event time is a great way to order rows on disk on a given node.
If you never needed to lookup data by event type, eg, never had a query like SELECT * FROM Events WHERE Type = Warning, then you have no need for your additional indexes, but your demands for partitioning don't change. Indexes make it easy to serve queries with different predicates. Since you mentioned that you indeed were planning on performing queries like that, you do in fact likely want an index on your EventType column.
Check out the cassandra documentation: http://www.datastax.com/documentation/cql/3.0/cql/ddl/ddl_compound_keys_c.html
Cassandra uses the first column name in the primary key definition as the partition key.
...
In the case of the playlists table, the song_order is the clustering column. The data for each partition is clustered by the remaining column or columns of the primary key definition. On a physical node, when rows for a partition key are stored in order based on the clustering columns