Overwrite row in cassandra with INSERT, will it cause tombstone? - cassandra

Writing data to Cassandra without causing it to create tombstones are vital in our case, due to the amount of data and speed. Currently we have only written a row once, and then never had the need to update the row again, only fetch the data again.
Now there has been a case, where we actually need to write data, and then complete it with more data, that is finished after awhile.
It can be made by either;
overwrite all of the data in a row again using INSERT (all data is available), or
performing an Update only on the new data.
What is the best way to do it, bear in mind of the speed and not creating a tombstone is of importance ?

Tombstones will only created when deleting data or using TTL values.
Cassandra does align very well to your described use case. Incrementally adding data will work for both INSERT and UPDATE statements. Cassandra will store data in different locations in case of adding data over time for the same partition key. Periodically running compactions will merge data again for a single key to optimize access and free disk space. This will happend based on the timestamp of written values but does not create any new tombstones.
You can learn more about how Cassandra stores data e.g. here.

It would be more efficient to do an update to add new or changed data. There is no need to rewrite the old data that isn't changing and it would be inefficient to make Cassandra rewrite it.
When you do an insert or update, Cassandra keeps a timestamp for the modify time for each column. When you do a read, Cassandra collects all the writes for that key from in memory, from on disk, and from other replicas depending on the consistency setting. It will then merge the column data so that the newest value is used for each column.
When data is compacted on disk, if there are separate updates for different columns of a row, those will be combined into a single row in the compacted data.
You don't need to worry about creating tombstones by doing an update unless you are using an update to set a TTL (Time To Live) value. In your application it sounds like you never delete data, so you will never have any tombstones.

Related

Does Cassandra store only the affected columns when updating a record or does it store all columns every time it is updated?

If the answer is yes,
Does that mean unlike Mongo or RDMS, whether we retrieve every column or some column will have big performance impact in Cassandra?(I am not talking about transfer time over network as it will affect all of the above)
Does that mean during compaction, it cannot just stop when it finds the latest row for a primary key, it has to go through the full set in SSTables? (I understand there will be optimisations as previously compacted SSTable will have maximum one occurrence for row)
Please ask only one question per question.
That is entirely up to you. If you write one column value, it'll persist just that one. If you write them all, they will all persist, even if they are the same as the current value.
whether we retrieve every column or some column will have big performance impact
This is definitely the case. Queries for column values that are small or haven't been written to or deleted will be much faster than the opposite.
during compaction, it cannot just stop when it finds the latest row for a primary key, it has to go through the full set in SSTables?
Yes. And not just during compaction, but read queries will also check multiple SSTable files.

Deleting column in cassandra for large dataset

We have a redundant column that we'd like to delete from our Cassandra database (version 2.1.15). This is a text column represents the majority of data on disk (15 nodes X 1.8 TB per node).
The easiest option just seems to be an alter table to remove that column, and then let Cassandra compaction take care of things (also running Cassandra Reaper to manage repairs). However, given the size of the dataset I'm concerned I will knock over the cluster with a massive delete.
Other options I've consider is a process that will run through the keyspace setting the value to null, but I think this will have the same effect as removing the column, but is more under out control (but also requires writing something to do this).
Would anyone have any advice on how to approach this?
Thanks!
Dropping a column does mark the deleted values as tombstones. The column value becomes unavailable immediately and the column data is removed in the next compaction cycle.
If you want to to expedite the removal of the column before the compaction occurs, you can run nodetool upgradesstables to remove the data, after you use the ALTER TABLE command to change the metadata for the column.
See Documentation: https://docs.datastax.com/en/cql/3.1/cql/cql_reference/alter_table_r.html
If I remember correctly, drop of column doesn't really mark the deleted values with tombstone, but instead inserts corresponding entry into system.dropped_columns table, and then code, like, SerializationHelper & BTreeRow, performs filtering on the fly. The data will be deleted when compaction will happen.
Explicitly setting the value to null won't make situation better because you'll add data to the table.
I would recommend to test deletion on small cluster & check how it behaves.

Why don't an upsert create Tombstones in Cassandra?

As per Question regarding Tombstone, why doesn't upserts create tombstones?
As per datastax documentation, How is data updated ? for every upsert, cassandra considers as delete followed by insert, as the new timestamps of the insert overwrites the old timestamp. The old timestamp data has to be marked as delete which relates to tombstone.
Why do we have contradicting statements? or else am I missing anything here?
Usecase:
Data is inserted with unique key (uuid) in Cassandra and some of the columns in this data keeps updating frequently. Which approach do you recommend?
Inserting the same data with new column values in the
Insert query.
Updating the existing record based on given uuid
with new column values in the update query.
Which approach does or doesn't create tombstones? and how does Cassandra handle both queries?
As Russ pointed out, you may want to read other similar questions on this topic. However,
An upsert/overwrite is just-another-cell, with a name, a timestamp and a value.
A tombstone is just like an overwrite, except it gets one extra field indicating that it's been deleted, so that it isn't returned as valid output. The reason tombstones are often harmful is that they can accumulate in bad data models, even when people think the data is gone - and skipping them to get to live data actually requires memory.
When you update/upsert as you describe, the cell you create SHADOWS (obsoletes) the previous cell, which will be removed upon compaction. That previous cell is NOT a tombstone, even though it's no longer live/active - it will be compacted away and completely replaced by the new, live, highest-timestamp value as soon as compaction allows.
The biggest thing to keep in mind is this: tombstones aren't necessarily removed by compaction - they're kept around (persisted/rewritten) for at least gc_grace_seconds, and potentially even long if they need to shadow/cover other cells in sstables not-yet-compacted. Because of this, tombstones stay around for a long time, but shadowed/overwritten cells are gc'd as soon as the sstable they're in is compacted.

How Cassandra manage insertion, update and Deletion of column and Column data. internally

Actually I am getting confused with some concepts regarding cassandra.
what do we Actually mean by updating Cassandra row? is it mean adding more column or updates in the value of the column. or it is both.?
When we are adding more column to a row. is the previous row in the sstable got invalidate and new row entry is inserted in the SSTABLE with the newly added rows.?
Since SSTable is immutable so each new update in Column data OR addition of Column OR Deletion of Column data will result in invalidating the previous row and inserting a new Row with all the previous column+new Column?
Please Help..
What do we Actually mean by updating Cassandra row? is it mean adding
more column or updates in the value of the column. or it is both.?
In cassandra, updating a row and inserting a row are the same operation, bot lead to adding data to a memtable (in-memory sstable) which is latter flushed to disk and becomes an sstable (also a log line is written to the commit log if persistent writes are enabled). If you insert a column (btw in cassandra terms, a column is the same as a cell, and a row is known as a partition, you might find this useful if you do any further reading) which already exists, e.g:
INSERT INTO db.tbl (id, value) VALUES ('text_id1', 'some text as a value');
INSERT INTO db.tbl (id, value) VALUES ('text_id1', 'some text as a value');
You'll end up with 1 partition, since the first one is overwritten by the second insert. This means that inserting partitions with duplicate keys leads to the previous one being overwritten (and the overwrite is based on the timestamp at the time of insert, last write wins).
When we are adding more column(cell) to a row(partition). is the
previous row in the sstable got invalidate and new row entry is
inserted in the SSTABLE with the newly added rows.?
For cql, the previous columns will just contain a null value. No invalidation will happen, you can alter schemas as you please. If you delete a column, its' data will be removed during the next compaction with the aim of reclaiming back disk space.
Since SSTable is immutable so each new update in Column data OR
addition of Column OR Deletion of Column data will result in
invalidating the previous row and inserting a new Row with all the
previous column+new Column?
Kind of, sstables are merged into larger sstables when necessary, how this is done depends on the compaction strategy that is being used. There are two flavours, size-tiered and levelled compaction. Covering how they work is a whole separate question that has been answered by people who are smarter than me so have a read here.
Updating is covered here:
http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_write_update_c.html
As you note, SSTables are immutable, so you're probably wondering what happens when a later write supercedes data already in an SSTable. The storage engine reads from all tables that might have data for a requested row (as determined by bloom filters for each table). Understanding the read path might clarify this for you:
http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_about_reads_c.html
Specifically:
http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_about_read_path_c.html

Deleting a row of supercolumns and immediately replacing it with new data

Say I have a row of super-columns in Cassandra. I delete the entire row (it is now marked with a tombstone). I then immediately (before any compaction / nodetool repair) add different data with the same exact row-key. My question is, does Cassandra properly handle this and delete the data, or is there a risk of sstables being orphaned that should have been deleted?
all depends on the timestamps. The later timestamp wins....so if deletes timestamp is before the modification timestampt, modification wins and puts stuff in there.
Dean
PlayOrm for Cassandra Developer

Resources