Does the same partition key in different cassandra tables add up to cell theoretical limit? - cassandra

It is known that a Cassandra partition has a theoretical limit of 2 billion cells. But how does that work in a situation like this one below:
create table table1 (
some_id int PRIMARY KEY,
some_name text
);
create table table2 (
other_id int PRIMARY KEY,
other_name text
);
Assume we have 1 billion cells in partition (some_id = 1) on table1.
If we had another 1 billion cells in partition (other_id = 1) on table2, would those add up to the 2 billion theoretical limit?
In other words, are equal partition keys in different tables stored together?

Different tables have different partitions. This makes the structure of any particular partition homogenous (it will always follow the proscribed schema of a single table) which allows for optimizations.
If you look at the storage engine under the hood you'll see that every table even has it's own directory structure making it clear that a partition from one table will never interact with the partition of another. (see /var/lib/cassandra/)

Related

How does Cassandra Partitioning actually work?

I understand that two tables with same partition columns and values have same token generated. Does that mean that all the cells of this partition in both tables are actually in the same partition ? How does Cassandra store data internally ?
Eg:
Create table table1 (emp_id int PRIMARY KEY, name text, role text);
Create table table2 (emp_id int PRIMARY KEY, name text, role text);​
​​
​​INSERT INTO table1(emp_id, name, role) VALUES (1, 'sahil', 'MTS');
​​INSERT INTO table2(emp_id, name, role) VALUES (1, 'sahil', 'MTS');
SELECT token(emp_id) from table1 where token(emp_id) = token(11596);
system.token(emp_id)
----------------------
**7447223576279188802**​
SELECT token(emp_id) from table2 where token(emp_id) = token(1);
system.token(emp_id)
----------------------
**7447223576279188802**
​​
For your example, because both tables have the same partition key, then when identical values are inserted, they will be mapped to the same token. It is on insert that the hash function to the PK is applied to determine what replica will get the data. If you use the Murmur3 partitioner (which is used by default) then you get a consistent token value, i.e. using the same PK and PK value, the result is the same. You can reference this page for understanding:
https://docs.datastax.com/en/cassandra-oss/3.x/cassandra/architecture/archDataDistributeHashing.html
Rows (items of data) that have the same table and the same partition key are said to be in the same partition. The most important consequence of being in the same partition is that data in the same partition is guaranteed to be co-located - handled by the same replica nodes and in ScyllaDB, even by the same CPU. This allows efficiently scanning a partition: All the partition's data can be read from a single node and Cassandra doesn't to go back and forth between replicas to read the various pieces of the partition and combine them. This is also what allows a node that handles the partition's full data to maintain it sorted by the clustering key: A process called compaction is merging different pieces of a sorted partition (these are sstables, or sorted string tables) into a bigger sorted partition.
When you have two different tables in the same keyspace, and use the same partition key in both, they are not stored physically on disk together - because each table has its own set of sstables (files on disk), so in that sense they are not "in the same partition". However, the co-location property which I mentioned earlier still holds (if the two tables are in the same keyspace): Two identically-keyed partitions in the two tables will be stored on exactly the same node. Why is this important/useful? Usually it isn't. One place where this knowledge can become useful is that it can be used in some situations to achieve atomic batch write to both tables at once, utilizing the fact that all replicas will see both writes together, whereas usually two writes to two tables go to different nodes at different times.

Primary Key in Cassandra

I have the following scenario;
My data has the id field, and this field constantly increasing.
When event created, id is assigned = 1 automatically.
Then 2, 3, 4 and so on.
When data that has the id = 1 is generated, then it will never be generated again.
I want to store this dat ain Cassandra. I can set primary key as the id field, but i dont know how cassandra will create partitions for each record?
Will it create one partition for each record?
Or will it create range partition by primary key. For example; id from 1 to 100 is the first partition, 100-200 is the second partition etc.
In Cassandra, the partition key uniquely identifies a single partition (record) in the table. For clarification, the primary key:
must have 1 partition key
zero or more clustering columns
So the primary key doesn't equate to a range of partitions.
Compared to traditional RDBMS which have two-dimensional tables, Cassandra tables have the traditional 2D tables but can also be 3D or more. The power of Cassandra is that tables can be multi-dimensional meaning each partition can have one or more rows (it can have thousands).
If you're interested, I've explained this in a bit more detail with examples in this post -- https://community.datastax.com/questions/6171/. Cheers!

Trying to visual how wide and skinny rows are layed out

Can someone give and show me how the data is layed out when you design your tables for wide vs. skinny rows.
I'm not sure I fully grasp how the data is spread out with a "wide" row.
Is there a difference in how you can fetch the data or will it be the same i.e. if it is ordered it doesn't matter if the data is vertical (skinny) or horizontally (wide) organized.
Update
Is a table considered with if the primary key consists of more than one column?
Or table will have wide rows only if the partition key is a composite partition key?
Wide... Skinny... Terms that make your head explode... I prefer to oversimplify the thing as such:
All the tables have wide rows
You simply need to take care of how wide the rows gets
This allows me to think this as follow (mangling a bit the C* terminology):
Number of RECORDS in a partition
1 <--------------------------------------- ... 2Billion
^ ^
Skinny rows wide rows
The lesser records in a partition, the skinner is the "partition", and vice-versa.
When designing for C* I always keep in mind a couple of things:
I want to use "skinny partitions" when my data can be fetched with one query and it is fully contained in one record of one partition. Typical example is something along SELECT * FROM table WHERE username = 'xmas79'; where the table has a primary key in the form of PRIMARY KEY (username)that let me get all the data belonging to a particular username.
I want to use "wide rows" when my data can be fetched with one query and it is fully contained on multiple records of one partition. Typical examples are range queries like SELECT * FROM table WHERE sensor = 'pressure' AND time >= '2016-09-22';, where the table has a primary key in the form of PRIMARY KEY (sensor, time).
So, first approach for one shot queries, second approach for range queries. Beware that this second approach have the (major) drawback that you can keep adding data to the partition, and it will get wider and wider, hurting performances.
In order to control how wide your partitions are, you need to add something to the partition key. In the sensor example above, if your don't violate your requirements of course, you can "group" some measurements by date, eg you split the measures in a day-by-day groups, making the primary key like PRIMARY KEY ((sensor, day), time), where the partition key was transformed to (sensor, day). By this approach, you have full (well, let's say good at least) control on the wideness of your partitions.
You only need to find a good compromise between your query capabilities and the desired performance.
I suggest these three readings for further investigation on the details:
Wide Rows in Cassandra CQL
Does CQL support dynamic columns / wide rows?
CQL3 for Cassandra experts
Beware that in the 1. there's a mistake in the second to last picture: the primary key should be
PRIMARY KEY ((user_id, tweet_id))
with double parenthesis around the columns instead of one.

What are the pros and cons of grouping multiple table together in Cassandra?

The problem is Cassandra cannot handle a lot of tables per cluster (> 1000). I was looking for any means to reduce the number of tables, and one of them was grouping multiple tables that share the same structure to gether.
Let say if we have two table A and B
create table A (
key text,
value text,
primary key(key)
)
and
create table B (
key text,
value text,
primary key(key)
)
We can group them together by adding one more partition key
create table Shared (
original_table_name text, // either 'A' or 'B'
key text,
value text,
primary key(original_table_name, key)
)
My question is, is it a good pattern and what are the consequences of modelling data this way?
Please elaborate what you mean by alot of tables, because our production is running with 50+ tables, and I don't see any issue with it.
Anyways, if your application is using atlot of tables, then most probable cause of it it, normalized table. In cassandra you should always create denormalized tables, because of no join facility. Cassandra is built for very fast writes, so, you can count on it and not worry about that.
Now regarding the new design, I don't see any problem with that, only thing is your partition key should be combination of (table_name, key) and not just table_name so that it will be evenly distributed across nodes.
And ofcourse to query each time, you will have to specify table_name + key.

Are sorted columns in Cassandra using just one set of nodes? (one set = repeat factor)

Using older versions of Cassandra, we were expected to create our own sorted rows using a special row of columns, because columns are saved sorted in Cassandra.
Is Cassandra 3.0 with CQL using the same concept when you create a PRIMARY KEY?
Say, for example, that I create a table like so:
CREATE TABLE my_table (
created_on timestamp,
...,
PRIMARY KEY (created_on)
);
Then I add various entries like so:
INSERT INTO my_table (created_on, ...) VALUES (1, ...);
...
INSERT INTO my_table (created_on, ...) VALUES (9, ...);
How does Cassandra manage the sort on the PRIMARY KEY? Will that happens on all nodes, or only one set (what I call a set is the number of replicates, so if you have a cluster of 100 nodes with a replication factor of 4, would the primary key appear on 100 nodes, 25, or just 4? With older versions, it would only be on 4 nodes.)
In your case the primary key is the partition key, which used to be the row key. Which means the data your are inserting will be present on 4 out of 100 nodes if the replication factor is set to 4.
In CQL you can add more columns to the primary key, which are called clustering keys. When querying C* with CQL the result set might contain more than one row for a partition key. Those rows are logical and are stored in the partition of which they share the partition key (but vary in their clustering key values). The data in those logical rows is replicated as the partition is.
Have a look at the example for possible primary keys in the official documentation of the CREATE TABLE statement.
EDIT (row sorting):
C* keeps the partitions of a table in the order of their partition key values' hash code. The ordering is therefor not straight forward and results for range queries by partition key values are not what you would expect them to be. But as partitions are in fact ordered you still can do server side pagination with the help of the token function.
That said, you could employ the ByteOrderedPartitioner to achieve lexical ordering of your partitions. But it is very easy to create hotspots with that partitioner and it is generally discouraged to use it.
The rows of a given partition are ordered by the actual values of their clustering keys. Range queries on those behave as you'd expect them to.

Resources