ConfigurationException: Invalid type int for column above_size: Cannot mix counter and non counter columns in the same table - cassandra

Why I get this error message if I create this table ?
items_by_name
item_id uuid
user_id uuid
name TEXT
image VARCHAR
desc TEXT
price DECIMAL
category TEXT
trouser_size INT
shoe_size INT
above_size INT
color TEXT,
liked_user_id INT,
like_count counter,
PRIMARY KEY (name, item_id)
...........................
..........................
...........................
.............................
.............................

Tables with counters are handled specially in Cassandra, and as result, you may have non-counter types only as part of the primary key, but not as regular columns. From documentation:
A table that contains a counter can only contain counters. In other words, either all the columns of a table outside the PRIMARY KEY have the counter type, or none of them have it.
There are also other limitations - see documentation.
In your case, you will need to have two tables - one for counters, and one - for the non-counter types. Just use the same primary key for both tables.

Related

Cassandra non counter family

I attempted to create a table with counter as one of the column type in cassandra but getting the following error:
ConfigurationException: ErrorMessage code=2300 [Query invalid because
of configuration issue] message="Cannot add a counter column
(transaction_count) in a non counter column family"
My table schema is as follows:
CREATE TABLE MARKET_DATA_TRANSACTION_COUNT (
TRADE_DATE TIMESTAMP,
SECURITY_EXCHANGE TEXT,
PRODUCT_CODE TEXT,
SYMBOL TEXT,
SPREAD_TYPE TEXT,
USER_DEFINED TEXT,
PRODUCT_GUID TEXT,
CHANNEL_ID INT,
SECURITY_TYPE TEXT,
INSTRUMENT_GUID TEXT,
SECURITY_ID INT,
TRANSACTION_COUNT COUNTER,
PRIMARY KEY (TRADE_DATE));
That's a limitation of the current counter implementation. You can't mix counters and regular columns in the same table. So you need a separate table for counters.
They are thinking of removing this limitation in Cassandra 3.x. See this Jira ticket.
This is not exactly the answer to the question, might help some people with the similar error.
If you can make other columns as PRIMARY KEY then its possible.
Eg: CREATE TABLE rate_data (ts varchar, type varchar, rate counter, PRIMARY KEY (ts, type));

Cassandra: Is there a limit to amount of data that a collection column can hold?

In the below table, what is the maximum size phone_numbers column can accommodate ?
Like normal columns, is it 2GB ?
Is it 64K*64K as mentioned here
CREATE TABLE d2.employee (
id int PRIMARY KEY,
doj timestamp,
name text,
phone_numbers map<text, text>
)
Collection types in Cassandra are represented as a set of distinct cells in the internal data model: you will have a cell for each key of your phone_numbers column. Therefore they are not normal columns, but a set of columns. You can verify this by executing the following command in cassandra-cli (1001 stands for a valid employee id):
use d2;
get employee[1001];
The good answer is your point 2.

Create a super column using CQL3

I am upgrading my thrift api to cql3. My data contains SuperColumns as follows:
- User //column family
- Division/name //my row key
-DivHead //SuperColumn
- name //Columns
- address //Columns
I understand all the column families to be changed to tables. And the primary key becomes the rowkey. So rest are the columns.
But my data has supercolumns. how do I create supercolumns using CQL3?
CREATE TABLE user (
rowkey varchar,
division text,
head_name text,
address text,
PRIMARY KEY (rowkey, division)
)
OR
CREATE TABLE user (
rowkey varchar,
division text,
head_name text,
head_address text,
PRIMARY KEY ((rowkey, division))
)
Under the covers the first example will have each rowkey assigned to the same partition. Each rowkey will have a set of logical rows, one for each division. Those rows will contain two columns: head_name and head_address. You can query based on the rowkey and get all divisions (sorted!). Or you can query a rowkey with a range of divisions or a single division and get a subset of the divisions with their division head and address.
The second example will have one partition for each rowkey and division combination. Each such partition will be one logical row as well. The single row for each composite key will have two columns: head_name and head_address. To make a query, you must provide BOTH the rowkey and the division.
EDIT: Cleared up some bad grammar.

How does a CQL3 composite index with 3 fields map in the thrift column family world?

After reading this blog at planetcassandra, I'm wondering how does a CQL3 composite index with 3 fields map in the thrift column family word, For e.g.:
CREATE TABLE comments (
article_id uuid,
posted_at timestamp,
author text,
karma int,
content text,
PRIMARY KEY (article_id, posted_at)
)
Here the column article_id will be mapped to the internal row key and posted_at will be mapped to (the first part of) the cell name.
What if the table design will be
CREATE TABLE comments (
author_id varchar,
posted_at timestamp,
article_id uuid,
author text,
karma int,
content text,
PRIMARY KEY (author_id, posted_at, article_id)
)
And will the internal row key mapped to 1st 2 fields of the composite index with article_id mapped to cell name, essentially slicing for as many articles upto 2 billion entries and any query on author_id and posted_at combination is one seek on the disk?
Is the behavior same for any number of fields in a composite key?
Your answers much appreciated.
The above observation is incorrect and the correct one is here
I've personally verified:
In the first case:
article_id = partition key, posted_at = cluster key
In the second case:
author_id = partition key, posted_at:article_id = cluster key
First part of composite key (author_id) is called "Partition Key",
rest (posted_at,article_id) are remaining keys.
Cassandra stores columns differently when composite keys are used. Partition key
becomes row key. Remaining keys are concatenated with each column
name (":" as separator) to form column names. Column values remain
unchanged.
Remaining keys (other than partition keys) are ordered,
and it's not allowed to search on any random column, you have to
start with the first one and then you can move to the second one and
so on. This is evident from "Bad Request" error.
There's an excellent explanation by Aaron Morton # his site thelastpickle.
In the first case:
article_id = partition key, posted_at = cluster key
In the second case:
author_id + posted_at = partition key, article_id = cluster key
hence be mindful of the disk seeks as you go by second method and see the row is not getting too wide and gives real benefit compared to the first case.
If you aren't crossing the 2 billion and well within the limits, don't overdo by adopting the 2nd method, as the dispersion of records happens on the combo key.

Cassandra Composite Columns - How are CompositeTypes chosen?

I'm trying to understand the type used when I create composite columns.
I'm using CQL3 (via cqlsh) to create the CF and then the CLI to issue a describe command.
The Types in the Columns sorted by: ...CompositeType(Type1,Type2,...) are not the ones I'm expecting.
I'm using Cassandra 1.1.6.
CREATE TABLE CompKeyTest1 (
KeyA int,
KeyB int,
KeyC int,
MyData varchar,
PRIMARY KEY (KeyA, KeyB, KeyC)
);
The returned CompositeType is
CompositeType(Int32,Int32,UTF8)
Shouldn't it be (Int32,Int32,Int32)?
CREATE TABLE CompKeyTest2 (
KeyA int,
KeyB varchar,
KeyC int,
MyData varchar,
PRIMARY KEY (KeyA, KeyB, KeyC)
);
The returned CompositeType is
CompositeType(UTF8,Int32,UTF8)
Why isn't it the same as the types used when I define the table? I'm probably missing something basic in the type assignment...
Thanks!
The composite column name is composed of the values of primary keys 2...n and the name of the non-primary key column being saved.
(So if you have 5 non-key fields then you'll have five such columns and their column names will differ only in the last composed value which would be the non-key field name.)
So in both examples the composite column is made up of the values of KeyB, KeyC and the name of the column being stored ("MyData", in both cases). That's why you're seeing those CompositeTypes being returned.
(btw, the first key in the primary key is the partitioning key and its value is only used as the row key (if you're familiar with Cassandra under the covers). It is not used as part of any of the composite column names.)

Resources