CQL select showing encoded values - cassandra

I am new to cassandra and trying my hands on basic commands. The following is how I am inserting using cassandra-cli
set contactManagementSystem['rowkey3']['firstName'] ='xyz';set contactManagementSystem['rowkey3']['lastName'] ='abc';
but when i try to view those values on CQLSH this is what it shows:
cqlsh:test> select * from "contactManagementSystem";
key | column1 | value
-----+-
0x726f776b657933 | 0x66697273744e616d65 | 0x41616b61
0x726f776b657933 | 0x6c6173744e616d65 | 0x4d
0x726f776b657933 | 0x70686f6e65 | 0x3631372d3132332d373839
I just wanted to understand why is it happening like this and what am I doing wrong? (apologies for the weird looking code. I do not have enough reputations to post images)

While creating table/columnfamily using cassandra-cli the default datatype of columns will be BytesType. So when you describe/select the data it will show you in bytes format. But you can declare the datatype for columns while creating column family.
You can find it here
Also you can create column family using CQLSH as well, and you can declare datatype for each column. Here you can find how to create table/column family using cqlsh.
And doing CRUD using CQLSH is very simple.
Hope it will help you.

Related

How to add a new column in JOOQ?

I want to translate this SQL query, in JOOQ.
I want the new field from calculated value, but here i just add it with constant value.
select
*,
100 as "newField"
from author
I tried this
Result<Record> r=create.select(DSL.asterisk(),DSL.val(100).as("newField"))
.from(DSL.table("author"))
.fetch();
That generates this SQL (which looks fine)
select
*,
? as "newField"
from author
But i get this as result (expected is all table with 1 extra column with value 100)
+--------+
|newField|
+--------+
|1 |
|2 |
+--------+
In pgAdmin, the sql above query does what i need(adds a new field to the table with all values 100).
I don't know what i do wrong, the results seems like newField gets as value the row number.
Also in SQL generation how i can print the SQL with the 100 value instead of ?
This is a known limitation, the usage of asterisk() with a plain SQL query is limited: https://github.com/jOOQ/jOOQ/issues/7841
Workarounds include:
Using the code generator (which is always a good idea)
Wrapping the query in another derived table, and selecting the asterisk() alone in the outer most query
Also in SQL generation how i can print the SQL with the 100 value instead of ?
That's a different question, not strictly related. There are various ways to generate inline values in jOOQ. In your case, probably Query.getSQL(paramType.INLINED) could help you "print" the SQL. Though, you might have had something else in mind, in case of which I recommend asking a new question with more details about this.

Cassandra returns Unordered result set for numeric values

I am new to No SQL and just started learning Cassandra, I have a following question to ask. I have created a simple table with one column to understand Cassandra partition and clustering and trying to query all the values after insertion.
My table structure
create table if not exists music_library(custno int, primary key(custno))
I inserted following values in a sequential order
insert into music_library(custno) values (11)
insert into music_library(custno) values (12)
insert into music_library(custno) values (13)
insert into music_library(custno) values (14)
then I was querying this table
select * from music_library
it returns values in the following order
13
11
14
12
but i was expecting
11
12
13
14
Why its behaving like that?
I ran your exact statements and produced the same result. But I also adjusted your query to run the token function, and this is what it produced:
aaron#cqlsh:stackoverflow> select custno,token(custno) from music_library;
custno | system.token(custno)
--------+----------------------
13 | -5034495173465742853
11 | -4156302194539278891
14 | 4279681877540623768
12 | 8582886034424406875
(4 rows)
Why its behaving like that?
Simply put, because Cassandra cannot order results by the values of the partition keys.
As your table has a single primary key of custno, your rows are partitioned by the hashed token value of custno, and written to the nodes responsible for those token ranges. When you run an unbound query in Cassandra (query without a WHERE clause), the results are returned ordered by the hashed token values of their partition keys.
Using ORDER BY won't work here, either. ORDER BY can only sort data within a partition, and even then only on clustering keys. To get the custno values to order properly, you will need to find a new partition key, and then specify custno as a clustering key in an ascending direction.
Edit 20190916 - follow-up clarifications
Does this tokenization will happen for all the columns?
No. The partition keys are hashed into a token to determine their placement in the cluster (which node(s) they are written to). Individual column values are written within a partition.
How will I return the inserted number with the order?
You cannot alter the order of this table without changing the model. Simply put, you'll have to find a way to organize the values you expect to return (with your query) together (find another partition key). Exactly how that looks depends on your business/query requirements.
For example, let's say that I wanted to track which customers purchased specific music albums. I might create a table that looks like this:
CREATE TABLE customers_by_album (
album TEXT,
band TEXT,
custno INT,
PRIMARY KEY (album,custno))
WITH CLUSTERING ORDER BY (custno ASC);
After inserting some data, the following query returns results ordered by custno:
aaron#cqlsh:stackoverflow> SELECT album,token(album),band,custno FROM
customers_by_album WHERE album='Moving Pictures';
album | system.token(album) | band | custno
-----------------+---------------------+------+--------
Moving Pictures | 7819329704333693835 | Rush | 11
Moving Pictures | 7819329704333693835 | Rush | 12
Moving Pictures | 7819329704333693835 | Rush | 13
Moving Pictures | 7819329704333693835 | Rush | 14
(4 rows)
This works, because I am querying data by a partition (album), and then I am "clustering" on custno which leverages the on-disk sort order. This is also the order the data was written to disk in, so Cassandra just reads it from the partition sequentially.
I wrote an article on this topic for DataStax a few years ago, and it's still quite relevant. Give it a read if you get a chance: https://www.datastax.com/dev/blog/we-shall-have-order

How to convert int column to float/double column in Cassandra database table

I am using cassandra database in production.I have one column field in
a cassandra table e.g coin_deducted is int data type.
I need to convert coin_deducted in float/double data type.
But I tried to change data type by using alter
table command but cassandra is throwing incompatible issue while
converting int to float. Is there any way to do this?
e.g: currently it is showing like:
user_id | start_time | coin_deducted (int)
122 | 26-01-01 | 12
I want to be
user_id | start_time | coin_deducted (float)
122 | 26-01-01 | 12.0
Is it possible to copy entire one column field into new added column
field in same table?
Changing type of column is possible only if old type and new type are compatible. From documentation:
To change the storage type for a column, the type you are changing to
and from must be compatible.
One more proof that this cannot be done is when you write statement:
ALTER TABLE table_name ALTER int_column TYPE float;
it will tell you that types are incompatible. This is also logical since float is broader type than int (has decimal) and database would not know what to put on decimal space. Here is a list of compatible types which can be altered one to another without problems.
Solution 1
You can do it on application level, create one more column in that table which is float and create background job which will loop through all records and copy your int value to new float column.
We created cassandra migration tool for DATA and SCHEMA migrations for cases like this, you add it as dependency and can write SCHEMA migration which will add new column and add DATA migration which will fire in background and copy values from old column to new column. Here is a link to Java example application to see usage.
Solution 2
If you do not have application level and want to do this purely in CQL you can use COPY command to extract data to CSV, create new table with float, sort manually int values in CSV and return data to new table.

Does CQL3 require a schema for Cassandra now?

I've just had a crash course of Cassandra over the last week and went from Thrift API to CQL to grokking SuperColumns to learning I shouldn't use them and user Composite Keys instead.
I'm now trying out CQL3 and it would appear that I can no longer insert into columns that are not defined in the schema, or see those columns in a select *
Am I missing some option to enable this in CQL3 or does it expect me to define every column in the schema (defeating the purpose of wide, flexible rows, imho).
Yes, CQL3 does require columns to be declared before used.
But, you can do as many ALTERs as you want, no locking or performance hit is entailed.
That said, most of the places that you'd use "dynamic columns" in earlier C* versions are better served by a Map in C* 1.2.
I suggest you to explore composite columns with "WITH COMPACT STORAGE".
A "COMPACT STORAGE" column family allows you to practically only define key columns:
Example:
CREATE TABLE entities_cargo (
entity_id ascii,
item_id ascii,
qt ascii,
PRIMARY KEY (entity_id, item_id)
) WITH COMPACT STORAGE
Actually, when you insert different values from itemid, you dont add a row with entity_id,item_id and qt, but you add a column with name (item_id content) and value (qt content).
So:
insert into entities_cargo (entity_id,item_id,qt) values(100,'oggetto 1',3);
insert into entities_cargo (entity_id,item_id,qt) values(100,'oggetto 2',3);
Now, here is how you see this rows in CQL3:
cqlsh:goh_master> select * from entities_cargo where entity_id = 100;
entity_id | item_id | qt
-----------+-----------+----
100 | oggetto 1 | 3
100 | oggetto 2 | 3
And how they are if you check tnem from cli:
[default#goh_master] get entities_cargo[100];
=> (column=oggetto 1, value=3, timestamp=1349853780838000)
=> (column=oggetto 2, value=3, timestamp=1349853784172000)
Returned 2 results.
You can access a single column with
select * from entities_cargo where entity_id = 100 and item_id = 'oggetto 1';
Hope it helps
Cassandra still allows using wide rows. This answer references that DataStax blog entry, written after the question was asked, which details the links between CQL and the underlying architecture.
Legacy support
A dynamic column family defined through Thrift with the following command (notice there is no column-specific metadata):
create column family clicks
with key_validation_class = UTF8Type
and comparator = DateType
and default_validation_class = UTF8Type
Here is the exact equivalent in CQL:
CREATE TABLE clicks (
key text,
column1 timestamp,
value text,
PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE
Both of these commands create a wide-row column family that stores records ordered by date.
CQL Extras
In addition, CQL provides the ability to assign labels to the row id, column and value elements to indicate what is being stored. The following, alternative way of defining this same structure in CQL, highlights this feature on DataStax's example - a column family used for storing users' clicks on a website, ordered by time:
CREATE TABLE clicks (
user_id text,
time timestamp,
url text,
PRIMARY KEY (user_id, time)
) WITH COMPACT STORAGE
Notes
a Table in CQL is always mapped to a Column Family in Thrift
the CQL driver uses the first element of the primary key definition as the row key
Composite Columns are used to implement the extra columns that one can define in CQL
using WITH COMPACT STORAGE is not recommended for new designs because it fixes the number of possible columns. In other words, ALTER TABLE ... ADD is not possible on such a table. Just leave it out unless it's absolutely necessary.
interesting, something I didn't know about CQL3. In PlayOrm, the idea is it is a "partial" schema you must define and in the WHERE clause of the select, you can only use stuff that is defined in the partial schema BUT it returns ALL the data of the rows EVEN the data it does not know about....I would expect that CQL should have been doing the same :( I need to look into this now.
thanks,
Dean

Cassandra (Pycassa/CQL) Return Partial Match

I'm trying to do a partial search through a column family in Cassandra similar to an SQL query like: SELECT * FROM columnfamily WHERE col = 'val*' where val* means any value matching at least the first three characters 'val'.
I've read datastax's documentation on the SELECT function, but can't seem to find any support for the partial WHERE criteria. Any ideas?
There is no wildcard support like this in Cassandra, but you can model your data in such a way that you could get the same end result.
You would take the column that you want to perform this query on and denormalize it into a second column family. This CF would have a single wide row with the column name as the value of the col you want to do the wild card query on. The column value for this CF could either be the row key for the original CF or some other representation of the original row.
Then you would use slicing to get out the values you care about. For example if this was the wide row to slice on:
+---------+----------+--------+----------+---------+--------+----------+
| RowKey | aardvark | abacus | abacuses | abandon | accent | accident |
| +----------+--------+----------+---------+--------+----------+
| | | | | | | |
| | | | | | | |
+---------+----------+-----------------------------+--------+----------+
Using CQL you could select out everything starting with 'aba*' using this query*:
SELECT 'aba'..'abb' from some_cf where RowKey = some_row_key;
This would give you the columns for 'abacus', 'abacuses', and 'abandon'.
There are some things to be aware of with this strategy:
In the above example, if you have things with the same column_name you need to have some way to differentiate between them (otherwise inserting into the wide column family will clobber other valid values). One way that you could do this is by using a composite column of word:some_unique_value.
The above model only allows wild cards at the end of the string. Wild cards at the beginning of the string could also easily be handled with a few modifications. Wild cards in the middle of a string would be much more challenging.
Remember that Cassandra doesn't give you an easy way to do ad-hoc queries. Instead you need to figure out how you will be using the data and model your CFs accordingly. Take a look at this blog post from Ed Anuff on indexing data in Cassandra for more info on modeling data like this.
*Note that the CQL syntax for slicing columns is changing in an upcoming release of Cassandra.

Resources