How do you nest a UDT inside another UDT in Cassandra? - cassandra

I have created the following user defined types (UDT) in cassandra :
CREATE TYPE keyspace.location (
latitude text,
longitude text,
accuracy text,
address text
);
CREATE TYPE keyspace.person (
name text,
contact_no text,
alternate_contact text
);
and I want to use these to create another UDT
CREATE TYPE keyspace.pinpoint(
user <person>,
location <location>
);

You can nest UDTs by simply specifying your UDT as the type within another UDT in this manner:
CREATE TYPE keyspace.pinpoint (
user person,
location location
);
You don't enclose them in <> brackets because those are used for collections.
As a side note, I personally wouldn't nest UDTs unless you have no other option. UDTs are not as flexible as native columns. Inserting or updating data in a nested UDT can get very complicated and hard to maintain.
Whenever possible, try to use generic table definitions. For example instead of defining the type pinpoint, try to use a table with native column types and clustered rows. Cheers!

You need to declare these nested UDTs as frozen<UDTName>, like:
CREATE TYPE keyspace.pinpoint(
user frozen<person>,
location frozen<location>
);
But this means that you won't be able to update their individual fields - you'll able only update the whole field, with complete UDT instance, for example, complete user or location.

Related

Cassandra, custom types - Is it possible to extend custom type later?

I have created a custom type in cassandra:
CREATE TYPE IF NOT EXISTS my_type (
id ascii,
name ascii
);
I use this type in my new table:
CREATE TABLE IF NOT exists person
(
id my_type,
name ascii
);
Is it possible to extend custom type later (add new fields, etc.) - after create database schema? For example when my structure changes and I will need to add some fields into this type, would Cassandra complains about it or it is easy and I could just change custom type?
There is a limited support for schema evolution of the user-defined data types:
You can add new fields to UDT
You can rename the existing field
But you can't do:
Drop existing field from UDT
Change the type of the existing field
See documentation on ALTER TYPE command.

How to create a materialized view in Cassandra to filter based on part of a user defined type

I have a table with columns (id as primary key, myudt) where myudt is a user defined type. Now I want to do a query based on part of myudt. Based on following discussion it seems one way is to use materialized view but how? Can someone give an example?
how to filter cassandra query by a field in user defined type
When I try something like below it fails:
CREATE MATERIALIZED VIEW my_view
AS SELECT
myud.fname
FROM
source_table
WHERE id IS NOT NULL AND myudt IS NOT NULL AND myudt.fname IS NOT NULL
PRIMARY KEY (myudt.fname, id);
The error I get in cqlsh is:
ErrorMessage code=2000 [Syntax error in CQL query] message="line 7:28 mismatched input '.' expecting ')' (...NOT NULL PRIMARY KEY (myudt[.]fname...)"
The materialized view feature is being retroactively classified as experimental, and not recommended for new production uses, https://www.mail-archive.com/user#cassandra.apache.org/msg54073.html. So its better to stay away from them.
Searching on a portion of the UDT, defeats the purpose of having them combined in the first place. Design data models based on the queries it has to serve and not vice-versa. Its better to duplicate data by creating another table to serve the queries based on the columns of UDT that you care about.

Can we add primary key to collection datatypes?

When I tried to retrieve table using contains keyword it prompts "Cannot use CONTAINS relation on non collection column col1" but when I tried to create table using
CREATE TABLE test (id int,address map<text, int>,mail list<text>,phone set<int>,primary key (id,address,mail,phone));
it prompts "Invalid collection type for PRIMARY KEY component phone"
One of the basics in Cassandra is that you can't modify primary keys. Always keep that in mind.
You can't use a collection as primary key unless it is frozen, meaning you can't modify it.
This will work
CREATE TABLE test (id int,address frozen<map<text, int>>,mail frozen<list<text>>,phone frozen<set<int>>,primary key (id,address,mail,phone));;
However, I think you should take a look at this document: http://www.datastax.com/dev/blog/cql-in-2-1
You can put secondary indexes on collections after cql 2.1. You may want to use that functionality.

Finding data type of columns in created table

I'm trying to provide a service for user validation of table structures, one component of which is column data type, like uuid, text, and bigint in the `CREATE TABLE' statement below.
USE my_keyspace;
CREATE TABLE users (
id uuid,
name text,
age bigint);
If I do
USE system;
SELECT validator FROM schema_columns
WHERE keyspace_name='my_keyspace' AND columnfamily_name='users';
I get
org.apache.cassandra.db.marshal.UUIDType
org.apache.cassandra.db.marshal.UTF8Type
org.apache.cassandra.db.marshal.LongType
Which seems informative, but on closer inspection, multiple distinct datatypes can map to the same validator value. Is there a way I can pull the data type info as entered in the `CREATE TABLE' statement, or at least find some distinction between the types?
Also, I'm curious as to why the validator data has the 'org.apache.cassandra...' prepended to it, and couldn't find an explanation, so if anybody knows why that is, I'd be very interested to know.
Which seems informative, but on closer inspection, multiple distinct datatypes can map to the same validator value.
If this is the case, as for example with varchar and text, I believe that the data types map on one another and are interchangeable. Anyone else correct me if I am wrong.
Is there a way I can pull the data type info as entered in the `CREATE TABLE' statement, or at least find some distinction between the types?
The only way I know would be:
DESC TABLE users;
Also, I'm curious as to why the validator data has the 'org.apache.cassandra...' prepended to it, and couldn't find an explanation, so if anybody knows why that is, I'd be very interested to know.
Cassandra is implemented in Java and this is the full path to the Class that implements the data type.
More info:
http://docs.datastax.com/en/cql/3.0/cql/cql_reference/cql_data_types_c.html
https://github.com/apache/cassandra/tree/trunk/src/java/org/apache/cassandra/db/marshal
Use following query:
select column_name,type from system_schema.columns where keyspace_name ='my_keyspace' AND table_name='users';

How can I obtain the database schema from an existing ActiveRecord.cs file?

I have been given the source code for an existing project that uses SubSonic ORM. My (limited!) understanding is that SubSonic generates code by reverse-engineering the existing database. Unfortunately I don't have the database that was used for this project.
I do have the ActiveRecord.cs file from the last time it was compiled. How could I work out the database schema so I can reproduce the database?
This sounds like SubSonic 3. Here are a couple places to get you started based on me looking through my ActiveRecord.cs file. You might want to create a small database yourself, run SubSonic on it, and see what gets generated in ActiveRecord.cs.
Inside your ActiveRecord.cs file, you'll find one partial class per table. The partial class will inherit from IActiveRecord and will likely be the name of the table.
Inside the class, you'll find a function called "KeyName()" which will return your primary key column name for the table. SubSonic requires a primary key for tables it processes and generates code for.
Look for a region named " Foreign Keys ". If this table has foreign keys, you'll find a property corresponding to each foreign key, something like "public IQueryable OtherTableNames". So this table should have a column named something like "OtherTableNameID"; check the generated partial class for the foreign key table to be sure.
Immediately below the foreign key region, you'll find properties for the non-foreign key columns of this table. You can somewhat guess at the data types of the columns from the property data types (e.g. string might be a char(x) or a varchar(x)).

Resources