RDBMS Schema To NoSQL Cassandra

RDBMS Schema To NoSQL Cassandra - cassandra

Let's go directly to the question: I've this RDBMS schema, that is a part of a large schema that I don't know.
Now, the request (it's a university project) is to think 5 queries and then build this schema in Cassandra and perform those queries via CQL.
Now.. I get that I don't have Join operation in Cassandra, so I kinda need to duplicate something, but I stil can't understand something important.
Let's say my first query is "Print all SalesOrderHeader (its ID and status) with OrderDate > 01/01/2017".. So, for this I will need a ColumnFamily for SalesOrderHeader and so I will write something like:
CREATE TABLE SalesOrderHeader {
alesOrderID uuid PRIMARY KEY,
RevisionNumber TEXT,
OrderDate TIMESTAMP,
DueDate TIMESTAMP,
ShipDate TIMESTAMP,
Status TEXT,
OnlineOrderFlag BOOLEAN,
SalesOrderNumber TEXT,
PurchaseOrderNumber TEXT,
AccountNumber TEXT,
SubTotal FLOAT,
TaxAmt FLOAT,
Freight TEXT,
TotalDue FLOAT,
Comment TEXT,
CreditCartApprovalCode TEXT,
}
But now, what I don't understand is: should I add to this "TABLE" the information about the CreditCard?
I mean, let's say a second query ask me "Print all SalesOrderHader (id and status) made with a CreditCard of Type = VISA". CreditCard and SalesOrderHeader is a 1-To-1 Relation, so I could Think to put all the information about CreditCard into the TABLE SalesOrderHeader, but should I make two different TABLE of SalesOrderHeader, one with the CreditCard information and one without?
Unfortunately in classes we just saw what's Cassandra is, then they said "Ok, to this project and read the Cassandra Documentation".. Too bad I'm having lots of problem >_>
I searched on the internet but I just found some easier example with two table (like Song and Playlist) but they didn't really help me.

Related

How to fetch Primary Key/Clustering column names for a particular table using CQL statements?

I am trying to fetch the Primary Key/Clustering Key names for a particular table/entity and implement the same query in my JPA interface (which extends CassandraRepository).
I am not sure whether something like:
#Query("DESCRIBE TABLE <table_name>)
public Object describeTbl();
would work here as describe isn't a valid CQL statement and in case it would, what would be the type of the Object?
Suggestions?

One thing you could try, would be to query the system_schema.columns table. It is keyed by keyspace_name and table_name, and might be what you're looking for here:
> SELECT column_name,kind FROM system_schema.columns
WHERE keyspace_name='spaceflight_data'
AND table_name='astronauts_by_group';
column_name | kind
-------------------+---------------
flights | regular
group | partition_key
name | clustering
spaceflight_hours | clustering
(4 rows)

DESCRIBE TABLE is supported only in Cassandra 4 that includes fix for CASSANDRA-14825. But it may not help you much because it just returns the text string representing the CREATE TABLE statement, and you'll need to parse text to extract primary key definition - it's doable but could be tricky, depending on the structure of the primary key.
Or you can obtain underlying Session object and via getMetadata function get access to actual metadata object that allows to obtain information about keyspaces & tables, including the information about schema.

Apache Cassandra "no viable alternative at input 'OR' "

My table looks like :
CREATE TABLE prod_cust (
pid bigint,
cid bigint,
effective_date date,
expiry_date date,
PRIMARY KEY ((pid, cid))
);
My below query is giving no viable alternative at input 'OR' error
SELECT * FROM prod_cust
where
pid=101 and cid=201
OR
pid=102 and cid=202;
Does Cassandra not support OR operator if not, Is there any alternate way to achieve my result.

CQL does not support the OR operator. Sometimes you can get around that by using IN. But even IN won't let you do what you're attempting.
I see two options:
Submit each side of your OR as individual queries.
Restructure the table to better-suit what you're trying to do. Doing a "port-over" from a RDBMS to Cassandra almost never works as intended.

Cassandra, Delete if a set contains value

I'm a beginner in Cassandra and I have a table like this:
CREATE TABLE Books(
Title text PRIMARY KEY,
Authors set<text>,
Family set <text>,
Publisher text,
Price decimal
);
(the other options are missing because it's only an example)
now I would like to execute this query:
DELETE Price FROM Books WHERE Authors CONTAINS 'J.K. Rowling' IF EXISTS;
But it doesn't work. I searched on Google but found nothing.
Hope somebody can help me and sorry if my english is not very good.

but it doesn't work.
That doesn't really give us enough information to help you. Usually, you'll want to provide an error message. I built your table locally, inserted data, and tried your approach. This is the error that I see:
InvalidRequest: Error from server: code=2200 [Invalid query]
message="Some partition key parts are missing: title"
DELETE requires that the appropriate PRIMARY KEY components be specified in the WHERE clause. In your case, Authors is not part of the PRIMARY KEY definition. Given the error message returned (and the table definition) specifying title is the only way to delete rows from this table.
aploetz#cqlsh:stackoverflow> DELETE FROM Books
WHERE title = 'Harry Potter and the Chamber of Secrets'
IF EXISTS;
[applied]
-----------
True
Can I do a query like this? UPDATE Books SET Family = Family + {'Fantasy'} WHERE Authors CONTAINS 'J.K. Rowling';
No. This fails for the same reason. Writes in Cassandra (INSERTs, UPDATEs, DELETEs are all writes) require the primary key (specifically, the partition key) in the WHERE clause. Without that, Cassandra can't figure out which node holds the data, and it needs that to perform the write.

Non frozen collections and user defined types on Cassandra 2.1.8

I'm trying to run the following example from here
CREATE TYPE address (
street text,
city text,
zip int
);
CREATE TABLE user_profiles (
login text PRIMARY KEY,
first_name text,
last_name text,
email text,
addresses map<text, address>
);
However, when I try to create the user_profiles table, I get the following error:
InvalidRequest: code=2200 [Invalid query] message="Non-frozen collections are not
allowed inside collections: map<text, address>
Any thoughts on why this could be happening?

I am running 2.1.8 and I get the same error message. To fix this, you need the frozen keyword:
CREATE TABLE user_profiles (
login text PRIMARY KEY,
first_name text,
last_name text,
email text,
addresses map<text, frozen <address>>
);
Frozen is necessary for UDTs (for now) as it serializes them into a single value. A similar, better example for you to follow might be the one in the User Defined Type documentation. Give that a try.

I was getting this message when I mistakenly used "string" instead of "text" in a cassandra map, like:
mymap map<bigint, string>
I followed this stackoverflow thread from google and I thought this information could save someone a few minutes of their time.

Non-frozen UDTs are not yet supported. The reason for asking the user to explicitly specify this keyword for each UDT is to be able to introduce mutable UDTs in 3.x without breaking existing code.

Composite key in Cassandra with Pig

We have a CQL table that looks something like this:
CREATE table data (
occurday text,
seqnumber int,
occurtimems bigint,
unique bigint,
fields map<text, text>,
primary key ((occurday, seqnumber), occurtimems, unique)
)
I can query this table from cqlsh like this:
select * from data where seqnumber = 10 AND occurday = '2013-10-01';
This query works and returns the expected data.
If I execute this query as part of a LOAD from within Pig, however, things don't work.
-- Need to URL encode the query
data = LOAD 'cql://ks/data?where_clause=seqnumber%3D10%20AND%20occurday%3D%272013-10-01%27' USING CqlStorage();
gives
InvalidRequestException(why:seqnumber cannot be restricted by more than one relation if it includes an Equal)
at org.apache.cassandra.thrift.Cassandra$prepare_cql3_query_result.read(Cassandra.java:39567)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
at org.apache.cassandra.thrift.Cassandra$Client.recv_prepare_cql3_query(Cassandra.java:1625)
at org.apache.cassandra.thrift.Cassandra$Client.prepare_cql3_query(Cassandra.java:1611)
at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader$RowIterator.prepareQuery(CqlPagingRecordReader.java:591)
at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader$RowIterator.executeQuery(CqlPagingRecordReader.java:621)
Shouldn't these behave the same? Why is the version through Pig failing where the straight cqlsh command works?

Hadoop is using CqlPagingRecordReader to try to load your data. This is leading to queries that are not identical to what you have entered. The paging record reader is trying to obtain small slices of Cassandra data at a time to avoid timeouts.
This means that your query is executed as
SELECT * FROM "data" WHERE token("occurday","seqnumber") > ? AND
token("occurday","seqnumber") <= ? AND occurday='A Great Day'
AND seqnumber=1 LIMIT 1000 ALLOW FILTERING
And this is why you are seeing your repeated key error. I'll submit a bug to the Cassandra Project.
Jira:
https://issues.apache.org/jira/browse/CASSANDRA-6151

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

RDBMS Schema To NoSQL Cassandra - cassandra

Related

How to fetch Primary Key/Clustering column names for a particular table using CQL statements?

Apache Cassandra "no viable alternative at input 'OR' "

Cassandra, Delete if a set contains value

Non frozen collections and user defined types on Cassandra 2.1.8

Composite key in Cassandra with Pig

Categories

Resources