Cassandra many-to-many relationship modeling options - cassandra

In this article the author illustrates several options for modeling many-to-many relationship in Cassandra. I would like to get some more clarifications on two of them:
Why option 4 would take more space? It seems like you are just "appending" Item_by_user to User column space.
Also, in option 4, how can you define composite columns as the author suggests? It seems like you have two groups of columns: 1) Name, Email and 2) Likes whereas the latter is wide(?). What would be the CQL code that defines Name, Email and wide columns for Likes for the User table?
Thanks.
The following images are taken form the article mentioned above:

As far as first question goes it looks to me that it will take same amount of space only one row per user and per item less because you keep everything in single row.
As for second question you can take a look at static columns (here is cql documentation). Basically it is a way to define column which will be shared by all values in one row (user details in user table and item details in items table) and you can update value only by using partitioning key.
Second solution can be to model which items user liked as map type (here is map documentation) and same thing goes to items (create a map of users which liked that item).

I suggest you to get more information about Data modeling in Cassandra. I've read A Big Data Modeling Methodology for Apache Cassandra and Basic Rules of Cassandra Data Modeling as useful articles in this case. They will help you understanding about modelling the tables based on your queries (Query-Driven methodology) and data duplication and its advantages/disadvantages.

Related

IBM Cognos: matching multiple columns (two foreign keys)

I am learning how to use IBM Cognos and my first task is to create relationships between the tables I have uploaded into Cognos.
Basically, I am trying to tell Cognos to link the id column in the Person Table with the person_id and related_person_id columns in the Relationship Table, as shown here:
However, this does not seem possible since the Match Selected Columns button becomes disabled when I try to also link the related_person_id column.
The reason I need to do this is because person_id and related_person_id are foreign keys - they point to people in the Person Table and explain how they are related.
How can this be accomplished in Cognos?
Thank you.
You can have any number of matches. You need to match a single query item from each side for each match. IIRC, a query item can be used in multiple matches, although that would only be really helpful once relational operators are implemented.
It isn't clear if in your case you want to use person_id and related_person_id as a composite key or if you want a 1.n relationship between ID and person_id and some other relationship (n.1?) between ID and related person ID or if a 1.n relationship between ID and person_id would be sufficient to whatever you are trying to accomplish.
Editorial comment:
It would be really really nice if Cognos introduced relational operators Real Soon Now.

How we can do CRUD operations on complex data models in Cassandra?

How we can do CRUD operations on complex data models in Cassandra?
I have a project using NOSQL.
I have a column family for my customers.
The column family has just "id" at first.
Then it will be updated by altering new columns.
Count and type of columns for each customer could be different.
Also, each column can include sub columns with ids again and it would be altered, too. So, they should be indexed. And documents are not useful for this issue.
I've read about NOSQL, and I've decided to use Cassandra. I will be thankful if you would answer this questions:
Is the above that possible?
How we can create and use CRUD operations on this column family?
If the answer of last question is true, what is the type of result of a query?
It will return some rows for each primary key (id)?
How we can manage that, to access a table like with no redundancy? because I don't now this summarizing should be manage in DBside or in code side.
Thank you for your help.

Wide rows vs Collections in Cassandra

I am trying to model many-to-many relationships in Cassandra something like Item-User relationship. User can like many items and item can be bought by many users. Let us also assume that the order in which the "like" event occurs is not a concern and that the most used query is simply returning the "likes" based on item as well as the user.
There are a couple of posts dicussing data modeling
http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-1/
An alternative would be to store a collection of ItemID in the User table to denote the items liked by that user and do something similar in the Items table in CQL3.
Questions
Are there any hits in performance using the collection? I think they translate to composite columns? So the read pattern, caching and other factors should be similar?
Are collections less performant for write heavy applications? Is updating the collection frequently less performant?
There are a couple of advantages of using wide rows over collections that I can think of:
The number of elements allowed in a collection is 65535 (an unsigned short). If it's possible to have more than that many records in your collection, using wide rows is probably better as that limitation is much higher (2 billion cells (rows * columns) per partition).
When reading a collection column, the entire collection is read every time. Compare this to wide row where you can limit the number of rows being read in your query, or limit the criteria of your query based on clustering key (i.e. date > 2015-07-01).
For your particular use case I think modeling an 'items_by_user' table would be more ideal than a list<item> column on a 'users' table.

How to establish a relationship between two tables?

I'm new to core data.
I have a data model, in which there are two tables and a 1-N relationship between them.
The application downloads all data from a service and saves the data in each table.
In addition, the tables are related and therefore want to do this:
a) Obtain all elements of the table2, which satisfies certain conditions.
b) For each element in table2, the identifier look table1 / save the table1´s id.
c) I get the item in Table 1 which meets the requirements ID.
d) I relate to Table 2 with 1.
I'm not capable for doing this. :(
I do not know if this method for make a relation between tables in this way is good or no.
This is sort of difficult to answer. If you think about Core Data as an SQL table you'll just get yourself into difficulty.
Core data isn't about joining and searching tables, it's about an object graph. An object has relationships to another object which has an inverse relationship to the other object. Essentially, what you should be looking to do is:
This is a fetch request of the entity which you are storing in table 2 subject to a predicate which defines your conditions.
You don't actually deal with ids directly in Core Data. You hardly ever deal with the keys directly.
Step 1 returned a collection of objects, and you can run a further predicate on this to filter it.
That is what the inverse relationship is for.
I know this doesn't answer your actual question. I'm trying to get you to think of your Core Data store as a collection of objects related to each other rather than as a bunch of linked tables.

converting excel spreadsheet to MySql Database

I have a Horse Racing Database that has the results for all handicap races for the 2010 flat season. The spreadsheet has now got too big and I want to convert it to a MySQL Databse. I have looked at many sites about normalizing data and database structures but I just can't work out what goes where, and what are PRIMARY KEYS,FOREIGN KEYS ETC I have over 30000 lines in the spreadsheet. the Column headings are :-
RACE_NO,DATE,COURSE,R_TIME,AGE,FURS,CLASS,PRIZE,RAN,Go,BHB,WA,AA,POS,DRW,BTN,HORSE,WGT,SP,TBTN,PPL,LGTHS,BHB,BHBADJ,BEYER
most of the columns are obvious, the following explains the less obvious BHB is the class of race,WA and AA are weight allowances for age and weight,TBTN is total distance beaten,PPL is Pounds per length, the last 4 are ratings.
I managed to export into MySQL as a flat file by saving the spreadsheet as a comma delimited file but I need to structure the
data into a normalized state with the proper KEYS.
I would appreciate any advice
many thyanks
Davey H
To do this in the past, I've done it in several steps...
Import your Excel spreadsheet into Microsoft Access
Import your Microsoft Access database into MySQL using the MySQL Workbench (previously MySQL GUI Tools + MySQL Migration Toolkit)
It's a bit disjointed, but it usually works pretty well and saves me time in the long run.
It's kind of an involved question, and it would be difficult to give you a precise answer without knowing a little bit more about your system, but I can try and give you a high level overview of how Relational Database Mangement Systems (RDBMS's) are structured.
A primary key is some identifier for a particular record - usually it is unique to that record. In this case, your RACE_NO column might be a suitable primary key. That way, you can identify every race by its unique number.
Foreign keys are numbers that describe the relationships between other objects/tables in your database. For example, you may want to create a table that lists all the different classes of races. Each record in that table would have a primary key, unique to that class. If you wanted to indicate in your "races" table which class each race was, you might have a column for each record called class_id. The value of that column would be populated with primary keys from the "classes" table. You can then use join operations to bring all the information together into one view.
For more on data structures and mysql, I suggest the W3C tutorials on SQL: http://www.w3schools.com/sql/sql_intro.asp
Before anything else, You need to define your data: You have to fit every column into a value space known to MySQL.
Numeric value
http://dev.mysql.com/doc/refman/5.0/en/numeric-types.html
Textual value
http://dev.mysql.com/doc/refman/5.0/en/string-type-overview.html
Date/Time value
http://dev.mysql.com/doc/refman/5.0/en/date-and-time-type-overview.html

Resources