I know this question has been asked a million times, but I can't seem to find one that really gives me a good understanding of how relationships work in Kohana's ORM Module.
I have a database with 5 tables:
approved_submissions
-submission_id
-contents
favorites
-user_id
-submission_id
ratings
-user_id
-submission_id
-rating
users
-user_id
votes
-user_id
-submission_id
-vote
Right now, favorites,ratings, and votes have a Primary Key that consists of every column in the table, so as to prevent a user favoriting the same submission_id multiple times, a user voting on the same submission_id multiple times etc. I also believe these fields are set up using foreign keys that reference approved_submissions and users so as to prevent invalid data existing in the respective fields.
Using the DB module, I can access and update these tables no problem. I really feel as though ORM may offer a more powerful and accessible way to accomplish the same things using less code.
Can you demonstrate how I might update a user voting on a submission_id? A user removing a favorite submission_id? A user changing their rating on a particular submission_id?
Also, do I need to make changes to my database structure or is it okay the way it is?
You're probably looking for has_many_through relationships.
So to add a new submission, you'd do something like
$user->add('submissions', $submission);
and to remove
$user->remove('submissions', $submission);
You may want to consider restructuring your database table and key names so you don't end up doing a lot of configuration.
Related
i'm working on a MEAN stack project, i use too many collections in my aggregation so i use a lot of lookup, and that impacts negatively the performance and makes the execution of aggregation very slow. i was wondering if you have any suggestions , i found that we can reduce lookup by creating for each collection i need an array of objects into a globale collection however, i'm looking for an optimale and secured solution.
As an information, i defined indexes on all collections into mongo.
Thanks for sharing your ideas!
This is a very involved question. Even if you gave all your schemas and queries, it would take too long to answer, and be very specific to your case (ie. not useful to anyone else coming along later).
Instead for a general answer, I'd advise you to read into denormalization and consider some database redesign if this query is core to your project.
Here is a good article to get you started.
Denormalization allows you to avoid some application-level joins, at the expense of having more complex and expensive updates. Denormalizing one or more fields makes sense if those fields are read much more often than they are updated.
A simple example to outline it:
Say you have a Blog with a comment collection, and a user collection
You want to display the comment with the name of the user. So you have to load the player for every comment.
Instead you could save the username on the comment collection as well as the user collection.
Then you will have a fast query to show comments, as you don't need to load the users too. But if the user changes their name, then you will have to update all of the comments with the new name. This is the main tradeoff.
If a DB redesign is too difficult, I suggest splitting into multiple aggregates and combining them in memory (ie. in your node server side code)
I'm currently trying to learn Node.js and Mongoodb by building the server side of a web application which should manage insurance documents for the insurance agent.
So let's say i'm the user, I sign in, then I start to add my customers and their insurances.
So I have 2 collection related, Customers and Insurances.
I have one more collection to store the users login data, let's call it Users.
I don't want the new users to see and modify the customers and the insurances of other users.
How can I "divide" every user related record, so that each user can work only with his data?
I figured out I can actually add to every record, the _id of the one user who created the record.
For example I login as myself, I got my Id "001", I could add one field with this value in every customer and insurance.
In that way I could filter every query with this code.
Would it be a good idea? In my opinion this filtering is a waste of processing power for mongoDB.
If someone has any idea of a solution, or even a link to an article about it, it would be helpful.
Thank you.
This is more a general permissions problem than just a MongoDB question. Also, without knowing more about your schemas it's hard to give specific advice.
However, here are some approaches:
1) Embed sub-documents
Since MongoDB is a document store allowing you to store arbitrary JSON-like objects, you could simply store the customers and licenses wholly inside each user object. That way querying for a user would return their customers and licenses as well.
2) Denormalise
Common practice for NoSQL databases is to denormalise related data (ie. duplicate the data). This might include embedding a sub-document that is a partial representation of your customers/licenses/whatever inside your user document. This has the similar benefit to the above solution in that it eliminates additional queries for sub-documents. It also has the same drawbacks of requiring more care to be taken for preserving data integrity.
3) Reference with foreign key
This is a more traditionally relational approach, and is basically what you're suggesting in your question. Depending on whether you want the reference to be bi-directional (both documents reference each other) or uni-directional (one document references the other) you can either store the user's ID as a foreign user_id field, or store an array of customer_ids and insurance_ids in the user document. In relational parlance this is sometimes described to as "has many" or "belongs to" (the user has many customers, the customer belongs to a user).
I have two sets of data in the same collection in cosmos, one are 'posts' and the other are 'users', they are linked by the posts users create.
Currently my structure is as follows;
// user document
{
id: 123,
postIds: ['id1','id2']
}
// post document
{
id: 'id1',
ownerId: 123
}
{
id: 'id2',
ownerId: 123
}
My main issue with this setup is the fungible nature of it, code has to enforce the link and if there's a bug data will very easily be lost with no clear way to recover it.
I'm also concerned about performance, if a user has 10,000 posts that's 10,000 lookups I'll have to do to resolve all the posts..
Is this the correct method for modelling entity relationships?
As said by David, it's a long discussion but it is a very common one so, since I have on hour or so of "free" time, I'm more than glad to try to answer it, once for all, hopefully.
WHY NORMALIZE?
First thing I notice in your post: you are looking for some level of referential integrity (https://en.wikipedia.org/wiki/Referential_integrity) which is something that is needed when you decompose a bigger object into its constituent pieces. Also called normalization.
While this is normally done in a relational database, it is now also becoming popular in non-relational database since it helps a lot to avoid data duplication which usually creates more problem than what it solves.
https://docs.mongodb.com/manual/core/data-model-design/#normalized-data-models
But do you really need it? Since you have chosen to use JSON document database, you should leverage the fact that it's able to store the entire document and then just store the document ALONG WITH all the owner data: name, surname, or all the other data you have about the user who created the document. Yes, I’m saying that you may want to evaluate not to have post and user, but just posts, with user info inside it.This may be actually very correct, as you will be sure to get the EXACT data for the user existing at the moment of post creation. Say for example I create a post and I have biography "X". I then update my biography to "Y" and create a new post. The two post will have different author biographies and this is just right, as they have exactly captured reality.
Of course you may want to also display a biography in an author page. In this case you'll have a problem. Which one you'll use? Probably the last one.
If all authors, in order to exist in your system, MUST have blog post published, that may well be enough. But maybe you want to have an author write its biography and being listed in your system, even before he writes a blog post.
In such case you need to NORMALIZE the model and create a new document type, just for authors. If this is your case, then, you also need to figure out how to handler the situation described before. When the author will update its own biography, will you just update the author document, or create a new one? If you create a new one, so that you can keep track of all changes, will you also update all the previous post so that they will reference the new document, or not?
As you can see the answer is complex, and REALLY depends on what kind of information you want to capture from the real world.
So, first of all, figure out if you really need to keep posts and users separated.
CONSISTENCY
Let’s assume that you really want to have posts and users kept in separate documents, and thus you normalize your model. In this case, keep in mind that Cosmos DB (but NoSQL in general) databases DO NOT OFFER any kind of native support to enforce referential integrity, so you are pretty much on your own. Indexes can help, of course, so you may want to index the ownerId property, so that before deleting an author, for example, you can efficiently check if there are any blog post done by him/her that will remain orphans otherwise.
Another option is to manually create and keep updated ANOTHER document that, for each author, keeps track of the blog posts he/she has written. With this approach you can just look at this document to understand which blog posts belong to an author. You can try to keep this document automatically updated using triggers, or do it in your application. Just keep in mind, that when you normalize, in a NoSQL database, keep data consistent is YOUR responsibility. This is exactly the opposite of a relational database, where your responsibility is to keep data consistent when you de-normalize it.
PERFORMANCES
Performance COULD be an issue, but you don't usually model in order to support performances in first place. You model in order to make sure your model can represent and store the information you need from the real world and then you optimize it in order to have decent performance with the database you have chose to use. As different database will have different constraints, the model will then be adapted to deal with that constraints. This is nothing more and nothing less that the good old “logical” vs “physical” modeling discussion.
In Cosmos DB case, you should not have queries that go cross-partition as they are more expensive.
Unfortunately partitioning is something you chose once and for all, so you really need to have clear in your mind what are the most common use case you want to support at best. If the majority of your queries are done on per author basis, I would partition per author.
Now, while this may seems a clever choice, it will be only if you have A LOT of authors. If you have only one, for example, all data and queries will go into just one partition, limiting A LOT your performance. Remember, in fact, that Cosmos DB RU are split among all the available partitions: with 10.000 RU, for example, you usually get 5 partitions, which means that all your values will be spread across 5 partitions. Each partition will have a top limit of 2000 RU. If all your queries use just one partition, your real maximum performance is that 2000 and not 10000 RUs.
I really hope this help you to start to figure out the answer. And I really hope this help to foster and grow a discussion (how to model for a document database) that I think it is really due and mature now.
This is a general best practice question:
I am building a MEAN (mongo, express, angular, node) website. I have a user object that can have a gender [Mr or Miss] and a city [Paris, New York, Anything]
So this is quite a common problem: where should I store those lists that rarely change and never exceed, let's say, 50 rows.
1/ Is it better to have them stored in the database (mongo) with a foreign key in the user table. And so I have a gender table and a city table. But everytime I access these lists I need to read the base?
2/ Is it better to have them store in a file or in a controller? But this is a bit dangerous I think.
3/ Maybe there is another way that I don't know about.
I am not sure what is the best solution.
Are you concerned about an extra database call to get a list out?
If it was me I'd pick option 1 and I'd be storing it in a database. If you store value descriptions only front-end you'll run the risk of discrepancies if you end up updating your database's foreign keys but forget to update your controller or file and it seems rather untrustworthy. It also makes it more difficult to provide internationalization, because you'll have to start storing names and genders in files or controllers in multiple languages. Storing things is what a database is for and an additional call to get a list out is really not that big an impact on your performance.
Angular's $http object, which you are probably using to call your API has a caching option, which means you'll only need to retrieve the list once per app instantiation.
You could alternatively have a look at this post by Josh who found a way to pre populate a directive with JSON from the server before loading it.
Suppose that we have a web site where each person has a profile and other people write comments to the persons profile. (like the wall in facebook). What is the best way to store the comments made for a person ? I was thinking like a relational database type of thing where there will be a field to hold all the comments for a person in the form of a long string separated with some kind of delimiter but I am not sure if this is the best way. Any ideas ?
You'll have two separate tables one for Users one for Comments, all the entries having their unique IDs, schema would go like:
Users (ID, name, mail, etc)
Comments (ID, for, from, time, content, etc)
Where for and from fields are User IDs.
postgresql, mysql, sqlite or even leveldb if you want simple key value store. There's a lot of tutorials out there to get started with any of them.
The problem with Relational databases is that they do not scale well to super massive social networking sites. When your table starts to get huge the queries will start to take more and more time. If your site is going to be pretty small then a relational database is fine. I think that you may want to investigate "NoSql" databases.
Start here:
http://nosql-database.org/