How to set dynamic model association without foreign key cakephp3 - model-associations

I am using cakephp 3.4.
I need to retrieve the data from two tables where no any association between them. I need to have solution where I can bind 2 models dynamically for that particular query.
There is an option to bind-unbind model, but that is an old method.
If there are 2 tables i.e. Articles and Authors without any relations/association, I could be able to find Author record while querying on Articles.

Related

How to generate test data for mongodb in bulk which has reference to different collections?

I'm working on a Node.JS project with Mongodb as a database. I want to test how the api performs when the db collection has thousands of documents in it.
So I'm trying to fill the collection with test data but the problem here is, I've multiple associations for a single collections, i.e multiple reference fields in collection.
So is there any api available which allows me to generate data that supports reference fields too.
There is an old library for creating dummy data in mongoose: mongoose-dummy. But it doesn't look like it handles referenced documents.
You can also use faker to create eg. addresses, emails, names, numbers etc.
This is one of those things you really need to DIY (though you can use the above libraries to make it easier)
You created your application and models. Only you will be able to make realistic test data. For example, you will need to decide how many documents to create for a comment model on a blog post model based on what either historical data or your expectations if none exists.

MongoDB - When to add SubDocuments and when to Ref

Im using MongoDB for storing information for a nodeJS application and a doubt came to my mind, after finding that it is possible to use ObjectID to ref another document. As it is known, MongoDB is a no-SQL db, so there is no need for consistency whatsoever and information can be repeated.
So, lets say, I have a collection for users and one of their field values is 'friends', which is an array of this user friends (another users). What is the best practice, saving all the user info there (thus repeating the same thing over and over again throughout the DB) or saving only the ObjectID of the friendUser (makes way more sense to me, but it sounds kinda SQL mindset). I'm not really getting when should I use each of the options, so a professional opinion would be very appreciated.
To model relationships between connected data, you can reference a document or embed it in another document as a subdocument.
Referencing a document does not create a “real” relationship between these two documents as does with a relational database.
Referencing documents is also known as normalization. It is good for data consistency but creates more queries in your system.
Embedding documents is also known as denormalization.
The benefit of Embedding approach is getting all the data you need about a document and it’s sub-document(s) with a single query. Therefore, this approach is very fast. The drawback is that data may not stay as consistent in the database.
Important
If one document is to be used by many documents then better create a referenced doc.
i. Will Save Space.
ii. if any change required, we will have to update only the referenced doc
instead of updating many docs.
Create sub doc(embedded)
i. If another document is not dependent on the subdocument.
Source: https://vegibit.com/mongoose-relationships-tutorial/
Recommended reading:
MongoDB Applied Design Patterns by Rick Copeland
To Embed or Reference

Can CouchDB do this?

I evaluating CouchDB & I'm wondering whether it's possible to achieve the following functionality.
I'm planning to develop a web application and the app should allow a 'parent' table and derivatives of this table. The parent table will contains all the fields (master table) and the user will selectively choose fields, which should be saved as separate tables.
My queries are as follows:
Is it possible to save different versions of the same table using CouchDB?
Is there an alternative to creating child tables (and clutter the database)?
I'm new to NoSQL databases and am evaluating CouchDB because it supports JSON out of the box and this format seems to fit the application very well.
If there are alternatives to NOT save the derivatives as separate tables, the better will the application be. Any ideas how I could achieve this?
Thanks in advance.
CouchDB is a document oriented database which means you cannot talk in terms of tables. There are only documents. The _rev (or revision ID) describes a version of a document.
In CouchDB, there are 2 ways to achieve relationships.
Use separate documents
Use an embedded array
If you do not prefer to clutter your database, you can choose to use option (2) by using an embedded array.
This gives you the ability to have cascade delete functionality as well for free.

Is it possible to have documents with a subset of fields of the collection's schema under one solr collection?

We have 4 different data sets and want to perform faceted search on them.
We are currently using SolrCloud and flattened these data sets before indexing them to Solr. Even though we have relational data, our primary goal is faceted search and Solr seemed like the right option.
Rough structure of our data:
Dataset1(col1, col2, col3,col4)
Dataset2(col1,col6,col7,col8)
Dataset3(col6,col9,col10)
Flattened dataset: dataset(col1,col2,col3,col4,col6,col7,col8,col9,col10).
In the end, we flattened them to have one common structure and have nulls where values do not exist. So far Solr works great.
Problem: Now we have additional data sets coming in and each of them have about 50-60 columns. Technically, I can still flatten these too, but I don't think it is a good idea. I know that I can have different collections with different schemas for each data set. But, we perform group by's on these documents so we need one schema.
Is there any way to maintain documents with a subset of fields of the schema under one collection without flattening them? If not, is there a better solution for this problem?
For instance:
DocA(field1, field2) DocB(field3,field4).
Schema(field1, field2, field3, field4).
Can we have DocA and DocB under one collection with the above schema?
Our backend is on top of Cloudera Hadoop (CDH4.6 and 5.2) distribution and we can choose any tool that belongs to the Hadoop ecosystem for a possible solution.
Of course you can, they only need a different uniquekey for each document. If you have defined a fixed solr schema, maybe dynamicfields can help you.

How to perform intersection operation on two datasets in Key-Value store?

Let's say I have 2 datasets, one for rules, and the other for values.
I need to filter the values based on rules.
I am using a Key-Value store (couchbase, cassandra etc.). I can use multi-get to retrieve all the values from one table, and all rules for the other one, and perform validation in a loop.
However I find this is very inefficient. I move massive volume of data (values) over the network, and the client busy working on filtering.
What is the common pattern for finding the intersection between two tables with Key-Value store?
The idea behind the nosql data model is to write data in a denormalized way so that a table can answer to a precise query. To make an example imagine you have reviews made by customers on shops. You need to know the reviews made by a user on shops and also reviews received by a shop. This would be modeled using two tables
ShopReviews
UserReviews
In the first table you query by shop id in the second by user id but data are written twice and accessed directly using just a key access.
In the same way you should organize values by rules (can't be more precise without knowing what's the relation between them) and so on. One more consideration: newer versions of nosql db supports collections which might help to model 1 to many relations.
HTH, Carlo

Resources