MongoDB find documents based on array of values - node.js

I have one collection, called "games" whose documents store the ids of the owners of games.
{
"owner" : "88878b6c25c167d"
}
{
"owner" : "88878b6c25c167d"
}
{
"owner" : "af565f77f73469b"
}
Then I have another collection called "users".
{
"id" : "af565f77f73469b"
}
{
"id" : "a881335e1d4cf17"
}
{
"id" : "aa3ce3f7767c46b"
}
{
"id" : "d19e52c0bd78bcb"
}
{
"id" : "88878b6c25c167d"
}
So the first query I do retrieves the owners of all the games and stores those values in an array.['88878b6c25c167d', '88878b6c25c167d', 'af565f77f73469b']
The second query I want to perform should retrieve the documents of the users with the corresponding IDs. Previously I had used this query:
db.users.find({'id':
{'$in': [
'88878b6c25c167d',
'88878b6c25c167d',
'af565f77f73469b'
]}})
Here's the problem with this: It does not return duplicate documents if a user is listed twice, as above. Instead I just get one. This breaks my application. How can I make sure that each owner returns a document?

MongoDB works perfectly fine --- it finds all user, whose id-s are contained in the array.
Do not know the broader context of your needs (maybe tell us what you want to achieve -- not what is wrong?), but if you want to have an association between games and users something like that may be suitable:
after retrieving collection of games; just create an auxiliary hash map (normal JS object) that for given owner id will return the array of its games.
retrieve info about users who are owners.
if you want to know, which games belong to particular user just pass her id to data structure from 1. and get the arrays with games.
Is it what you were looking for? Do you need help with 1.?

Related

emit doc twice with different key in couchdb

Say I have a doc to save with couchDB and the doc looks like this:
{
"email": "lorem#gmail.com",
"name": "lorem",
"id": "lorem",
"password": "sha1$bc5c595c$1$d0e9fa434048a5ae1dfd23ea470ef2bb83628ed6"
}
and I want to be able to query the doc either by 'id' or 'email'. So when save this as a view I write so:
db.save('_design/users', {
byId: {
map: function(doc) {
if (doc.id && doc.email) {
emit(doc.id, doc);
emit(doc.email, doc);
}
}
}
});
And then I could query like this:
db.view('users/byId', {
key: key
}, function(err, data) {
if (err || data.length === 0) return def.reject(new Error('not found'));
data = data[0] || {};
data = data.value || {};
self.attrs = _.clone(data);
delete self.attrs._rev;
delete self.attrs._id;
def.resolve(data);
});
And it works just fine. I could load the data either by id or email. But I'm not sure if I should do so.
I have another solution which by saving the same doc with two different view like byId and byEmail, but in this way I save the same doc twice and obviously it will cost space of the database.
Not sure which solution is better.
The canonical solution would be to have two views, one by email and one by id. To not waste space for the document, you can just emit null as the value and then use the include_docs=true query paramter when you query the view.
Also, you might want to use _id instead of id. That way, CouchDB ensures that the ID will be unique and you don't have to use a view to loop up documents.
I'd change to the two separate views. That's explicit and clear. When you emit the same doc twice in a single view – by an id and e-mail you're effectively combining the 2 views into one. You may think of it as a search tree with the 2 root branches. I don't see any reason of doing that, and would suggest leaving the data access and storage optimization job to the database.
The views combination may also yield tricky bugs, when for some reason you confuse an id and an e-mail.
There is absolutely nothing wrong with emitting the same document multiple times with a different key. It's about what makes most sense for your application.
If id and email are always valid and interchangeable ways to identify a user then a single view is perfect. For example, when id is some sort of unique account reference and users are allowed to use that or their (more memorable) email address to login.
However, if you need to differentiate between the two values, e.g. id is only meant for application administrators, then separate views are probably better. (You could probably use a complex key instead ... but that's another answer.)

How can I reduce the number of calls to a MongoDB instance when using a role-based application?

I'm specifically talking about NodeJS with MongoDB (I know MongoDB is schema-less, but let's be realistic about the importance of structuring data for a moment).
Is there some magic solution to minimising the number of queries to a database in regards to authenticating users? For example, if the business logic of my application needs to ensure that a user has the necessary privileges to update/retrieve data from a certain Document or Collection, is there any way of doing this without two calls to the database? One to check the user has the rights, and the other to retrieve the necessary data?
EDIT:
Another question closed by the trigger-happy SO moderators. I agree the question is abstract, but I don't see how it is "not a real question". To put it simply:
What is the best way to reduce the number of calls to a database in role-based applications, specifically in the context of NodeJS + MongoDB? Can it be done? Or is role-based access control for NodeJS + MongoDB inefficient and clumsy?
Obviously, you know wich document holds which rigths. I would guess that it is a field in the document, like :
{ 'foo':'bar'
'canRead':'sales' }
At the start of the session you could query the roles a user has. Say
{ 'user':'shennan',
'roles':[ 'users','west coast','sales'] }
You could store that list of roles in the user's session. With that in hand, all that's left to do is add the roles with an $in operator, like this :
db.test.find({'canRead':{'$in':['users','west coast','sales']})
Where the value for the $in operator is taken from the user's session. Here is code to try it out on your own, in the mongo console :
db.test.insert( { 'foo':'bar', 'canRead':'sales' })
db.test.insert( { 'foo2':'bar2', 'canRead':['hr','sales'] })
db.test.insert( { 'foo3':'bar3', 'canRead':'hr' })
> db.test.find({}, {_id:0})
{ "foo" : "bar", "canRead" : "sales" }
{ "foo2" : "bar2", "canRead" : [ "hr", "sales" ] }
{ "foo3" : "bar3", "canRead" : "hr" }
Document with 'foo3' can't be read by someone in sales :
> db.test.find({'canRead':{'$in':['users','west coast','sales']}}, {_id:0})
{ "foo" : "bar", "canRead" : "sales" }
{ "foo2" : "bar2", "canRead" : [ "hr", "sales" ] }
Definitely do-able, but w/o more context it's hard to determine what's best.
One simple solution that comes to mind is to cache users and their permissions in memory so no DB lookup is required. At this point you can just issue the query for documents where permission match and...
Let me know if you need a few more ideas.

CouchDb view - key in a list

I Want to query CouchDB and I have a specific need : my query should return the name field of documents corresponding to this condition : the id is equal or contained in a document filed (a list).
For example, the field output is the following :
"output": [
"doc_s100",
"doc_s101",
"doc_s102",
"doc_s103",
],
I want to get all the documents having in their output field "doc_s102" for example.
I wrote a view in a design document :
"backward_by_docid": {
"map": "function(doc) {if(doc.output) emit(doc.output, doc.name)}"
}
but this view works only when I have a unique value in the output field.
How can I resolve this query ?
Thanks !
you have to iterate over the array:
if(doc.output) {
for (var curOutput in doc.output) {
emit (doc.output[curOutput],doc.name);
}
}
make sure that output always is an array (at least [])
.. and, of course use key="xx" instead key=["xxx"]

Object mapping solution

I have this SQL Database structure. A users table, a objects table and a mapping table users_objects_map to assign objects to an user account.
In SQL it works fine. With this structure it is easy to fetch objects of an user or users assigned to an object. I also can assign an object to multiple users.
users
id
firstname
lastname
...
objects
id
...
users_objects_map
user_id
object_id
What is the best way to build this with MongoDB?
My first idea was to add an array to the users where all IDs of assign objects will stored.
{"firstname":"John", "lastname": "Doe", "object_ids":["id","id2",...,"id-n"]}
But what is if a user is assigned to thousands of objects? I don't think that's a good solution. And how I'm able to fetch all users assigned to an object or all objects assigned to an user?
Is there any clever MongoDB solution for my problem?
Using object IDs within BsonArrays as a reference towards the objects is a great way to go and also consider using BsonDocuments within the "object_ids" of the user itself, then you will be able to scale it easier and using the "_id" (ObjectID) so that MongoDB indexes those IDs, this will gain performance.
Eventually you will be having 2 collections, one is with users and the other is with objects:
user:
{
"_id" : "user_id",
"firstname" : "John",
"lastname" : "Doe",
"object_ids" : [
{ "_id" : "26548" , "futurefield" : "futurevalue" },
{ "_id" : "26564" , "futurefield" : "futurevalue" }
]
}
At this moment I really don't know what kind of objects they are going to be.. but i can give you an example:
workshop object:
{
"_id>" : "user_id",
"name" : "C# for Advanced Users",
"level" : "300",
"location" : "Amsterdam, The Netherlands",
"date" : "2013-05-08T15:00:00"
}
Now comes the fun part and that is querying.
I am developing in C# and using the driver from mongodb.org.
Example:
Give me everyone that has object id == "26564".
var query = from user in userCollection.Find(Query.EQ("objects_ids._id","26564"))
select user;
This query will return the documents, in this case the users that have matched the ID.
If you have a range of values please use: Query.All("name" , "BsonArray Values");
The second query is find and/or match the IDs of the objects ID that BsonDocuments might contain.
var secondQuery =
from workshops in objectsCollection.Find(Query.EQ("_id", "userid"))
select cust["object_ids"].AsBsonArray.ToArray();
I hope I have helped you this way.
Good luck with it!

Whats the best way of saving a document with revisions in a key value store?

I'm new to Key-Value Stores and I need your recommendation. We're working on a system that manages documents and their revisions. A bit like a wiki does. We're thinking about saving this data in a key value store.
Please don't give me a recommendation that is the database you prefer because we want to hack it so we can use many different key value databases. We're using node.js so we can easily work with json.
My Question is: What should the structure of the database look like? We have meta data for each document(timestamp, lasttext, id, latestrevision) and we have data for each revision (the change, the author, timestamp, etc...). So, which key/value structure you recommend?
thx
Cribbed from the MongoDB groups. It is somewhat specific to MongoDB, however, it is pretty generic.
Most of these history implementations break down to two common strategies.
Strategy 1: embed history
In theory, you can embed the history of a document inside of the document itself. This can even be done atomically.
> db.docs.save( { _id : 1, text : "Original Text" } )
> var doc = db.docs.findOne()
> db.docs.update( {_id: doc._id}, { $set : { text : 'New Text' }, $push : { hist : doc.text } } )
> db.docs.find()
{ "_id" : 1, "hist" : [ "Original Text" ], "text" : "New Text" }
Strategy 2: write history to separate collection
> db.docs.save( { _id : 1, text : "Original Text" } )
> var doc = db.docs.findOne()
> db.docs_hist.insert ( { orig_id : doc._id, ts : Math.round((new Date()).getTime() / 1000), data : doc } )
> db.docs.update( {_id:doc._id}, { $set : { text : 'New Text' } } )
Here you'll see that I do two writes. One to the master collection and
one to the history collection.
To get fast history lookup, just grab the original ID:
> db.docs_hist.ensureIndex( { orig_id : 1, ts : 1 })
> db.docs_hist.find( { orig_id : 1 } ).sort( { ts : -1 } )
Both strategies can be enhanced by only displaying diffs
You could hybridize by adding a link from history collection to original collection
Whats the best way of saving a document with revisions in a key value store?
It's hard to say there is a "best way". There are obviously some trade-offs being made here.
Embedding:
atomic changes on a single doc
can result in large documents, may break the reasonable size limits
probably have to enhance code to avoid returning full hist when not necessary
Separate collection:
easier to write queries
not atomic, needs two operations (do you have transactions?)
more storage space (extra indexes on original docs)
I'd keep a hierarchy of the real data under each document with the revision data attached, for instance:
{
[
{
"timestamp" : "2011040711350621",
"data" : { ... the real data here .... }
},
{
"timestamp" : "2011040711350716",
"data" : { ... the real data here .... }
}
]
}
Then use the push operation to add new versions and periodically remove the old versions. You can use the last (or first) filter to only get the latest copy at any given time.
I think there are multiple approaches and this question is old but I'll give my two cents as I was working on this earlier this year. I have been using MongoDB.
In my case, I had a User account that then had Profiles on different social networks. We wanted to track changes to social network profiles and wanted revisions of them so we created two structures to test out. Both methods had a User object that pointed to foreign objects. We did not want to embed objects from the get-go.
A User looked something like:
User {
"tags" : [Tags]
"notes" : "Notes"
"facebook_profile" : <combo_foreign_key>
"linkedin_profile" : <same as above>
}
and then, for the combo_foreign_key we used this pattern (Using Ruby interpolation syntax for simplicity)
combo_foreign_key = "#{User.key}__#{new_profile.last_updated_at}"
facebook_profiles {
combo_foreign_key: facebook_profile
... and you keep adding your foreign objects in this pattern
}
This gave us O(1) lookup of the latest FacebookProfile of a User but required us to keep the latest FK stored in the User object. If we wanted all of the FacebookProfiles we would then ask for all keys in the facebook_profiles collection with the prefix of "#{User.key}__" and this was O(N)...
The second strategy we tried was storing an array of those FacebookProfile keys on the User object so the structure of the User object changed from
"facebook_profile" : <combo_foreign_key>
to
"facebook_profile" : [<combo_foreign_key>]
Here we'd just append on the new combo_key when we added a new profile variation. Then we'd just do a quick sort of the "facebook_profile" attribute and index on the largest one to get our latest profile copy. This method had to sort M strings and then index the FacebookProfile based on the largest item in that sorted list. A little slower for grabbing the latest copy but it gave us the advantage knowing every version of a Users FacebookProfile in one swoop and we did not have to worry about ensuring that foreign_key was really the latest profile object.
At first our revision counts were pretty small and they both worked pretty well. I think I prefer the first one over the second now.
Would love input from others on ways they went about solving this issue. The GIT idea suggested in another answer actually sounds really neat to me and for our use case would work quite well... Cool.

Resources