Possible to add data into commands on certain MongoDB collections? - node.js

Is it possible to add data into commands on certain MongoDB collections?
The use case here is for simple management of multitenancy. We have data that doesn't contain the tenant's id and we then want to insert the tenant's id in on every command (find, update, updateOne, insert, insertMany, etc.) to particular collections (some collections are generic tenant wide collections). We are using the MongoDB driver (rather not use Mongoose).
Currently, we have to remember to add the tenant id whenever we use a command, but this is a bit dangerous as it is possible to miss adding the tenant's id...
Thanks!

Out of the box Mongo does not come with this feature set.
Mongoose actually shines for these kind of things with its middleware infrastructure.
If mongoose is not an option you can look into these smaller hooks implementations:
https://www.npmjs.com/package/mongohooks
https://www.npmjs.com/package/mongodb-hooks
They give you something to work with I believe.

In the end we created a helper function which we passed the data to. The helper then adds the tenant's id to any commands nessasay and send the command to MongoDB.
This allows us to control when to restrict by tenant per collection per command type.

Related

couchdb _find security, filtering _find returned results

I know that we can define a validation function for updates in replication. But what about find/select queries?
I am wondering if is it possible to put mandatory filter to all _find queries eg:
filter password field from person documents? or put automatic filter to show only that user's documents.I don't want a user be able to see other user's documents in the same collection.
Is it possible to put a java api in this regard?
I work at Couchbase, so I'll answer the question from the Couchbase angle.
Couchbase 7.0 (currently available in Beta at time of writing) introduces scopes and collections. One of the use cases for scopes is to support multi-tenancy.
You create multiple "scopes" within a bucket, then give each database user access to only certain scopes.

Multiple remote databases, single local database (fancy replication)

I have a PouchDB app that manages users.
Users have a local PouchDB instance that replicates with a single CouchDB database. Pretty simple.
This is where things get a bit complicated. I am introducing the concept of "groups" to my design. Groups will be different CouchDB databases but locally, they should be a part of the user database.
I was reading a bit about "fancy replication" in the pouchDB site and this seems to be the solution I am after.
Now, my question is, how do I do it? More specifically, How do I replicate from multiple remote databases into a single local one? Some code examples will be super.
From my diagram below, you will notice that I need to essentially add databases dynamically based on the groups the user is in. A critique of my design will also be appreciated.
Should the flow be something like this:
Retrieve all user docs from his/her DB into localUserDB
var groupDB = new PouchDB('remote-group-url');
groupDB.replicate.to(localUserDB);
(any performance issues with multiple pouchdb instances 0_0?)
Locally, when the user makes a change related to a specific group, we determine the corresponding database and replicate by doing something like:
localUserDB.replicate.to(groupDB) (Do I need filtered replication?)
Replicate from many remote databases to your local one:
remoteDB1.replicate.to(localDB);
remoteDB2.replicate.to(localDB);
remoteDB3.replicate.to(localDB);
// etc.
Then do a filtered replication from your local database to the remote database that is supposed to receive changes:
localDB.replicate.to(remoteDB1, {
filter: function (doc) {
return doc.shouldBeReplicated;
}
});
Why filtered replication? Because your local database contains documents from many sources, and you don't want to replicate everything back to the one remote database.
Why a filter function? Since you are replicating from the local database, there's no performance gain from using design docs, views, etc. Just pass in a filter function; it's simpler. :)
Hope that helps!
Edit: okay, it sounds like the names of the groups that the user belongs to are actually included in the first database, which is what you mean by "iterate over." No, you probably shouldn't do this. :) You are trying to circumvent CouchDB's built-in authentication/privilege system.
Instead you should use CouchDB's built-in roles, apply those roles to the user, and then use a "database per role" scheme to ensure users only have access to their proper group DBs. Users can always query the _users API to see what roles they belong to. Simple!
For more details, read the pouchdb-authentication README.

Securing the data accessed by Neo4j queries

I wish to implement security on the data contained in a Neo4j database down to the level of individual nodes and/or relationships.
Most data will be available to all users but some data will be restricted by user role. I can add either properties or labels to the data that I wish to restrict.
I want to allow users to run custom cypher queries against the data but hide any data that the user isn't authorised to see.
If I have to do something from the outside then not only do I have to filter the results returned but I also have to parse and either restrict or modify all queries that are run against the data to prevent a user from writing a query which acted on data that they aren't allowed to view.
The ideal solution would be if there is a low-level hook that allows intercepting the reads of nodes and relationships BEFORE a cypher query acts on those records. The interceptor would perform the security checks and if they fail then it would behave as though the node or relationship didn't exist at all. i.e. the same cypher query would have different results depending on who ran it. And this would apply to all possible queries e.g. count(n) not just those that returned the nodes/relationships.
Can something like this be done? If it's not supported already, is there a suitable place in the code that I could add such a security filter or would it require many code changes?
Thanks, Damon
As Chris stated, it's certainly not trivial on database level, but if you're looking for a solution on application level, you might have a look at Structr, a framework on top of and tightly integrated with Neo4j.
It provides node-level security based on ACLs, with users, groups, and different access levels. The security in Structr is implemented on the lowest level possible, f.e. we only instantiate objects if the querying user has the approriate access rights.
All higher access levels like REST API and UI see only the records available in the user's context.
[1] http://structr.org, https://github.com/structr/structr

ACL best practices, store roles in user object, or separate table/collection?

I am using nodejs, and have been researching acl/authorization for the past week. I have found only a couple, but none seem to have all the features I require. The closest has been https://github.com/OptimalBits/node_acl, but I don't think it supports protecting resources by id (for example, if I wanted to allow user 12345 and only user 12345 to access user/12345/edit). Hence, I think I will have to make a custom acl solution for myself.
My question regarding this is, what are some pros and cons to storing roles (user, admin, moderator, etc.) under each user object, as opposed to creating another collection/table that maps each user with their authorization rules? node_acl uses a separate collection, whereas most of the other ones depend on the roles array in user objects.
By the way, I am using Mongodb at the moment. However I have not researched the pros and cons yet of using relational vs. nonrelational databases for authentication yet, so if let me know if your answer depends on that.
As I was typing this up, I thought of one thing. If I store roles in a separate collection, it is more portable. I would be able to swap out the acl system much more easily. (I think?)
The question here seems like it could be abstracted from "where should I store my roles" to "how should I store related information in Mongo (or NoSQL in general)". It's a relation vs non-relational modeling issue.
Non-Relational
Using Node + Mongo, storing the roles on the user will make it really easy to determine if a user has access to the feature, given that you can just look in the 'roles' property. The trade off is that you have lots of duplicate information ('user_read' could be a role on every user account) and if you end up changing that property, you'll need to update it inside every user object.
You could store the roles in their own collection and then store the id for that entry in the Roles collection on your User model, but then you'll still need to fetch the actual record from the collection to display any of it's information (though arguably this could be a rare occurrence)
Relational
Storing these in a relational DB would be a more "traditional" approach in that you can establish the relationships between the tables (via FKs / join tables or what not). This can be a good solution, but then you no longer have the benefits of using a NoSQL database.
Summary
If the rest of your app is stored in Mongo and has to stay there (for performance or whatever constraint) then you are probably better off doing it all in Mongo. Most of the advice I've come across says don't mix & match data stores, e.g. use one or the other, but not both. That being said, I've done projects with both and it can get messy but sometimes the pros outweigh the cons.
I like #DavidWelch answer, but I'd like to tackle the question from another perspective because the library mentioned gives the option to use a different data store entirely.
Storing roles in a separate data store:
(Pro) Can make the system more performant if you are using a faster data store. (More advantageous in distributed environments?)
(Con) You will have to ensure consistency between the two data stores.
General notes:
You can add roles/permissions such as 'blog\123' in acl. You can also give a user permissions based on verbs such as put, delete, get, etc..
I think it is easier to create a pluggable solution that does not depend on your storage implementation. Perhaps that is why acl does not store roles in the same collections you have.
If you choose to keep the roles in your own collection, consider adding them to a token (JWT). That way, you will not have to check your collection for every request that needs authorization.
I hope that helped.

How to construct this database in mongodb?

say i have two collections in mongodb,one for users which contains users basic info,and one for apps which contains applications.now if users are allowed to add apps,and the next time when they login, the web should get the user added app for them.how should i construct this kind of database in mongodb.
users:{_id:ObjectId(),username:'username',password:'password'}
apps:{_id:ObjectId(),appname:'',developer:,description:};
and how should i get the user added app for them??? should i add something like addedAppId like:
users:{_id:ObjectId(),username:'username',password:'password',addedApppId:[]}
to indicate which app they have added,and then get the apps using addedAppId???
Yep, there's nothing wrong with the users collection keeping track of which apps a user has added like you've indicated.
This is known as linking in MongoDB. In a relational system you'd probably create a separate table, added_apps, which had a user ID, app ID and any other relevant information. But since you can't join, keeping this information in the users collection is entirely appropriate.
From the docs:
A key question when designing a MongoDB schema is when to embed and when to link. Embedding is the nesting of objects and arrays inside a BSON document. Links are references between documents.
There are no joins in MongoDB – distributed joins would be difficult on a 1,000 server cluster. Embedding is a bit like "prejoined" data. Operations within a document are easy for the server to handle; these operations can be fairly rich. Links in contrast must be processed client-side by the application; the application does this by issuing a follow-up query.
(this is the extra bit you'd need to do, fetch app information from the user's stored AppId.)
Generally, for "contains" relationships between entities, embedding should be be chosen. Use linking when not using linking would result in duplication of data.
...
Many to many relationships are generally done by linking.

Resources