PouchDB as a real live data tool for different collections - couchdb

I'm thinking of using PouchDB as a solution to automatically update comments that are submitted by users on papers.
It should mimic the behavior of a subscribe/publish service. Whenever someone submits a comment in his client, the list of comments on an other client should automatically update.
This is possible using PouchDB as described in the getting started guide:
var db = new PouchDB('paper');
var remoteCouch = 'http://user:pass#mname.iriscouch.com/paper';
function sync() {
var opts = {live: true};
db.replicate.to(remoteCouch, opts, syncError);
db.replicate.from(remoteCouch, opts, syncError);
}
The app holds different papers, each with their own comments. When using PouchDB as my publish/subscribe service, I have these questions:
Is it a good idea to use PouchDB this way?
If I only want to sync the comments of the current paper a user is working on, should I create a new database for each paper? (This would also mean I would lose the possibility to query for example all the users comments in all the papers from a single database)
Is there a way to only sync a part of the database? This way I could still use the database to hold all the comments even for different papers.

Yep, PouchDB works fine for real-time stuff. It doesn't use web sockets, but it uses long-polling, which is fast enough for most use cases.
It sounds like you probably should create a separate database for each paper, assuming you want to restrict access on a per-paper basis. CouchDB authentication is kinda tricky, but basically if you want to control read access, you can either give users full read access or zero read access to an entire database. There's a writeup here.
Also don't worry about creating thousands of databases; a "database" is cheap in CouchDB.
The only other thing I would advise is that maybe you would like the relational-pouch plugin, because then you could easily set up a relational-style database with a "paper" type and a "comment" type.

Related

Couchdb apply filter server side

I'm developing a mobile app using PouchDB (client-side) and CouchDB (server-side).
I need to secure docs in order to allow users to read/write his own documents only.
I did a filter for this, something like:
function(doc, req) {
return doc.owner == req.userCtx.name || doc.sharedWith == req.userCtx.name;
}
and it works well, but only if the request from client includes the filter:
/somedatabase/_alldocs?filter=filter/secure
I need CouchDB to use the filter in every request, with or without client explicitation, for obvious security reasons. Is this even possible? Otherwise which is the correct approch to handle these security issues?
There is a similar question here but the answer is not applicable in my case since I need to share docs between users and replicate them between all databases is not a valid option.
So I don't know if you have looked at this wiki but it lists few options available. Some of them are outdated tho.
Per user database
Probably the most popular solution. As you said, you need to share documents with other users. This could be done by :
Copy document to other users when sharing. You could have a deamon that listen to _changes feed and update the author file in other users database.
Build a web service to access shared documents (very similar to proxy solution)
Smart Proxy
Build a smart proxy in front of your database and do some business logic to fetch the documents. This gives you more control on your data flow but it will surely be slower.
Note
The validate_doc_read server function could interest you but it has never been part of CouchDB's releases(due to the listed limitations).
Uhm, probably it isn't. The app that we are developing need to share documents with different users. any doc could be shared with a different group of users

Best way to store answers from users in Facebook bot chat?

Building a Facebook messenger bot using Claudia JS and plan on hosting on AWS Lambda.
I want to ask the user a series of questions.
When a user responds with an answer, I need to save that for later and once I have all the information I need, I will pass the answers to a function.
What is the best way to save this information?
I was thinking some caching layer such as redis but because that is stored in RAM I will lose it when lamda server shuts down. Mongodb apparently has a lot of overheads when connecting but will at least be persistent.
Perhaps just a simple mySQL server?
How does everybody else do it? I feel like there is a simple solution that I am missing.
I will first answer the part about how I'm doing it: I'm using a MongoDB. I toyed with the ideas you mentioned, but quickly crossed out in-memory solutions (Memcached, Redis) with the same reason. My final solution came down to either a relational DB or a noSQL like MongoDB. To be honest, at my project's scale, I did not think about robustly comparing performance between DB types.
With my particular feature "roadmap," I decided to go with Mongo to approach a more "OOP" style when dealing with the user "object" without having to explicitly define a user class, thanks to the normalized structure of Mongo. I understand the same could be done for MySQL, too, just that processing json data is more "object-like" for me and flask, i.e. user = getUserFromMongo, which gives me a dict in Python then I can just do user['first_name']. The codes belows will explain this simplicity:
(Somehow this was feeling like... not having to write SQL commands for simple database interaction in Rails)
My user object data on MongoDB
Finally, as to how I manage user input, I adopted Wit.ai's concept of context. I don't know how they do it exactly, but a context to me is the type of conversation purpose that is going on. I use it like a stack, and as soon as the current context is done, pop it off the context data of the user. For every message the bot receives, the program will get the current context and direct the flow. Whenever an unknown error occurs (exceptions handling), most likely because the user is saying something the bot doesn't understand, I clear the context data, too.
The good part about MongoDB is that I can shape the context however I want and treat it just as an object. A simple one is like {name: yelp-search, stage:ask-for-user-location}, and I imagine complex ones could be built on that structure, too. Of course, a stack implementation of the context does not deal with complex conversation with complex past reference.
I put my project on Github if you want to take a look at it.
i have also used mysql for chatbot but i have used NodeJS for the backend app.For that mysql module would be very helpful.
You need to store users' current state for the question answer session and also store the answer itself from the user and you need to make a switch or if-else-if case for asking questions to user based on its state as switch(state) and in cases of switch just update it's state.and you have user's facebook-id in event object of chatbot so that you can store data of each user individually with their state and question-answer in different table.
For e.g. define flags{1,2,3}
user's state will be 1 in begining so ask him for e.g. question-1
only,and store this as answer-1, you can do this by it's state
checking, and after this update status to 2.
so,in this way you can ask each individual student question as per
their state and answer him.
I've done the same in exact above manner.
Hope this would be helpful to you.

Multiple remote databases, single local database (fancy replication)

I have a PouchDB app that manages users.
Users have a local PouchDB instance that replicates with a single CouchDB database. Pretty simple.
This is where things get a bit complicated. I am introducing the concept of "groups" to my design. Groups will be different CouchDB databases but locally, they should be a part of the user database.
I was reading a bit about "fancy replication" in the pouchDB site and this seems to be the solution I am after.
Now, my question is, how do I do it? More specifically, How do I replicate from multiple remote databases into a single local one? Some code examples will be super.
From my diagram below, you will notice that I need to essentially add databases dynamically based on the groups the user is in. A critique of my design will also be appreciated.
Should the flow be something like this:
Retrieve all user docs from his/her DB into localUserDB
var groupDB = new PouchDB('remote-group-url');
groupDB.replicate.to(localUserDB);
(any performance issues with multiple pouchdb instances 0_0?)
Locally, when the user makes a change related to a specific group, we determine the corresponding database and replicate by doing something like:
localUserDB.replicate.to(groupDB) (Do I need filtered replication?)
Replicate from many remote databases to your local one:
remoteDB1.replicate.to(localDB);
remoteDB2.replicate.to(localDB);
remoteDB3.replicate.to(localDB);
// etc.
Then do a filtered replication from your local database to the remote database that is supposed to receive changes:
localDB.replicate.to(remoteDB1, {
filter: function (doc) {
return doc.shouldBeReplicated;
}
});
Why filtered replication? Because your local database contains documents from many sources, and you don't want to replicate everything back to the one remote database.
Why a filter function? Since you are replicating from the local database, there's no performance gain from using design docs, views, etc. Just pass in a filter function; it's simpler. :)
Hope that helps!
Edit: okay, it sounds like the names of the groups that the user belongs to are actually included in the first database, which is what you mean by "iterate over." No, you probably shouldn't do this. :) You are trying to circumvent CouchDB's built-in authentication/privilege system.
Instead you should use CouchDB's built-in roles, apply those roles to the user, and then use a "database per role" scheme to ensure users only have access to their proper group DBs. Users can always query the _users API to see what roles they belong to. Simple!
For more details, read the pouchdb-authentication README.

how to generate a unique id in node.js for users registering a webservice?

I am building a webservice with node.js and I am registering users in my service. I am using node.js + mongodb for my db and once I create a new user I also want to create a unique id for them and send that back as a response, just like all the great services like fb that send u back the facebook id. I do not want to send back the _id from mongodb, so how do I generate a unique id for every user in node, also is it better to do it this way or just send back the mongo _id.
is it better to do it this way or just send back the mongo _id
If you have to ask, it is better to just send back the mongo _id. Unless you have iron-clad reasoning behind why that simple, straightforward, make-life-easy-for-everyone technique is problematic, by all means, just send back the mongo _id.
If you decide the mongo _id is not good enough for you (probably due to FUD as opposed to any realistic reasoning), you have the following extra challenges you are adopting without any benefit:
you have to think more carefully about your indexes
Helper libraries functions like findById don't work for you anymore
Now you have 2 huge, hard-to-eyeball IDs to deal with on every record
helper libraries like mongoose are also going to be challenging to leverage
You will have to be mapping back and forth between _id and mySuperAwesomeExtraneousId constantly during debugging for the entire lifetime of your app
K.I.S.S.
That said, you can always just use an additional mongo ObjectId as they are perfectly valid unique Ids:
const mongo = require('mongodb')
let mySuperAwesomeExtraneousId = new mongo.ObjectID()
Use coupon-code.This is simple to use and solves most of the use cases for generation of unique ids.
https://github.com/appsattic/node-coupon-code
https://github.com/broofa/node-uuid
There are numerous good reasons for not sending back the _id created by Mongoose by default for example. For one, those IDs could easily be guessed by a rogue entity or hacker. And it's also never a good idea to expose your database ids anyway.
For an app that relies solely on a unique ID for a password-less account access, generating a highly unique hash id may be the best option. The perfect node module for that is hashids
You may also give crypto a try:
require('crypto').randomBytes(48, function(ex, buf) {
var token = buf.toString('hex');
});
good luck!
Use http://mongoosejs.com as an abstraction layer to MongoDB. It will ensure that you always have the _id available. It also will manage many other things, such as validation, connections, etc.
Pros: It is simpler to use than raw drivers, while at the same time maintaining all the power of a raw MongoDB driver. You will be up and running in about 20 minutes after visiting the mongoose website. Its one of the few times you don't trade off power for simplicity. It is a thin abstraction layer, you will have a net performance gain in code by having this single, robust layer in place then by trying to re-invent the wheel on every single part of your code that needs to access the DB. It supports every single MongoDB feature. It provides validation. It provides a simple interface to managing indexes. It automatically creates databases and collections. Built by the same guys that made ExpressJS, Socket.IO and Mocha for node.js. The list goes on.
Cons: Does not support multiple MongoDB connections. This is usually not a problem though because you will most likely use MongoDB's sharding features before you need to create multiple connections to multiple MongoDB clusters. It has a silly name.
We have been using Mongoose in production environments for quite sometime. If you want to see it in action, look at ZingProject.com. Its entirely node.js + mongoose over MongoDB. Its lightning fast.

ACL best practices, store roles in user object, or separate table/collection?

I am using nodejs, and have been researching acl/authorization for the past week. I have found only a couple, but none seem to have all the features I require. The closest has been https://github.com/OptimalBits/node_acl, but I don't think it supports protecting resources by id (for example, if I wanted to allow user 12345 and only user 12345 to access user/12345/edit). Hence, I think I will have to make a custom acl solution for myself.
My question regarding this is, what are some pros and cons to storing roles (user, admin, moderator, etc.) under each user object, as opposed to creating another collection/table that maps each user with their authorization rules? node_acl uses a separate collection, whereas most of the other ones depend on the roles array in user objects.
By the way, I am using Mongodb at the moment. However I have not researched the pros and cons yet of using relational vs. nonrelational databases for authentication yet, so if let me know if your answer depends on that.
As I was typing this up, I thought of one thing. If I store roles in a separate collection, it is more portable. I would be able to swap out the acl system much more easily. (I think?)
The question here seems like it could be abstracted from "where should I store my roles" to "how should I store related information in Mongo (or NoSQL in general)". It's a relation vs non-relational modeling issue.
Non-Relational
Using Node + Mongo, storing the roles on the user will make it really easy to determine if a user has access to the feature, given that you can just look in the 'roles' property. The trade off is that you have lots of duplicate information ('user_read' could be a role on every user account) and if you end up changing that property, you'll need to update it inside every user object.
You could store the roles in their own collection and then store the id for that entry in the Roles collection on your User model, but then you'll still need to fetch the actual record from the collection to display any of it's information (though arguably this could be a rare occurrence)
Relational
Storing these in a relational DB would be a more "traditional" approach in that you can establish the relationships between the tables (via FKs / join tables or what not). This can be a good solution, but then you no longer have the benefits of using a NoSQL database.
Summary
If the rest of your app is stored in Mongo and has to stay there (for performance or whatever constraint) then you are probably better off doing it all in Mongo. Most of the advice I've come across says don't mix & match data stores, e.g. use one or the other, but not both. That being said, I've done projects with both and it can get messy but sometimes the pros outweigh the cons.
I like #DavidWelch answer, but I'd like to tackle the question from another perspective because the library mentioned gives the option to use a different data store entirely.
Storing roles in a separate data store:
(Pro) Can make the system more performant if you are using a faster data store. (More advantageous in distributed environments?)
(Con) You will have to ensure consistency between the two data stores.
General notes:
You can add roles/permissions such as 'blog\123' in acl. You can also give a user permissions based on verbs such as put, delete, get, etc..
I think it is easier to create a pluggable solution that does not depend on your storage implementation. Perhaps that is why acl does not store roles in the same collections you have.
If you choose to keep the roles in your own collection, consider adding them to a token (JWT). That way, you will not have to check your collection for every request that needs authorization.
I hope that helped.

Resources