I am currently learning about CouchDB and PouchDB. Is it correct to assume that from CouchDB's perspective PouchDB is a normal CouchDB offline-client (i.e. it follows CouchDB's proprietary replication protocols), albeit one that is implemented in JavaScript instead of Erlang?
Small correction: CouchDB's protocol is by no means proprietary (see replication.io for the spec) and has several independent implementations - CouchDB, PouchDB, Couchbase Sync Gateway, Cloudant, rcouch, Couchbase Mobile, etc.
Otherwise yes, PouchDB is just another CouchDB. In fact PouchDB Server is functionally the same as CouchDB 1.6 in every way, down to the HTTP interface, the Fauxton UI, etc.
From CouchDB's perspective there is no difference between replicating with PouchDB or replicating with another CouchDB instance. PouchDB follows the standard replication protocols, and in fact is tested against the same test suite that CouchDB uses.
Related
Before jumping into development, I'd like to get feedback on a change I'm thinking of making, moving from mongo to couch.
Basically I've got a webapp which is used to help organize users activities (todolist, calendar, notes, journal). It currently uses mongodb, but i'm thinking to move it to couch, mainly due to couches replication ability, and clientdb interaction (pouchdb). I have a similar homegrown setup on the browser using localstorage, backed by mongo, but am looking for a more mature solution.
Due to how couchdb differs from mongodb, I'm thinking that each user should have their own couch db, and their documents being each of my app components. Basically I have to move everything up a level with couch due to local db replication, and due to security.
I have 3 questions.
1) I assume that couch does not have document level security/authentication, correct? (Hence me moving each user assets to their own database, good idea?)
2) My plan is have users login to the website, then my backend nodejs code authenticates them, and sends them down some auth/session token. The javascript on the client then uses its local pouchdb data to set itself up, and also sends the replication request directly to the couchdb server (using the auth token it got from my server-side process). They should only have access to their database, since I can do per database auth access (correct?)
What do you think of that setup? It should work?
3) Regarding couchdb service providers, why do they vary so much on their couch version? IE, happycouch, 1.6.1, iris 1.5, cloudant, 1.0.2? And I also hear about couchdb 2.0 coming out soon... I'd like to use cloudant, but 1.0.2 is so many versions back from a 1.6 or 1.5, if I'm not doing anything exotic, does it matter?
Bonus question :p Continuing from the last question, do you know of any services that host node.js and have local instances of couchdb available? I'd like to use my backend server code as a proxy, but not at the expense of another network hop.
Thank you very much for your feedback,
Paul
Due to how couchdb differs from mongodb, I'm thinking that each user should have their own couch db
This is a CouchDB best practice. Good choice.
I assume that couch does not have document level security/authentication, correct? (Hence me moving each user assets to their own database, good idea?)
You are correct: https://github.com/nolanlawson/pouchdb-authentication
My plan is have users login to the website...
Yep. You can just pass the cookie headers straight through from Node.js to CouchDB, and it'll work fine. nano has some docs on how to do that: https://github.com/dscape/nano#using-cookie-authentication
Regarding couchdb service providers, why do they vary so much on their couch version
The Couch community is one big happy fragmented family. :)
I'd like to use cloudant, but 1.0.2 is so many versions back from a 1.6 or 1.5, if I'm not doing anything exotic, does it matter?
1.0.2 refers to when Cloudant forked CouchDB. They've added so many of their own features since then, that they're pretty much feature-equivalent by now.
The biggest difference between the various Couch implementation is in authentication. Everybody (Cloudant, CouchDB, Couchbase) does it differently.
I've written an application with a CouchDB backend. I have invested a lot of time into CouchDB and so I'm reluctant to move everything over to a different NoSQL database (like Redis).
The problem is that I now need to implement a rate limiting (based on IP address) feature.
There are plenty of examples on how good Redis is for this kind of task, however because I don't want to drop CouchDB for other tasks this means I would essentially be running (and supporting) two databases (1 for most data, 1 for rate limiting) and so...
Is running CouchDB in tandem with Redis unheard of?
Is CouchDB itself suitable for handling rate limiting itself?
Is running CouchDB in tandem with Redis unheard of?
Redis is commonly used in complement with other storage solutions (MySQL, PostgreSQL, MongoDB, CouchDB, etc ...). Like many other NoSQL solutions, Redis is not adapted to all kind of workloads or situations. The authors of Redis are pragmatic and open people, and they routinely suggest to use other solutions rather than Redis, when they are more adapted to the situation.
Redis is therefore a good team player, and it is generally easy to integrate in an existing infrastructure.
Here is an example of usage of Redis with CouchDB.
Is CouchDB itself suitable for handling rate limiting itself?
CouchDB has a number of useful features to implement the rate limiting strategy described in Chris O'Hara's article. For instance, it supports bulk operations on several documents (with optional atomicity). A "bucket span" can be stored in a single document. In-place incrementation of counters can be covered by using update handlers.
IMO, the main missing feature would be automatic item expiration (which CouchDB does not provide AFAIK). So you would have to design a clever mechanism to get rid of obsolete data on top of CouchDB.
The main problem is CouchDB is not really designed for this kind of workload: it is a log structured document oriented database. Each time a counter has to be incremented, it would involve JSON unpacking/packing operations, some Javascript code to be executed, and writing a new revision of the whole document in append only files. You can find a good article describing how CouchDB stores its data here.
I suspect a rate limiting strategy implemented on top of CouchDB would not scale very well (too many I/Os, too much CPU consumption, inefficient network protocol). For instance, CouchDB is a RESTful server; I would not feel comfortable to initiate client HTTP operations (REST queries to CouchDB) to rate limit each incoming HTTP query of my system.
Redis is much more adapted to this kind of workload (fast, in-memory, no I/O, efficient client protocol, no JSON parsing/formatting, incrementations are native atomic operations, etc ...)
You can do rate limiting with Memcached - it has a nice counter increment command as you mention, plus obsolete data is automatically purged from the cache in due course, so it has all the benefits of Redis for this application without the annoying duplication of capability (and complexity) that running Redis on top of CouchDB would bring.
http://simonwillison.net/2009/jan/7/ratelimitcache/
You could add memcached to your own setup easily enough or you could investigate CouchBase whose current server product integrates a CouchDB derived database with Memcached compatibility baked in:
http://www.couchbase.com/memcached
Personally I dislike the way Couchbase forked from CouchDB, but for your application it might be a perfect fit.
I am looking at Couch Db and I saw Ektorp that presents a JPA like interface for database. However I see that there are examples that how to make query at JavaScript. I didn't understand how the system work.
Do I query a database from web tier without a middle tier? How can security be done with that?
CouchDB uses javascript to define map and reduce functions for it's views. Ektorp is simply providing you a convenient way to create those functions that will be used by couchdb. You might want to read the couchdb wiki page on views:
http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views
Just because the views are javascript, does not imply that you have to create the views from a 'web tier'.
In terms of architecture, you have a couple of options. You can use a traditional three tier approach with a java front end, and in your middle tier call couchdb with ektorp. Then you are in full control of security.
You can also go with what is coming to be known as the 2.1 tier model, where users interact directly with couchdb, mainly with a couchapp. You can then provide support services that listen to the changes feed. I have done this with ektorp and it works very well. Other have used node.js. It is a different way of thinking, but it can work. You can read a fun post about this model here:
http://markmail.org/thread/cfw7f3ef75aoqzin
Anyway, I just wanted to provide you with possible options in how you 'tier' your architecture.
It is fourth day already since I've started diving into CouchDB specifically Membase (Couchbase), Membase seems really interesting technology for me due to simplicity of administration, their interface is as magical as informal and simple. The way you add/remove buckets is just fun.
Unfortunately I didn't managed to launch their .NET client on Mac OS X (on Windows it worked fine) and also couldn't find out the way to perform Map/Reduce queries so it seemed that Membase Server technology is little simpler then pure CouchDB. Anyway everything changed until recently I've stumbled upon the diagram that describes their technology:
The Image is explained here
It seems that "Couchbase server (Currently Membase Server)" plays role of some sort of Master database which isn't accessed directly, and also there is "Couchbase Single Server" which plays the role of client database which has all the features of CouchDB (such as Map/Reduce queries)
If so then how is "CouchSync" is performed? Is it possible to perform this "CouchSync" from code?
Before I describe to you how CouchSync works I think it would be beneficial for me to describe how the Couchbase product history evolved. This will make things more clear. About a year ago, Membase Server was first released. The idea behind Membase Server was to provide memcached with persistence (the persistence layer was sqlite) and simple to use clustering technology. Then about 6 months ago the companies Membase and CouchOne merged to form Couchbase. Directly after the merger Couchbase continued to provide Membase Server, but now also provided Couchbase Single Server. Couchbase Single Server is essentially CouchDB with GeoCouch packaged in by default along with many major performance improvements. On July 29th, 2011 Couchbase announced a developer preview of the first version of Couchbase Server. Couchbase Server is the combination of Couchbase Single Server and Membase. Basically what Couchbase did was replace sqlite with CouchDB as the persistence engine. So this basically caused the product to go from being a key-value store to a document store database.
So what is CouchSync?
CouchSync is basically what Couchbase is calling CouchDB replication. It is very simple to setup in both Couchbase Server, Couchbase Single Server and in CouchDB. All it is is a changes feed that is streamed from one server to another.
A note on using Membase. Since Membase doesn't provide any of the CouchDB support it doesn't actually fit into this diagram and therfore doesn't support CouchSync. You will actually want to look at the developer preview of Couchbase Server since this product has both Membase and CouchDB features. In the meantime if you are looking for something more stable to test then take a look at Couchbase Single Server as it will be able to give you a feel for some of the features (like CouchSync) that are in Couchbase Server
Also, the point of this diagram is to show that you can do CouchSync across the entire Couchbase product line. You don't need to go through Couchbase Single Server to do CouchSync to Couchbase Mobile. You can do CouchSync directly from Couchbase Server.
Is it possible to perform CouchSync from code?
No. It's easier than that. You set it up in the web ui.
Hope that helps.
[EDIT]:
This diagram is now outdated. Couchbase the company no longer supports Couchbase Single Server (which is it's version of CouchDB). The CouchSync feature will now sync directly with Couchbase server.
I'm getting more into Node.js and am enjoying it. I'm moving more into web application development.
I have wrapped my head around Node.js and currently using Backbone for the front end. I'm making a few applications that uses Backbone to communicate with the server using a RESTful API. In Node.js, I will be using the Express framework.
I'm reaching a point where I need a simple database on the server. I'm used to PostgreSQL and MySQL with Django, but what I'm needing here is some simple data storage etc. I know about CouchDB, MongoDB and Redis, but I'm just not sure which one to use?
Is any one of them better suited for Node.js? Is any one of them better for beginners, moving from relational databases? I'm just needing some guidance on which to choose, I've come this far, but when it's coming to these sort of databases, I'm just not sure...
Is any one of them better suited for
Node JS?
Better suited especially for node.js probably no, but each of them is better suited for certain scenarios based on your application needs or use cases.
Redis is an advanced key-value store and probably the fastest one among the three NoSQL solutions. Besides basic key data manipulation it supports rich data structures such as lists, sets, hashes or pub/sub functionality which can be really handy, namely in statistics or other real-time madness. It however lacks some sort of querying language.
CouchDB is document oriented store which is very durable, offers MVCC, REST interface, great replication system and map-reduce querying. It can be used for wide area of scenarios and substitute your RDBMS, however if you are used to ad hoc SQL queries then you may have certain problems with it's map-reduce views.
MongoDB is also document oriented store like CouchDB and it supports ad hoc querying besides map-reduce which is probably one of the crucial features why people searching for DRBMS substitution choose MongoDB over the other NoSQL solutions.
Is any one of them better for
beginners, moving from relational
databases?
Since you are coming from the RDBMS world and you are probably used to SQL then, I think, you should go with the Mongodb because, unlike Redis or CouchDB, it supports ad hoc queries and the querying mechanism is similar to SQL. However there may be areas, depending on your application scenarios, where Redis or CouchDB may be better suited to do the job.