Limit records synchronized in PouchDB/CouchDB - couchdb

I have a news base that needs to work online and offline. I'm using CouchDB (IBM Cloudant ) and PouchDB to make this sync with the APP.
The problem is that the news is relatively "heavy" for having photos and am having sync problems because the size of the "docs", and does not see any need to synchronize all the news base, will only fill the user's mobile phone with unnecessary records.
I need to sync only some news, approx. five registers. I wonder how can I do this in CouchDB or PouchDB.
I looked in sync + filters documentation but does not answer me the question of the amount of sync docs (or at least did not see if it is possible).
I'm using a view to pull the news.

Since you're using a view to pull the news, you can use limit to limit the number of documents you fetch. You can also use since to determine when you need to fetch the next batch of documents (this will have to be executed periodically to check for the existence of new documents)
If you go down this route and if your app doesn't need need client -> server replication then you could use something lighter than PouchDB do store the documents and other info on the client.

Yes you can, use filtered replication for this: https://pouchdb.com/api.html#filtered-replication
Credits: Nolan Lawson (PouchDB core team).

Related

Partial syncing in pouchdb / couchdb with a particular scenario

I have been reading docs and articles on pouchdb/couchdb/cloudant. I am not able to create this simple architecture in my head. I need help!
So there are many users on the app. Each user has a separate database (which I read is the approach in pouch/couch/cloudant setup).
Now lets just focus on a single user. This user has some remote data already present on our server(couchdb). He has 3 separate docs stored.
He accesses docs 1 and docs 2 from browser 1. And docs 2 and docs 3 from browser 2.
Content in both the browsers must be in sync.
Should I be using Sync api of pouchdb? But as I read, it sync's the whole database. How can I use this api to sync only a subset of the central database. Is filtered replication answer here?
And also I don't want to push both the docs in a single call. He can access docs as he needs.
What is the correct approach to implement this logic with pouch/couch databases. If you can explain with a little code, that will be great. I just need basic ideas.
Is this kind of problem easily solvable in upcoming releases of CouchDB 2.0 and PouchDB-find.
Thanks a lot!
If you take a look at the PouchDB documentation, you should see the options.doc_ids. This parameter let you setup a replication on certain document ids. In your scenario, this would be solving your problem.

how replicate only the user's documents

I need to sync some document from Cloudant server to my iOS in swift language.
For that I use this official library
https://github.com/cloudant/CDTDatastore#overview
I need to understand how replicate only user documents.
I need to figure out the correct road.
Imagine you a ticket assistance system of a company.
All users can create the ticket and this is save in cloudant/couchdb server.
When the user uses a mobile platform, I would just like to synchronize him ticket
how can I do it?
Thank all
CDTDatastore is designed to sync the whole database, and cloudant/ couchdb doesn't provide a per document ACL. In order to only sync a specific users data you either need to use a filter function, which will significantly hit the performance of the replication, or use the one database per user model.

Per document user access control for PouchDB / CouchDB

I wish to use PouchDB - CouchDB for saving user data for my web application, but cannot find a way to control the access per user basis. My DB would simply consists of documents using user id as the key. I know there are some solutions:
One database per user - however it requires to monitor whenever a new user wants to save data in order to create a new DB, and may create a lot of DBs;
Proxy between client and CouchDB - however I don't want PouchDB to sync changes for the whole DB including documents of other users in those _all_docs, _revs_diff request.
Is there any suggestion for user access control for pouchDB for a user base of around 1 million (active users around 10 thousand only)?
The topic of a million or more databases has come up on the mailing list in the past. The conclusion was that it depends on how your operating system deals with that many files. CouchDB is just accessing parts of the .couch file when requested. Performance is related to how quickly it can find, open, access, and close that file.
There are tricks for some file systems like putting / delimiters in the database name--which will cause CouchDB to store them in matching directory structures such as groupA/userA.couch or using email-style database names com/bigbluehat/byoung.couch (or some similar).
If that's not sufficient, Apache CouchDB 2.0 brings in BigCouch code (which IBM Cloudant uses) to provide a fully auto-sharded CouchDB. It's not done yet, but it will provide scalability across multiple nodes using an Amazon Dynamo style sharding system.
Another option is to do your own username-based partitioning between multiple CouchDB servers or use IBM Cloudant (which is built for this level of scale).
All these options provide the same Apache CouchDB replication protocol and will work just fine with PouchDB sitting on the user's computer, phone, or tablet.
The user's device would then have their own database +/- any share databases. The apps on those million user devices would only have the scalability of their own content (aka hard drive space) to be concerned about. The app would replicate directly to the "cloud"-side user database for backup, web use, etc.
Hopefully something in there sounds promising. :)

Sync views between pouchdb and couchdb

I've been able to sync data from my cloudant instance to my nodejs based pouchdb, however I need to setup a secondary search index and therefore I created a view on the couchdb instance however I am unable to see it in my synced pouchdb instance.
I see it in cloudant, in all documents, however after syncing and calling alldocs on pouchdb, it's not there. Also, i'm using the pouchdb-find plugin and I can't reference the secondary index search fields. Of course from pouchdb if if set the secondary index, it works fine.
Am I missing something? Does sync not replicate design docs in PouchDB? If not, what's the best way to create a persistent secondary index?
Any good docs for this? (Nolan....?) Speaking of docs, or support, is there an IRC room or some other live support for couchdb from the user community?
Thanks for your attention,
Paul
pouchdb-find is a reimplementation of Cloudant Query Language, not their search index (which is what I think you're talking about). It's also not done; I've only written about half of the operators. :) You may also want to try the pouchdb-quick-search plugin, which is for full-text search.
In general, the advice I usually give people is to not sync design documents at all – just replicate using a filter to avoid syncing design docs. Then you can create design documents that are optimized for whatever platform you happen to be on (PouchDB, CouchDB, Cloudant, the various PouchDB plugins, etc.).
And yeah, we are usually pretty responsive inside of the IRC channel and on the mailing list, but it's a small operation because we aren't sponsored by Cloudant or Couchbase or anybody. The core PouchDB team are all hobbyists. :)
Maybe this is stupid but, does the user that access couch has the admin role? Only admins can see and edit design documents.

Architecture for Redis cache & Mongo for persistence

The Setup:
Imagine a 'twitter like' service where a user submits a post, which is then read by many (hundreds, thousands, or more) users.
My question is regarding the best way to architect the cache & database to optimize for quick access & many reads, but still keep the historical data so that users may (if they want) see older posts. The assumption here is that 90% of users would only be interested in the new stuff, and that the old stuff will get accessed occasionally. The other assumption here is that we want to optimize for the 90%, and its ok if the older 10% take a little longer to retrieve.
With this in mind, my research seems to strongly point in the direction of using a cache for the 90%, and then to also store the posts in another longer-term persistent system. So my idea thus far is to use Redis for the cache. The advantages is that Redis is very fast, and also it has built in pub/sub which would be perfect for publishing posts to many people. And then I was considering using MongoDB as a more permanent data store to store the same posts which will be accessed as they expire off of Redis.
Questions:
1. Does this architecture hold water? Is there a better way to do this?
2. Regarding the mechanism for storing posts in both the Redis & MongoDB, I was thinking about having the app do 2 writes: 1st - write to Redis, it then is immediately available for the subscribers. 2nd - after successfully storing to Redis, write to MongoDB immediately. Is this the best way to do it? Should I instead have Redis push the expired posts to MongoDB itself? I thought about this, but I couldn't find much information on pushing to MongoDB from Redis directly.
It is actually sensible to associate Redis and MongoDB: they are good team players. You will find more information here:
MongoDB with redis
One critical point is the resiliency level you need. Both Redis and MongoDB can be configured to achieve an acceptable level of resiliency, and these considerations should be discussed at design time. Also, it may put constraint on the deployment options: if you want master/slave replication for both Redis and MongoDB you need at least 4 boxes (Redis and MongoDB should not be deployed on the same machine).
Now, it may be a bit simpler to keep Redis for queuing, pub/sub, etc ... and store the user data in MongoDB only. Rationale is you do not have to design similar data access paths (the difficult part of this job) for two stores featuring different paradigms. Also, MongoDB has built-in horizontal scalability (replica sets, auto-sharding, etc ...) while Redis has only do-it-yourself scalability.
Regarding the second question, writing to both stores would be the easiest way to do it. There is no built-in feature to replicate Redis activity to MongoDB. Designing a daemon listening to a Redis queue (where activity would be posted) and writing to MongoDB is not that hard though.

Resources