Does CouchDB have a "bulk get all revisions" feature?

Does CouchDB have a "bulk get all revisions" feature? - couchdb

I'm using CouchDB with PouchDB and have noticed that remote-remote replication (or replication to PouchDB) does a lot of
/db/doc?revs=true&open_revs=all&attachments=true&_nonce=...
Do any of CouchDB's bulk APIs fetch the revs and open_revs (revs=true&open_revs=all) of more than one document at a time?

I saw your issue on GitHub as well. This is really something that would be better to ask in the CouchDB mailing list or #couchdb on IRC.
If you do all_docs with keys, you can actually get the most recent revision information even for deleted documents, but for more than one revision, I don't think so.
If what you're really asking is whether we've gotten replication in PouchDB to go about as fast as it can go given the current CouchDB replication protocol, I think the answer is yes. :)

Related

CouchDB replication ignoring sporadic documents

I've got a CouchDB setup (CouchDB 2.1.1) for my app, which relies heavily on replication integrity. We are using the "one db per user" approach, with an additional layer of "role" db:s that groups users like the image below.
Recently, while increasing the number of beta testers, we discovered that some documents had not been replicated as they should. We are unable to see any pattern in document size, creation/update time, user or other. The errors seem to happen sporadically, with 2-3 successfully replicated docs followed by 4-6 non-replicated docs.
The server responds with {"error":"not_found","reason":"missing"} on those docs.
Most (but not all) of the user documents has been replicated to the corresponding Role DB, but very few made it all the way to the Master DB. This never happened when testing with < 100 documents (now we're at 1000-1200 docs in the db).
I discovered a problem with the "max open files" setting mentioned in the Performance chapter in the docs and fixed it, but the non-replicated documents are still not replicating. If I open a document and save it, it will replicate.
This is my current theory:
The replication process tried to copy new documents when the user went online
The write process failed due to Linux's "max_open_files" peaked
The master DB still thinks the replication was successful
At a later replication, the master DB ignores those old documents and only tries to replicate new ones
Could this be correct? And can I somehow make the CouchDB server "double check" all documents and the integrity of previous replications?
Thank you for your time and any helpful comments!

I have experienced something similar in the past - when attempting to replicate documents without sufficient permissions the replication fails as it should do. But when the permissions issue is fixed the documents you attempted to replicate cannot then be replicated, although edit/save on the documents fixes the issue. I wonder if this is due to checkpoints? The CouchDb manual says about the "use_checkpoints" flag:
Disabling checkpoints is not recommended as CouchDB will scan the
Source database’s changes feed from the beginning.
Though scanning from the beginning sounds like it might fix the problem, so perhaps disabling checkpoints could help. I never got back to that issue at the time so I am afraid this is not a proper answer, just a suggestion.

Is it possible to get the latest seq number of PouchDB?

I am trying to cover for an issue where CouchDB is rolled back, causing PouchDB to be in the future. I want to find a way to detect this situation and force PouchDB to destroy and reload when this happens.
Is there a way to ask PouchDB for it's current pull seq number? I am not able to find any documentation at all on this. My google-foo is not strong enough.
So far my only thought is to watch the sync.on(change) feed, and record the seq number on every pull. Then on app reload, run this as ajax https:/server/db/_changes?descending=true&limit=1 and verify that the seq number this returns, is higher than the seq number I stored. If the stored seq is higher, then pouchdb.destroy(), purge _pouch_ from indexdb, and probably figure out how to delete websql versions for this release https://github.com/pouchdb/pouchdb/releases/tag/6.4.2.
Or is there a better way to solve situations where PouchDB ends up in the future ahead of CouchDB?

The problem seems to be in the replication checkpoint documents. When you recover a database from the backup, probably, you are recovering also the checkpoint local documents.
You should remove all local docs, by finding them with the _local_docs endpoint and then removing form the recovered database.
Doing this, your PouchDB should try to send to CouchDB their docs syncing back PouchDB and CouchDB.

Partial syncing in pouchdb / couchdb with a particular scenario

I have been reading docs and articles on pouchdb/couchdb/cloudant. I am not able to create this simple architecture in my head. I need help!
So there are many users on the app. Each user has a separate database (which I read is the approach in pouch/couch/cloudant setup).
Now lets just focus on a single user. This user has some remote data already present on our server(couchdb). He has 3 separate docs stored.
He accesses docs 1 and docs 2 from browser 1. And docs 2 and docs 3 from browser 2.
Content in both the browsers must be in sync.
Should I be using Sync api of pouchdb? But as I read, it sync's the whole database. How can I use this api to sync only a subset of the central database. Is filtered replication answer here?
And also I don't want to push both the docs in a single call. He can access docs as he needs.
What is the correct approach to implement this logic with pouch/couch databases. If you can explain with a little code, that will be great. I just need basic ideas.
Is this kind of problem easily solvable in upcoming releases of CouchDB 2.0 and PouchDB-find.
Thanks a lot!

If you take a look at the PouchDB documentation, you should see the options.doc_ids. This parameter let you setup a replication on certain document ids. In your scenario, this would be solving your problem.

Sync views between pouchdb and couchdb

I've been able to sync data from my cloudant instance to my nodejs based pouchdb, however I need to setup a secondary search index and therefore I created a view on the couchdb instance however I am unable to see it in my synced pouchdb instance.
I see it in cloudant, in all documents, however after syncing and calling alldocs on pouchdb, it's not there. Also, i'm using the pouchdb-find plugin and I can't reference the secondary index search fields. Of course from pouchdb if if set the secondary index, it works fine.
Am I missing something? Does sync not replicate design docs in PouchDB? If not, what's the best way to create a persistent secondary index?
Any good docs for this? (Nolan....?) Speaking of docs, or support, is there an IRC room or some other live support for couchdb from the user community?
Thanks for your attention,
Paul

pouchdb-find is a reimplementation of Cloudant Query Language, not their search index (which is what I think you're talking about). It's also not done; I've only written about half of the operators. :) You may also want to try the pouchdb-quick-search plugin, which is for full-text search.
In general, the advice I usually give people is to not sync design documents at all – just replicate using a filter to avoid syncing design docs. Then you can create design documents that are optimized for whatever platform you happen to be on (PouchDB, CouchDB, Cloudant, the various PouchDB plugins, etc.).
And yeah, we are usually pretty responsive inside of the IRC channel and on the mailing list, but it's a small operation because we aren't sponsored by Cloudant or Couchbase or anybody. The core PouchDB team are all hobbyists. :)

Maybe this is stupid but, does the user that access couch has the admin role? Only admins can see and edit design documents.

CouchDB: bulk update best practices

My use case: i would set a flag ("read" or "unread") in a group of documents with only one request.
My first idea was to send a list of ids using an _update handler but reading docs it seem to work only on one document.
I'm wrong? How to solve this case?

You are correct.
Currently (CouchDB 1.1.0 and to my knowledge the next release, 1.2 also), the only way to modify documents in bulk is to send the literal documents themselves to CouchDB using the CouchDB bulk document API.
In my experience, in practice, this is not a major problem because bulk operations tend to be done with offline tools or else with AJAX operations where there is no noticeable impact to the user experience.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string