Read quorum in CouchDb for _find and MapReduce queries - couchdb

The CouchDb documentation indicates that, by default, both reads and writes to single documents are quorum reads and writes (e.g. r=2 and w=2 in a 3-replica system).
However, the documentation for _find says r "defaults to 1, in which case the document found in the index is returned. If set to a higher value, each document is read from at least that many replicas before it is returned..." It is not 100% clear to me however what exactly that means. If I run _find with r=2 and I find a document in the index of a single node I think it's fairly clear that it will also fetch that document from a 2nd node and return the latest to me. However, I think it's still only checking the index on one node so consistency in a healthy cluster isn't guaranteed.
For example, suppose I have a healthy 3 node cluster with no network partitions. The DB in this cluster has a Mango index that includes field foo and I query, via _find, for all documents with foo=bar. Let's say that initially document X has value foo=baz so that X should not be returned. Now, I update X setting foo=bar and I do this with w=2. I then immediately re-run my _find with r=2. If the index is only consulted on one node then I'm not guaranteed to have X returned by my query even with r=2. So does r=2 mean only that documents found in one node's index will also be looked up on a 2nd node or does it mean that the index on 2 nodes will run the query and have their results merged.
Also, it seems like the same index and r=1 default would likely apply to JavaScript MapReduce views, but I see no equivalent documentation for that case. Do MapReduce view queries default to r=1 or r=2?

I posted a link to the above on the CouchDb Slack and got a response:
map/reduce views (accessed by _view) are r=1 and the r parameter mentioned for _find I think only refers to when we fetch the document, not when the query itself runs, but I'm not 100% sure on that.
It's not quite a definitive answer so I'm not marking this correct but it is definitely more information that I had before.

Related

How does the cassandra know that it has completed QUORUM?

I have always used Cassandra in spark applications, but I never wondered how it works internally. Reading the Cassandra documentation I got a small doubt (which may be a beginner's doubt).
I read in a book (Cassandra The Definitive Guide) and in the official Cassandra documentation that the formula would be:
(RF / 2) + 1.
So theoretically if I have a cluster with 6 nodes, and a replication factor of 3, I would only need response from 2 nodes.
And here come the small doubts:
1 - What would this response be? (The query return with the data?)
2 - If there was no data with the filters used in the query, is the empty return considered a response?
3 - And last but not least, if the empty return is considered a response, if these two nodes that complete the QUORUM don't have the replica data yet, my application that did the SELECT will understand that this data doesn't exist in the database, right?
Your reasoning sounds correct to me.
Basically, if you're reading at LOCAL_QUORUM and have an RF of 3, it's possible that the coordinator accepts a response from two replicas that are both inconsistent and leaves out the third replica that had consistent data.
It's one of the reasons Cassandra is considered an eventually consistent db, and also why regular repairs of the data are so important for production databases. Of course, if consistency mattered above all else, you could always read with a CL of ALL, but you'd sacrifice some amount of response time as a tradeoff. Assuming the db is provisioned well though, while it's certainly in the realm of possible, it isn't likely that only a single replica receives an incoming write unless you make a habit an only writing at a CL of ONE/LOCAL_ONE. If consistency mattered, you'd be writing to the db with a CL of at least LOCAL_QUORUM to avoid this very scenario.
To try and answer your questions directly, yes, having no data to return can be a valid response, and yes if the two replicas chosen by the coordinator both agree there is no data to return, the app will report that result.
1 - What would this response be? (The query return with the data?)
The coordinator node will wait for 2 replicas of the 3 (because CL=QUORUM) to respond to the query (with the request results). It will then send the response to the client.
2 - If there was no data with the filters used in the query, is the empty return considered a response?
Yes, the empty response will be sufficient and will be considered a valid response. Note that there is a mechanism last-write-wins (based on row write time) used in case of conflict.
3 - And last but not least, if the empty return is considered a response, if these two nodes that complete the QUORUM don't have the replica data yet, my application that did the SELECT will understand that this data doesn't exist in the database, right?
You have to understand that Apache Cassandra uses eventual consistency meaning that the client will decide on the desired CL. If you have a strong consistency, meaning you have an overlap of the write CL and read CL (Write CL + Read CL > RF), then will always retrieve the last data. I recommend you to watch this video: https://www.youtube.com/watch?v=Gx-pmH-b5mI

Inconsistent results when sorting documents by _doc

I want to fetch elasticsearch hits using the sort+search_after paging mechanism.
The elasticsearch documentation states:
_doc has no real use-case besides being the most efficient sort order. So if you don’t care about the order in which documents are returned, then you should sort by _doc. This especially helps when scrolling.
However, when performing the same query multiple times, I get different results. More specifically, the first hit alternates randomly between two different hits, where the returned sort field is 0 for one hit, and some specific number for the other.
This obviously breaks the paging as it relies on the value returned in sorting to be later fed into sort_after for the next query.
No data is being written to the index while I am querying it, so this is not because of refreshes.
My questions are therefore:
Is it wrong to sort by _doc for paging? Seems the results I get are inconsistent.
How does sorting by _doc work internally? The documentation is lacking in this regard as it simply states the sort is performed by "index order".
The data was written to the index in parallel using Spark. I thought the problem might have been the parallel write combined with the "index order" sorting, however I did not manage to replicate this behavior with other indicies which were also written to in Spark.
es 7, index contains 2 shards, one primary and one replica
cheers.
The reason this happened is that the index consists of 2 shards. One primary and one replica. The documents were not indexed in the same order. Thus, the order of the results depends on the shard they were returned from. This is fine when using scrolling because Elasticsearch keeps an inner state of the results, but not with paging, which is stateless.

How to minimise time for any operation in janusgraph using gremlin?

For any query, it is taking more than five minutes to give result.
I am running simple query like as following
g.V().hasLabel("Label").has("pProperty","vValue").next()
When I have lesser number of nodes it was working fine but now I have more than 1 million nodes, so the issue arises.
When using JanusGraph and a Gremlin query to search for a property, if no index has been created for that property the query becomes a full scan over the data. Simple and composite indices can be created using the JanusGraph Management API. The Gremlin profile() step will show you if your query used an index.
Seconding what Kelvin said about adding an index. To make things more-efficient, you'll either need to filter on additional indexed properties, or make sure that you're designating an appropriate "entry point" for your traversal.

Couchdb cluster writes behaviour

I have a couchdb in a cluster setup. 3 nodes, all shards on all nodes and W = 2. We have code to create a document in couchdb and read it back from a view. However, the view returns no corresponding data intermittently. The data is there after we check couchdb directly. So, my question is that why the third nodes taking so long to write a value and how long should I expect the write latency to be?
Thanks in advance.
If you query a view and not use stale parameter, views are supposed to always return fresh data. A view will first gets itself updated to the database, and then returns results for your query.
A view can get results from any node. If you query a view, and don't get expected fresh data, it means that the updates are not yet available on the node used.
If a write a document with W =2, than at least two nodes out of three should successfully update this document. And if all nodes are up, internal synchronization between nodes, within milliseconds or seconds should bring updates to all nodes. So the latency should be just several seconds.
How long was the latency that you experienced? Was your view finally able to produce the expected results after this latency?

couchdb request never completes

I am trying to request a large number of documents from my database (which has over 400k documents). I started using _all_docs built-in view. I first tried with this query:
http://database:port/databasename/_all_docs?limit=100&include_docs=true
No problem. Completes as expected. Now to ramp it up:
http://database:port/databasename/_all_docs?limit=1000&include_docs=true
Still fine. Took longer, more data, etc. as expected. Ramp it up again:
http://database:port/databasename/_all_docs?limit=10000&include_docs=true
Request never completes. The Dev tools in chrome show Size = 5.3MB (seems to be significant), and this occurs no matter what value for the limit parameter I use that is over 6500ish. No matter if i specify 6500 or 10,000, it always returns 5.3MB downloaded, and the request stalls.
I have also tried other combinations, such as "skip" and it seems that limit + skip must be < 6500 or I get the same stall.
My environment: Couchdb 1.6.1, Ubuntu 14.04.3 LTS, Azure A1 standard
you have to prewarm your queries, just throwing a 100K or more docs and expecting that you'd get them out of couchdb won't work, it just won't work.
When you ask for some items from a view (in your case Default View), at the first read CouchDB will notice that the B-tree for the view doesn't exist yet, so it goes ahead and builds it on the first read. Depending on how many documents you have in your database, that can take a while, putting a good work load on your database.
On every subsequent read, CouchDB will check if documents have changed since the last write, and throw the changed documents at the map and reduce function. So if you only query some views from time to time, but have lots of changes in between, expect some delays on the next read.
There are 2 ways to handle this situation
1. Pre-warm your view - run a cronjob that does reads to make sure that your view has the B-Tree for this View.
2. Prepare your view in advance for a particular query before inserting the data in the couchdb.
and for now if you really want to read all your docs, don't read them all at once, rather use the skip, limit range queries.

Resources