How to force CouchDB to use HTTP:POST on filtred replication - couchdb

I'm working on the one database per user pattern using CouchDB, my main database contains all my data and i want to create multiple sub-database using CouchDB Replication feature with a filter (i'm filtering the bigger database using : DateMin, DateMax, an Array of IDs, some other conditions based on each user)
https://docs.couchdb.org/en/3.2.0/replication/index.html
https://docs.couchdb.org/en/3.2.0/json-structure.html#replication-settings
When i have a small amount of information for filtering, everything works perfectly,
my problem comes when the Array of IDs that i'm using for filtering contains too much element:
To run the filter, CouchDB is calling an HTTP:GET https://docs.couchdb.org/en/3.2.0/api/database/changes.html#db-changes
which is restricted by the number of character see
(What is the maximum length of a URL in different browsers?)
But CouchDB has the same request but with HTTP:POST https://docs.couchdb.org/en/3.2.0/api/database/changes.html#post--db-_changes
and i want to know how to specify to CouchDB to use the HTTP:POST for filtering
Additionnal informations :
CouchDB v. 3.1.0
My Replicator Doc :
{
"source": "DB_URL_SOURCE.com",
"target": "DB_URL_TARGET.com,
"filter": "mydesigndoc/by_date_and_activities",
"query_params": {
"weeksMin": 10,
"weeksMax": 10,
"clindividu": true,
"activitesID": "3988814,3989866,3743378,3742882,2595215,1259813,2596111",
"cl": true
},
"continuous": true
}
HTTP:GET request :
http://localhost:5984/my-bigdatabase/_changes
?filter =mydesigndoc%2Fby_date_and_activities
&activites=
&weeksMax=
&weeksMin=
&feed=continuous
&style=all_docs
&since=0
&timeout=10000

Related

How to query many collections elements in mongodb

Good morning colleagues,
I want to make a query regarding how it would be formulated and what would be recommended and the most optimal for making queries with a large number of elements.
I have an api using express that creates a new mongodb model with a unique name and then includes elements inside.
Example:
Collections
*product234 => includes elements => { _:id: ...., name: ... }, { ...}
*product512 => includes elements => { _:id: ...., name: ... }, { ...}
Each collection hosts more than 5000 elements and I want to create a search engine that returns all the results of all the collections that match the "name" that I will send in the request.
How could I perform this query using mongoose? Would it be viable and would it not bring me performance problems by having more than 200 collections and more than 5000 elements in each one?
Answer (Edit):
As i see in the comments the best solution for this for making fast queries is to create a copy of the element with only name or needed values, reference id and reference collection name into a new collection named for example "ForSearchUse" and then make the query to that collection, if complete info is needed then you can query it to the specific collection using the reference id and name of the element

Data model for nested array of objects in Firestore

I need advice from experienced NoSQL engineers on how I should structure my data.
I want to model my SQL data structure to NoSQL for Google Cloud Firestore.
I have no prior experience with NoSQL databases but I am proficient with traditional SQL.
I use Node.js for writing queries.
So far, I converted three tables to JSON documents with example data:
{
"session": {
"userId": 99992222,
"token": "jwttoken1191891j1kj1khjjk1hjk1kj1",
"created": "timestamp"
}
}
{
"user": {
"id": 99992222,
"username": "userName",
"avatarUrl": "https://url-xxxx.com",
"lastLogin": "2019-11-23 13:59:48.884549",
"created": "2019-11-23 13:59:48.884549",
"modified": "2019-11-23 13:59:48.884549",
"visits": 1,
"profile": true,
"basketDetail": { // I get this data from a third party API
"response": {
"product_count": 2,
"products": [
{
"product_id": 111,
"usageInMinutes_recent": 0,
"usageInMinutes": 0,
"usageInMinutes_windows": 0,
"usageInMinutes_mac": 0,
"usageInMinutes_linux": 0
},
{
"product_id": 222, // no recent usage here
"usageInMinutes": 0,
"usageInMinutes_windows": 0,
"usageInMinutes_mac": 0,
"usageInMinutes_linux": 0
}
]
}
}
}
}
{
"visitor": {
"id": 999922221,
"created": "2019-11-23 13:59:48.884549"
}
}
My questions:
session.userId, user.id, visitor.id can all signify the same user. What is the Firestore equivalent to foreign keys in SQL? How would I connect/join these three collections in a query?
What do I do about the nested object basketDetail? Is it fine where it is or should I define its own collection?
I anticipate queries
occasionally add all the recent usage.
frequently check if a user owns a specific product_id
frequently replace the whole baskedDetail object with new data.
occasionally update one specific product_id.
How would I connect collections user with basketDetail in a query if I separated it?
Thanks for the advice!
session.userId, user.id, visitor.id can all signify the same user. What is the Firestore equivalent to foreign keys in SQL? How would I connect/join these three collections in a query?
Unfortunately, there is not JOIN clause in Firestore. Queries in Firestore are shallow, can only get elements from the collection that the query is run against. There is no way you can get documents from two collections in a single query unless you are using collection group query, but it's not the case since the collections in your project have different names.
If you have three collections, then three separate queries are required. There is no way you can achieve that in a single go.
What do I do about the nested object basketDetail? Is it fine where it is or should I define its own collection?
There are some limits when it comes to how much data you can put into a document. According to the official documentation regarding usage and limits:
Maximum size for a document: 1 MiB (1,048,576 bytes)
As you can see, you are limited to 1 MiB total of data in a single document. So if you think that nested object basketDetail can stay within this limitation then you can use that schema, otherwise, add it to a subcollection. Besides that, all those operations are permitted in Firestore. If you'll have hard times implementing them, post another question so we can take a look at it.
How would I connect collections user with basketDetail in a query if I separated it?
You cannot connect/join two collections. If you separate basketDetail in a subcollection, then two queries are required.

CouchDB View not returning documents when ?key specified

I'm coming to CouchDB from an SQL background and am trying to do the common "SELECT FROM DB where field = someValue". I've created the following design document:
{
"_id": "_design/views",
"views": {
"byRubric": {
"map": "function(doc) {if(doc.idRubric){emit(doc._id, doc.idRubric);} }"
}
}
}
If I query the CouchDB table using the following URL, it correctly returns all 15 documents in the table:
http://localhost:5984/rubric_content/_design/views/_view/byRubric
If, however, I try to get those documents in this view which have a particular value in the idRubric field, one which I know is present by, for example, executing the following url, I get 0 documents back when, in fact, 12 of the 15 documents have this specific value in the idRubric field. http://localhost:5984/rubric_content/_design/views/_view/byRubric?key="9bf94452c27908f241ab559d2a0d46c5" (no, it doesn't make any difference if the " marks are replaced by %22). The URL does fail if I leave the quote marks off.
What am I missing? Running this locally for test on OSX 10.12.3 using couchdb - Apache CouchDB 1.6.1
Your view is emitting the document with the document with the id as the key.
Instead, you want to emit the the rubricID as the key.
{
"_id": "_design/views",
"views": {
"byRubric": {
"map": "function(doc) {if(doc.idRubric){emit(doc.idRubric);} }"
}
}
}
Then, the query will be the following :
http://localhost:5984/rubric_content/_design/views/_view/byRubric?key="rubric3"
When you use a map, you need to think as if it was a dictionnary. You have a key and a value. You will search for a matching key and get the value.
If you don't emit any value, you can simply use the ?include_docs=true parameter to get the entire document.

Mongodb: Dynamic query object in collection.find()

I am working on a Node.js + MongoDB application. The application inserts some records in the MongoDB. For example lets take below simple record:
{
"name": "Sachin",
"age" : 11,
"class": 5,
"percentage": 78,
"rating": 5
}
Now end user can set different rule for which they want to get the notification/alert when a specific condition is satisfied. For example we can have a rule like:
1) Rule1: Generate notification/alert if "percentage" is less than 40
In order to achieve this, I am using Replication and tailable cursor. So whenever a new record gets added in the collection I get an record in the tailable cursor.
coll = db.collection('oplog.rs');
options = {
tailable: true,
awaitdata: true,
numberOfRetries: -1
};
var qcond = {'o.data.percentage':{$gt:40}};
coll.find(qcond, options, function(err, cur) {
cur.each(function(err, doc) {
//Perform some operations on received document like
//adding it to other collection or generating alert
}); //cur.each
}); //find
Everything works fine till this point.
Now problem starts when enduser wants to add another rule at runtime say:
2) Rule2: Generate notification/alert if "rating" is greater than 8
Now I would like to consider this condition/rule as well when querying the tailable cursor. But the current cursor is already in a waiting state based on the conditions given as per Rule1 only.
Is there any way to update the query conditions dynamically so that I can include conditions for Rule2 as well?
I tried searching but couldn't find a way to achieve this.
Does anyone have any suggestion/pointers to tackle this situation?
No. You can't modify a cursor once it's open on the server. You'll need to terminate the cursor and reopen it to cover both conditions, or open a second cursor to cover the second condition.

How to bulk save an array of objects in MongoDB?

I have looked a long time and not found an answer. The Node.JS MongoDB driver docs say you can do bulk inserts using insert(docs) which is good and works well.
I now have a collection with over 4,000,000 items, and I need to add a new field to all of them. Usually mongodb can only write 1 transaction per 100ms, which means I would be waiting for days to update all those items. How can I do a "bulk save/update" to update them all at once? update() and save() seem to only work on a single object.
psuedo-code:
var stuffToSave = [];
db.collection('blah').find({}, function(err, stuff) {
stuff.toArray().forEach(function(item)) {
item.newField = someComplexCalculationInvolvingALookup();
stuffToSave.push(item);
}
}
db.saveButNotSuperSlow(stuffToSave);
Sure, I'll need to put some limit on doing something like 10,000 at once to not try do all 4 million at once, but i think you get the point.
MongoDB allows you to update many documents that match a specific query using a single db.collection.update(query, update, options) call, see the documentation. For example,
db.blah.update(
{ },
{
$set: { newField: someComplexValue }
},
{
multi: true
}
)
The multi option allows the command to update all documents that match the query criteria. Note that the exact same thing applies when using the Node.JS driver, see that documentation.
If you're performing many different updates on a collection, you can wrap them all in a Bulk() builder to avoid some of the overhead of sending multiple updates to the database.

Resources