CouchDB count all documents and return a number? - node.js

We have a college project in CouchDB and I'm using node, I want to create a view that returns a number of all my documents by email.
I cannot find anything that works and I'm not sure what I'm missing, I tried a lot of different reduce functions and emit methods.
Thanks for any answers.
The documents have 2 fields, name and email

Do not use the db endpoint because the response field doc_count includes design documents along with other documents that may not have an email field.
A straight forward way to do this is with a view. The code snippet demonstrates the difference between db info doc_count and a view's total_rows using PouchDB. I'd guess there's probably more interesting uses for the index.
The design doc is trivial
{
_id: '_design/my_index',
views: {
email: {
map: function(doc) {
if (doc.email) emit(doc.email);
}.toString()
}
}
}
And the view query is very efficient and simple.
db.query('my_index/email', {
include_docs: false,
limit: 0
})
const gel = id => document.getElementById(id);
let db;
function setJsonToText(elId, json) {
gel(elId).innerText = JSON.stringify(json, undefined, 3);
}
async function view() {
// display db info
setJsonToText('info', await db.info());
// display total number or rows in the email index
const result = await db.query('my_index/email', {
include_docs: false,
limit: 0
});
setJsonToText('view', result);
}
// canned test documents
function getDocsToInstall() {
return [{
email: 'jerry#garcia.com',
},
{
email: 'bob#weir.com',
},
{
email: 'phil#lesh.com'
},
{
email: 'wavy#gravy.com'
},
{
email: 'samson#delilah.com'
},
{
email: 'cosmic#charlie.com'
},
// design doc
{
_id: '_design/my_index',
views: {
email: {
map: function(doc) {
if (doc.email) emit(doc.email);
}.toString()
}
}
}
]
}
// init example db instance
async function initDb() {
db = new PouchDB('test', {
adapter: 'memory'
});
await db.bulkDocs(getDocsToInstall());
};
(async() => {
await initDb();
await view();
})();
<script src="https://github.com/pouchdb/pouchdb/releases/download/7.1.1/pouchdb-7.1.1.min.js"></script>
<script src="https://github.com/pouchdb/pouchdb/releases/download/7.1.1/pouchdb.memory.min.js"></script>
<pre>Info</pre>
<pre id='info'></pre>
<div style='margin-top:2em'></div>
<pre>email view</pre>
<pre id='view'>
</pre>

You can use GET /{db}, which returns information about the specified database. This is a JSON object that contains the property doc_count.
doc_count (number) – A count of the documents in the specified database.
With Angular for example, this could be done with the following method:
async countDocuments(database: string): Promise<number> {
return this.http.get<any>(this.url('GET', database), this.httpOptions).toPromise()
.then(info => info['doc_count']);
}

Assumption:
Assuming that following documents are present in the Customers database:
[
{
"_id": "93512c6c8585ab360dc7f535ff00bdfa",
"_rev": "1-299289ee89275a8618cd9470733035f4",
"name": "Tom",
"email": "tom#domain.com"
},
{
"_id": "93512c6c8585ab360dc7f535ff00c930",
"_rev": "1-a676883d6f1b5bce3b0a9ece92da6964",
"name": "Tom Doe",
"email": "tom#domain.com"
},
{
"_id": "93512c6c8585ab360dc7f535ff00edc0",
"_rev": "1-09b5bf64cfe66af7e1134448e1a328c3",
"name": "John",
"email": "john#domain.com"
},
{
"_id": "93512c6c8585ab360dc7f535ff010988",
"_rev": "1-88e347af11cfd1e40e63920fa5806fd2",
"name": "Alan",
"email": "alan#domain.com"
}
]
If I understand your query correctly, then based on above data, You need below given result set.
{
"tom#domain.com": 2,
"alan#domain.com": 1,
"john#domain.com": 1
}
Solution:
In order to achieve above, Consider following design document containing a View which has Map and Reduce functions.
{
"_id": "_design/Customers",
"views": {
"by-email": {
"map": "function (doc) {
if(doc.email){
emit(doc.email, doc._id);
}
}",
"reduce": "_count"
}
},
"language": "javascript"
}
The above view function emits value of the key email of the document if the key exists in the document.
The reduce function _count is a built in reducer (provided by CouchDB) that does the counting logic.
Executing View Query:
In order to query this view, you need to: select the view function, mark reduce to be executed (as it is optional to run reduce) and set 1 as group level.
Here is how you can do it through the UI:
Result:
Here is the result given by above query:
[![result of map reduce query
Hope this helped.
For more details about other reduce functions and group level, please refer CouchDB documentation.
Cheers.

Related

How to count and do jointure like stuff on couchDB

I'm currently trying to get the number of events for one organizer.
This is what my organizer document looks like:
{
"doc_type": "User",
"email": "xxx#gmail.com",
"blebleble: "blebleble",
}
This is what my event document looks like:
{
"doc_type": "Event",
"email": "xxx#gmail.com",
"blablabla: "blablabla",
}
I still couldn't figure out how to do some kind of jointure between both docs and do a count on the number of event that shares the same. I think I can work around the email that both docs shares but I don't know how I can do that. I'm still having trouble with CouchDB. Doesn't seems like a hard thing to do in SQL, but can't find out for nosql.
Thanks you in advance.
"jointure" is not not a term I've encountered in my field so I am left to guess what is meant is join.
Joins are possible with CouchDB views, but what I read from the requirement in the OP is to get counts of events by email. See CouchDB's Joins With Views documentation. For that, I don't see documents with an ancestral relation rather a one-to-many relation, i.e. user ==> events.
Consider this design document:
{
"_id": "_design/SO-68999682",
"views": {
"user_events": {
"map": `function (doc) {
if(doc.doc_type === 'Event') {
emit(doc.email);
}
}`,
"reduce": '_count'
}
}
The view's map function simply adds doc.email to the 'user_events' index when appropriate. Of particular interest the reduce function specifies the built-in reduce function _count.
Given such a view index one may apply the /db/_design/design-doc/_view/view-name endpoint to, for example,
View all events
{
reduce: false,
include_docs: true
}
Get a count of all events
{
reduce: true
}
Get a count of events for every email (summary)
{
reduce: true,
group_level: 1
}
Get a count of events for a specific email
{
reduce: true,
group_level: 1,
key: email
}
Get all events for a specific email
{
reduce: false,
include_docs: true,
key: email
}
The _count reduce built-in provides high performance. The snippet below demonstrates the above using the very handy and compatible PouchDB.
async function showAllEventDocs() {
let result = await db.query('SO-68999682/user_events', {
reduce: false,
include_docs: true
});
//show
gel('user_events_view').innerText = result.rows.map(row => [row.doc.email, row.doc.date].join('\t\t')).join('\n');
}
async function showEventCountTotal() {
let result = await db.query('SO-68999682/user_events', {
reduce: true
});
gel('event_count_total').innerText = result.rows[0].value;
}
async function showEventCountSummary() {
let result = await db.query('SO-68999682/user_events', {
reduce: true,
group_level: 1
});
//show key/value (email, count)
gel('event_count_summary').innerText = result.rows.map(row => [row.key, row.value].join('\t\t')).join('\n');
}
async function showUserEventCount(email, displayElement) {
let result = await db.query('SO-68999682/user_events', {
reduce: true,
group_level: 1,
key: email
});
//show value (count)
gel(displayElement).innerText = result.rows[0].value;
}
async function showUserEvents(email, displayElement) {
let result = await db.query('SO-68999682/user_events', {
reduce: false,
include_docs: true,
key: email
});
//show
gel(displayElement).innerText = result.rows.map(row => [row.doc.email, row.doc.date].join('\t\t')).join('\n');
}
function getDocsToInstall(count) {
const docs = [{
"doc_type": "User",
"email": "Jerry#gmail.com"
},
{
"doc_type": "User",
"email": "Bobby#gmail.com"
},
{
"doc_type": "Event",
"email": "Jerry#gmail.com",
"date": getDocDate().toISOString().slice(0, 10)
}, {
"doc_type": "Event",
"email": "Jerry#gmail.com",
"date": getDocDate().toISOString().slice(0, 10)
}, {
"doc_type": "Event",
"email": "Jerry#gmail.com",
"date": getDocDate().toISOString().slice(0, 10)
}, {
"doc_type": "Event",
"email": "Bobby#gmail.com",
"date": getDocDate().toISOString().slice(0, 10)
}, {
"doc_type": "Event",
"email": "Bobby#gmail.com",
"date": getDocDate().toISOString().slice(0, 10)
},
];
// design document
const ddoc = {
"_id": "_design/SO-68999682",
"views": {
"user_events": {
"map": `function (doc) {
if(doc.doc_type === 'Event') {
emit(doc.email);
}
}`,
"reduce": '_count'
}
}
};
docs.push(ddoc);
return docs;
}
const db = new PouchDB('SO-68999682', {
adapter: 'memory'
});
// install docs and show view in various forms.
(async() => {
await db.bulkDocs(getDocsToInstall(20));
await showAllEventDocs();
await showEventCountTotal();
await showEventCountSummary();
await showUserEventCount('Jerry#gmail.com', 'jerry_event_count');
await showUserEventCount('Bobby#gmail.com', 'bobby_event_count');
await showUserEvents('Jerry#gmail.com', 'jerry_events');
await showUserEvents('Bobby#gmail.com', 'bobby_events');
})();
const gel = id => document.getElementById(id);
function getDocDate() {
const today = new Date();
const day = Math.random() * 100 % today.getDay() + 1; // keep it basic
return new Date(today.getFullYear(), today.getMonth(), day)
}
.bold {
font-weight: bold
}
.plain {
font-weight: normal
}
<script src="https://cdn.jsdelivr.net/npm/pouchdb#7.1.1/dist/pouchdb.min.js"></script>
<script src="https://github.com/pouchdb/pouchdb/releases/download/7.1.1/pouchdb.memory.min.js"></script>
<pre>All user_events (entire view)</pre>
<pre id='user_events_view'></pre>
<hr/>
<pre>Total number of events: <span id='event_count_total'></span> events</pre>
<hr/>
<pre>Event count summary (user, count)</pre>
<pre id='event_count_summary'></pre>
<hr/>
<pre>Event count by email (specific to user)</pre>
<pre>Bobby#gmail.com has <span id='bobby_event_count'></span> events</pre>
<pre>Jerry#gmail.com has <span id='jerry_event_count'></span> events</pre>
<hr/>
<pre>Events by email</pre>
<pre class="bold">Bobby#gmail.com <pre class="plain" id='bobby_events'></pre></pre>
<pre class="bold">Jerry#gmail.com <pre class="plain" id='jerry_events'></pre></pre>
<hr/>
Notice the demo snippet's documents have a date field. If such a field existed in the OPs Event documents, then changing the emit to
emit(doc.email + '/' + doc.date);
would allow all the aforementioned queries plus the option to query by a date or date range, an exercise which I'll leave readers to explore.

Mongodb update all the documents with unique id

I have collection with name products with almost 100k documents. I want to introduce a new key called secondaryKey with unique value uuid in all the documents.
I do this using nodejs.
Problem I am facing:-
When I try the below query,
db.collection('products').updateMany({},{"$set":{secondaryKey: uuid()}});
Here it updates all the documents with same uuid value,
I try with loop to update document one by one,but here issues is I don't have filter value in updateOne because I want to update all the documents.
Can anyone please help me here.
Thanks :)
If you are using MongoDB version >= 4.4 You can try this:
db.products.updateMany(
{},
[
{
$set: {
secondaryKey: {
$function: {
body: function() {
return UUID().toString().split('"')[1];
},
args: [],
lang: "js"
}
}
}
}
]
);
Output
[
{
"_id": ObjectId("..."),
"secondaryKey": "f41b15b7-a0c5-43ed-9d15-69dbafc0ed29"
},
{
"_id": ObjectId("..."),
"secondaryKey": "50ae7248-a92e-4b10-be7d-126b8083ff64"
},
{
"_id": ObjectId("..."),
"secondaryKey": "fa778a1a-371b-422a-b73f-8bcff865ad8e"
}
]
Since it's not the same value you want to put in each document you have to use the loop.
In your loop, you have to update the current document of the iteration. So you have to filter with the _id in the updateOne
The above reply didn't work for me. Plus, it compromises security when you enable javascript on your database (see here $function and javascript enabling on database). The best way is to not overload your server, do your work on local as below:
const { nanoid, customAlphabet } = require('nanoid')
async function asdf() {
const movies = await client.db("localhost").collection("productpost");
var result2 = []
let result = await movies.find({}).toArray()
result.forEach(element => {
const nanoid = customAlphabet('1234567890', 10)
console.log(element.price)
element.price = 4
element.id = nanoid()
result2.push(element)
});
console.log("out reult2", result2)
await movies.deleteMany({})
await movies.insertMany(result2)
})
It will delete any objects on your collections and update with the new ones. Using nanoid as uniqueids.
This is the database object array after adding unique id:
{ "_id": { "$oid": "334a98519a20b05c20574dd1" }, "attach": "[\"http://localhost:8000/be/images/2022/4/bitfinicon.png\"]", "title": "jkn jnjn", "description": "jnjn", "price": 4, "color": "After viewing I am 48.73025772956596% more satisfied with life.", "trademark": "", "category": "[]", "productstate": "Published", "createdat": { "$date": "2022-04-03T17:40:54.743Z" }, "language": "en"}
P.S: Please backup your collection before doing this or filter the array on your needs for not going through all collection.

$push and $set same sub-document in an Array

I'm trying to keep an history of states in a subdocument array with mongoosejs 4.9.5 and mongo 3.2.7
Example of document structure:
company (Schema)
employees (Schema): [ ]
currentState: String
states (Schema): [ ]
state: String
starts: Date
ends: Date
When I change the employee state, I want to change the currentState, add the new state into the states array, and update the last state for define the 'ends' timestamp
// I get the last state position from a previous find request
var lastStateIndex = employee.stateHistory.length - 1;
var changeStateDate = new Date();
// Prepare the update
var query = { _id: companyId, "employees._id": employeeId };
var update = {
$set: {
"employees.$.state": newState,
`employees.$.stateHistory.${lastStateIndex}.ends`: changeStateDate
},
$push: {
"employees.$.stateHistory": {
state: newState,
starts: changeStateDate
}
}
}
Company.findOneAndUpdate(query, update, { multi:false, new:true}, ... )
Mongo is returning the following error
{"name":"MongoError","message":"Cannot update 'employees.0.stateHistory.0.ends' and 'employees.0.stateHistory' at the same time","ok":0,"errmsg":"Cannot update 'employees.0.stateHistory.0.ends' and 'employees.0.stateHistory' at the same time","code":16837}
Any suggestions how to avoid running two updates for that purpose?
Any work around for avoid storing the 'ends' date, but being able to calculate it after based on the 'starts' of the next item in the array?
Thank you,
I expected this to already be answered elsewhere, but no other reasonable response seems to exist. As commented, you cannot actually do this in a single update operation because the operations "conflict" on the same path. But .bulkWrite() allows "multiple updates" to be applied in a single request and response.
Company.bulkWrite([
{ "updateOne": {
"filter": { "_id": companyId, "employees._id": employeeId },
"update": {
"$set": {
"employees.$.state": newState,
[`employees.$.stateHistory.${lastStateIndex}.ends`]: changeStateDate
}
}},
{ "updateOne": {
"filter": { "_id": companyId, "employees._id": employeeId },
"update": {
"$push": {
"employees.$.stateHistory": {
"state": newState,
"starts": changeStateDate
}
}
}
}}
])
Now of course .bulkWrite() does not return the "modified document" like .findOneAndUpdate() does. So if you need to actually return the document, then you need to add to the Promise chain instead:
Company.bulkWrite([
{ "updateOne": {
"filter": { "_id": companyId, "employees._id": employeeId },
"update": {
"$set": {
"employees.$.state": newState,
[`employees.$.stateHistory.${lastStateIndex}.ends`]: changeStateDate
}
}},
{ "updateOne": {
"filter": { "_id": companyId, "employees._id": employeeId },
"update": {
"$push": {
"employees.$.stateHistory": {
"state": newState,
"starts": changeStateDate
}
}
}
}}
]).then( result => {
// maybe inspect the result
return Company.findById(companyId);
})
Of course noting that it is "possible" that another modification can be made to the document in between when the .bulkWrite() is applied and the .findById() is executed. But that is the cost of the operation you are doing.
It is generally best to consider if you actually need the returned document or not. In most instances you simply already have the information and any "updates" you should be aware of because you are "issuing them", and if you want "truly reactive" then you should be listening for other change events on the data through a socket instead.
Note you could simply "chain" the "multiple" .findOneAndUpdate() calls, but this is indeed "multiple" calls and responses from the server, as opposed to the one using .bulkWrite(). So there really isn't anything to gain by doing otherwise.

Node.js, mongodb and filtered queries

I'm using the native API of mongodb and I'm trying to query the data on my collection.
This is my filter object:
{
email: 'admin#email.it',
login: { '$exists': true }
}
and this is one document that it should find:
{
"_id": "5829cd89a48a7813f0cc7429",
"timestamp": "2016-11-14T14:43:18.705Z",
"login": {
"clientIPaddr": "::1",
"clientProxy": "none"
},
"userData": {
"sessdata": {
"sessionID": "CRTZaqpaUs-ep0J6rvYMBlQTdDakGwle",
"email": "admin#email.it",
"token": "3PlfQBVBoftlIpl-FizeCW5TbYMgcYTl4ZPTkHMVyxqv-TldWb_6U3eusJ27gtI64v7EqjT-KPlUUwkJK7hPnQ"
}
}
}
But the query doesn't return anything! Why?
It doesn't return anything because the email field is in an embedded document within the userData field, hence it tries to look for an email field at a higher level within the document that does not exist.
To make this work, you need to modify the filter or create a new query object which includes the embedded field, albeit the key will be in dot notation field i.e. the query should resemble
{
"userData.sessdata.email": "admin#email.it",
"login": { "$exists": true }
}
You can use the bracket notation to create the required field. For example:
var filter = {
email: 'admin#email.it',
login: { '$exists': true }
},
query = {};
Object.keys(filter).forEach(function(key){
if (key === "email") {
query["userData.sessdata."+key] = filter[key];
} else {
query[key] = filter[key];
}
});
console.log(JSON.stringify(query, null, 4));
Output
{
"userData.sessdata.email": "admin#email.it",
"login": {
"$exists": true
}
}
You can then use the query object in your find() query
collection.find(query).toArray(function(err, docs) {
// access the docs array here
})

Using $in in MongooseJS with nested objects

I've been successfully using $in in my node webservice when my mongo arrays only held ids. Here is sample data.
{
"_id": {
"$oid": "52b1a60ce4b0f819260bc6e5"
},
"title": "Sample",
"team": [
{
"$oid": "52995b263e20c94167000001"
},
{
"$oid": "529bfa36c81735b802000001"
}
],
"tasks": [
{
"task": {
"$oid": "52af197ae4b07526a3ee6017"
},
"status": 0
},
{
"task": {
"$oid": "52af197ae4b07526a3ee6017"
},
"status": 1
}
]
}
Notice that tasks is an array, but the id is nested in "task", while in teams it is on the top level. Here is where my question is.
In my Node route, this is how I typically deal with calling a array of IDs in my project, this works fine in the team example, but obviously not for my task example.
app.get('/api/tasks/project/:id', function (req, res) {
var the_id = req.params.id;
var query = req.params.query;
models.Projects.findById(the_id, null, function (data) {
models.Tasks.findAllByIds({
ids: data._doc.tasks,
query: query
}, function(items) {
console.log(items);
res.send(items);
});
});
});
That communicates with my model which has a method called findAllByIds
module.exports = function (config, mongoose) {
var _TasksSchema = new mongoose.Schema({});
var _Tasks = mongoose.model('tasks', _TasksSchema);
/*****************
* Public API
*****************/
return {
Tasks: _Tasks,
findAllByIds: function(data, callback){
var query = data.query;
_Tasks.find({_id: { $in: data.ids} }, query, function(err, doc){
callback(doc);
});
}
}
}
In this call I have $in: data.ids which works in the simple array like the "teams" example above. Once I nest my object, as with "task" sample, this does not work anymore, and I am not sure how to specify $in to look at data.ids array, but use the "task" value.
I'd like to avoid having to iterate through the data to create an array with only id, and then repopulate the other values once the data is returned, unless that is the only option.
Update
I had a thought of setting up my mongo document like this, although I'd still like to know how to do it the other way, in the event this isn't possible in the future.
"tasks": {
"status0": [
{
"$oid": "52995b263e20c94167000001"
},
{
"$oid": "529bfa36c81735b802000001"
}
],
"status1": [
{
"$oid": "52995b263e20c94167000001"
},
{
"$oid": "529bfa36c81735b802000001"
}
]
}
You can call map on the tasks array to project it into a new array with just the ObjectId values:
models.Tasks.findAllByIds({
ids: data.tasks.map(function(value) { return value.task; }),
query: query
}, function(items) { ...
Have you try the $elemMatch option in find conditions ? http://docs.mongodb.org/manual/reference/operator/query/elemMatch/

Resources