CouchDB View, list key with duplicate count - couchdb

In CouchDB I have a collection of articles. Each article has a tags property.
I wrote this map function to list all tags in database
function (doc) {
for(var i = 0; i < doc.metaKeywords.length; i++)
emit(doc.metaKeywords[i], 1)
}
But when it list all tags, it show duplicates of tags. I want to show only one time for each tag and show duplicate number of each tags instead of emit duplicate rows of same key.
What should I do to modify this map function?

The map function is OK, but there is no reason to emit the value 1.
Regardless, the simple builtin reduce function _count and the right query does everything required.
Working with CouchDB demands understanding the B-tree and it's documentation has a great rundown of it in its Reduce/Rereduce documentation. I highly recommend grok'ing that information.
The snippet below highlights its usage via pouchdb. The design document specifies both map and reduce, e.g.
{
"_id": "_design/SO-72078037",
"views": {
"tags": {
"map": `function (doc) {
if(doc.tags) doc.tags.forEach(tag => emit(tag));
}`,
"reduce": "_count",
}
}
}
Straight forward stuff. To get key (tag) counts the query is as simple:
{
reduce: true,
include_docs: false,
group_level: 1
}
Again, the CouchDB documentation is great - read up on group level queries
const gel = id => document.getElementById(id);
async function showReduceDocs(view) {
let result = await db.query(view, {
reduce: true,
include_docs: false,
group_level: 1
});
// show
gel('view_reduce').innerText = result.rows.map(row => JSON.stringify(row))
.join('\n');
}
async function showViewDocs(view) {
let result = await db.query(view, {
reduce: false,
include_docs: false
});
gel('view_docs').innerText = result.rows.map(row => JSON.stringify(row))
.join('\n');
}
function getDocsToInstall() {
return [{
tags: ["A", "B", "C"]
},
{
tags: ["A", "B"]
},
{
tags: ["A"]
},
{
// design document
"_id": "_design/SO-72078037",
"views": {
"tags": {
"map": `function (doc) {
if(doc.tags) doc.tags.forEach(tag => emit(tag));
}`,
"reduce": "_count",
}
}
},
];
}
const db = new PouchDB('SO-72078037', {
adapter: 'memory'
});
(async() => {
// install docs and show view in various forms.
await db.bulkDocs(getDocsToInstall());
showReduceDocs('SO-72078037/tags');
showViewDocs('SO-72078037/tags');
})();
<script src="https://cdn.jsdelivr.net/npm/pouchdb#7.1.1/dist/pouchdb.min.js"></script>
<script src="https://github.com/pouchdb/pouchdb/releases/download/7.1.1/pouchdb.memory.min.js"></script>
<div>View: Reduce</div>
<pre id='view_reduce'></pre>
<hr/>
<div>View</div>
<pre id='view_docs'></pre>

Related

How can I rename or map multiple keys to a new key name?

I am working with JSON data:
[
{
"name": "Person1",
"a_miles": "110 mi"
},
{
"name": "Person2",
"b_miles": "22 mi"
},
{
"name": "Person3",
"a_miles": "552 mi"
}
]
I need to rename a_miles or b_miles to "total" , but can't seem to get .map() to work since it won't allow multiple keys to 1 final key. Each item with have either A or B as well.
This is what I tried so far:
.data(function(data) {
console.log('checking for KMs');
for(key in data) {
console.log(key);
if(data[key].includes(' km')){
console.log('KM found, deleting %s', key)
delete data[key];
}
}
//console.log(data)
savedData.push(data);
})
.done(function() {
var formattedJson = savedData.map(({
name,
a_miles:total,
b_miles:total
}) => ({
name,
total
}));
Maybe I'm over complicating things, I just need to have a total key/value that replaces a or b so it's consistent through the whole array result.
Actually you can apply a .map to an object, as already answered here.
Here is an example:
a.map(obj => {
Object.keys(obj).map( key => {
if (key.match('_miles')){
obj['total'] = obj[key];
delete obj[key];
}
})
})
console.log(a);
// [ { name: 'Person1', total: '110 mi' },
// { name: 'Person2', total: '22 mi' },
// { name: 'Person3', total: '552 mi' } ]
Where a is the array you proposed. Hope it will help
Looking at your destructuring, If you JUST want a_miles or b_miles you can use your syntax also just add an OR between them:
.done(function () {
const formattedJson = savedData.map(({
name,
a_miles,
b_miles
}) => ({
name,
total: a_miles || b_miles
}))
});

MongoDB - find one and add a new property

Background: Im developing an app that shows analytics for inventory management.
It gets an office EXCEL file uploaded, and as the file uploads the app convert it to an array of JSONs. Then, it comapers each json object with the objects in the DB, change its quantity according to the XLS file, and add a timestamp to the stamps array which contain the changes in qunatity.
For example:
{"_id":"5c3f531baf4fe3182cf4f1f2",
"sku":123456,
"product_name":"Example",
"product_cost":10,
"product_price":60,
"product_quantity":100,
"Warehouse":4,
"stamps":[]
}
after the XLS upload, lets say we sold 10 units, it should look like that:
{"_id":"5c3f531baf4fe3182cf4f1f2",
"sku":123456,
"product_name":"Example",
"product_cost":10,
"product_price":60,
"product_quantity":90,
"Warehouse":4,
"stamps":[{"1548147562": -10}]
}
Right now i cant find the right commands for mongoDB to do it, Im developing in Node.js and Angular, Would love to read some ideas.
for (let i = 0; i < products.length; i++) {
ProductsDatabase.findOneAndUpdate(
{"_id": products[i]['id']},
//CHANGE QUANTITY AND ADD A STAMP
...
}
You would need two operations here. The first will be to get an array of documents from the db that match the ones in the JSON array. From the list you compare the 'product_quantity' keys and if there is a change, create a new array of objects with the product id and change in quantity.
The second operation will be an update which uses this new array with the change in quantity for each matching product.
Armed with this new array of updated product properties, it would be ideal to use a bulk update for this as looping through the list and sending
each update request to the server can be computationally costly.
Consider using the bulkWrite method which is on the model. This accepts an array of write operations and executes each of them of which a typical update operation
for your use case would have the following structure
{ updateOne :
{
"filter" : <document>,
"update" : <document>,
"upsert" : <boolean>,
"collation": <document>,
"arrayFilters": [ <filterdocument1>, ... ]
}
}
So your operations would follow this pattern:
(async () => {
let bulkOperations = []
const ids = products.map(({ id }) => id)
const matchedProducts = await ProductDatabase.find({
'_id': { '$in': ids }
}).lean().exec()
for(let product in products) {
const [matchedProduct, ...rest] = matchedProducts.filter(p => p._id === product.id)
const { _id, product_quantity } = matchedProduct
const changeInQuantity = product.product_quantity - product_quantity
if (changeInQuantity !== 0) {
const stamps = { [(new Date()).getTime()] : changeInQuantity }
bulkOperations.push({
'updateOne': {
'filter': { _id },
'update': {
'$inc': { 'product_quantity': changeInQuantity },
'$push': { stamps }
}
}
})
}
}
const bulkResult = await ProductDatabase.bulkWrite(bulkOperations)
console.log(bulkResult)
})()
You can use mongoose's findOneAndUpdate to update the existing value of a document.
"use strict";
const ids = products.map(x => x._id);
let operations = products.map(xlProductData => {
return ProductsDatabase.find({
_id: {
$in: ids
}
}).then(products => {
return products.map(productData => {
return ProductsDatabase.findOneAndUpdate({
_id: xlProductData.id // or product._id
}, {
sku: xlProductData.sku,
product_name: xlProductData.product_name,
product_cost: xlProductData.product_cost,
product_price: xlProductData.product_price,
Warehouse: xlProductData.Warehouse,
product_quantity: productData.product_quantity - xlProductData.product_quantity,
$push: {
stamps: {
[new Date().getTime()]: -1 * xlProductData.product_quantity
}
},
updated_at: new Date()
}, {
upsert: false,
returnNewDocument: true
});
});
});
});
Promise.all(operations).then(() => {
console.log('All good');
}).catch(err => {
console.log('err ', err);
});

How can I group items close to each other without knowing the distribution?

I'm preparing some data about sold apartment prices for regression analysis. One category is what street the houses are on, but some streets have very different areas, so I want to make a category with the combination of construction year and street name.
Broadway 1910
Broadway 2001
Forexample my challenge is that sometimes the construction spans over several two years. The data is from Sweden, known for huge centralized housing projects. I would like to group these houses together into a period somehow. This is my current code. I know it's not very efficient, but it will only run once on a not huge dataset.
(async () =>{
let client;
try {
client = await MongoClient;
let collection = client.db("booliscraper").collection("sold");
let docs = await collection.find();
await docs.forEach((sale) => {
sale.street = sale.location.address.streetAddress.split(/[0-9]/)[0] + sale.location.namedAreas[0]
sale.streetYear = sale.street+" "+sale.constructionYear
log(sale);
collection.replaceOne({_id: ObjectId(sale._id)}, doc)
});
client.close();
} catch(err) {
log(err)
}
})()
As you correctly said, your current code is inefficient when it comes to dealing with huge datasets so instead of making several calls to the server to do replaceOne within your forEach loop, you can create an aggregate query that computes the category fields you want with the $group pipeline and push the documents that fall into those categories into an array that you will later use to do a bulk update.
For the bulk update you can use bulkWrite method on the collection that will have multiple updateMany operations.
The following operation shows the intuition above in practice:
(async () => {
try {
let client = await MongoClient;
let collection = client.db("booliscraper").collection("sold");
let pipeline = [
{ '$group': {
'_id': {
'street': {
'$concat': [
{
'$arrayElemAt': [
{ '$split': [
'$location.address.streetAddress',
/[0-9]/
] },
0
]
},
{ '$arrayElemAt': [ '$location.namedAreas', 0 ] },
]
},
'streetYear': { '$concat': ['$street', ' ', '$constructionYear'] }
},
'ids': { '$push': '$_id' }
} }
]
let docs = await collection.aggregate(pipeline);
let ops = docs.map(({ _id, ids }) => ({
'updateMany': {
'filter': { '_id': { '$in': ids } },
'update': { '$set': {
'street': _id.street, 'streetYear': _id.streetYear
} }
}
}));
let result = await collection.bulkWrite(ops);
log(result)
client.close()
} catch(err) {
log(err)
}
})()

How to implement map function of Mongodb cursor in node.js (node-mondodb-native)

I am trying to implement following MongoDB query in NodeJS
db.tvseries.find({}).map(function(doc){
var userHasSubscribed = false;
doc.followers && doc.followers.forEach(function(follower) {
if(follower.$id == "abc") {
userHasSubscribed = true;
}
});
var followers = doc.followers && doc.followers.map(function(follower) {
var followerObj;
db[follower.$ref].find({
"_id" : follower.$id
}).map(function(userObj) {
followerObj = userObj;
});
return followerObj;
});
return {
"id": doc.name,
"userHasSubscribed": userHasSubscribed,
"followers": followers || []
};
})
Following is the db
users collection
{
"id": ObjectId("abc"),
"name": "abc_name"
},
{
"id": ObjectId("def"),
"name": "def_name"
},
{
"id": ObjectId("ijk"),
"name": "ijk_name"
}
tvseries collection
{
"id": ObjectId("123"),
"name": "123_name",
"followers": [
{
"$ref": "users",
"$id": ObjectId("abc"),
},
{
"$ref": "users",
"$id": ObjectId("def"),
}
]
},
{
"id": ObjectId("456"),
"name": "456_name",
"followers": [
{
"$ref": "users",
"$id": ObjectId("ijk"),
},
]
},
{
"id": ObjectId("789"),
"name": "789_name"
}
I am not able to figure out how to execute the above MongoDB query in NodeJS with the help of node-mongodb-native plugin.
I tried the below code but then I get TypeError: undefined is not a function at .map
var collection = db.collection('users');
collection.find({}).map(function(doc) {
console.log(doc);
});
How to execute .map function in NodeJS?
Thanks in advance
I struggled with this for some time. I found that by adding .toArray() after the map function works.
You could even skip map and only add .toArray() to get all the documents fields.
const accounts = await _db
.collection('accounts')
.find()
.map(v => v._id) // leaving this out gets you all the fields
.toArray();
console.log(accounts); // [{_id: xxx}, {_id: xxx} ...]
Please take note that in order for map to work the function used must return something - your example only console.logs without returning a value.
The forEach solution works but I really wanted map to work.
I know that I'm pretty late but I've arrived here by searching on Google about the same problem. Finally, I wasn't able to use map function to do it, but using forEach did the trick.
An example using ES6 and StandardJS.
let ids = []
let PublicationId = ObjectID(id)
feeds_collection
.find({PublicationId})
.project({ _id: 1 })
.forEach((feed) => {
ids.push(feed._id)
}, () => done(ids))
To echo #bamse's anwer, I got it working with .toArray(). Here is an async example:
async function getWordArray (query) {
const client = await MongoClient.connect(url)
const collection = client.db('personal').collection('wordBank')
const data = await collection.find(query).map(doc => doc.word).toArray()
return data
}
Then I use it in my Express route like this:
app.get('/search/:fragment', asyncMiddleware(async (req, res, next) => {
const result = await getWordArray({word: 'boat'})
res.json(result)
}))
Finally, if you need a guide to async/await middleware in NodeJS, here is a guide: https://medium.com/#Abazhenov/using-async-await-in-express-with-node-8-b8af872c0016
map returns a cursor, toArray returns a Promise that will execute a cursor and return it's results. That may be an array of the original query find, limit etc. or a promise of an array of those result piped through a function.
This is typically useful when you want to take the documents of the cursor and process that (maybe fetch something else) while the cursor is still fetching documents, as opposed to waiting until they have all been fetched to node memory
Consider the example
let foos = await db.collection("foos")
.find()
.project({
barId: 1
})
.toArray() // returns a Promise<{barId: ObjectId}[]>
// we now have all foos into memory, time to get bars
let bars = await Promise.all(foos.map(doc => db
.collection("bars")
.findOne({
_id: doc.barId
})))
this is roughly equivalent to
bars = await db.collection("foos")
.find()
.project({
barId: 1
})
.toArray() // returns a Promise<{barId: ObjectId}[]>
.then(docs => docs
.map(doc => db
.collection("bars")
.findOne({
_id: doc.barId
})))
using map you can perform the operation asynchrounsly and (hopefully) more efficiently
bars = await db.collection("foos")
.find()
.project({
barId: 1
})
.map(doc => db
.collection("bars")
.findOne({
_id: doc.barId
}))
.toArray()
.then(barPromises => Promise.all(barPromises)) // Promise<Bar[]>
The main point is that map is simply a function to be applied to the results fetched by the cursor. That function won't get executed until you turn it into a Promise, using either forEach or more sensibly, map

Exclude fields from result in MongoDB monk

I want to exclude some fields from result.
I have code:
users = db.select('users');
users.find( {}, { sort: { points:1 }, privateKey:0, publicKey:0}, function(err,data){
res.send(data);
});
I want to exclude private and public key from results.
Can I do that using monk?
You can also do it like this:
users.find( {}, { sort: { points:1 }, fields : { privateKey:0, publicKey:0} },
function(err,data){
res.send(data);
}
);
According to documentation first argument in find is filter and second is projection .But you have used sort . It will not able to interpret . You are trying to confuse projection with sort .Sorting should be after find and projection.
You can write projection like { field1: <boolean>, field2: <boolean> ... }
Note :
The find() method always includes the _id field even if the field is not explicitly stated to return in the projection parameter.
users.find({}, { privateKey: 0, publicKey: 0 }).sort({points: 1}).toArray(
function (err, data) {
res.send(data);
});
For me, I need to use the .project() method:
const someFunction = async () => {
const result = await users
.find({}, { sort: { points: 1 })
.project({ privateKey: 0, publicKey: 0});
};
This is what worked for me for excluding the _id field.
const courseChapters = await db
.collection("users")
.find({}, { projection: { _id: 0 } })
.toArray();
So the example in the question would look something like this.
users.find(
{},
{ projection: { points: 1, privateKey: 0, publicKey: 0 } },
function (err, data) {
res.send(data);
}
);
Check out this other answer that says you may need the fields field instead of projection depending upon your driver

Resources