Empty a MongoDB (via Azure Cosmos Db) shared collection with deletemany

Empty a MongoDB (via Azure Cosmos Db) shared collection with deletemany - node.js

I am currently working on a NodeJS project handling datas with MongoDB via microsoft AzureCosmosDB.
For the good use of the project, I have a shared collection (with the _id as shardkey) that I would like to empty regularly, I know that this is done using the "deleteMany" instruction with an empty object as parameter.
So I tried and I am currently facing a recurrent error :
query in command must target a single shard key
I understand the logic behind this error, but I don't know where to start to find a solution, and didn't find any help in the mongo documentation.
I've read about using hashed shardkeys and how this make the use of shardkeys more "flexible", but I would like to know if there is an easyer solution, maybe something i've missed that would allow me to empty the collection without giving all the item ids one by one :)
Any idea ?
Thank you very much !

SO
It appears that this is not currently possible, and that the Azure CosmosDb team is working on it, with a tentative of release date in the firsts months of this year (2019).
https://feedback.azure.com/forums/263030-azure-cosmos-db/suggestions/34813063-cosmosdb-mongo-api-delete-many-with-partition-ke
Thank you for the help and sorry for bothering

You should be able to make a query or delete command by matching any document in the collection that has _id field:
db.collection.deleteMany({ _id: { $exists: true }})

Related

Query on a database different from the one working on with XQuery

Currently I'm working on an application that is making queries on a given MarkLogic database, the default one as we can say, but to provide same values on the screen I have to check the role of the logged user before displaying. This can be done querying the Security database, the one provided by MarkLogic itself, but I don't know how to explicitly declare on the query that I want to query that particular database instead of the default one. Do you know some command that could help me? Thank you!

You can use eval to query against another database:
xdmp:eval("doc('/docs/mydoc.xml')", (),
<options xmlns="xdmp:eval">
<database>{xdmp:database("otherdb")}</database>
</options>)
See: https://docs.marklogic.com/xdmp:eval
Also, if you are querying the security database specifically, then instead of xdmp:database you can use xdmp:security-database.

How to do in MongoDB what would have been perfect for a stored procedure in SQL - recursive query

I have the following scenario in a CMS I am building in NodeJs with MongoDb. I have three collections, customData, queries, and templates. Queries can depend on customData, and templates can depend on customData, queries, and other templates. What I need to do is to be able to very quickly figure out all of the documents that depend on a particular item when that item changes. e.g. If a particular customData item changes, I need a list of all the queries and templates that depended on that customData, as well as recursively all the templates that depend on those queries and templates. I need to then take that list and flag all of those documents for processing/regeneration. This is accomplished by setting a regenerate property equal to true on each of the documents in the list. This is the sort of thing that would be perfect for a stored procedure in any database with stored procedures, but I'm struggling to figure out the best solution using MongoDb. Every other need of my project is perfectly suited for Mongo. This is the only scenario that I'm having trouble with.
One solution I've come up with is to store the dependencies on each document as an array of named items as follows (e.g. a doc in the templates collection):
{
name: "SomeTemplate",
...
dependencies: [{type: "query", name: "top5Products"}, {type: "template", name: "header"}]
}
The object above denotes a template that depends on the query named "top5Products" as well as the template named "header". If either of those documents change, this template needs to be flagged for regeneration. I can accomplish the above with a getAllDependentsOfItem function that calls the following on both the queries and templates collections, unioning the results, then recursively calling getAllDependentsOfItem on each result.
this.collection.find({dependencies: item })
For instance, if the query changes, I can call it as follows, then call something else to flag all of those results...
let dependents = this.dependencyService.getAllDependentsOfItem({type: "query", name: "top5Products"});
This just feels very messy to me, especially wrestling with Promises and the recursion above. I haven't even finished the above, but the whole idea of Promises and recursion just seems like a whole lot of cruft for something that would have been so simple in a stored procedure. I just need the dependent documents flagged, and having to wade through all my layers of NodeJs code (CustomDataService, QueryService, TemplateService, DependencyService) to accomplish the above just feels wrong. Not to mention the fact that I keep coming up with a circular dependency between DependencyService and the others. DependencyService needs to call the QueryService and TemplateService to actually talk to those collections, and they need to notify the DependencyService when something changes. I know there are ways around that like using events or not having a DependencyService at all, or just talking directly to the Mongo driver from the DependencyService, but I haven't found a solution that feels right yet.
Another idea I had was to record the dependencies in a completely new collection called "dependencies". Perhaps using a document schema like the following...
{
name: "SomeTemplate",
type: "template",
dependencies: [{type: "query", name: "top5Products"}, {type: "template", name: "header"}]
}
This way the dependencies can be tracked completely separately from the documents themselves. I haven't gotten very far on that solution though.
Any ideas will be greatly appreciated.
Update:
Anyone?
I've since written all the javascript in mongo shell that, given the type and name of a changed item, will recursively find all the dependents of that item and update those dependents, setting the regenerate flags on those documents to "1".
My problem now is - how do I run this code on the MongoDb server by calling it from NodeJs? I need NodeJs to control when this happens and pass the changed item into it. I've been looking at the eval command, and that just looks like a bad idea. I think it's been deprecated in MongoDb versions > 3.
I can't imagine how this recursive code I wrote using cursors in mongo shell could be anything but MUCH slower when run from inside NodeJs on a different server than the database. All the queries recursively getting each document, incurring the latency back and forth across servers, then looping through the results to update the regenerate flag on all the dependent documents... I just can't wrap my brain around why this can't and shouldn't be done on the server somehow. It seems like the perfect scenario for some sort of batch, server-side mechanism, like, I dunno, a stored procedure!
Please help me figure out either how to do this, or how to do it the "Mongo way". I can post the mongo shell code that is working if it would help.

Update a million records in MongoDb each with Subdocument that has a array which also needs to updated

I'm a noobie in Nodejs and MongoDb so please excuse my silly doubts :D but i need help right now
{
"_id": "someid",
"data": "some_data",
"subData": [
{
"_id": "someid",
"data": "some_data"
},
{
"_id": "some_id",
"data": "some_data"
}
]
}
I have a schema like above and imagine i have millions of Documents in that schema, Now i want to update those Documents.
Based on condition i want to select a set of them and modify those "subdata" arrays and update them.
I know there is no way to do that in one query and the issue here at Jira for that feauture but my question now is, what is the most efficient way to update a million records in mongoDb ?
Thanks in advance :)

Going by the schema that you have posted here, it is good that you are maintaining a specific id for the sub document which is automatically added if you are using mongoose (in case the backend is node.js).
I would like to post something from the post that you have posted along with the main post of yours.
It doesn't just not work for updating multiple items in one
array, it also doesn't work for updating a single item in an array for
multiple documents.
So our relevant option there goes out of the window. There is no way to update large chunks in single command as you'll have to target them individually.
If you are going to target them individually it is advisable that you target them using specific unique ids that are being generated and now to automate the whole process you can choose whichever efficient method suits the backend you are using.
You can make several processes in parallel that would help you to attain the desired task in less time but it wont be possible to do everything in one go because mongodb don't support that.
It is also advisable that at place of maintaining several sub documents you should just go for separate collection instead as it'll ease the whole process. Maintain a field to map your two collections.
References
https://jira.mongodb.org/browse/SERVER-831
https://jira.mongodb.org/browse/SERVER-1243
https://www.nodechef.com/docs/cloud-search/updates-documents-nested-in-arrays

How can I make mongoose to validate an entire 'outdated' MongoDB collection?

during the development of my app, i very often add custom and new fields to an existing schema, making the 'old' content in my mongodb 'incompelete' and missing the new fields. this leads sometimes to null content where it's required and it's in my use case very bad.
my question is what command/utility do i need to use to make mongoose validate my old documents, and in potential add those missing fields with pre-defined defaults to the old documents?
i remember reading something about that kind of functionality when i started learning how to use mongoose, but i just can't find it anywhere anymore..
thanks in advance :)
Amit

Mongoose bulk insert or update documents

I am working on a node.js app, and I've been searching for a way around using the Model.save() function because I will want to save many documents at the same time, so it would be a waste of network and processing doing it one by one.
I found a way to bulk insert. However, my model has two properties that makes them unique, an ID and a HASH (I am getting this info from an API, so I believe I need these two informations to make a document unique), so, I wanted that if I get an already existing object it would be updated instead of inserted into the schema.
Is there any way to do that? I was reading something about making concurrent calls to save the objects, using Q, however I still think this would generate an unwanted load on the Mongo server, wouldn't it? Does Mongo or Mongoose have a method to bulk insert or update like it does with insert?
Thanks in advance

I think you are looking for the Bulk.find(<query>).upsert().update(<update>) function.
You can use it this way:
bulk = db.yourCollection.initializeUnorderedBulkOp();
for (<your for statement>) {
bulk.find({ID: <your id>, HASH: <your hash>}).upsert().update({<your update fields>});
}
bulk.execute(<your callback>)
For each document, it will look for a document matching the {ID: <your id>, HASH: {your hash}} criteria. Then:
If it finds one, it will update that document using {<your update fields>}
Otherwise, it will create a new document
As you need, it will not make a connection to the mongo server on each iteration of the for loop. Instead a single call will be made on the bulk.execute() line.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string