Batch updates reporting contention error using nodejs - node.js

I am trying to update in a collection and there are 1400+ offices are there and after checking and running the query I am updating in a collection document and update in the subcollection with few details after querying but sometimes i am getting this error
10 ABORTED: Too much contention on these documents. Please try again.
and i am simply using batch for writing in the doc here is my code for updation in the collection.
batch.set(
rootCollections.x.doc(getISO8601Date())
.collection(subCollection.y)
.doc(change.after.get('Id')),
{
officeId: change.after.get('Id'),
office: change.after.get('office'),
status: change.after.get('status'),
creationTimestamp:
change.after.get('creationTimestamp') ||
change.after.get('createTimestamp') ||
change.after.get('timestamp'),
activeUsers: [...new Set(phoneNumbers)].length,
confirmedUers: activityCheckinSubscription.docs.length,
uniqueActivities: [...new Set(activities)].length,
payments: 0,
contact: [
`${change.after.data().creator.phoneNumber},${
change.after.data().attachment['First Contact'].value
}`,
],
},
{ merge: true },
);
batch.set(
rootCollections.x.doc(getISO8601Date()),
{
Added: admin.firestore.FieldValue.increment(1),
},
{ merge: true },
);
PromiseArray.push(batch.commit());
await Promise.all(PromiseArray);

It seems that you are facing the same issue from this similar case here, where there were thousands of records in the database being updated. As clarified there, you have limitation of how much writes you can perform in a document in one second - more details here - and even though Firestore sometimes might hang on with the faster writes, it will fail at some point.
As this is hard coded and a limit that it's imposed by Firestore, what you can try is the solution explained in this similar case here, that it's either change to Realtime Database, where the limit is not the number of writes, but the size of the data or in case of the usage of a counter or some other data aggregation in Firestore, to use a distributed counter solution, that you can get more details here.
To summarize, there isn't much you can do unless of workaround it with this solution, as this is a limitation documented of Firestore.

Related

index new document and get the indexed document in the same query

it is possible to index a new document and return him after he succeeded indexed?
I tried to take the _id that returns but I'm using 2 queries and the index action takes some time and the second query not find the _id so it not always doing it perfectly.
this is the query that index the document:
const query = await elsaticClient.index({
routing: "dasdsad34_d",
index: "milan",
body: {
text: "san siro",
user: {
user_id: "3",
username: "maldini",
},
tags: ["Forza Milan","grande milan"],
publish_date: new Date(),
likes: [],
users_tags: [1,5],
type: {
name: "comment",
parent: "dasdsad34_d",
},
},
});
No, its not possible with default behavior. By default, Elasticsearch has only a near real time support. Its default refresh interval is 1 second as index refresh is deemed as a costly operation.
In order to overcome this, in your indexing operation, you can add refresh=true. You can get further details from below links.
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-refresh.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html
Please note that this is NOT a recommended option as this comes with huge overhead. Only use this, if your inserts into this index in question are having a very very low number.
Recommended way is to use refresh=wait_for on your indexing operation. But this has a downside of waiting for a second for the natural refresh to complete. So if you have default refresh interval set to 1 and are okay with this as an acceptable trade off, then this is the way to go.
However, if you have a higher refresh interval set, then the wait time for the indexing operation will be as high the refresh interval. So choose your option carefully.

Mongo Updates being super slow

We are facing a timeout issue with our mongo updates. Our collection currently contains around 300 thousand documents. When we try to update a record via the UI, the server times out and the UI is stuck in limbo.
Lead.updateOne({
_id: body.CandidateID
}, {
$set: {
ingestionStatus: 'SUBMITTED',
program: body.program,
participant: body.participant,
promotion: body.promotion,
addressMeta: body.addressMeta,
CreatedByID: body.CreatedByID,
entryPerson: body.entryPerson,
lastEnteredOn: body.lastEnteredOn,
zipcode: body.zipcode,
state: body.state,
readableAddress: body.readableAddress,
promotionId: body.promotionId,
programId: body.programId,
phone1: body.phone1,
personId: body.personId,
lastName: body.lastName,
hasSignature: body.hasSignature,
firstName: body.firstName,
city: body.city,
email: body.email,
addressVerified: body.addressVerified,
address: body.address,
accountId: body.accountId
}
This is how we update a single record. We are using mlab and Heroku in our stack. Looking for advice on how to speed this up considerably.
Thank you.
If your indexes are fine then you could try rebuilding indexes on this collection.
collection indexes from the mango command line:
For example, rebuild the lead collection indexes from the mongo command line:
db.lead.reIndex();
Reference:
https://docs.mongodb.com/v3.2/tutorial/manage-indexes/
https://docs.mongodb.com/manual/reference/command/repairDatabase/
if you are not using this then try this one
Index builds can block write operations on your database, so you don’t want to build indexes in the foreground on large tables during peak usage. You can use the background creation of indexes by specifying background: true when creating.
db.collection.createIndex({ a:1 }, { background: true })
This will ultimately take longer to complete, but it will not block operations and will have less of an impact on performance.
1) Shard Lead collection by id as shard key.
2) Check if the memory taken by mongodb due to index is less than the memory of the mongoDb server.
Have you tried what this answer suggests? Namely, updating with no write-concern?

MongoDB - two updates in sequence overlap each other

We are building size calculation mechanism for our system.
In order to calculate sizes, we start with the first atomic operation - findAndModify - to find the object and add lock properties to it (to prevent another calculations for this object to interact with it and wait till the end, as we could have many parallel calculations - in this case others should be postponed), then we calculate size of specific properties and after this operation - we add metadata to object and delete locks.
However, it seems that sometimes, when we have a lot of multiple calculations for single object (especially when we calculate a lot of objects in parallel), some updates aren't executed.
_size metadata during calculation looks like this:
{
_lockedAt: SomeDate,
_transactionId: 'abc'
}
And after calculation it should look like this:
{
somePropertySize: 123,
anotherPropertySize: 1245,
(...)
_total: 131431523 // Some number
// Notice that both _lockedAt and _transactionId should be missing
}
And this is how our update flow looks like:
return Promise.coroutine(function * () {
yield object.findOneAndUpdate({
'_id': gemId,
'_size._lockedAt': {
$exists: false
}
}, {
$set: {
'_size._lockedAt': moment.utc().toDate(),
'_size._transactionId': transactionId
}
}).then(results => results.value);
// Calculations are performed here, new _size object is built
yield object.findOneAndUpdate({
_id: gemId,
_lockedAt: {
$exists: true // We tried both with and without this property, does not change anything
}
}, {
$set: {
_size: newSizeObject
}
});
})()
Exemplary real-life object JUST before second update (truncated for brevity):
{
title: 11,
description: 2,
detailedSection: 0,
tags: 2
file: 5625898,
_total: 5625913
}
For some reason, when we have multiple calculations next to each other, sometimes (for new objects, without _size property at all), the objects stay with _size object looking exactly as after locking, despite the fact logs show us that everything went well (calculations were complete, new sizes object was calculated and second DB update was called).
We use MongoDB 3.0, two replicaSets. Any ideas on what is happening?
Put the second update after the then so it will wait until the promise resolves:
object.findOneAndUpdate({
'_id': gemId,
'_size._lockedAt': {
$exists: false
}
}, {
$set: {
'_size._lockedAt': moment.utc().toDate(),
'_size._transactionId': transactionId
}
}).then(results => {
// Calculations are performed here, new _size object is built
object.findOneAndUpdate({
_id: gemId,
_lockedAt: {
$exists: true // We tried both with and without this property, does not change anything
}
}, {
$set: {
_size: newSizeObject
}
});
}).catch(err => console.error);
Also make sure you have error handling for your promises using catch.
If you don't really need the lock or transaction fields then I would remove that stuff. If you do need them, something like RethinkDB may work a little better, or PostgresSQL could give real transactions.
All in all, I checked the code very carefully and what was happening in reality, was the fact that completely different part of the code was querying the object from the DB and then, after a few other operations (mine included), it wrote the object to the DB (hence, overwriting my changes).
So, important note for every MongoDB user - please do remember that MongoDB is not transactional, but still atomic, which means that it guarantees that your operation will be persisted, but does not guarantee that data between operations will be persisted.
To sum up, things I learned by this example:
NEVER update whole object in the database with the data obtained from it some time before (e.g. by querying, changing some properties and saving again)
USE $set, $inc, $unset and other special operators. If you have a lot of parameters, use e.g. mongo-dot-notation npm library to flatten your data into $set selector.
If something unexpected is happening with your data (e.g. missing properties after saving) the first thing to investigate is another pending operations on those specific entities
The least probable cause of your problems is MongoDB itself. It's usually code that does not follow atomicity rules (which happens probably with a lot of people used to transactional DBs :)).

Mongoose returning inconsistent results

I'm experiencing a strange problem in Mongoose related to find queries. When I run the query below, I get a variable number of results. I will get a consistent 210 results when querying in Mongo, but usually get between 198-210 results when doing the same thing through Mongoose. I've tried the query with and without indexes set.
Any suggestions on what might be causing this would be greatly appreciated.
Customer Model:
subscriptions: [
{
renewal: {
type: Boolean,
default: false
}
}
]
Query
Customer.find({ "subscriptions.renewal": true }, {}, { timeout: false })
The problem ultimately cleared up when I removed the Customer collection indexes from Mongo (not just the definitions in the schema). Anyone experiencing this issue might want to give that a try.

How to bulk save an array of objects in MongoDB?

I have looked a long time and not found an answer. The Node.JS MongoDB driver docs say you can do bulk inserts using insert(docs) which is good and works well.
I now have a collection with over 4,000,000 items, and I need to add a new field to all of them. Usually mongodb can only write 1 transaction per 100ms, which means I would be waiting for days to update all those items. How can I do a "bulk save/update" to update them all at once? update() and save() seem to only work on a single object.
psuedo-code:
var stuffToSave = [];
db.collection('blah').find({}, function(err, stuff) {
stuff.toArray().forEach(function(item)) {
item.newField = someComplexCalculationInvolvingALookup();
stuffToSave.push(item);
}
}
db.saveButNotSuperSlow(stuffToSave);
Sure, I'll need to put some limit on doing something like 10,000 at once to not try do all 4 million at once, but i think you get the point.
MongoDB allows you to update many documents that match a specific query using a single db.collection.update(query, update, options) call, see the documentation. For example,
db.blah.update(
{ },
{
$set: { newField: someComplexValue }
},
{
multi: true
}
)
The multi option allows the command to update all documents that match the query criteria. Note that the exact same thing applies when using the Node.JS driver, see that documentation.
If you're performing many different updates on a collection, you can wrap them all in a Bulk() builder to avoid some of the overhead of sending multiple updates to the database.

Resources