We're having an issue in our main data synchronization back-end function. Our client's mobile device is pushing changes daily, however last week they warned us some changes weren't updated in the main web app.
After some investigation in the logs, we found that there is indeed a single transaction that fails and rollback. However it appears that all the transactions before this one also rollback.
The code works this way. The data to synchronize is an array of "changesets", and each changset can update multiple tables at once. It's important that a changset be updated completely or not at all, so each is wrapped in a transaction. Then each transaction is executed one after the other. If a transaction fails, the others shouldn't be affected.
I suspect that all the transactions are actually combined somehow, possibly through the main db.task. Instead of just looping to execute the transactions, we're using a db.task to execute them in batch avoid update conflicts on the same tables.
Any advice how we could execute these transactions in batch and avoid this rollback issue?
Thanks, here's a snippet of the synchronization code:
// Begin task that will execute transactions one after the other
db.task(task => {
const transactions = [];
// Create a transaction for each changeset (propriete/fosse/inspection)
Object.values(data).forEach((change, index) => {
const logchange = { tx: index };
const c = {...change}; // Use a clone of the original change object
transactions.push(
task.tx(t => {
const queries = [];
// Propriete
if (Object.keys(c.propriete.params).length) {
const params = proprietes.parse(c.propriete.params);
const propriete = Object.assign({ idpropriete: c.propriete.id }, params);
logchange.propriete = { idpropriete: propriete.idpropriete };
queries.push(t.one(`SELECT ${Object.keys(params).join()} FROM propriete WHERE idpropriete = $1`, propriete.idpropriete).then(previous => {
logchange.propriete.previous = previous;
return t.result('UPDATE propriete SET' + qutil.setequal(params) + 'WHERE idpropriete = ${idpropriete}', propriete).then(result => {
logchange.propriete.new = params;
})
}));
}
else delete c.propriete;
// Fosse
if (Object.keys(c.fosse.params).length) {
const params = fosses.parse(c.fosse.params);
const fosse = Object.assign({ idfosse: c.fosse.id }, params);
logchange.fosse = { idfosse: fosse.idfosse };
queries.push(t.one(`SELECT ${Object.keys(params).join()} FROM fosse WHERE idfosse = $1`, fosse.idfosse).then(previous => {
logchange.fosse.previous = previous;
return t.result('UPDATE fosse SET' + qutil.setequal(params) + 'WHERE idfosse = ${idfosse}', fosse).then(result => {
logchange.fosse.new = params;
})
}));
}
else delete c.fosse;
// Inspection (rendezvous)
if (Object.keys(c.inspection.params).length) {
const params = rendezvous.parse(c.inspection.params);
const inspection = Object.assign({ idvisite: c.inspection.id }, params);
logchange.rendezvous = { idvisite: inspection.idvisite };
queries.push(t.one(`SELECT ${Object.keys(params).join()} FROM rendezvous WHERE idvisite = $1`, inspection.idvisite).then(previous => {
logchange.rendezvous.previous = previous;
return t.result('UPDATE rendezvous SET' + qutil.setequal(params) + 'WHERE idvisite = ${idvisite}', inspection).then(result => {
logchange.rendezvous.new = params;
})
}));
}
else delete change.inspection;
// Cheminees
c.cheminees = Object.values(c.cheminees).filter(cheminee => Object.keys(cheminee.params).length);
if (c.cheminees.length) {
logchange.cheminees = [];
c.cheminees.forEach(cheminee => {
const params = cheminees.parse(cheminee.params);
const ch = Object.assign({ idcheminee: cheminee.id }, params);
const logcheminee = { idcheminee: ch.idcheminee };
queries.push(t.one(`SELECT ${Object.keys(params).join()} FROM cheminee WHERE idcheminee = $1`, ch.idcheminee).then(previous => {
logcheminee.previous = previous;
return t.result('UPDATE cheminee SET' + qutil.setequal(params) + 'WHERE idcheminee = ${idcheminee}', ch).then(result => {
logcheminee.new = params;
logchange.cheminees.push(logcheminee);
})
}));
});
}
else delete c.cheminees;
// Lock from further changes on the mobile device
// Note: this change will be sent back to the mobile in part 2 of the synchronization
queries.push(t.result('UPDATE rendezvous SET timesync = now() WHERE idvisite = $1', [c.idvisite]));
console.log(`transaction#${++transactionCount}`);
return t.batch(queries).then(result => { // Transaction complete
logdata.transactions.push(logchange);
});
})
.catch(function (err) { // Transaction failed for this changeset, rollback
logdata.errors.push({ error: err, change: change }); // Provide error message and original change object to mobile device
console.error(JSON.stringify(logdata.errors));
})
);
});
console.log(`Total transactions: ${transactions.length}`);
return task.batch(transactions).then(result => { // All transactions complete
// Log everything that was uploaded from the mobile device
log.log(res, JSON.stringify(logdata));
});
I apologize, this is almost impossible to make a final good answer when the question is wrong on too many levels...
It's important that a change set be updated completely or not at all, so each is wrapped in a transaction.
If the change set requires data integrity, the whole thing must be one transaction, and not a set of transactions.
Then each transaction is executed one after the other. If a transaction fails, the others shouldn't be affected.
Again, data integrity is what a single transaction guarantees, you need to make it into one transaction, not multiple.
I suspect that all the transactions are actually combined somehow, possibly through the main db.task.
They are combined, and not through task, but through method tx.
Any advice how we could execute these transactions in batch and avoid this rollback issue?
By joining them into a single transaction.
You would use a single tx call at the top, and that's it, no tasks needed there. And in case the code underneath makes use of its own transactions, you can update it to allow conditional transactions.
Also, when building complex transactions, an app benefits a lot from using the repository patterns shown in pg-promise-demo. You can have methods inside repositories that support conditional transactions.
And you should redo your code to avoid horrible things it does, like manual query formatting. For example, never use things like SELECT ${Object.keys(params).join()}, that's a recipe for disaster. Use the proper query formatting that pg-promise gives you, like SQL Names in this case.
Related
Problem:
front-end page make x parallel requests (let's call it first group),
the next group (x request) will be after 5 seconds, the first request (of the first group) set the cache from DB.
the other x-1 requests got empty array insted of wait to first request to done his job.
the second group and the all next requests got proper data from cache.
What is the best practics to lock other threads until the first done (or fail) in stateless mechanism?
EDIT:
The cache module allow use trigger of set chache but it's not work since it stateless mechanism.
const GetDataFromDB= async (req, res, next) => {
var cachedTableName = undefined;
// "lockFlag" uses to prevent parallel request to get into critical section (because its take time to set cache from db)
// to prevent that we uses "lockFlag" that is short-initiation to cache.
//
if ( !myCache.has( "lockFlag" ) && !myCache.has( "dbtable" ) ){
// here arrive first req from first group only
// the other x-1 of first group went to the nest condition
// here i would build mechanism to wait 'till first req come back from DB (init cache)
myCache.set( "lockFlag", "1" )
const connection1 = await odbc.connect(connectionConfig);
const cachedTableName = await connection1.query(`select * from ${tableName}`);
if(cachedTableName.length){
const success = myCache.set([
{key: "dbtable", val: cachedTableName, ttl: 180},
])
if(success)
{
cachedTableName = myCache.get( "dbtable" );
}
}
myCache.take("lockFlag");
connection1.close();
return res.status(200).json(cachedTableName ); // uses for first response.
}
// here comes x-1 of first group went to the nest condition and got nothing, bacause the cache not set yet
//
if ( myCache.has( "dbtable" ) ){
cachedTableName = myCache.get( "dbtable" );
}
return res.status(200).json(cachedTableName );
}
You can try the approach given here, with minor modifications to apply it for your case.
For brevity, I removed comments and shortened variables names.
Code, then explanation:
const EventEmitter = require('events');
const bus = new EventEmitter();
const getDataFromDB = async (req, res, next) => {
var table = undefined;
if (myCache.has("lockFlag")) {
await new Promise(resolve => bus.once("unlocked", resolve));
}
if (myCache.has("dbtable")) {
table = myCache.get("dbtable");
}
else {
myCache.set("lockFlag", "1");
const connection = await odbc.connect(connectionConfig);
table = await connection.query(`select * from ${tableName}`);
connection.close();
if (table.length) {
const success = myCache.set([
{ key: "dbtable", val: table, ttl: 180 },
]);
}
myCache.take("lockFlag");
bus.emit("unlocked");
}
return res.status(200).json(table);
}
This is how it should work:
At first, lockFlag is not present.
Then, some code calls getDataFromDB. That code evaluates the first if block to false, so it continues: it sets lockFlag to true ("1"), then goes on to retrieve the table data from db. In the meantime:
Some other code calls getDataFromDB. That code, however, evaluates the first if block to true, so it awaits on the promise, until an unlocked event will be emitted.
Back to the first calling code: It finishes its logic, caches the table data, sets lockFlag back to false, emits an unlocked event, and returns.
The other code can now continue its execution: it evaluates the second if to true, so it takes the table from the cache, and returns.
As workaround i add "finally" scope to remove lock-key from cache after first initiation, and this:
while(myCache.has( "lockFlag" )){
await wait(1500);
}
And the "wait" function:
function wait(milleseconds) {
return new Promise(resolve => setTimeout(resolve, milleseconds))
}
(source)
This is working, but still could be time (<1500 ms) that there is cache and the thread not aware.
I'ld happy for batter solution.
I need some advice on how to structure this function as at the moment it is not happening in the correct order due to node being asynchronous.
This is the flow I want to achieve; I don't need help with the code itself but with the order to achieve the end results and any suggestions on how to make it efficient
Node routes a GET request to my controller.
Controller reads a .csv file on local system and opens a read stream using fs module
Then use csv-parse module to convert that to an array line by line (many 100,000's of lines)
Start a try/catch block
With the current row from the csv, take a value and try to find it in a MongoDB
If found, take the ID and store the line from the CSV and this id as a foreign ID in a separate database
If not found, create an entry into the DB and take the new ID and then do 6.
Print out to terminal the row number being worked on (ideally at some point I would like to be able to send this value to the page and have it update like a progress bar as the rows are completed)
Here is a small part of the code structure that I am currently using;
const fs = require('fs');
const parse = require('csv-parse');
function addDataOne(req, id) {
const modelOneInstance = new InstanceOne({ ...code });
const resultOne = modelOneInstance.save();
return resultOne;
}
function addDataTwo(req, id) {
const modelTwoInstance = new InstanceTwo({ ...code });
const resultTwo = modelTwoInstance.save();
return resultTwo;
}
exports.add_data = (req, res) => {
const fileSys = 'public/data/';
const parsedData = [];
let i = 0;
fs.createReadStream(`${fileSys}${req.query.file}`)
.pipe(parse({}))
.on('data', (dataRow) => {
let RowObj = {
one: dataRow[0],
two: dataRow[1],
three: dataRow[2],
etc,
etc
};
try {
ModelOne.find(
{ propertyone: RowObj.one, propertytwo: RowObj.two },
'_id, foreign_id'
).exec((err, searchProp) => {
if (err) {
console.log(err);
} else {
if (searchProp.length > 1) {
console.log('too many returned from find function');
}
if (searchProp.length === 1) {
addDataOne(RowObj, searchProp[0]).then((result) => {
searchProp[0].foreign_id.push(result._id);
searchProp[0].save();
});
}
if (searchProp.length === 0) {
let resultAddProp = null;
addDataTwo(RowObj).then((result) => {
resultAddProp = result;
addDataOne(req, resultAddProp._id).then((result) => {
resultAddProp.foreign_id.push(result._id);
resultAddProp.save();
});
});
}
}
});
} catch (error) {
console.log(error);
}
i++;
let iString = i.toString();
process.stdout.clearLine();
process.stdout.cursorTo(0);
process.stdout.write(iString);
})
.on('end', () => {
res.send('added');
});
};
I have tried to make the functions use async/await but it seems to conflict with the fs.openReadStream or csv parse functionality, probably due to my inexperience and lack of correct use of code...
I appreciate that this is a long question about the fundamentals of the code but just some tips/advice/pointers on how to get this going would be appreciated. I had it working when the data was sent one at a time via a post request from postman but can't implement the next stage which is to read from the csv file which contains many records
First of all you can make the following checks into one query:
if (searchProp.length === 1) {
if (searchProp.length === 0) {
Use upsert option in mongodb findOneAndUpdate query to update or upsert.
Secondly don't do this in main thread. Use a queue mechanism it will be much more efficient.
Queue which I personally use is Bull Queue.
https://github.com/OptimalBits/bull#basic-usage
This also provides the functionality you need of showing progress.
Also regarding using Async Await with ReadStream, a lot of example can be found on net such as : https://humanwhocodes.com/snippets/2019/05/nodejs-read-stream-promise/
okay. I'm confused as to the best way to do this:
the following pieces are in play: a node js server, a client-side react(with redux), a MYSql DB.
in the client app I have lists (many but for this issue, assume one), that I want to be able to reorder by drag and drop.
in the mysql DB the times are stored to represent a linked list (with a nextKey, lastKey, and productionKey(primary), along with the data fields),
//mysql column [productionKey, lastKey,nextKey, ...(other data)]
the current issue I'm having is a render issue. it stutters after every change.
I'm using these two function to get the initial order and to reorder
function SortLinkedList(linkedList)
{
var sortedList = [];
var map = new Map();
var currentID = null;
for(var i = 0; i < linkedList.length; i++)
{
var item = linkedList[i];
if(item?.lastKey === null)
{
currentID = item?.productionKey;
sortedList.push(item);
}
else
{
map.set(item?.lastKey, i);
}
}
while(sortedList.length < linkedList.length)
{
var nextItem = linkedList[map.get(currentID)];
sortedList.push(nextItem);
currentID = nextItem?.productionKey;
}
const filteredSafe=sortedList.filter(x=>x!==undefined)
//undefined appear because server has not fully updated yet, so linked list is broken
//nothing will render without this
return filteredSafe
;
}
const reorder = (list, startIndex, endIndex) => {
const result = Array.from(list);
const [removed] = result.splice(startIndex, 1);
result.splice(endIndex, 0, removed);
const adjustedResult = result.map((x,i,arr)=>{
if(i==0){
x.lastKey=null;
}else{
x.lastKey=arr[i-1].productionKey;
}
if(i==arr.length-1){
x.nextKey=null;
}else{
x.nextKey=arr[i+1].productionKey;
}
return x;
})
return adjustedResult;
};
I've got this function to get the items
const getItems = (list,jobList) =>
{
return list.map((x,i)=>{
const jobName=jobList.find(y=>y.jobsessionkey==x.attachedJobKey)?.JobName;
return {
id:`ProductionCardM${x.machineID}ID${x.productionKey}`,
attachedJobKey: x.attachedJobKey,
lastKey: x.lastKey,
machineID: x.machineID,
nextKey: x.nextKey,
productionKey: x.productionKey,
content:jobName
}
})
}
my onDragEnd
const onDragEnd=(result)=> {
if (!result.destination) {
return;
}
// dropped outside the list
const items = reorder(
state.items,
result.source.index,
result.destination.index,
);
dispatch(sendAdjustments(items));
//sends update to server
//server updates mysql
//server sends back update events from mysql in packets
//props sent to DnD component are updated
}
so the actual bug looks like the graphics are glitching - as things get temporarily filtered in the sortLinkedList function - resulting in jumpy divs. is there a smoother way to handle this client->server->DB->server->client dataflow that results in a consistent handling in DnD?
UPDATE:
still trying to solve this. currently implemented a lock pattern.
useEffect(()=>{
if(productionLock){
setState({
items: SortLinkedList(getItems(data,jobList)),
droppables: [{ id: "Original: not Dynamic" }]
})
setLoading(false);
}else{
console.log("locking first");
setLoading(true);
}
},[productionLock])
where production lock is set to true and false from triggers on the server...
basically: the app sends the data to the server, the server processes the request, then sends new data back, when it's finished the server sends the unlock signal.
which should trigger this update happening once, but it does not, it still re-renders on each state update to the app from the server.
What’s the code for sendAdjustments()?
You should update locally first, otherwise DnD pulls it back to its original position while you wait for backend to finish. This makes it appear glitchy. E.g:
Set the newly reordered list locally as your state
Send network request
If it fails, reverse local list state back to the original list
I am using Mongoose to access to my database. I need to use transactions to make an atomic insert-update.
95% of the time my transaction works fine, but 5% of the time an error is showing :
"Given transaction number 1 does not match any in-progress transactions"
It's very difficult to reproduce this error, so I really want to understand where it is coming from to get rid of it.
I could not find a very clear explanation about this type of behaviour.
I have tried to use async/await key words on various functions. I don't know if an operation is not done in time or too soon.
Here the code I am using:
export const createMany = async function (req, res, next) {
if (!isIterable(req.body)) {
res.status(400).send('Wrong format of body')
return
}
if (req.body.length === 0) {
res.status(400).send('The body is well formed (an array) but empty')
return
}
const session = await mongoose.startSession()
session.startTransaction()
try {
const packageBundle = await Package.create(req.body, { session })
const options = []
for (const key in packageBundle) {
if (Object.prototype.hasOwnProperty.call(packageBundle, key)) {
options.push({
updateOne: {
filter: { _id: packageBundle[key].id },
update: {
$set: {
custom_id_string: 'CAB' + packageBundle[key].custom_id.toLocaleString('en-US', {
minimumIntegerDigits: 14,
useGrouping: false
})
},
upsert: true
}
}
})
}
}
await Package.bulkWrite(
options,
{ session }
)
for (const key in packageBundle) {
if (Object.prototype.hasOwnProperty.call(packageBundle, key)) {
packageBundle[key].custom_id_string = 'CAB' + packageBundle[key].custom_id.toLocaleString('en-US', {
minimumIntegerDigits: 14,
useGrouping: false
})
}
}
res.status(201).json(packageBundle)
await session.commitTransaction()
} catch (error) {
res.status(500).end()
await session.abortTransaction()
throw error
} finally {
session.endSession()
}
}
I expect my code to add in the database and to update the entry packages in atomic way, that there is no instable database status.
This is working perfectly for the main part, but I need to be sure that this bug is not showing anymore.
You should use the session.withTransaction() helper function to perform the transaction, as pointed in mongoose documentation. This will take care of starting, committing and retrying the transaction in case it fails.
const session = await mongoose.startSession();
await session.withTransaction(async () => {
// Your transaction methods
});
Explanation:
The multi-document transactions in MongoDB are relatively new and might be a bit unstable in some cases, such as described here. And certainly, it has also been reported in Mongoose here. Your error most probably is a TransientTransactionError due to a write-conflict happening when the transaction is committed.
However, this is a known and expected issue from MongoDB and these comments explain their reasoning behind why they decided it to be like this. Moreover, they claim that the user should be handling the cases of write conflicts and retrying the transaction if that happens.
Therefore, looking at your code, the Package.create(...) method seems to be the reason why the error gets triggered, since this method is executing a save() for every document in the array (from mongoose docs).
A quick solution might be using Package.insertMany(...) instead of create(), since the Model.insertMany() "only sends one operation to the server, rather than one for each document" (from mongoose docs).
However, MongoDB provides a helper function session.withTransaction() that will take care of starting and committing the transaction and retry it in case of any error, since release v3.2.1. Hence, this should be your preferred way to work with transactions in a safer way; which is, of course, available in Mongoose through the Node.js API.
The accepted answer is great. In my case, I was running multiple transactions serially within a session. I was still facing this issue every now and then. I wrote a small helper to resolve this.
File 1:
// do some work here
await session.withTransaction(() => {});
// ensure the earlier transaction is completed
await ensureTransactionCompletion(session);
// do some more work here
await session.withTransaction(() => {});
Utils File:
async ensureTransactionCompletion(session: ClientSession, maxRetryCount: number = 50) {
// When we are trying to split our operations into multiple transactions
// Sometimes we are getting an error that the earlier transaction is still in progress
// To avoid that, we ensure the earlier transaction has finished
let count = 0;
while (session.inTransaction()) {
if (count >= maxRetryCount) {
break;
}
// Adding a delay so that the transaction get be committed
await new Promise(r => setTimeout(r, 100));
count++;
}
}
I want to perform a transaction that requires updating two documents using the previous values of those documents.
For the sake of the question, I'm trying to transfer 100 tokens from one app user to another. This operation must be atomic to keep the data integrity of my DB, so on the server side I though to use admin.firestore().runTransaction.
As I understand runTransaction needs to perform all reads before performing writes, so how do I read both user's balance before updating the data?
This is what I have so far:
db = admin.firestore();
const user1Ref = db.collection('users').doc(user1Id);
const user2Ref = db.collection('users').doc(user2Id);
transaction = db.runTransaction(t => {
return t.get(user1Ref).then(user1Snap => {
const user1Balance = user1Snap.data().balance;
// Somehow get the second user's balance (user2Balance)
t.update(user1Ref , {balance: user1Balance - 100});
t.update(user2Ref , {balance: user2Balance + 100});
return Promise.resolve('Transferred 100 tokens from ' + user1Id + ' to ' + user2Id);
});
}).then(result => {
console.log('Transaction success', result);
});
You can use getAll. See documentation at https://cloud.google.com/nodejs/docs/reference/firestore/0.15.x/Transaction?authuser=0#getAll
You can use Promise.all() to generate a single promise that resolves when all promises in the array passed to it have resolved. Use that promise to continue work after all your document reads are complete - it will contain all the results. The general form of your code should be like this:
const p1 = t.get(user1Ref)
const p2 = t.get(user2Ref)
const pAll = Promise.all([p1, p2])
pAll.then(results => {
snap1 = results[0]
snap2 = results[1]
// work with snap1 and snap2 here, make updates to refs...
})