Problem:
front-end page make x parallel requests (let's call it first group),
the next group (x request) will be after 5 seconds, the first request (of the first group) set the cache from DB.
the other x-1 requests got empty array insted of wait to first request to done his job.
the second group and the all next requests got proper data from cache.
What is the best practics to lock other threads until the first done (or fail) in stateless mechanism?
EDIT:
The cache module allow use trigger of set chache but it's not work since it stateless mechanism.
const GetDataFromDB= async (req, res, next) => {
var cachedTableName = undefined;
// "lockFlag" uses to prevent parallel request to get into critical section (because its take time to set cache from db)
// to prevent that we uses "lockFlag" that is short-initiation to cache.
//
if ( !myCache.has( "lockFlag" ) && !myCache.has( "dbtable" ) ){
// here arrive first req from first group only
// the other x-1 of first group went to the nest condition
// here i would build mechanism to wait 'till first req come back from DB (init cache)
myCache.set( "lockFlag", "1" )
const connection1 = await odbc.connect(connectionConfig);
const cachedTableName = await connection1.query(`select * from ${tableName}`);
if(cachedTableName.length){
const success = myCache.set([
{key: "dbtable", val: cachedTableName, ttl: 180},
])
if(success)
{
cachedTableName = myCache.get( "dbtable" );
}
}
myCache.take("lockFlag");
connection1.close();
return res.status(200).json(cachedTableName ); // uses for first response.
}
// here comes x-1 of first group went to the nest condition and got nothing, bacause the cache not set yet
//
if ( myCache.has( "dbtable" ) ){
cachedTableName = myCache.get( "dbtable" );
}
return res.status(200).json(cachedTableName );
}
You can try the approach given here, with minor modifications to apply it for your case.
For brevity, I removed comments and shortened variables names.
Code, then explanation:
const EventEmitter = require('events');
const bus = new EventEmitter();
const getDataFromDB = async (req, res, next) => {
var table = undefined;
if (myCache.has("lockFlag")) {
await new Promise(resolve => bus.once("unlocked", resolve));
}
if (myCache.has("dbtable")) {
table = myCache.get("dbtable");
}
else {
myCache.set("lockFlag", "1");
const connection = await odbc.connect(connectionConfig);
table = await connection.query(`select * from ${tableName}`);
connection.close();
if (table.length) {
const success = myCache.set([
{ key: "dbtable", val: table, ttl: 180 },
]);
}
myCache.take("lockFlag");
bus.emit("unlocked");
}
return res.status(200).json(table);
}
This is how it should work:
At first, lockFlag is not present.
Then, some code calls getDataFromDB. That code evaluates the first if block to false, so it continues: it sets lockFlag to true ("1"), then goes on to retrieve the table data from db. In the meantime:
Some other code calls getDataFromDB. That code, however, evaluates the first if block to true, so it awaits on the promise, until an unlocked event will be emitted.
Back to the first calling code: It finishes its logic, caches the table data, sets lockFlag back to false, emits an unlocked event, and returns.
The other code can now continue its execution: it evaluates the second if to true, so it takes the table from the cache, and returns.
As workaround i add "finally" scope to remove lock-key from cache after first initiation, and this:
while(myCache.has( "lockFlag" )){
await wait(1500);
}
And the "wait" function:
function wait(milleseconds) {
return new Promise(resolve => setTimeout(resolve, milleseconds))
}
(source)
This is working, but still could be time (<1500 ms) that there is cache and the thread not aware.
I'ld happy for batter solution.
Related
I'm working on a proxy that caches files and I'm trying to add some logic that prevents multiple clients from downloading the same files before the proxy has a chance to cache them.
Basically, the logic I'm trying to implement is the following:
Client 1 requests a file. The proxy checks if the file is cached. If it's not, it requests it from the server, caches it, then sends it to the client.
Client 2 requests the same file after client 1 requested it, but before the proxy has a chance to cache it. So the proxy will tell client 2 to wait a few seconds because there is already a download in progress.
A better approach would probably be to give client 2 a "try again later" message, but let's just say that's currently not an option.
I'm using Nodejs with the anyproxy library. According to the documentation, delayed responses are possible by using promises.
However, I don't really see a way to achieve what I want using Promises. From what I can tell, I could do something like this:
module.exports = {
*beforeSendRequest(requestDetail) {
if(thereIsADownloadInProgressFor(requestDetail.url)) {
return new Promise((resolve, reject) => {
setTimeout(() => { // delay
resolve({ response: responseDetail.response });
}, 10000);
});
}
}
};
But that would mean simply waiting for a maximum amount of time and hoping the download finishes by then.
And I don't want that.
I would prefer to be able to do something like this (but with Promises, somehow):
module.exports = {
*beforeSendRequest(requestDetail) {
if(thereIsADownloadInProgressFor(requestDetail.url)) {
var i = 0;
for(i = 0 ; i < 10 ; i++) {
JustSleep(1000);
if(!thereIsADownloadInProgressFor(requestDetail.url))
return { response: responseDetail.response };
}
}
}
};
Is there any way I can achieve this with Promises in Nodejs?
Thanks!
You can use a Map to cache your file downloads.
The mapping in Map would be url -> Promise { file }
// Map { url => Promise { file } }
const cache = new Map()
const thereIsADownloadInProgressFor = url => cache.has(url)
const getCachedFilePromise = url => cache.get(url)
const downloadFile = async url => {/* download file code here */}
const setAndReturnCachedFilePromise = url => {
const filePromise = downloadFile(url)
cache.set(url, filePromise)
return filePromise
}
module.exports = {
beforeSendRequest(requestDetail) {
if(thereIsADownloadInProgressFor(requestDetail.url)) {
return getCachedFilePromise(requestDetail.url).then(file => ({ response: file }))
} else {
return setAndReturnCachedFilePromise(requestDetail.url).then(file => ({ response: file }))
}
}
};
You don't need to send a try again response, simply serve the same data to both requests. All you need to do is store the requests somewhere in the caching system and trigger all of them when the fetching is done.
Here's a cache implementation that does only a single fetch for multiple requests. No delays and no try-laters:
export class class Cache {
constructor() {
this.resultCache = {}; // this object is the cache storage
}
async get(key, cachedFunction) {
let cached = this.resultCache[key];
if (cached === undefined) { // No cache so fetch data
this.resultCache[key] = {
pending: [] // This is the magic, store further
// requests in this pending array.
// This way pending requests are directly
// linked to this cache data
}
try {
let result = await cachedFunction(); // Wait for result
// Once we get result we need to resolve all pending
// promises. Loop through the pending array and
// resolve them. See code below for how we store pending
// requests.. it will make sense:
this.resultCache[key].pending
.forEach(waiter => waiter.resolve(result));
// Store the result of the cache so later we don't
// have to fetch it again:
this.resultCache[key] = {
data: result
}
// Return result to original promise:
return result;
// Note: yes, this means pending promises will get triggered
// before the original promise is resolved but normally
// this does not matter. You will need to modify the
// logic if you want promises to resolve in original order
}
catch (err) { // Error when fetching result
// We still need to trigger all pending promises to tell
// them about the error. Only we reject them instead of
// resolving them:
if (this.resultCache[key]) {
this.resultCache[key].pending
.forEach((waiter: any) => waiter.reject(err));
}
throw err;
}
}
else if (cached.data === undefined && cached.pending !== undefined) {
// Here's the condition where there was a previous request for
// the same data. Instead of fetching the data again we store
// this request in the existing pending array.
let wait = new Promise((resolve, reject) => {
// This is the "waiter" object above. It is basically
// It is basically the resolve and reject functions
// of this promise:
cached.pending.push({
resolve: resolve,
reject: reject
});
});
return await wait; // await response form original request.
// The code above will cause this to return.
}
else {
// Return cached data as normal
return cached.data;
}
}
}
The code may look a bit complicated but it is actually quite simple. First we need a way to store the cached data. Normally I'd just use a regular object for this:
{ key : result }
Where the cached data is stored in the result. But we also need to store additional metadata such as pending requests for the same result. So we need to modify our cache storage:
{ key : {
data: result,
pending: [ array of requests ]
}
}
All this is invisible and transparent to code using this Cache class.
Usage:
const cache = new Cache();
// Illustrated with w3c fetch API but you may use anything:
cache.get( URL , () => fetch(URL) )
Note that wrapping the fetch in an anonymous function is important because we want the Cache.get() function to conditionally call the fetch to avoid multiple fetch being called. It also gives the Cache class flexibility to handle any kind of asynchronous operation.
Here's another example for caching a setTimeout. It's not very useful but it illustrates the flexibility of the API:
cache.get( 'example' , () => {
return new Promise((resolve, reject) => {
setTimeout(resolve, 1000);
});
});
Note that the Cache class above does not have any invalidations or expiry logic for the sake of clarity but it's fairly easy to add them. For example if you want the cache to expire after some time you can just store the timestamp along with the other cache data:
{ key : {
data: result,
timestamp: timestamp,
pending: [ array of requests ]
}
}
Then in the "no-cache" logic simply detect the expiry time:
if (cached === undefined || (cached.timestamp + timeout) < now) ...
I have a list of promises and currently I am using promiseAll to resolve them
Here is my code for now:
const pageFutures = myQuery.pages.map(async (pageNumber: number) => {
const urlObject: any = await this._service.getResultURL(searchRecord.details.id, authorization, pageNumber);
if (!urlObject.url) {
// throw error
}
const data = await rp.get({
gzip: true,
headers: {
"Accept-Encoding": "gzip,deflate",
},
json: true,
uri: `${urlObject.url}`,
})
const objects = data.objects.filter((object: any) => object.type === "observed-data" && object.created);
return new Promise((resolve, reject) => {
this._resultsDatastore.bulkInsert(
databaseName,
objects
).then(succ => {
resolve(succ)
}, err => {
reject(err)
})
})
})
const all: any = await Promise.all(pageFutures).catch(e => {
console.log(e)
})
So as you see here I use promise all and it works:
const all: any = await Promise.all(pageFutures).catch(e => {
console.log(e)
})
However I noticed it affects the database performance wise so I decided to resolve every 3 of them at a time.
for that I was thinking of different ways like cwait, async pool or wrting my own iterator
but I get confused on how to do that?
For example when I use cwait:
let promiseQueue = new TaskQueue(Promise,3);
const all=new Promise.map(pageFutures, promiseQueue.wrap(()=>{}));
I do not know what to pass inside the wrap so I pass ()=>{} for now plus I get
Property 'map' does not exist on type 'PromiseConstructor'.
So whatever way I can get it working(my own iterator or any library) I am ok with as far as I have a good understanding of it.
I appreciate if anyone can shed light on that and help me to get out of this confusion?
First some remarks:
Indeed, in your current setup, the database may have to process several bulk inserts concurrently. But that concurrency is not caused by using Promise.all. Even if you had left out Promise.all from your code, it would still have that behaviour. That is because the promises were already created, and so the database requests will be executed any way.
Not related to your issue, but don't use the promise constructor antipattern: there is no need to create a promise with new Promise when you already have a promise in your hands: bulkInsert() returns a promise, so return that one.
As your concern is about the database load, I would limit the work initiated by the pageFutures promises to the non-database aspects: they don't have to wait for eachother's resolution, so that code can stay like it was.
Let those promises resolve with what you currently store in objects: the data you want to have inserted. Then concatenate all those arrays together to one big array, and feed that to one database bulkInsert() call.
Here is how that could look:
const pageFutures = myQuery.pages.map(async (pageNumber: number) => {
const urlObject: any = await this._service.getResultURL(searchRecord.details.id,
authorization, pageNumber);
if (!urlObject.url) { // throw error }
const data = await rp.get({
gzip: true,
headers: { "Accept-Encoding": "gzip,deflate" },
json: true,
uri: `${urlObject.url}`,
});
// Return here, don't access the database yet...
return data.objects.filter((object: any) => object.type === "observed-data"
&& object.created);
});
const all: any = await Promise.all(pageFutures).catch(e => {
console.log(e);
return []; // in case of error, still return an array
}).flat(); // flatten it, so all data chunks are concatenated in one long array
// Don't create a new Promise with `new`, only to wrap an other promise.
// It is an antipattern. Use the promise returned by `bulkInsert`
return this._resultsDatastore.bulkInsert(databaseName, objects);
This uses .flat() which is rather new. In case you have no support for it, look at the alternatives provided on mdn.
First, you asked a question about a failing solution attempt. That is called X/Y problem.
So in fact, as I understand your question, you want to delay some DB request.
You don't want to delay the resolving of a Promise created by a DB request... Like No! Don't try that! The promise wil resolve when the DB will return a result. It's a bad idea to interfere with that process.
I banged my head a while with the library you tried... But I could not do anything to solve your issue with it. So I came with the idea of just looping the data and setting some timeouts.
I made a runnable demo here: Delaying DB request in small batch
Here is the code. Notice that I simulated some data and a DB request. You will have to adapt it. You also will have to adjust the timeout delay. A full second certainly is too long.
// That part is to simulate some data you would like to save.
// Let's make it a random amount for fun.
let howMuch = Math.ceil(Math.random()*20)
// A fake data array...
let someData = []
for(let i=0; i<howMuch; i++){
someData.push("Data #"+i)
}
console.log("Some feak data")
console.log(someData)
console.log("")
// So we have some data that look real. (lol)
// We want to save it by small group
// And that is to simulate your DB request.
let saveToDB = (data, dataIterator) => {
console.log("Requesting DB...")
return new Promise(function(resolve, reject) {
resolve("Request #"+dataIterator+" complete.");
})
}
// Ok, we have everything. Let's proceed!
let batchSize = 3 // The amount of request to do at once.
let delay = 1000 // The delay between each batch.
// Loop through all the data you have.
for(let i=0;i<someData.length;i++){
if(i%batchSize == 0){
console.log("Splitting in batch...")
// Process a batch on one timeout.
let timeout = setTimeout(() => {
// An empty line to clarify the console.
console.log("")
// Grouping the request by the "batchSize" or less if we're almost done.
for(let j=0;j<batchSize;j++){
// If there still is data to process.
if(i+j < someData.length){
// Your real database request goes here.
saveToDB(someData[i+j], i+j).then(result=>{
console.log(result)
// Do something with the result.
// ...
})
} // END if there is still data.
} // END sending requests for that batch.
},delay*i) // Timeout delay.
} // END splitting in batch.
} // END for each data.
We're having an issue in our main data synchronization back-end function. Our client's mobile device is pushing changes daily, however last week they warned us some changes weren't updated in the main web app.
After some investigation in the logs, we found that there is indeed a single transaction that fails and rollback. However it appears that all the transactions before this one also rollback.
The code works this way. The data to synchronize is an array of "changesets", and each changset can update multiple tables at once. It's important that a changset be updated completely or not at all, so each is wrapped in a transaction. Then each transaction is executed one after the other. If a transaction fails, the others shouldn't be affected.
I suspect that all the transactions are actually combined somehow, possibly through the main db.task. Instead of just looping to execute the transactions, we're using a db.task to execute them in batch avoid update conflicts on the same tables.
Any advice how we could execute these transactions in batch and avoid this rollback issue?
Thanks, here's a snippet of the synchronization code:
// Begin task that will execute transactions one after the other
db.task(task => {
const transactions = [];
// Create a transaction for each changeset (propriete/fosse/inspection)
Object.values(data).forEach((change, index) => {
const logchange = { tx: index };
const c = {...change}; // Use a clone of the original change object
transactions.push(
task.tx(t => {
const queries = [];
// Propriete
if (Object.keys(c.propriete.params).length) {
const params = proprietes.parse(c.propriete.params);
const propriete = Object.assign({ idpropriete: c.propriete.id }, params);
logchange.propriete = { idpropriete: propriete.idpropriete };
queries.push(t.one(`SELECT ${Object.keys(params).join()} FROM propriete WHERE idpropriete = $1`, propriete.idpropriete).then(previous => {
logchange.propriete.previous = previous;
return t.result('UPDATE propriete SET' + qutil.setequal(params) + 'WHERE idpropriete = ${idpropriete}', propriete).then(result => {
logchange.propriete.new = params;
})
}));
}
else delete c.propriete;
// Fosse
if (Object.keys(c.fosse.params).length) {
const params = fosses.parse(c.fosse.params);
const fosse = Object.assign({ idfosse: c.fosse.id }, params);
logchange.fosse = { idfosse: fosse.idfosse };
queries.push(t.one(`SELECT ${Object.keys(params).join()} FROM fosse WHERE idfosse = $1`, fosse.idfosse).then(previous => {
logchange.fosse.previous = previous;
return t.result('UPDATE fosse SET' + qutil.setequal(params) + 'WHERE idfosse = ${idfosse}', fosse).then(result => {
logchange.fosse.new = params;
})
}));
}
else delete c.fosse;
// Inspection (rendezvous)
if (Object.keys(c.inspection.params).length) {
const params = rendezvous.parse(c.inspection.params);
const inspection = Object.assign({ idvisite: c.inspection.id }, params);
logchange.rendezvous = { idvisite: inspection.idvisite };
queries.push(t.one(`SELECT ${Object.keys(params).join()} FROM rendezvous WHERE idvisite = $1`, inspection.idvisite).then(previous => {
logchange.rendezvous.previous = previous;
return t.result('UPDATE rendezvous SET' + qutil.setequal(params) + 'WHERE idvisite = ${idvisite}', inspection).then(result => {
logchange.rendezvous.new = params;
})
}));
}
else delete change.inspection;
// Cheminees
c.cheminees = Object.values(c.cheminees).filter(cheminee => Object.keys(cheminee.params).length);
if (c.cheminees.length) {
logchange.cheminees = [];
c.cheminees.forEach(cheminee => {
const params = cheminees.parse(cheminee.params);
const ch = Object.assign({ idcheminee: cheminee.id }, params);
const logcheminee = { idcheminee: ch.idcheminee };
queries.push(t.one(`SELECT ${Object.keys(params).join()} FROM cheminee WHERE idcheminee = $1`, ch.idcheminee).then(previous => {
logcheminee.previous = previous;
return t.result('UPDATE cheminee SET' + qutil.setequal(params) + 'WHERE idcheminee = ${idcheminee}', ch).then(result => {
logcheminee.new = params;
logchange.cheminees.push(logcheminee);
})
}));
});
}
else delete c.cheminees;
// Lock from further changes on the mobile device
// Note: this change will be sent back to the mobile in part 2 of the synchronization
queries.push(t.result('UPDATE rendezvous SET timesync = now() WHERE idvisite = $1', [c.idvisite]));
console.log(`transaction#${++transactionCount}`);
return t.batch(queries).then(result => { // Transaction complete
logdata.transactions.push(logchange);
});
})
.catch(function (err) { // Transaction failed for this changeset, rollback
logdata.errors.push({ error: err, change: change }); // Provide error message and original change object to mobile device
console.error(JSON.stringify(logdata.errors));
})
);
});
console.log(`Total transactions: ${transactions.length}`);
return task.batch(transactions).then(result => { // All transactions complete
// Log everything that was uploaded from the mobile device
log.log(res, JSON.stringify(logdata));
});
I apologize, this is almost impossible to make a final good answer when the question is wrong on too many levels...
It's important that a change set be updated completely or not at all, so each is wrapped in a transaction.
If the change set requires data integrity, the whole thing must be one transaction, and not a set of transactions.
Then each transaction is executed one after the other. If a transaction fails, the others shouldn't be affected.
Again, data integrity is what a single transaction guarantees, you need to make it into one transaction, not multiple.
I suspect that all the transactions are actually combined somehow, possibly through the main db.task.
They are combined, and not through task, but through method tx.
Any advice how we could execute these transactions in batch and avoid this rollback issue?
By joining them into a single transaction.
You would use a single tx call at the top, and that's it, no tasks needed there. And in case the code underneath makes use of its own transactions, you can update it to allow conditional transactions.
Also, when building complex transactions, an app benefits a lot from using the repository patterns shown in pg-promise-demo. You can have methods inside repositories that support conditional transactions.
And you should redo your code to avoid horrible things it does, like manual query formatting. For example, never use things like SELECT ${Object.keys(params).join()}, that's a recipe for disaster. Use the proper query formatting that pg-promise gives you, like SQL Names in this case.
I'm developing an app with the following node.js stack: Express/Socket.IO + React. In React I have DataTables, wherein you can search and with every keystroke the data gets dynamically updated! :)
I use Socket.IO for data-fetching, so on every keystroke the client socket emits some parameters and the server calls then the callback to return data. This works like a charm, but it is not garanteed that the returned data comes back in the same order as the client sent it.
To simulate: So when I type in 'a', the server responds with this same 'a' and so for every character.
I found the async module for node.js and tried to use the queue to return tasks in the same order it received it. For simplicity I delayed the second incoming task with setTimeout to simulate a slow performing database-query:
Declaration:
const async = require('async');
var queue = async.queue(function(task, callback) {
if(task.count == 1) {
setTimeout(function() {
callback();
}, 3000);
} else {
callback();
}
}, 10);
Usage:
socket.on('result', function(data, fn) {
var filter = data.filter;
if(filter.length === 1) { // TEST SYNCHRONOUSLY
queue.push({name: filter, count: 1}, function(err) {
fn(filter);
// console.log('finished processing slow');
});
} else {
// add some items to the queue
queue.push({name: filter, count: filter.length}, function(err) {
fn(data.filter);
// console.log('finished processing fast');
});
}
});
But the way I receive it in the client console, when I search for abc is as follows:
ab -> abc -> a(after 3 sec)
I want it to return it like this: a(after 3sec) -> ab -> abc
My thought is that the queue runs the setTimeout and then goes further and eventually the setTimeout gets fired somewhere on the event loop later on. This resulting in returning later search filters earlier then the slow performing one.
How can i solve this problem?
First a few comments, which might help clear up your understanding of async calls:
Using "timeout" to try and align async calls is a bad idea, that is not the idea about async calls. You will never know how long an async call will take, so you can never set the appropriate timeout.
I believe you are misunderstanding the usage of queue from async library you described. The documentation for the queue can be found here.
Copy pasting the documentation in here, in-case things are changed or down:
Creates a queue object with the specified concurrency. Tasks added to the queue are processed in parallel (up to the concurrency limit). If all workers are in progress, the task is queued until one becomes available. Once a worker completes a task, that task's callback is called.
The above means that the queue can simply be used to priorities the async task a given worker can perform. The different async tasks can still be finished at different times.
Potential solutions
There are a few solutions to your problem, depending on your requirements.
You can only send one async call at a time and wait for the first one to finish before sending the next one
You store the results and only display the results to the user when all calls have finished
You disregard all calls except for the latest async call
In your case I would pick solution 3 as your are searching for something. Why would you use care about the results for "a" if they are already searching for "abc" before they get the response for "a"?
This can be done by giving each request a timestamp and then sort based on the timestamp taking the latest.
SOLUTION:
Server:
exports = module.exports = function(io){
io.sockets.on('connection', function (socket) {
socket.on('result', function(data, fn) {
var filter = data.filter;
var counter = data.counter;
if(filter.length === 1 || filter.length === 5) { // TEST SYNCHRONOUSLY
setTimeout(function() {
fn({ filter: filter, counter: counter}); // return to client
}, 3000);
} else {
fn({ filter: filter, counter: counter}); // return to client
}
});
});
}
Client:
export class FilterableDataTable extends Component {
constructor(props) {
super();
this.state = {
endpoint: "http://localhost:3001",
filters: {},
counter: 0
};
this.onLazyLoad = this.onLazyLoad.bind(this);
}
onLazyLoad(event) {
var offset = event.first;
if(offset === null) {
offset = 0;
}
var filter = ''; // filter is the search character
if(event.filters.result2 != undefined) {
filter = event.filters.result2.value;
}
var returnedData = null;
this.state.counter++;
this.socket.emit('result', {
offset: offset,
limit: 20,
filter: filter,
counter: this.state.counter
}, function(data) {
returnedData = data;
console.log(returnedData);
if(returnedData.counter === this.state.counter) {
console.log('DATA: ' + JSON.stringify(returnedData));
}
}
This however does send unneeded data to the client, which in return ignores it. Somebody any idea's for further optimizing this kind of communication? For example a method to keep old data at the server and only send the latest?
I am adding user validation an data modification page on a node.js application.
In a synchronous universe, in a single function I would:
Lookup the original record in the database
Lookup the user in LDAP to see if they are the owner or admin
Do the logic and write the record.
In an asynchronous universe that won't work. To solve it I've built a series of hand-off functions:
router.post('/writeRecord', jsonParser, function(req, res) {
post = req.post;
var smdb = new AWS.DynamoDB.DocumentClient();
var params = { ... }
smdb.query(params, function(err,data){
if( err == null ) writeRecordStep2(post,data);
}
});
function writeRecord2( ru, post, data ){
var conn = new LDAP();
conn.search(
'ou=groups,o=amazon.com',
{ ... },
function(err,resp){
if( err == null ){
writeRecordStep3( ru, post, data, ldap1 )
}
}
}
function writeRecord3( ru, post, data ){
var conn = new LDAP();
conn.search(
'ou=groups,o=amazon.com',
{ ... },
function(err,resp){
if( err == null ){
writeRecordStep4( ru, post, data, ldap1, ldap2 )
}
}
}
function writeRecordStep4( ru, post, data, ldap1, ldap2 ){
// Do stuff with collected data
}
Additionally, because the LDAP and Dynamo logic are in their own source documents, these functions are scattered tragically around the code.
This strikes me as inefficient, as well as inelegant. I'm eager to find a more natural asynchronous pattern to achieve the same result.
Any promise library should sort your issue out. My preferred choice is bluebird. In summary they help you in performing blocking operations.
If you haven't heard about bluebird then just use it. It converts all function of a module and return promise which is then-able. Simply put, it promisifies all functions.
Here is the mechanism:
Module1.someFunction() \\do your job and finally pass the return object to next call
.then() \\Use that object which is return from the first call, do your job and return the updated value
.then() \\same goes on
.catch() \\do your job when any error occurs.
Hope you understand. Here is an example:
var readFile = Promise.promisify(require("fs").readFile);
readFile("myfile.js",
"utf8").then(function(contents) {
return eval(contents);
}).then(function(result) {
console.log("The result of evaluating
myfile.js", result);
}).catch(SyntaxError, function(e) {
console.log("File had syntax error", e);
//Catch any other error
}).catch(function(e) {
console.log("Error reading file", e);
});
I could not tell from your pseudo-code exactly which async operations depend upon results from with other ones and knowing that is key to the most efficient way to code a series of asynchronous operations. If two operations do not depend upon one another, they can run in parallel which generally gets to an end result faster. I also can't tell exactly what data needs to be passed on to later parts of the async requests (too much pseudo-code and not enough real code to show us what you're really attempting to do).
So, without that level of detail, I'll show you two ways to approach this. The first runs each operation sequentially. Run the first async operation, when it's done, run the next one and accumulates all the results into an object that is passed along to the next link in the chain. This is general purpose since all async operations have access to all the prior results.
This makes use of promises built into the AWS.DynamboDB interface and makes our own promise for conn.search() (though if I knew more about that interface, it may already have a promise interface).
Here's the sequential version:
// promisify the search method
const util = require('util');
LDAP.prototype.searchAsync = util.promisify(LDAP.prototype.search);
// utility function that does a search and adds the result to the object passed in
// returns a promise that resolves to the object
function ldapSearch(data, key) {
var conn = new LDAP();
return conn.searchAsync('ou=groups,o=amazon.com', { ... }).then(results => {
// put our results onto the passed in object
data[key] = results;
// resolve with the original object (so we can collect data here in a promise chain)
return data;
});
}
router.post('/writeRecord', jsonParser, function(req, res) {
let post = req.post;
let smdb = new AWS.DynamoDB.DocumentClient();
let params = { ... }
// The latest AWS interface gets a promise with the .promise() method
smdb.query(params).promise().then(dbresult => {
return ldapSearch({post, dbresult}, "ldap1");
}).then(result => {
// result.dbresult
// result.ldap1
return ldapSearch(result, "ldap2")
}).then(result => {
// result.dbresult
// result.ldap1
// result.ldap2
// doSomething with all the collected data here
}).catch(err => {
console.log(err);
res.status(500).send("Internal Error");
});
});
And, here's a parallel version that runs all three async operations at once and then waits for all three of the to be done and then has all the results at once:
// if the three async operations you show can be done in parallel
// first promisify things
const util = require('util');
LDAP.prototype.searchAsync = util.promisify(LDAP.prototype.search);
function ldapSearch(params) {
var conn = new LDAP();
return conn.searchAsync('ou=groups,o=amazon.com', { ... });
}
router.post('/writeRecord', jsonParser, function(req, res) {
let post = req.post;
let smdb = new AWS.DynamoDB.DocumentClient();
let params = { ... }
Promise.all([
ldapSearch(...),
ldapSearch(...),
smdb.query(params).promise()
]).then(([ldap1Result, ldap2Result, queryResult]) => {
// process ldap1Result, ldap2Result and queryResult here
}).catch(err => {
console.log(err);
res.status(500).send("Internal Error");
});
});
Keep in mind that due to the pseudo-code nature of the code in your question, this is also pseudo-code where implementation details (exactly what parameters you're searching for, what response you're sending, etc...) have to be filled in. This should be illustrative of promise chaining to serialize operations and the use of Promise.all() for parallelizing operations and promisifying a method that didn't have promises built in.