Promise.all vs looping in AVA test - node.js

I am using a mongo in memory test fixture loader that gets primed before each test. Normally works flawlessly.
I have a function getSample that makes a db call that I test using AVA. Wanting to call this multiple time with different parameters (timestamps) I tried this:
const timestamps = [ '2017-08-14T00:00:00.000Z', '2017-08-13T00:00:00.000Z']
const tasks = timestamps.map(t => getSample(t))
const samples = await Promise.all(tasks)
This failed in an interesting way. My first call works (db results are there) and all others return an empty set - no errors).
Changing code to this format works. All loop instances find the collection and content.
let samples = []
for (let t of timestamps) {
samples.push(await getSample(t))
}
const getSample = async () => {
const c = await getCollection('foo') // fetches open mongo connection and returns collection
return c.find().toArray()
}
With a standard Mongo DB things work fine. But evidently there is a difference in how these 2 pieces of code work and I'd like to understand what that is. To be clear I am not looking for a fix for my in memory db - more wanting to understand what might be happening.
It might be related to this SO post but not certain.

Related

ObjectionJS model used in knex migration reports "relation does not exist" when running batch of migrations

When running batch of knex migrations, either through the API or via the CLI, the migrations might fail if they use ObjectionJS models. This can happen particularly in the case where the knexfile itself is resolved as an asynchronous function.
Setup
To explain this better, here is an example:
database.js
// This file stores logic responsible for providing credentials.
async function getKnexfile() {
// Some asynchronous behaviour that returns valid configuration.
// A good use case for this can be retrieving a secret stored in AWS Secrets Manager
// and passing it to the connection string part of the config.
//
// For this example, let's assume the following is returned:
return {
client: 'pg',
connectionString: 'pg://user:password#host:5432/database'
};
}
module.exports = { getKnexfile };
knexfile.js
module.exports = require('./database').getKnexfile();
Now let's consider two migration files that will be ran concurrently.
001_build_schema.js
exports.up = async (knex) => {
await knex.schema.createTable('mytable', (table) => {
table.string('id').unique().notNullable().primary();
table.string('text', 45);
});
}
exports.down = async (knex) => {
await knex.schema.dropTable('mytable');
}
And in the second migration file, we begin by importing one of the models. I'm not providing the complete source for that model because ultimately, the way it is defined does not really matter for this example. The important part however, is that (in my case) this model was making use of several plugins, such as knexSnakeCaseMappers(), which together with the fact that my configuration was fetched asynchronously required some creative coding. The partial source for that model will be defined at the end.
002_insert_data.js
const MyModel = require('./MyModel');
exports.up = async (knex) => {
await MyModel.query().insert({text: 'My Text'});
}
exports.down = async (knex) => {
// Do nothing, this part is irrelevant...
}
The problem
What does not work, is running the two migrations as a batch. This means that triggering the batch of migrations (i.e. via CLI), causes them to fail like so:
# We are currently at the base migration (i.e. migrations were not ran yet).
knex migrate:latest
The above will result in the following error:
migration file "002_insert_data.js" failed
migration failed with error: insert into "mytable" ("text") values ($1) returning "id" - relation "mytable" does not exist
DBError: insert into "mytable" ("text") values ($1) returning "id" - relation "mytable" does not exist
This seemed like the migrations were not being awaited (i.e. migration 002 was running before migration 001 has finished), but experimenting with it has shown that this was not the case. Or at least, the problem was not as simple as the migrations not running one after another, since using simple console.log statements have shown that these files were in fact executed concurrently.
Moreover, running the migrations one by one (i.e. not in a batch) using script similar to the following would result in successful migrations and the data would be populated in the database appropriately:
knex migrate:up && knex migrate:up
Having made sure that the schema used was identical across the board (setting .withSchema('schema_name')), I figured out that the issue must have been related to the migrations being run in transactions, but using flag disableTransactions: true has proven to be a poor solution, since in case of a crash, the database would be left in an unknown state.
Here is the partial source for MyModel.js
const { Model, knexSnakeCaseMappers, snakeCaseMappers } = require('objection');
// The below line imports an async function that returns the connection string. This is
// needed as knex() expects the provided argument to be an object, and accepts async function
// only for the connection field (which is why previously defined getKnexfile cannot be used).
const getConnectionStringAsync = require('./database');
const db = knex({
client: 'pg',
connection: knexfile.getConnectionString,
...knexSnakeCaseMappers(),
});
Model.knex(db);
module.exports = class MyModel extends Model {
// The implementation of the model goes here...
// The table name of this model is set to `mytable`.
}
I have managed to solve the problem by realising two things:
The migrations are ran in transactions, which suggests that the actual knex object used to communicate with the database is shared across migrations and is the same. It therefore matters which knex object is used.
My setup with asynchronous configuration fetching resulted in multiple connections when running migrations that make use of the models, because models would initialise their own connections.
From there, the solution was pretty obvious: use the same knex object across all the models and migration commands. This could be achieved in a relatively easy manner, by tweaking the migration files that use models in a following way:
002_insert_data.js
// Import the model as previously (name can be changed for clarity).
const MyModelUnbound = require('./MyModel');
exports.up = async (knex) => {
// Bind the existing knex connection to the model.
const MyModel = MyModelUnbound.bindKnex(knex);
await MyModel.query().insert({text: 'My Text'});
}
// ...
It's important to note, that the above code sets the knexfile configuration in the model, adding the knexSnakeCaseMapper plugin, which will not be applied to the knex configuration generated by the getKnexfile() function. This could be fixed by moving that configuration to the getKnexfile() method (or in case where the API is used, duplicating that definition in the knexfile configuration in that place).
This has fixed my issue completely and now running the migrations in batches works fine. One thing I am still not entirely sure about is why the initial behaviour actually takes place. The way I imagined the transactions working was on a migration basis (i.e. 1 migration = 1 transaction), which would suggest that things should work one way or another.
My current theory is that there might be some race condition for when the first migration's transaction is completed, and when the next connection is established for the models in second migration. Either way, binding the original knex object (built during the call to migrations API or CLI) solves the problem.
Taking in consideration Marceli's reply, you can as well bind transaction directly in the query like:
exports.up = async (knex) => {
await MyModel.query(knex).insert({text: 'My Text'});
}
this way works better if you have joins in your model

pass result from AwaitReactions out of function discord.js

I am new to async functions. I want to utilize the code below to ask the user a question and react to the question with an X or check mark to get the users answer on whether or not to delete something to make room for a new entry.
The function below works perfectly fine. However, I want to pass the result from the function out of the function so I can make an if else statement outside of it and that is where I am stuck.
I've looked around online and saw several things related to callbacks being used, but each example I've seen is different for something I think is similar, so I am just confused. And none of these examples have been used for Reactions on Discord, so I'm just not sure where to go.
const agree = "✅"
const disagree = "❌"
let msg = await message.author.send("You have made the maximum number of decks. Would you like to delete one of your decks in order to make a new one? Please react with one of the following...")
await msg.react(agree)
await msg.react(disagree)
const filter = (reaction, user) => {
return ['✅', '❌'].includes(reaction.emoji.name) && user.id === message.author.id;
};
const reactions = await msg.awaitReactions(filter, {
max: 1
}).then(collected => {
const result = collected.last();
})
return result;
}
deleteDeckQuestion(function(result){
console.log(result)
}).catch(err => console.error(err))
The above code results in 'undefined' being logged to the console when I run deleteDeckQuestion. No errors otherwise. I would like it to make the Results variable accessible to me outside the function so I can make an if else statement based upon which reaction the user added to the question.
I tried putting the if else statement I wanted to use with the results of deleteDeckQuestion inside the async function and it operated fine, but then inside the "Yes" result of that function, I want to put another Async function to ask which deck 1, 2 or 3 should be deleted and have the same reaction-determines-answer-to-question scenario. Just saves the user typing more than necessary at the ease of mobile users.
Would it be easier I just put an async function inside another async function? Something tells me that isn't the best idea in terms of efficiency. Eventually these reactions will lead to using mysql queries, which I am comfortable with using, but it will get pretty lengthy and functions inside other functions just seems like a mess... not sure if that is part of the "callback hell" I've read the joys of though...
Thanks for any help in advance.
collected within your then() callback and reactions are the exact same object. However, result's scope is limited to within the callback.
In this example code, collected is the result of msg.awaitReactions(...)'s fulfilled promise. Then, result is declared in the same scope, and therefore accessible where you need it to be.
const collected = await msg.awaitReactions(filter, { max: 1 })
.catch(console.error);
const result = collected.first();
MDN: Async Programming, await, then(), scope
Discord.js: Message.awaitReactions()

mongodb stops while iterating over large DB

This is a follow-up question of this Stackoverflow question: Async Cursor Iteration with Asynchronous Sub-task. with a slightly different turn this time.
While iterating over MongoDB documents the task stops in the middle if the target DB size is too large. (more than 3000 documents in a single collection and each document consists of lengthy texts, so .toArray is not really feasible due to memory limit. 3000 is just a part of the whole data and the full data might be more than 10,000 documents.) I've noticed if the number of documents in a collection is larger than approx. 750, it just stops in the middle of the task.
I've searched over previous Stackoverflow questions to solve this: some say iteration on a large collection requires using stream, each or map instead of for/while with cursor. When I tried these recommendations in real life, non of them did work. They also just stops in the middle, bears almost no difference from for/while iteration. I don't really like the idea of expanding timeout since it may leave the cursor behind drifting around in the memory but it also didn't work.
every method below is under async condition
stream method
const cursor = db.collections('mycollection').find()
cursor.on('data', doc => {
await doSomething(doc)//do something with doc here
})
while/for method(just replace while with for)
const cursor = db.collections('mycollection').find()
while ( await cursor.hasNext() ) {
let doc = await cursor.next()
await doSomething(doc)
}
map/each/foreach method(replace map with foreach/each)
const cursor = db.collections('mycollection').find()
cursor.map(async doc=>{
await doSomething(doc)
})
none of them shows any difference to the other. They just stop when it iterates around approx. 750 documents and just hang. I've even tried registering each document on Promise.all queue and do the async/await task at once later so that cursor won't spend too much time while iterating but the same problem arises.
EDIT: I think doSomething() confuses the other readers. So I have created a sample code so that you can reproduce the problem.
const MongoClient = require('mongodb').MongoClient
const MongoUrl = 'mongodb://localhost:27017/'
const MongoDBname = 'testDB'
const MongoCollection = 'testCollection'
const moment = require('moment')
const getDB = () =>
new Promise((resolve,reject)=>{
MongoClient.connect(MongoUrl,(err,client)=>{
if(err) return reject(err)
console.log('successfully connected to db')
return resolve(client.db(MongoDBname))
client.close()
})
})
;(async ()=>{
console.log(`iteration begins on ${moment().format('YYYY/MM/DD hh:mm:ss')} ------------`)
let db = await getDB() //receives mongodb
//iterate through all db articles...
const cursor = await db.collection(MongoCollection).find()
const maxDoc = await cursor.count()
console.log('Amount of target document:' + maxDoc)
let count = 0
//replace this with stream/while/map...any other iteration methods
cursor.each((err,doc)=>{
count ++
console.log(`preloading doc No.${count} async ${(count / maxDoc * 100).toFixed(2)}%`)
})
})()
My apologies. on the test run. it actually iterated all the documents...I think I really have done something wrong with the other parts. I'll elaborate this one with the other parts causing the trouble.

Running knex queries synchronously

I have complex solution and I just need to run knex synchronously, is it possible?
I have scenario when knex query is run inside Promise.mapSeries for array with unknown number of elements. For each element some knex query is called, including insert query.
So, this insert could affect result for the next element of array.
var descriptionSplitByCommas = desc.split(",");
Promise.mapSeries(descriptionSplitByCommas, function (name) {
// knex.select
// knex.insert if select doesn't return results
});
This was not my initial code, so maybe even Promise.mapSeries should be removed. But I need each descriptionSplitByCommas array elements to be processed syncrhonously.
Otherwise often while processing next description in array I get SQL error, because of duplicate elements inserted for column with unique index. This would not happen if query would be synchronous.
I am using native promises, so I do not have experience with mapSeries, therefore I cannot tell you what exactly is going on at current state.
However running several asynchronous commands in series instead of parallel is quite common. There is one important thing, you have to know - once you create Promise, you do not have control about how and when it will be resolved. So if you create 100 Promises, they all start resolving in parallel.
This is the reason, there is no method for native promises like Promise.series - it is not possible.
What are your options? If you need to "create promise at one place, but run it in another", then factory method is your friend:
const runPromiseLater = () => Promise.resolve(25);
// some code
const myRealPromise = runPromiseLater();
myRealPromise.then( //
Of course, you can create array with these methods, then is question - how to run it in series?
If you can use Node with support for async/await, then for cycle is good enough
async function runInSeries(array) {
for (let i=0;i < array.length; i++){
await array[i]();
// or if you have only instructions in array then you get the value and then call some // await myMethod(array[i])
}
}
If you cant use that, then async library is your friend: https://caolan.github.io/async/docs.html#series
If you need to use the value from previous calls, you can use .waterfall

Update value once write completes in Cloud Function

I'm trying to update one value after a write completes (in a Cloud Function) but it just wont work (I'm sure this is a stupidly simple problem). Code below:
const functions = require('firebase-functions');
const admin = require('firebase-admin');
const firebase = require('firebase');
admin.initializeApp(functions.config().firebase);
exports.createMessage = functions.https.onRequest((request, response) => {
const json = JSON.parse(request.query.json); // == "{'start':0, 'end':0}"
json.start = firebase.database.ServerValue.TIMESTAMP;
admin.database().ref('/messages/').push(json).then(snapshot => {
//Here is the problem. Whatever I try here it won't work to retrieve the value.
//So, how to I get the "start" value, which has been written to the DB (TIMESTAMP value)?
var startValue = snapshot.ref.child('start').val();
snapshot.ref.update({ end: (startValue + 85800000) }).then(snapshot2=>{
response.redirect(303, snapshot.ref);
});
});
});
Is the problem that I'm using admin.database()?
This code:
var startValue = snapshot.ref.child('start').val();
doesn't actually retrieve any values. Take a look at the docs for DataSnapshot. Reach into that snapshot directly with child() - you don't need the ref. Maybe this is what you meant?
var startValue = snapshot.child('start').val();
I'm not sure if there's a bug in Firebase or if I'm using it wrong, but if I try to call any method on the snapshot-reference I will only get an error saying: TypeError: snapshot.xxx is not a function where xxx is the function name i try to use (for example: child(...), forEach(...), etc).
However, the following seems to fix the issue with the snapshot:
admin.database().ref('/messages/').push(json).once('value').then(snapshot => {
instead of:
admin.database().ref('/messages/').push(json).then(snapshot => {
My uneducated guess is that the then-promise, for the push-function returns some faulty snapshot since the only thing that seems to work is snapshot.key.
Also, if I'm not mistaken, doesn't my solution make two reads now, instead of one? Since push will write and then (supposedly) read and return the written value and then I read it once more with once(value).
Does anyone has any further insights into this problem?

Resources