Mysql inserts with AWS Lambda + Node.js - node.js

I'm running nodejs function in Amazon Lambda. It is supposed to do an insert to mysql DB after a HTTP get. Everything seems to be fine -- looking at the cloudwatch logs the query is parsed correctly and if I copy paste the query to mysql console it does exactly what it is supposed to.
Essentially:
var mysql = require('mysql')
var connection = createConnection({ connection details });
connection.connect();
var query = connection.query('Insert into AAA select * \
from BBB where BBB.a = ?;', [parameter],
function(err, result) {}
);
connection.end();
The problems is that the Lambda version simply does nothing. Query is visible and correct and the function returns cleanly but it never actually inserts anything. I have the same problem with update query as well but all the mysql selects work and return stuff so the problem is not that. The insert also works when I run it on my machine -- when I push it to lambda the problem appears.
I tried to add a separate commit statement but couldn't get it working either. I'm clearly missing something but can't figure out what. Do I need to have a transaction block for updates?
EDIT: Per Mark B's request. I think I tried to be smarter than I am by showing only part of the code. The whole logic was:
exports.handler = function(event, context, callback){
if ( event.A == -1 ){
exports.updateDB(event, function(res) {
context.succeed(res)
}
}
};
exports.updateDB = function(event, callback) {
var mysql = require('mysql')
var connection = createConnection({ connection details });
connection.connect();
var query = connection.query( 'update products set A=? where product_id = ?;',
[parameters],
function(err,result){ });
var query = connection.query( 'insert into other_table select * from products where product_id = ?;',
[parameters],
function(err,result){ });
connection.commit(function(err) {
if(err) {
connection.rollback(function() {
throw(err);
});
}
connection.end();
});
callback({"ok":"ok"})
};
Per advice given here I made the following changes. I took the last callback away, and did put callbacks inside both connection.queries:
var query = connection.query( 'insert into other_table select * from products where product_id = ?;',
[parameters],
function(err,result){
callback({"ok":"ok"})
});
And it seems to work. I'm guessing now that the commit -part does nothing but it doesn't seem to break it either. It probably is obvious at this point that I'm not much of a developer and even less so familiar with node.js so I truly appreciate the help I got!

Please note that the query function is an asynchronous function, meaning that it will be no result available until the callback function is triggered. In your sample code, the connection is closed immediately after it was triggered, long before the callback is executed. Try changing the code so that connection is closed by in the callback function, e.g.
var query = connection.query('Insert into AAA select * \
from BBB where BBB.a = ?;', [parameter],
function(err, result) {
// now it is ok to close the connection
connection.end();
if (err) {
// error handling
}
else {
// do something with the result
}
}
);
By the way, since you are working with Lambda, the same thing applies to the callback(), context.succeed() and context.fail() function handlers. In other words, it is likely that you would like to call them where I wrote the comments about error and result handling above.

Related

Why AWS Lambda execution time is long using pg-promise

I started using AWS Lambda to perform a very simple task which is executing an SQL query to retrieve records from an RDS postgres database and create SQS message base on the result.
Because Amazon is only providing aws-sdk module (using node 4.3 engine) by default and we need to execute this SQL query, we have to create a custom deployment package which includes pg-promise. Here is the code I'm using:
console.info('Loading the modules...');
var aws = require('aws-sdk');
var sqs = new aws.SQS();
var config = {
db: {
username: '[DB_USERNAME]',
password: '[DB_PASSWORD]',
host: '[DB_HOST]',
port: '[DB_PORT]',
database: '[DB_NAME]'
}
};
var pgp = require('pg-promise')({});
var cn = `postgres://${config.db.username}:${config.db.password}#${config.db.host}:${config.db.port}/${config.db.database}`;
if (!db) {
console.info('Connecting to the database...');
var db = pgp(cn);
} else {
console.info('Re-use database connection...');
}
console.log('loading the lambda function...');
exports.handler = function(event, context, callback) {
var now = new Date();
console.log('Current time: ' + now.toISOString());
// Select auction that need to updated
var query = [
'SELECT *',
'FROM "users"',
'WHERE "users"."registrationDate"<=${now}',
'AND "users"."status"=1',
].join(' ');
console.info('Executing SQL query: ' + query);
db.many(query, { status: 2, now: now.toISOString() }).then(function(data) {
var ids = [];
data.forEach(function(auction) {
ids.push(auction.id);
});
if (ids.length == 0) {
callback(null, 'No user to update');
} else {
var sqsMessage = {
MessageBody: JSON.stringify({ action: 'USERS_UPDATE', data: ids}), /* required */
QueueUrl: '[SQS_USER_QUEUE]', /* required */
};
console.log('Sending SQS Message...', sqsMessage);
sqs.sendMessage(sqsMessage, function(err, sqsResponse) {
console.info('SQS message sent!');
if (err) {
callback(err);
} else {
callback(null, ids.length + ' users were affected. SQS Message created:' + sqsResponse.MessageId);
}
});
}
}).catch(function(error) {
callback(error);
});
};
When testing my lambda function, if you look at the WatchLogs, the function itself took around 500ms to run but it says that it actually took 30502.48 ms (cf. screenshots).
So I'm guessing it's taking 30 seconds to unzip my 318KB package and start executing it? That for me is just a joke or am I missing something? I tried to upload the zip and also upload my package to S3 to check if it was faster but I still have the same latency.
I noticed that the Python version can natively perform SQL request without any custom packaging...
All our applications are written in node so I don't really want to move away from it, however I have a hard time to understand why Amazon is not providing basic npm modules for database interactions.
Any comments or help are welcome. At this point I'm not sure Lambda would be benefic for us if it takes 30 seconds to run a script that is triggered every minute...
Anyone facing the same problem?
UPDATE: This is how you need to close the connection as soon as you don't need it anymore (thanks again to Vitaly for his help):
exports.handler = function(event, context, callback) {
[...]
db.many(query, { status: 2, now: now.toISOString() }).then(function(data) {
pgp.end(); // <-- This is important to close the connection directly after the request
[...]
The execution time should be measured based on the length of operations being executed, as opposed to how long it takes for the application to exit.
There are many libraries out there that make use of a connection pool in one form or another. Those typically terminate after a configurable period of inactivity.
In case of pg-promise, which in turn uses node-postgres, such period of inactivity is determined by parameter poolIdleTimeout, which defaults to 30 seconds. With pg-promise you can access it via pgp.pg.defaults.poolIdleTimeout.
If you want your process to exit after the last query has been executed, you need to shut down the connection pool, by calling pgp.end(). See chapter Library de-initialization for details.
It is also shown in most of the code examples, as those need to exit right after finishing.

How can I execute queries one after the other and extract value from 1st query and use it in the 2nd using expressJS?

router.post("/application_action", function(req,res){
var Employee = req.body.Employee;
var conn = new jsforce.Connection({
oauth2 : salesforce_credential.oauth2
});
var username = salesforce_credential.username;
var password = salesforce_credential.password;
conn.login(username, password, function(err, userInfo, next) {
if (err) { return console.error(err); res.json(false);}
// I want this conn.query to execute first and then conn.sobject
conn.query("SELECT id FROM SFDC_Employee__c WHERE Auth0_Id__c = '" + req.user.id + "'" , function(err, result) {
if (err) { return console.error(err); }
Employee["Id"] = result.records[0].Id;
});
//I want this to execute after the execution of above query i.e. conn.query
conn.sobject("SFDC_Emp__c").update(Employee, function(err, ret) {
if (err || !ret.success) { return console.error(err, ret);}
console.log('Updated Successfully : ' + ret.id);
});
});
I have provided my code above. I need to modify Employee in the conn.query and use it in conn.sobject. I need to make sure that my first query executes before 2nd because I am getting value from 1st and using in the 2nd. Please do let me know if you know how to accomplish this.
New Answer Based on Edit to Question
To execute one query based on the results of the other, you put the second query inside the completion callback of the first like this:
router.post("/application_action", function (req, res) {
var Employee = req.body.Employee;
var conn = new jsforce.Connection({
oauth2: salesforce_credential.oauth2
});
var username = salesforce_credential.username;
var password = salesforce_credential.password;
conn.login(username, password, function (err, userInfo, next) {
if (err) {
return console.error(err);
res.json(false);
}
// I want this conn.query to execute first and then conn.sobject
conn.query("SELECT id FROM SFDC_Employee__c WHERE Auth0_Id__c = '" + req.user.id + "'", function (err, result) {
if (err) {
return console.error(err);
}
Employee["Id"] = result.records[0].Id;
//I want this to execute after the execution of above query i.e. conn.query
conn.sobject("SFDC_Emp__c").update(Employee, function (err, ret) {
if (err || !ret.success) {
return console.error(err, ret);
}
console.log('Updated Successfully : ' + ret.id);
});
});
});
});
The only place that the first query results are valid is inside that callback because otherwise, you have no way of knowing when those asynchronous results are actually available and valid.
Please note that your error handling is unfinished since you don't finish the response in any of the error conditions and even in the success case, you have not yet actually sent a response to finish the request.
Original Answer
First off, your code shows a route handler, not middleware. So, if you really intend to ask about middleware, you will have to show your actual middleware. Middleware that does not end the request needs to declare next as an argument and then call it when it is done with it's processing. That's how processing continues after the middleware.
Secondly, your console.log() statements are all going to show undefined because they execute BEFORE the conn.query() callback that contains the code that sets those variables.
conn.query() is an asynchronous operation. It calls its callback sometime IN THE FUTURE. Meanwhile, your console.log() statements execute immediately.
You can see the results of the console.log() by putting the statements inside the conn.query() callback, but that is probably only part of your problem. If you explain what you're really trying to accomplish, then we could probably help with a complete solution. Right now, you're just asking questions about flawed code, but not explaining the higher level problem you're trying to solve so you're making it hard for us to give you the best answer to your actual problem.
FYI:
app.locals - properties scoped to your app, available to all request handlers.
res.locals - properties scoped to a specific request, available only to middleware or request handlers involved in processing this specific request/response.
req.locals - I can't find any documentation on this in Express or HTTP module. There is discussion of this as basically serving the same purpose as res.locals, though it is not documented.
Other relevants answers:
req.locals vs. res.locals vs. res.data vs. req.data vs. app.locals in Express middleware
Express.js: app.locals vs req.locals vs req.session
You miss the basics of the asynchronous flow in javascript. All the callbacks are set to the end of event loop, so the callback of the conn.query will be executed after console.logs from the outside. Here is a good article where the the basic concepts of asynchronous programming in JavaScript are explained.

NodeJS, wait.for module and mysql query

Trying to use the wait.for module with a mysql query ( can do it with callbacks but would be nice to be able to do it with 'wait.for' )
I know that the sql connection query is non-standard so I'm not sure how to convert it.
var getUserQuery = "SELECT * FROM x WHERE id= '5'";
connection.query(getUserQuery, function(err, rows, fields){
....
});
How would I go about waiting to get the rows of this before my code proceeds on ?
link to module 'wait.for'
The part I am not understanding is at the bottom of that page - (Notes on usage on non-standard callbacks. e.g.: connection.query from mysql ).
https://github.com/luciotato/waitfor#notes-on-usage-on-non-standard-callbacks-eg-connectionquery-from-mysql
Notes on usage on non-standard callbacks. e.g.: connection.query from mysql
wait.for expects standardized callbacks. A standardized callback always returns (err,data) in that order.
A solution for the sql.query method and other non-standard callbacks is to create a wrapper function standardizing the callback, e.g.:
connection.prototype.q = function(sql, params, stdCallback){
this.query(sql,params, function(err,rows,columns){
return stdCallback(err,{rows:rows,columns:columns});
});
}
usage:
try {
var getUserQuery = "SELECT * FROM x WHERE id= ?";
var result = wait.forMethod(connection, "q", getUserQuery, ['5']);
console.log(result.rows);
console.log(result.columns);
}
catch(err) {
console.log(err);
}

NodeJS with arangojs and sync: Everything after .sync() ignored?

I want to use NodeJS to read 60k records from a MySQL database and write them to a ArangoDB database. I will later use ArangoDB's aggregation features etc. to process my dataset.
Coming from PHP, where a script usually runs synchronous, and because I believe it makes sense here, my initial (naive) try was to make my NodeJS script run sync too. However, it doesn't work as expected:
I print to console, call a function via .sync() to connect to ArangoDB server and print all existing databases, then print to console again. But everything below the sync call to my ArangoDB function is completely ignored (does not print to console again, nor does it seem to execute anything else here).
What am I overlooking? Does .done() in the function called via .sync() cause trouble?
var mysql = require('node-mysql');
var arango = require('arangojs');
//var sync = require('node-sync'); // Wrong one!
var sync = require('sync');
function test_arango_query() {
var db = arango.Connection("http://localhost:8529");
db.database.list().done(function(res) {
console.log("Databases: %j", res);
});
return "something?";
}
sync(function() {
console.log("sync?");
var result = test_arango_query.sync();
console.log("done."); // DOES NOT PRINT, NEVER EXECUTED?!
return result;
}, function(err, result) {
if (err) console.error(err);
console.log(result);
});
Your function test_arango_query doesn't use a callback. sync only works with functions that use a callback. It needs to know when the data is ready to return it from .sync(), if your function never calls the callback, then sync can't ever return a result.
Update your function to call a callback function when you want it to return:
function test_arango_query(callback) {
var db = arango.Connection("http://localhost:8529");
db.database.list().done(function(res) {
console.log("Databases: %j", res);
callback('something');
});
}

node.js, pg module and done() method

Using the pg module and clients pool I need to call done() method in order to return the client into clients pool.
Once I connect to the server, I add SQL query client’s query queue and I start handling the result asynchronously row by row in row event:
// Execute SQL query
var query = client.query("SELECT * FROM categories");
// Handle every row asynchronously
query.on('row', handleRow );
When I should call done() method?
Should I call it once I receive the end event and all rows are processed or I can call it immediately after I add SQL query to the client’s query queue?
Going from an example on this project's page (https://github.com/brianc/node-pg-query-stream), I'd recommend calling it when you get the end event.
This makes sense, because you're not done with it until you're received the last row. If someone else got that same connection and tried using it, that would likely create odd errors.
The former makes sense: you would want to call it once you know you have processed all rows for your query.
// your DB connection info
var conString = "pg://admin:admin#localhost:5432/Example";
var pg = require("pg");
var client = new pg.Client(conString);
client.connect();
// Your own query
var query = client.query("SELECT * FROM mytable");
query.on("row", function (row, result) {
// do your stuff with each row
result.addRow(row);
});
query.on("end", function (result) {
// here you have the complete result
console.log(JSON.stringify(result.rows, null, 2));
// end when done ;)
client.end();
});

Resources