node.js, pg module and done() method - node.js

Using the pg module and clients pool I need to call done() method in order to return the client into clients pool.
Once I connect to the server, I add SQL query client’s query queue and I start handling the result asynchronously row by row in row event:
// Execute SQL query
var query = client.query("SELECT * FROM categories");
// Handle every row asynchronously
query.on('row', handleRow );
When I should call done() method?
Should I call it once I receive the end event and all rows are processed or I can call it immediately after I add SQL query to the client’s query queue?

Going from an example on this project's page (https://github.com/brianc/node-pg-query-stream), I'd recommend calling it when you get the end event.
This makes sense, because you're not done with it until you're received the last row. If someone else got that same connection and tried using it, that would likely create odd errors.

The former makes sense: you would want to call it once you know you have processed all rows for your query.
// your DB connection info
var conString = "pg://admin:admin#localhost:5432/Example";
var pg = require("pg");
var client = new pg.Client(conString);
client.connect();
// Your own query
var query = client.query("SELECT * FROM mytable");
query.on("row", function (row, result) {
// do your stuff with each row
result.addRow(row);
});
query.on("end", function (result) {
// here you have the complete result
console.log(JSON.stringify(result.rows, null, 2));
// end when done ;)
client.end();
});

Related

Mysql inserts with AWS Lambda + Node.js

I'm running nodejs function in Amazon Lambda. It is supposed to do an insert to mysql DB after a HTTP get. Everything seems to be fine -- looking at the cloudwatch logs the query is parsed correctly and if I copy paste the query to mysql console it does exactly what it is supposed to.
Essentially:
var mysql = require('mysql')
var connection = createConnection({ connection details });
connection.connect();
var query = connection.query('Insert into AAA select * \
from BBB where BBB.a = ?;', [parameter],
function(err, result) {}
);
connection.end();
The problems is that the Lambda version simply does nothing. Query is visible and correct and the function returns cleanly but it never actually inserts anything. I have the same problem with update query as well but all the mysql selects work and return stuff so the problem is not that. The insert also works when I run it on my machine -- when I push it to lambda the problem appears.
I tried to add a separate commit statement but couldn't get it working either. I'm clearly missing something but can't figure out what. Do I need to have a transaction block for updates?
EDIT: Per Mark B's request. I think I tried to be smarter than I am by showing only part of the code. The whole logic was:
exports.handler = function(event, context, callback){
if ( event.A == -1 ){
exports.updateDB(event, function(res) {
context.succeed(res)
}
}
};
exports.updateDB = function(event, callback) {
var mysql = require('mysql')
var connection = createConnection({ connection details });
connection.connect();
var query = connection.query( 'update products set A=? where product_id = ?;',
[parameters],
function(err,result){ });
var query = connection.query( 'insert into other_table select * from products where product_id = ?;',
[parameters],
function(err,result){ });
connection.commit(function(err) {
if(err) {
connection.rollback(function() {
throw(err);
});
}
connection.end();
});
callback({"ok":"ok"})
};
Per advice given here I made the following changes. I took the last callback away, and did put callbacks inside both connection.queries:
var query = connection.query( 'insert into other_table select * from products where product_id = ?;',
[parameters],
function(err,result){
callback({"ok":"ok"})
});
And it seems to work. I'm guessing now that the commit -part does nothing but it doesn't seem to break it either. It probably is obvious at this point that I'm not much of a developer and even less so familiar with node.js so I truly appreciate the help I got!
Please note that the query function is an asynchronous function, meaning that it will be no result available until the callback function is triggered. In your sample code, the connection is closed immediately after it was triggered, long before the callback is executed. Try changing the code so that connection is closed by in the callback function, e.g.
var query = connection.query('Insert into AAA select * \
from BBB where BBB.a = ?;', [parameter],
function(err, result) {
// now it is ok to close the connection
connection.end();
if (err) {
// error handling
}
else {
// do something with the result
}
}
);
By the way, since you are working with Lambda, the same thing applies to the callback(), context.succeed() and context.fail() function handlers. In other words, it is likely that you would like to call them where I wrote the comments about error and result handling above.

How do I ensure db call is completed before using db.colse() [in node.js to access mongo.db]

var storedArticleArray = db.collection('storedArticle').find(query).toArray;
console.dir(storedArticleArray);
db.close();
How can I ensure that console.dir(stroedArticleArray) displays its argument only after the database completed the query and stored the result in storedArticleArray?
Also db.close() does not close before the query is completed.
Does this work:
var storedArticleArray = db.collection('').find(query).toArray(function() {
console.dir(storedArticleArray);
db.close();
});
You must use Node.js callback functions when you query your MongoDB database, because they're asynchronous operations.
Following your example you can use:
var storedArticleArray = [];
db.collection('storedArticle').find(query, function(error, data) {
storedArticleArray = data.toArray;
console.dir(storedArticleArray);
db.close();
});
The callback function will be executed once the query will be completed and returned data (or an error, that you must always handle). In the callback you'll be sure to close the db connection without problems.

Node Postgres Module not responding

I have an amazon beanstalk node app that uses the postgres amazon RDS. To interface node with postgres I use node postgres. Code looks like this:
var pg = require('pg'),
done,client;
function DataObject(config,success,error) {
var PG_CONNECT = "postgres://"+config.username+":"+config.password+"#"+
config.server+":"+config.port+"/"+config.database;
self=this;
pg.connect(PG_CONNECT, function(_error, client, done) {
if(_error){ error();}
else
{
self.client = client;
self.done = done;
success();
}
});
}
DataObject.prototype.add_data = function(data,success,error) {
self=this;
this.client.query('INSERT INTO sample (data) VALUES ($1,$2)',
[data], function(_error, result) {
self.done();
success();
});
};
To use it I create my data object and then call add_data every time new data comes along. Within add_data I call 'this/self.done()' to release the connection back to the pool. Now when I repeatedly make those requests the client.query never gets back. Under what circumstance could this lead to a blocking/not responding database interface?
The way you are using pool is incorrect.
You are asking for a connection from pool in the function DataObject. This function acts as a constructor and is executed once per data object. Thus only one connection is asked for from the pool.
When we call add_data the first time, the query is executed and the connection is returned to the pool. Thus the consequent calls are not successful since the connection is already returned.
You can verify this by logging _error:
DataObject.prototype.add_data = function(data,success,error) {
self=this;
this.client.query('INSERT INTO sample (data) VALUES ($1,$2)',
[data], function(_error, result) {
if(_error) console.log(_error); //log the error to console
self.done();
success();
});
};
There are couple of ways you can do it differently:
Ask for a connection for every query made. Thus you'll need to move the code which ask for pool to function add_data.
Release client after performing all queries. This is a tricky way since calls are made asynchronously, you need to be careful that client is not shared i.e. no new request be made until client.query callback function is done.

nodejs getting db delayed result only works with console.log

I'm aware of the event-driven/non-blocking stuff of node and I've been using it for about 2 years...
recently I came through this problem I can't solve it even forcing a closure...
I'm asking database for a simple result: 'SELECT 1+1 as theResult'
(...) // previous code
var dbRows1 = {}; // empty object to hold rows
var dbRows2 = {}; // idem
var mysql = require( 'mysql' );
var db = mysql.createConnection( dbInfo ); // dbInfo has connction data
var test = function( a ) { dbRows2 = a };
db.connect(
function( err )
{
if (err) throw err.stack;
console.log( 'TID ->', db.threadId)
});
db.query (
'SELECT 1+1 AS XXX',
function( err, rows, fields )
{
console.log( 'rows ->', rows ); // works ok
dbRows1 = rows; // don't work because rows is still empty
test( rows ); // !!!should work but dbRows2 is empty at program end!!!
});
(...) // more code
console.log( dbRows2 );
the last line prints a empty object for dbRows2 ( { } )...
Any ideas why console.log() works and my test() function do not work...
Your test function is working, the problem is that, at the time your last console.log statement executes, the test function hasn't been executed yet. This is because both .connect and .query are asynchronous.
The program flow looks something like this:
connect is called and queued
query is called and queued
console.log(dbRows2) is called
(time passes...)
connect succeeds and calls back
(time passes...)
query succeeds and calls back
test is called assigning rows to dbRows2
You can verify this behavior yourself by placing some strategic logging in your code before each action and inside of the callbacks. If you were to do that you would see something like this in your console:
about to call connect
about to call query
logging dbRows2
connect succeeded
query succeeded
about to call test
Without knowing the exact structure of your code, the following provides a rough example of how you might structure your logic to ensure the code that depends on the query completing is only invoked after the query completes:
db.query (
'SELECT 1+1 AS XXX',
function( err, rows, fields )
{
console.log( 'rows ->', rows ); // works ok
dbRows1 = rows; // don't work because rows is still empty
test( rows ); // !!!should work but dbRows2 is empty at program end!!!
// Call doStuffThatDependsOnQuery here
doStuffThatDependsOnQuery();
});
(...) // more code that doesn't depend on db.query(...)
function doStuffThatDependsOnQuery() {
console.log( dbRows2 );
}

node-postgres 'event emitter style' vs 'callback style'

node-postgres states the following:
node-postgres supports both an 'event emitter' style API and a 'callback' style. The
callback style is more concise and generally preferred, but the evented API can come in
handy. They can be mixed and matched.
With the event emitter API, I can do the following:
var db = new pg.Client("insert-postgres-connection-info");
db.connect();
And then I can use db to execute queries throughout my web app using db.query('sql statement here'). With the callback style, I would do the following each time I want to run a query:
pg.connect(conString, function(err, client) {
client.query("sql statement", function(err, result) {
// do stuff
});
});
So my question is why is it "generally preferred" to use the callback style? Isn't it inefficient to open a connection each time you do something with the database? What benefits are there from using the callback style?
EDIT
I might be mistaken as to what he means by "callback style" (I'm not kidding, my JavaScript isn't very strong) but my question is about the method of connection. I assumed the following was the callback style connection method:
// Simple, using built-in client pool
var pg = require('pg');
//or native libpq bindings
//var pg = require('pg').native
var conString = "tcp://postgres:1234#localhost/postgres";
//error handling omitted
pg.connect(conString, function(err, client) {
client.query("SELECT NOW() as when", function(err, result) {
console.log("Row count: %d",result.rows.length); // 1
console.log("Current year: %d", result.rows[0].when.getYear());
});
});
and the following is the EventEmitter API connection method:
// Evented api
var pg = require('pg'); //native libpq bindings = `var pg = require('pg').native`
var conString = "tcp://postgres:1234#localhost/postgres";
var client = new pg.Client(conString);
client.connect();
If I'm just getting terms mixed up here, my question still remains. pg.connect(do queries) opens a new connection every time you use it (doesn't it?) whereas
var client = new pg.Client(conString);
client.connect();
opens a connection and then allows you to use client to run queries when necessary, no?
The EventEmitter style is more for this type of thing:
var query = client.query("SELECT * FROM beatles WHERE name = $1", ['John']);
query.on('row', function(row) {
console.log(row);
console.log("Beatle name: %s", row.name); //Beatle name: John
console.log("Beatle birth year: %d", row.birthday.getYear()); //dates are returned as javascript dates
console.log("Beatle height: %d' %d\"", Math.floor(row.height/12), row.height%12); //integers are returned as javascript ints
});
By mixing and matching, you should be able to do the following:
// Connect using EE style
var client = new pg.Client(conString);
client.connect();
// Query using callback style
client.query("SELECT NOW() as when", function(err, result) {
console.log("Row count: %d",result.rows.length); // 1
console.log("Current year: %d", result.rows[0].when.getYear());
});
Note that even when using the callback style, you wouldn't open a connect every time you want to execute a query; most likely, you'd open a connection when the application starts and use it throughout.
There are pros and cons and the one you choose depends on your use case.
Use case 1: Return the result set to the client row-by-row.
If you're going to return data to the client much in the same way it comes out of the database - row by row - then you can use the event emitter style to reduce latency, which I define here as the time between issuing the request and receiving the first row. If you used the callback style instead, latency would be increased.
Use case 2: Return a hierarchical data structure (e.g. JSON) based on the entire result set.
If you're going to return data to the client in a hierarchical data structure such as JSON (which you would do to save bandwidth when the result set is a flat representation of a hierarchy), you should use the callback style because you are can't return anything until you have received all rows. You could use the event emitter style and accumulate rows (node-postgres provides such a mechanism so you don't have to maintain a map of partially built results by query), but it would be a pointless waste of effort because you can't return any results until you have received the last row.
Use case 3: Return an array of hierarchical data structures.
When returning an array of hierarchical data structures, you will have a lot of rows to get through all at once if you use the callback style. This would block for a significant amount of time which isn't good because you have only one thread to service many clients. So you should use the event emitter style with the row accumulator. Your result set should be ordered such that when you detect a change in value of a particular field, you know the current row represents the beginning of a new result to return and everything accumulated so far represents a now complete result which you can convert to your hierarchical form and return to the client.

Resources