many queries postgres (node), no parallel queries?

many queries postgres (node), no parallel queries? - node.js

I am running a node server with the postgres-node (pg) package.
I wrote a program, which requests n-queries (for instance 20,000) at once to my postgres database.
When I do this with several clients who want to query 20,000 at once too, there is no parallelity. That means, the requests of the second client will be queued until the first client finished all his queries.
Is this a normal behavior for postgres? If yes, how can I prevent that one user gets all the ressources (and the others have to wait) if there is no parallelity?
This is my code:
const express = require('express');
const app = express();
const { Pool } = require("pg");
const pool = new Pool();
benchmark(){
pool.connect((err, client, done) => {
if (err) throw err;
client.query("SELECT * from member where m_id = $1", [1], (err, res) => {
done();
if (err) {
console.log(err.stack);
} else {
console.log(res.rows[0]);
}
});
});
}
app.get('/', function(req, res) {
for(let i=0;i<20000;i++){
benchmark();
}
});

First you need to create a connection pool, here's an example with node's pg in a separate module (node-pg-sql.js) for convenience:
node-pg-sql.js:
const { Pool } = require('pg');
const pool = new Pool(fileNameConfigPGSQL);
module.exports = {
query: (text, params, callback) => {
const start = Date.now()
return pool.query(text, params, (err, res) => {
const duration = Date.now() - start
// console.log('executed query', { text, duration, rows: res.rowCount })
callback(err, res)
})
},
getClient: (callback) => {
pool.connect((err, client, done) => {
const query = client.query.bind(client)
// monkey patch
client.query = () => {
client.lastQuery = arguments
client.query.apply(client, arguments)
}
// Timeout 5 sek
const timeout = setTimeout(() => {
// console.error('A client has been checked out for more than 5 seconds!')
// console.error(`The last executed query on this client was: ${client.lastQuery}`)
}, 5000)
const release = (err) => {
// 'done' Methode - returns client to the pool
done(err)
// clear Timeouts
clearTimeout(timeout)
// reset der Query-Method before Monkey Patch
client.query = query
}
callback(err, client, done)
})
}
}
In your postgresql.conf (on linux normally under /var/lib/pgsql/data/postgresql.conf) set max-connection to the desired value:
max_connection = 300
Keep in mind:
Each PostgreSQL connection consumes RAM for managing the connection or the client using it. The more connections you have, the more RAM you will be using that could instead be used to run the database.
While increasing your max-connections, you need to increase shared_buffers and kernel.shmmax as well in order for the client-connection increase to be effective .
Whenever you want to run a query from in one of your routes/endpoints just require the separate client-pool-file like:
const db = require('../../../node-pg-sql');
module.exports = (router) => {
router.get('/someRoute', (req, res) => {
console.log(`*****************************************`);
console.log(`Testing pg..`);
let sqlSelect = `SELECT EXISTS (
SELECT 1
FROM pg_tables
WHERE schemaname = 'someschema'
)`;
db.query(sqlSelect, (errSelect, responseSelect) => {
if (errSelect) {
/* INFO: Error while querying table */
console.log(`*****************************************`);
console.log(`ERROR WHILE CHECKING CONNECTION: ${errSelect}`);
}
else {
// INFO: No error from database
console.log(`*****************************************`);
console.log(`CONNECTION TO PGSQL WAS SUCCESSFUL..`);
res.json({ success: true, message: responseSelect, data:responseSelect.rows[0].exists });
}
})
});
}
EDIT:
"there is no parallelity.."
Node is asynchronous, you can either work with promises or spawn more clients/pools and tune your max-connections (as explained in my answer, but keep performance of your host-machine in mind), but with multiple clients running around 20.000 queries, they won't resolve with a result instantly or parallel. What is the exact goal you try to achieve?
"Is this a normal behavior for postgres?"
This is due to node's event-loop as well as due to certain performance-limitation of the host-machine running the Postgres.

Related

MongoError: pool destroyed when fetching all data without conditions

I am new to mongoDb, as I am trying to query from different collection and in order to do that, when I am fetching data from category collection I mean when I am running select * from collection it is throwing error, MongoError: pool destroyed.
As per my understanding it is because of some find({}) is creating a pool and that is being destroyed.
The code which I am using inside model is below,
const MongoClient = require('mongodb').MongoClient;
const dbConfig = require('../configurations/database.config.js');
export const getAllCategoriesApi = (req, res, next) => {
return new Promise((resolve, reject ) => {
let finalCategory = []
const client = new MongoClient(dbConfig.url, { useNewUrlParser: true });
client.connect(err => {
const collection = client.db(dbConfig.db).collection("categories");
debugger
if (err) throw err;
let query = { CAT_PARENT: { $eq: '0' } };
collection.find(query).toArray(function(err, data) {
if(err) return next(err);
finalCategory.push(data);
resolve(finalCategory);
// db.close();
});
client.close();
});
});
}
When my finding here is when I am using
let query = { CAT_PARENT: { $eq: '0' } };
collection.find(query).toArray(function(err, data) {})
When I am using find(query) it is returning data but with {} or $gte/gt it is throwing Pool error.
The code which I have written in controller is below,
import { getAllCategoriesListApi } from '../models/fetchAllCategory';
const redis = require("redis");
const client = redis.createClient(process.env.REDIS_PORT);
export const getAllCategoriesListData = (req, res, next, query) => {
// Try fetching the result from Redis first in case we have it cached
return client.get(`allstorescategory:${query}`, (err, result) => {
// If that key exist in Redis store
if (false) {
res.send(result)
} else {
// Key does not exist in Redis store
getAllCategoriesListApi(req, res, next).then( function ( data ) {
const responseJSON = data;
// Save the Wikipedia API response in Redis store
client.setex(`allstorescategory:${query}`, 3600, JSON.stringify({ source: 'Redis Cache', responseJSON }));
res.send(responseJSON)
}).catch(function (err) {
console.log(err)
})
}
});
}
Can any one tell me what mistake I am doing here. How I can fix pool issue.
Thanking you in advance.

I assume that toArray is asynchronous (i.e. it invokes the callback passed in as results become available, i.e. read from the network).
If this is true the client.close(); call is going to get executed prior to results having been read, hence likely yielding your error.
The close call needs to be done after you have finished iterating the results.
Separately from this, you should probably not be creating the client instance in the request handler like this. Client instances are expensive to create (they must talk to all of the servers in the deployment before they can actually perform queries) and generally should be created per running process rather than per request.

How to connect to Mongodb reliably in a serverless setup?

8 out of ten times everything connects well. That said, I sometimes get a MongoClient must be connected before calling MongoClient.prototype.db error. How should I change my code so it works reliably (100%)?
I tried a code snippet from one of the creators of the Now Zeit platform.
My handler
const { send } = require('micro');
const { handleErrors } = require('../../../lib/errors');
const cors = require('../../../lib/cors')();
const qs = require('micro-query');
const mongo = require('../../../lib/mongo');
const { ObjectId } = require('mongodb');
const handler = async (req, res) => {
let { limit = 5 } = qs(req);
limit = parseInt(limit);
limit = limit > 10 ? 10 : limit;
const db = await mongo();
const games = await db
.collection('games_v3')
.aggregate([
{
$match: {
removed: { $ne: true }
}
},
{ $sample: { size: limit } }
])
.toArray();
send(res, 200, games);
};
module.exports = handleErrors(cors(handler));
My mongo script that reuses the connection in case the lambda is still warm:
// Based on: https://spectrum.chat/zeit/now/now-2-0-connect-to-database-on-every-function-invocation~e25b9e64-6271-4e15-822a-ddde047fa43d?m=MTU0NDkxODA3NDExMg==
const MongoClient = require('mongodb').MongoClient;
if (!process.env.MONGODB_URI) {
throw new Error('Missing env MONGODB_URI');
}
let client = null;
module.exports = function getDb(fn) {
if (client && !client.isConnected) {
client = null;
console.log('[mongo] client discard');
}
if (client === null) {
client = new MongoClient(process.env.MONGODB_URI, {
useNewUrlParser: true
});
console.log('[mongo] client init');
} else if (client.isConnected) {
console.log('[mongo] client connected, quick return');
return client.db(process.env.MONGO_DB_NAME);
}
return new Promise((resolve, reject) => {
client.connect(err => {
if (err) {
client = null;
console.error('[mongo] client err', err);
return reject(err);
}
console.log('[mongo] connected');
resolve(client.db(process.env.MONGO_DB_NAME));
});
});
};
I need my handler to be 100% reliable.

if (client && !client.isConnected) {
client = null;
console.log('[mongo] client discard');
}
This code can cause problems! Even though you're setting client to null, that client still exists, will continue connecting to mongo, will not be garbage collected, and its callback connection code will still run, but in its callback client will refer to the next client that's created that is not necessarily connected.
A common pattern for this kind of code is to only ever return a single promise from the getDB call:
let clientP = null;
function getDb(fn) {
if (clientP) return clientP;
clientP = new Promise((resolve, reject) => {
client = new MongoClient(process.env.MONGODB_URI, {
useNewUrlParser: true
});
client.connect(err => {
if (err) {
console.error('[mongo] client err', err);
return reject(err);
}
console.log('[mongo] connected');
resolve(client.db(process.env.MONGO_DB_NAME));
});
});
return clientP;
};

I had the same issue. In my case it was caused by calling getDb() before a previous getDb() call had returned. In this case, I believe that 'client.isConnected' returns true, even though it is still connecting.
This was caused by forgetting to put an 'await' before the getDb() call in one location. I tracked down which by outputting a callstack from getDb using:
console.log(new Error().stack);
I don't see the same issue in the sample code in the question, though it could be triggered by another bit of code that isn't shown.

I have written this article talking about serverless, lambda e db connections. There are some good concepts which could help you to find the root cause of your problem. There are also example and use cases of how to mitigate connection pool issues.
Just by looking your code I can tell it is missing this:
context.callbackWaitsForEmptyEventLoop = false;
Serverless: Dynamodb x Mongodb x Aurora serverless

node.js Global connection already exists. Call sql.close() first

I'm trying to create web services using node.js from an sql server database,in the frontend when i call those 2 webservices simultaneously it throws an error Global connection already exists. Call sql.close() first .
Any Solution ?
var express = require('express');
var router = express.Router();
var sql = require("mssql");
router.get('/Plant/:server/:user/:password/:database', function(req, res, next) {
user = req.params.user;
password = req.params.password;
server = req.params.server;
database = req.params.database;
// config for your database
var config = {
user: user,
password: password,
server: server,
database:database
};
sql.connect(config, function (err) {
// create Request object
var request = new sql.Request();
// query to the database and get the records
request.query("SELECT distinct PlantName FROM MachineryStateTable"
, function (err, recordset) {
if (err) console.log(err)
else {
for(i=0;i<recordset.recordsets.length;i++) {
res.send(recordset.recordsets[i])
}
}
sql.close();
});
});
});
router.get('/Dep/:server/:user/:password/:database/:plantname', function(req, res, next) {
user = req.params.user;
password = req.params.password;
server = req.params.server;
database = req.params.database;
plantname = req.params.plantname;
// config for your database
var config = {
user: user,
password: password,
server: server,
database:database
};
sql.connect(config, function (err) {
// create Request object
var request = new sql.Request();
// query to the database and get the records
request.query("SELECT distinct DepName FROM MachineryStateTable where PlantName= '"+plantname+"'"
, function (err, recordset) {
if (err) console.log(err)
else {
for(i=0;i<recordset.recordsets.length;i++) {
res.send(recordset.recordsets[i])
}
sql.close();
}
});
});
});
module.exports = router;

You have to create a poolConnection
try this:
new sql.ConnectionPool(config).connect().then(pool => {
return pool.request().query("SELECT * FROM MyTable")
}).then(result => {
let rows = result.recordset
res.setHeader('Access-Control-Allow-Origin', '*')
res.status(200).json(rows);
sql.close();
}).catch(err => {
res.status(500).send({ message: `${err}`})
sql.close();
});

From the documentation, close method should be used on the connection, and not on the required module,
So should be used like
var connection = new sql.Connection({
user: '...',
password: '...',
server: 'localhost',
database: '...'
});
connection.close().
Also couple of suggestions,
1. putting res.send in a loop isn't a good idea, You could reply back the entire recordsets or do operations over it, store the resultant in a variable and send that back.
2. Try using promises, instead of callbacks, it would make the flow neater

You must use ConnectionPool.
Next function returns a recordset with my query results.
async function execute2(query) {
return new Promise((resolve, reject) => {
new sql.ConnectionPool(dbConfig).connect().then(pool => {
return pool.request().query(query)
}).then(result => {
resolve(result.recordset);
sql.close();
}).catch(err => {
reject(err)
sql.close();
});
});
}
Works fine in my code!

if this problem still bother you, then change the core api.
go to node_modules\mssql\lib\base.js
at line 1723, add below code before if condition
globalConnection = null

In case someone comes here trying to find out how to use SQL Server pool connection with parameters:
var executeQuery = function(res,query,parameters){
new sql.ConnectionPool(sqlConfig).connect().then(pool =>{
// create request object
var request = new sql.Request(pool);
// Add parameters
parameters.forEach(function(p) {
request.input(p.name, p.sqltype, p.value);
});
// query to the database
request.query(query,function(err,result){
res.send(result);
sql.close();
});
})
}

Don't read their documentation, I don't think it was written by someone that actually uses the library :) Also don't pay any attention to the names of things, a 'ConnectionPool' doesn't seem to actually be a connection pool of any sort. If you try and create more than one connection from a pool, you will get an error. This is the code that I eventually got working:
const sql = require('mssql');
let pool = new sql.ConnectionPool(config); // some object that lets you connect ONCE
let cnn = await pool.connect(); // create single allowed connection on this 'pool'
let result = await cnn.request().query(query);
console.log('result:', result);
cnn.close(); // close your connection
return result;
This code can be run multiple times in parallel and seems to create multiple connections and correctly close them.

Querying Large Dataset in Oracle Database from NodeJS

I'm currently working on a project from work where i have an Oracle 10 database table with about 310K give or take 10-30K rows.
The goal is to display those rows in an angular frontend, however returning all of those through NodeJS is taking a lot of time.
Given that I'm using both NodeJS and oracledb for the first time, i'm assuming i must be missing something?
var oracledb = require('oracledb');
var config = require(__dirname+'/../db.js');
function get(req,res,next)
{
var table = req.query.table;
var meta;
oracledb.getConnection(config.oracle)
.then( function(connection)
{
var stream = connection.queryStream('SELECT * FROM '+table);
stream.on('error', function (error)
{
console.error(error);
return next(err);
});
stream.on('metadata', function (metadata) {
console.log(metadata);
});
stream.on('data', function (data) {
console.log(data);
});
stream.on('end', function ()
{
connection.release(
function(err) {
if (err) {
console.error(err.message);
return next(err);
}
});
});
})
.catch(function(err){
if(err){
connection.close(function(err){
if(err){
console.error(err.message);
return next(err);
}
});
}
})
}
module.exports.get = get;

30 MB is a lot of data to load into the front end. It can work in some cases, such as desktop web apps where the benefits of "caching" the data offset the time needed to load it (and increased stale data is okay). But it will not work well in other cases, such as mobile.
Keep in mind that the 30 MB must be moved from the DB to Node.js and then from Node.js to the client. The network connections between these will greatly impact performance.
I'll point out a few things that can help performance, though not all are exactly related to this question.
First, if you're using a web server, you should be using a connection pool, not dedicated/one-off connections. Generally, you'd create the connection pool in your index/main/app.js and start the web server after that's done and ready.
Here's an example:
const oracledb = require('oracledb');
const express = require('express');
const config = require('./db-config.js');
const thingController = require('./things-controller.js');
// Node.js used 4 background threads by default, increase to handle max DB pool.
// This must be done before any other calls that will use the libuv threadpool.
process.env.UV_THREADPOOL_SIZE = config.poolMax + 4;
// This setting can be used to reduce the number of round trips between Node.js
// and the database.
oracledb.prefetchRows = 10000;
function initDBConnectionPool() {
console.log('Initializing database connection pool');
return oracledb.createPool(config);
}
function initWebServer() {
console.log('Initializing webserver');
app = express();
let router = new express.Router();
router.route('/things')
.get(thingController.get);
app.use('/api', router);
app.listen(3000, () => {
console.log('Webserver listening on localhost:3000');
});
}
initDBConnectionPool()
.then(() => {
initWebServer();
})
.catch(err => {
console.log(err);
});
That will create a pool which is added to the internal pool cache in the driver. This allows you to easily access it from other modules (example later).
Note that when using connection pools, it's generally a good idea to increase the thread pool available to Node.js to allow each connection in the pool to work concurrently. An example of this is included above.
In addition, I'm increasing the value of oracledb.prefetchRows. This setting is directly related to your question. Network round trips are used to move the data between the DB and Node.js. This setting allows you to adjust the number of rows fetched with each round trip. So as prefetchRows goes higher, fewer round trips are needed and performance increases. Just be careful you don't go to high as per the memory you have in your Node.js server.
I ran a generic test that mocked the 30 MB dataset size. When oracledb.prefetchRows was left at the default of 100, the test finished in 1 minute 6 seconds. When I bumped this up to 10,000, it finished in 27 seconds.
Okay, moving on to "things-controller.js" which is based on your code. I've updated the code to do the following:
Assert that table is a valid table name. Your current code is vulnerable to SQL injection.
Use a promise chain that emulates a try/catch/finally block to close the connection just once and return the first error encountered (if needed).
Work so I could run the test.
Here's the result:
const oracledb = require('oracledb');
function get(req, res, next) {
const table = req.query.table;
const rows = [];
let conn;
let err; // Will store the first error encountered
// You need something like this to preven SQL injection. The current code
// is wide open.
if (!isSimpleSqlName(table)) {
next(new Error('Not simple SQL name'));
return;
}
// If you don't pass a config, the connection is pulled from the 'default'
// pool in the cache.
oracledb.getConnection()
.then(c => {
return new Promise((resolve, reject) => {
conn = c;
const stream = conn.queryStream('SELECT * FROM ' + table);
stream.on('error', err => {
reject(err);
});
stream.on('data', data => {
rows.push(data);
});
stream.on('end', function () {
resolve();
});
});
})
.catch(e => {
err = err || e;
})
.then(() => {
if (conn) { // conn assignment worked, need to close/release conn
return conn.close();
}
})
.catch(e => {
console.log(e); // Just log, error during release doesn't affect other work
})
.then(() => {
if (err) {
next(err);
return;
}
res.status(200).json(rows);
});
}
module.exports.get = get;
function isSimpleSqlName(name) {
if (name.length > 30) {
return false;
}
// Fairly generic, but effective. Would need to be adjusted to accommodate quoted identifiers,
// schemas, etc.
if (!/^[a-zA-Z0-9#_$]+$/.test(name)) {
return false;
}
return true;
}
I hope that helps. Let me know if you have questions.

increase number response per second

I have an android game that has 40,000 users online. And each user send request to server every 5 second.
I write this code for test request:
const express = require('express')
const app = express()
const pg = require('pg')
const conString = 'postgres://postgres:123456#localhost/dbtest'
app.get('/', function (req, res, next) {
pg.connect(conString, function (err, client, done) {
if (err) {
return next(err)
}
client.query('SELECT name, age FROM users limit 1;', [], function (err, result) {
done()
if (err) {
return next(err)
}
res.json(result.rows)
})
})
})
app.listen(3000)
Demo
And for test this code with 40,000 requests I write this ajax code:
for (var i = 0; i < 40000; i++) {
var j = 1;
$.ajax({
url: "http://85.185.161.139:3001/",
success: function(reponse) {
var d = new Date();
console.log(j++, d.getHours() + ":" + d.getMinutes() + ":" + d.getSeconds());
}
});
}
SERVER detail(I know this is poor)
Questions:
this code (node js)only response 200 requests per second!
how can improve my code for increase number response per second?
this way(ajax) for simulate 40,000 online users is correct or not?
if i use socket is better or not?

You should use Divide&Conquer algorithm for solving such problems. Find the most resource inefficient operation and try to replace or reduce an amount of calls to it.
The main problem that I see here is that server open new connection to database on each request which possibly takes most of the time and resources.
I suggest to open connection when the server boots up and reuse it in requests.
const express = require('express')
const app = express()
const pg = require('pg')
const conString = 'postgres://postgres:123456#localhost/dbtest'
const pgClient
pg.connect(conString, function (err, client, done) {
if (err) {
throw err
}
pgClient = client
})
app.get('/', function (req, res, next) {
pgClient.query('SELECT name, age FROM users limit 1;', [], function (err, result) {
if (err) {
return next(err)
}
res.json(result.rows)
})
})
app.listen(3000)
For proper stress load testing better use specialized utilities such as ab from Apache. Finally, sockets are better for rapid, small data transfer but remember it has problems with scaling and in most cases became very inefficient at 10K+ simultaneous connections.
EDIT: As #robertklep pointed out, better use client pooling in this case, and retrieve client from pool.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

many queries postgres (node), no parallel queries? - node.js

Related

MongoError: pool destroyed when fetching all data without conditions

How to connect to Mongodb reliably in a serverless setup?

node.js Global connection already exists. Call sql.close() first

Querying Large Dataset in Oracle Database from NodeJS

increase number response per second

Categories

Resources