preventing race conditions with nodejs - node.js

I'm writing an application using nodeJS 6.3.0 and aws DynamoDB.
the dynamodb holds statistics information that are added to dynamodb that are being called from 10 different function (10 different statistic measures). the interval is set to 10 seconds, which means that every 10 seconds, 10 calls to my function are being made to add all the relevant information.
the putItem function:
function putItem(tableName,itemData,callback) {
var params = {
TableName: tableName,
Item: itemData
};
docClient.put(params, function(err, data) {
if (err) {
logger.error(params,"putItem failed in dynamodb");
callback(err,null);
} else {
callback(null,data);
}
});
now... I created a queue.
var queue = require('./dynamoDbQueue').queue;
that implements a simple queue with fixed size that I took from http://www.bennadel.com/blog/2308-creating-a-fixed-length-queue-in-javascript-using-arrays.htm.
the idea is that if there is a network problem.. lets say for a minute. i want all the events to be pushed to the queue and when the problem is resolved to send queue information to dynamodb and to free the queue.
so I modified my original function to the following code:
function putItem(tableName,itemData,callback) {
var params = {
TableName: tableName,
Item: itemData
};
if (queue.length>0) {
queue.push(params);
callback(null,null);
} else {
docClient.put(params, function (err, data) {
if (err) {
queue.push(params);
logger.error(params, "putItem failed in dynamodb");
handleErroredQueue(); // imaginary function that i need to implement
callback(err, null);
} else {
callback(null, data);
}
});
}
}
but since I have 10 insert functions that runs at the same second, there is a chance of race conditions. which means that ...
execute1 - one function validated that the queue is empty... and is about to execute docClient.put() function.
execute2 - and at the same time another function returned from docClient.put() with an error and as a result it adds to the queue it's first row.
execute1 - by the time that the first function calling docClient.put(), the problem has been resolved and it successfully inserted data to dynamodb, which leaves the queue with previous data that will be released in the next iteration.
so for example if i inserted 4 rows with ids 1,2,3,4, the order of rows that will be inserted to dynamodb is 1,2,4,3.
is there a way to resolve that ?
thanks!

I think you are on right track, but instead of checking for an error and then adding into queue what I would suggest is to add every operation to queue first and then read the data from the queue every time.
For instance, in your case you call function 1,2,3,4 and it results in 1,2,4,3 because you are using the queue at a time off error/abrupt operation.
Step1: All your function will make an entry to a Queue -> 1,2,3,4
Step2: Read your queue and make an insert, if success remove the element
else redo the operation. This way it will insert in the desired sequence
Another advantage is that because you are using queue you don't have to keep very high throughputs for the table.
Edit:
I guess you just need to ensure that on completion of your first operation you will perform your next process and not before that.
e.g: fn 1 -> read from queue (don't delete right now from queue) -> operation Completed if not perfrom again -> Delete from queue -> perform next operation.
You just have to make sure you read from queue and wait till you get response from DynamoDB.
Hope this helps.

Related

Run a Cron Job every 30mins after onCreate Firestore event

I want to have a cron job/scheduler that will run every 30 minutes after an onCreate event occurs in Firestore. The cron job should trigger a cloud function that picks the documents created in the last 30 minutes-validates them against a json schema-and saves them in another collection.How do I achieve this,programmatically writing such a scheduler?
What would also be fail-safe mechanism and some sort of queuing/tracking the documents created before the cron job runs to push them to another collection.
Building a queue with Firestore is simple and fits perfectly for your use-case. The idea is to write tasks to a queue collection with a due date that will then be processed when being due.
Here's an example.
Whenever your initial onCreate event for your collection occurs, write a document with the following data to a tasks collection:
duedate: new Date() + 30 minutes
type: 'yourjob'
status: 'scheduled'
data: '...' // <-- put whatever data here you need to know when processing the task
Have a worker pick up available work regularly - e.g. every minute depending on your needs
// Define what happens on what task type
const workers: Workers = {
yourjob: (data) => db.collection('xyz').add({ foo: data }),
}
// The following needs to be scheduled
export const checkQueue = functions.https.onRequest(async (req, res) => {
// Consistent timestamp
const now = admin.firestore.Timestamp.now();
// Check which tasks are due
const query = db.collection('tasks').where('duedate', '<=', new Date()).where('status', '==', 'scheduled');
const tasks = await query.get();
// Process tasks and mark it in queue as done
tasks.forEach(snapshot => {
const { type, data } = snapshot.data();
console.info('Executing job for task ' + JSON.stringify(type) + ' with data ' + JSON.stringify(data));
const job = workers[type](data)
// Update task doc with status or error
.then(() => snapshot.ref.update({ status: 'complete' }))
.catch((err) => {
console.error('Error when executing worker', err);
return snapshot.ref.update({ status: 'error' });
});
jobs.push(job);
});
return Promise.all(jobs).then(() => {
res.send('ok');
return true;
}).catch((onError) => {
console.error('Error', onError);
});
});
You have different options to trigger the checking of the queue if there is a task that is due:
Using a http callable function as in the example above. This requires you to perform a http call to this function regularly so it executes and checks if there is a task to be done. Depending on your needs you could do it from an own server or use a service like cron-job.org to perform the calls. Note that the HTTP callable function will be available publicly and potentially, others could also call it. However, if you make your check code idempotent, it shouldn't be an issue.
Use the Firebase "internal" cron option that uses Cloud Scheduler internally. Using that you can directly trigger the queue checking:
export scheduledFunctionCrontab =
functions.pubsub.schedule('* * * * *').onRun((context) => {
console.log('This will be run every minute!');
// Include code from checkQueue here from above
});
Using such a queue also makes your system more robust - if something goes wrong in between, you will not loose tasks that would somehow only exist in memory but as long as they are not marked as processed, a fixed worker will pick them up and reprocess them. This of course depends on your implementation.
You can trigger a cloud function on the Firestore Create event which will schedule the Cloud Task after 30 minutes. This will have queuing and retrying mechanism.
An easy way is that you could add a created field with a timestamp, and then have a scheduled function run at a predefined period (say, once a minute) and execute certain code for all records where created >= NOW - 31 mins AND created <= NOW - 30 mins (pseudocode). If your time precision requirements are not extremely high, that should work for most cases.
If this doesn't suit your needs, you can add a Cloud Task (Google Cloud product). The details are specified in this good article.

Do not process next job until previous job is completed (BullJS/Redis)?

Basically, each of the clients ---that have a clientId associated with them--- can push messages and it is important that a second message from the same client isn't processed until the first one is finished processing (Even though the client can send multiple messages in a row, and they are ordered, and multiple clients sending messages should ideally not interfere with each other). And, importantly, a job shouldn't be processed twice.
I thought that using Redis I might be able to fix this issue, I started with some quick prototyping using the bull library, but I am clearly not doing it well, I was hoping someone would know how to proceed.
This is what I tried so far:
Create jobs and add them to the same queue name for one process, using the clientId as the job name.
Consume jobs while waiting large random amounts of random time on 2 separate process.
I tried adding the default locking provided by the library that I am using (bull) but it locks on the jobId, which is unique for each job, not on the clientId .
What I would want to happen:
One of the consumers can't take the job from the same clientId until the previous one is finished processing it.
They should be able to, however, get items from different clientIds in parallel without problem (asynchronously). (I haven't gotten this far, I am right now simply dealing with only one clientId)
What I get:
Both consumers consume as many items as they can from the queue without waiting for the previous item for the clientId to be completed.
Is Redis even the right tool for this job?
Example code
// ./setup.ts
import Queue from 'bull';
import * as uuid from 'uuid';
// Check that when a message is taken from a place, no other message is taken
// TO do that test, have two processes that process messages and one that sets messages, and make the job take a long time
// queue for each room https://stackoverflow.com/questions/54178462/how-does-redis-pubsub-subscribe-mechanism-works/54243792#54243792
// https://groups.google.com/forum/#!topic/redis-db/R09u__3Jzfk
// Make a job not be called stalled, waiting enough time https://github.com/OptimalBits/bull/issues/210#issuecomment-190818353
export async function sleep(ms: number): Promise<void> {
return new Promise((resolve) => {
setTimeout(resolve, ms);
});
}
export interface JobData {
id: string;
v: number;
}
export const queue = new Queue<JobData>('messages', 'redis://127.0.0.1:6379');
queue.on('error', (err) => {
console.error('Uncaught error on queue.', err);
process.exit(1);
});
export function clientId(): string {
return uuid.v4();
}
export function randomWait(minms: number, maxms: number): Promise<void> {
const ms = Math.random() * (maxms - minms) + minms;
return sleep(ms);
}
// Make a job not be called stalled, waiting enough time https://github.com/OptimalBits/bull/issues/210#issuecomment-190818353
// eslint-disable-next-line #typescript-eslint/ban-ts-comment
//#ts-ignore
queue.LOCK_RENEW_TIME = 5 * 60 * 1000;
// ./create.ts
import { queue, randomWait } from './setup';
const MIN_WAIT = 300;
const MAX_WAIT = 1500;
async function createJobs(n = 10): Promise<void> {
await randomWait(MIN_WAIT, MAX_WAIT);
// always same Id
const clientId = Math.random() > 1 ? 'zero' : 'one';
for (let index = 0; index < n; index++) {
await randomWait(MIN_WAIT, MAX_WAIT);
const job = { id: clientId, v: index };
await queue.add(clientId, job).catch(console.error);
console.log('Added job', job);
}
}
export async function create(nIds = 10, nItems = 10): Promise<void> {
const jobs = [];
await randomWait(MIN_WAIT, MAX_WAIT);
for (let index = 0; index < nIds; index++) {
await randomWait(MIN_WAIT, MAX_WAIT);
jobs.push(createJobs(nItems));
await randomWait(MIN_WAIT, MAX_WAIT);
}
await randomWait(MIN_WAIT, MAX_WAIT);
await Promise.all(jobs)
process.exit();
}
(function mainCreate(): void {
create().catch((err) => {
console.error(err);
process.exit(1);
});
})();
// ./consume.ts
import { queue, randomWait, clientId } from './setup';
function startProcessor(minWait = 5000, maxWait = 10000): void {
queue
.process('*', 100, async (job) => {
console.log('LOCKING: ', job.lockKey());
await job.takeLock();
const name = job.name;
const processingId = clientId().split('-', 1)[0];
try {
console.log('START: ', processingId, '\tjobName:', name);
await randomWait(minWait, maxWait);
const data = job.data;
console.log('PROCESSING: ', processingId, '\tjobName:', name, '\tdata:', data);
await randomWait(minWait, maxWait);
console.log('PROCESSED: ', processingId, '\tjobName:', name, '\tdata:', data);
await randomWait(minWait, maxWait);
console.log('FINISHED: ', processingId, '\tjobName:', name, '\tdata:', data);
} catch (err) {
console.error(err);
} finally {
await job.releaseLock();
}
})
.catch(console.error); // Catches initialization
}
startProcessor();
This is run using 3 different processes, which you might call like this (Although I use different tabs for a clearer view of what is happening)
npx ts-node consume.ts &
npx ts-node consume.ts &
npx ts-node create.ts &
I'm not familir with node.js. But for Redis, I would try this,
Let's say you have client_1, client_2, they are all publisher of events.
You have three machines, consumer_1,consumer_2, consumer_3.
Establish a list of tasks in redis, eg, JOB_LIST.
Clients put(LPUSH) jobs into this JOB_LIST, in a specific form, like "CLIENT_1:[jobcontent]", "CLIENT_2:[jobcontent]"
Each consumer takes out jobs blockingly (RPOP command of Redis) and process them.
For example, consumer_1 takes out a job, content is CLIENT_1:[jobcontent]. It parses the content and recognize it's from CLIENT_1. Then it wants to check if some other consumer is processing CLIENT_1 already, if not, it will lock the key to indicate that it's processing CLIENT_1.
It goes on to set a key of "CLIENT_1_PROCESSING" , with content as "consumer_1", using the Redis SETNX command (set if the key not exists), with an appropriate timeout. For example, the task norally takes one minute to finish, you set a timeout of the key of five minutes, just in case consumer_1 crashes and holds on the lock indefinitely.
If the SETNX returns 0, it means it fails to acquire the lock of CLIENT_1 (someone is already processing a job of client_1). Then it returns the job (a value of "CLIENT_1:[jobcontent]")to the left side of JOB_LIST, by using Redis LPUSH command.Then it might wait a bit (sleep a few seconds), and RPOP another task from the right side of the LIST. If this time SETNX returns 1, consumer_1 acquires the lock. It goes on to process job, after it finishes, it deletes the key of "CLIENT_1_PROCESSING", releasing the lock. Then it goes on to RPOP another job, and so on.
Some things to consider:
The JOB_LIST is not fair,eg, earlier jobs might be processed later
The locking part is a bit rudimentary, but will suffice.
----------update--------------
I've figured another way to keep tasks in order.
For each client(producer), build a list. Like "client_1_list", push jobs into the left side of the list.
Save all the client names in a list "client_names_list", with values "client_1", "client_2", etc.
For each consumer(processor), iterate the "client_names_list", for example, consumer_1 get a "client_1", check if the key of client_1 is locked(some one is processing a task of client_1 already), if not, right pop a value(job) from client_1_list and lock client_1. If client_1 is locked, (probably sleep one second) and iterate to the next client, "client_2", for example, and check the keys and so on.
This way, each client(task producer)'s task is processed by their order of entering.
EDIT: I found the problem regarding BullJS is starting jobs in parallel on one processor: We are using named jobs and where defining many named process functions on one queue/processor. The default concurrency factor for a queue/processor is 1. So the queue should not process any jobs in parallel.
The problem with our mentioned setup is if you define many (named) process-handlers on one queue the concurrency is added up with each process-handler function: So if you define three named process-handlers you get a concurrency factor of 3 for given queue for all the defined named jobs.
So just define one named job per queue for queues where parallel processing should not happen and all jobs should run sequentially one after the other.
That could be important e.g. when pushing a high number of jobs onto the queue and the processing involves API calls that would give errors if handled in parallel.
The following text is my first approach of answering the op's question and describes just a workaround to the problem. So better just go with my edit :) and configure your queues the right way.
I found an easy solution to operators question.
In fact BullJS is processing many jobs in parallel on one worker instance:
Let's say you have one worker instance up and running and push 10 jobs onto the queue than possibly that worker starts all processes in parallel.
My research on BullJS-queues gave that this is not intended behavior: One worker (also called processor by BullJS) should only start a new job from the queue when its in idle state so not processing a former job.
Nevertheless BullJS keeps starting jobs in parallel on one worker.
In our implementation that lead to big problems during API calls that most likely are caused by t00 many API calls at a time. Tests gave that when only starting one worker the API calls finished just fine and gave status 200.
So how to just process one job after the other once the previous is finished if BullJS does not do that for us (just what the op asked)?
We first experimented with delays and other BullJS options but thats kind of workaround and not the exact solution to the problem we are looking for. At least we did not get it working to stop BullJS from processing more than one job at a time.
So we did it ourself and started one job after the other.
The solution was rather simple for our use case after looking into BullJS API reference (BullJS API Ref).
We just used a for-loop to start the jobs one after another. The trick was to use BullJS's
job.finished
method to get a Promise.resolve once the job is finished. By using await inside the for-loop the next job gets just started immediately after the job.finished Promise is awaited (resolved). Thats the nice thing with for-loops: Await works in it!
Here a small code example on how to achieve the intended behavior:
for (let i = 0; i < theValues.length; i++) {
jobCounter++
const job = await this.processingQueue.add(
'update-values',
{
value: theValues[i],
},
{
// delay: i * 90000,
// lifo: true,
}
)
this.jobs[job.id] = {
jobType: 'socket',
jobSocketId: BackgroundJobTasks.UPDATE_VALUES,
data: {
value: theValues[i],
},
jobCount: theValues.length,
jobNumber: jobCounter,
cumulatedJobId
}
await job.finished()
.then((val) => {
console.log('job finished:: ', val)
})
}
The important part is really
await job.finished()
inside the for loop. leasingValues.length jobs get started all just one after the other as intended.
That way horizontally scaling jobs across more than one worker is not possible anymore. Nevertheless this workaround is okay for us at the moment.
I will get in contact with optimalbits - the maker of BullJS to clear things out.

Querying DB2 every 15 seconds causing memory leak in NodeJS

I have an application which checks for new entries in DB2 every 15 seconds on the iSeries using IBM's idb-connector. I have async functions which return the result of the query to socket.io which emits an event with the data included to the front end. I've narrowed down the memory leak to the async functions. I've read multiple articles on common memory leak causes and how to diagnose them.
MDN: memory management
Rising Stack: garbage collection explained
Marmelab: Finding And Fixing Node.js Memory Leaks: A Practical Guide
But I'm still not seeing where the problem is. Also, I'm unable to get permission to install node-gyp on the system which means most memory management tools are off limits as memwatch, heapdump and the like need node-gyp to install. Here's an example of what the functions basic structure is.
const { dbconn, dbstmt } = require('idb-connector');// require idb-connector
async function queryDB() {
const sSql = `SELECT * FROM LIBNAME.TABLE LIMIT 500`;
// create new promise
let promise = new Promise ( function(resolve, reject) {
// create new connection
const connection = new dbconn();
connection.conn("*LOCAL");
const statement = new dbstmt(connection);
statement.exec(sSql, (rows, err) => {
if (err) {
throw err;
}
let ticks = rows;
statement.close();
connection.disconn();
connection.close();
resolve(ticks.length);// resolve promise with varying data
})
});
let result = await promise;// await promise
return result;
};
async function getNewData() {
const data = await queryDB();// get new data
io.emit('newData', data)// push to front end
setTimeout(getNewData, 2000);// check again in 2 seconds
};
Any ideas on where the leak is? Am i using async/await incorrectly? Or else am i creating/destroying DB connections improperly? Any help on figuring out why this code is leaky would be much appreciated!!
Edit: Forgot to mention that i have limited control on the backend processes as they are handled by another team. I'm only retrieving the data they populate the DB with and adding it to a web page.
Edit 2: I think I've narrowed it down to the DB connections not being cleaned up properly. But, as far as i can tell I've followed the instructions suggested on their github repo.
I don't know the answer to your specific question, but instead of issuing a query every 15 seconds, I might go about this in a different way. Reason being that I don't generally like fishing expeditions when the environment can tell me an event occurred.
So in that vein, you might want to try a database trigger that loads the key to the row into a data queue on add, or even change or delete if necessary. Then you can just put in an async call to wait for a record on the data queue. This is more real time, and the event handler is only called when a record shows up. The handler can get the specific record from the database since you know it's key. Data queues are much faster than database IO, and place little overhead on the trigger.
I see a couple of potential advantages with this method:
You aren't issuing dozens of queries that may or may not return data.
The event would fire the instant a record is added to the table, rather than 15 seconds later.
You don't have to code for the possibility of one or more new records, it will always be 1, the one mentioned in the data queue.
yes you have to close connection.
Don't make const data. you don't need promise by default statement.exec is async and handles it via return result;
keep setTimeout(getNewData, 2000);// check again in 2 seconds
line outside getNewData otherwise it becomes recursive infinite loop.
Sample code
const {dbconn, dbstmt} = require('idb-connector');
const sql = 'SELECT * FROM QIWS.QCUSTCDT';
const connection = new dbconn(); // Create a connection object.
connection.conn('*LOCAL'); // Connect to a database.
const statement = new dbstmt(dbconn); // Create a statement object of the connection.
statement.exec(sql, (result, error) => {
if (error) {
throw error;
}
console.log(`Result Set: ${JSON.stringify(result)}`);
statement.close(); // Clean up the statement object.
connection.disconn(); // Disconnect from the database.
connection.close(); // Clean up the connection object.
return result;
});
*async function getNewData() {
const data = await queryDB();// get new data
io.emit('newData', data)// push to front end
setTimeout(getNewData, 2000);// check again in 2 seconds
};*
change to
**async function getNewData() {
const data = await queryDB();// get new data
io.emit('newData', data)// push to front end
};
setTimeout(getNewData, 2000);// check again in 2 seconds**
First thing to notice is possible open database connection in case of an error.
if (err) {
throw err;
}
Also in case of success connection.disconn(); and connection.close(); return boolean values that tell is operation successful (according to documentation)
Always possible scenario is to pile up connection objects in 3rd party library.
I would check those.
This was confirmed to be a memory leak in the idb-connector library that i was using. Link to github issue Here. Basically there was a C++ array that never had it's memory deallocated. A new version was added and the commit can viewed Here.

Node calling postgres function with temp tables causing "memory leak"

I have a node.js program calling a Postgres (Amazon RDS micro instance) function, get_jobs within a transaction, 18 times a second using the node-postgres package by brianc.
The node code is just an enhanced version of brianc's basic client pooling example, roughly like...
var pg = require('pg');
var conString = "postgres://username:password#server/database";
function getJobs(cb) {
pg.connect(conString, function(err, client, done) {
if (err) return console.error('error fetching client from pool', err);
client.query("BEGIN;");
client.query('select * from get_jobs()', [], function(err, result) {
client.query("COMMIT;");
done(); //call `done()` to release the client back to the pool
if (err) console.error('error running query', err);
cb(err, result);
});
});
}
function poll() {
getJobs(function(jobs) {
// process the jobs
});
setTimeout(poll, 55);
}
poll(); // start polling
So Postgres is getting:
2016-04-20 12:04:33 UTC:172.31.9.180(38446):XXX#XXX:[5778]:LOG: statement: BEGIN;
2016-04-20 12:04:33 UTC:172.31.9.180(38446):XXX#XXX:[5778]:LOG: execute <unnamed>: select * from get_jobs();
2016-04-20 12:04:33 UTC:172.31.9.180(38446):XXX#XXX:[5778]:LOG: statement: COMMIT;
... repeated every 55ms.
get_jobs is written with temp tables, something like this
CREATE OR REPLACE FUNCTION get_jobs (
) RETURNS TABLE (
...
) AS
$BODY$
DECLARE
_nowstamp bigint;
BEGIN
-- take the current unix server time in ms
_nowstamp := (select extract(epoch from now()) * 1000)::bigint;
-- 1. get the jobs that are due
CREATE TEMP TABLE jobs ON COMMIT DROP AS
select ...
from really_big_table_1
where job_time < _nowstamp;
-- 2. get other stuff attached to those jobs
CREATE TEMP TABLE jobs_extra ON COMMIT DROP AS
select ...
from really_big_table_2 r
inner join jobs j on r.id = j.some_id
ALTER TABLE jobs_extra ADD PRIMARY KEY (id);
-- 3. return the final result with a join to a third big table
RETURN query (
select je.id, ...
from jobs_extra je
left join really_big_table_3 r on je.id = r.id
group by je.id
);
END
$BODY$ LANGUAGE plpgsql VOLATILE;
I've used the temp table pattern because I know that jobs will always be a small extract of rows from really_big_table_1, in hopes that this will scale better than a single query with multiple joins and multiple where conditions. (I used this to great effect with SQL Server and I don't trust any query optimiser now, but please tell me if this is the wrong approach for Postgres!)
The query runs in 8ms on small tables (as measured from node), ample time to complete one job "poll" before the next one starts.
Problem: After about 3 hours of polling at this rate, the Postgres server runs out of memory and crashes.
What I tried already...
If I re-write the function without temp tables, Postgres doesn't run out of memory, but I use the temp table pattern a lot, so this isn't a solution.
If I stop the node program (which kills the 10 connections it uses to run the queries) the memory frees up. Merely making node wait a minute between polling sessions doesn't have the same effect, so there are obviously resources that the Postgres backend associated with the pooled connection is keeping.
If I run a VACUUM while polling is going on, it has no effect on memory consumption and the server continues on its way to death.
Reducing the polling frequency only changes the amount of time before the server dies.
Adding DISCARD ALL; after each COMMIT; has no effect.
Explicitly calling DROP TABLE jobs; DROP TABLE jobs_extra; after RETURN query () instead of ON COMMIT DROPs on the CREATE TABLEs. Server still crashes.
Per CFrei's suggestion, added pg.defaults.poolSize = 0 to the node code in an attempt to disable pooling. The server still crashed, but took much longer and swap went much higher (second spike) than all the previous tests which looked like the first spike below. I found out later that pg.defaults.poolSize = 0 may not disable pooling as expected.
On the basis of this: "Temporary tables cannot be accessed by autovacuum. Therefore, appropriate vacuum and analyze operations should be performed via session SQL commands.", I tried to run a VACUUM from the node server (as some attempt to make VACUUM an "in session" command). I couldn't actually get this test working. I have many objects in my database and VACUUM, operating on all objects, was taking too long to execute each job iteration. Restricting VACUUM just to the temp tables was impossible - (a) you can't run VACUUM in a transaction and (b) outside the transaction the temp tables don't exist. :P EDIT: Later on the Postgres IRC forum, a helpful chap explained that VACUUM isn't relevant for temp tables themselves, but can be useful to clean up the rows created and deleted from pg_attributes that TEMP TABLES cause. In any case, VACUUMing "in session" wasn't the answer.
DROP TABLE ... IF EXISTS before the CREATE TABLE, instead of ON COMMIT DROP. Server still dies.
CREATE TEMP TABLE (...) and insert into ... (select...) instead of CREATE TEMP TABLE ... AS, instead of ON COMMIT DROP. Server dies.
So is ON COMMIT DROP not releasing all the associated resources? What else could be holding memory? How do I release it?
I used this to great effect with SQL Server and I don't trust any query optimiser now
Then don't use them. You can still execute queries directly, as shown below.
but please tell me if this is the wrong approach for Postgres!
It is not a completely wrong approach, it's just a very awkward one, as you are trying to create something that's been implemented by others for a much easier use. As a result, you are making many mistakes that can lead to many problems, including memory leaks.
Compare to the simplicity of the exact same example that uses pg-promise:
var pgp = require('pg-promise')();
var conString = "postgres://username:password#server/database";
var db = pgp(conString);
function getJobs() {
return db.tx(function (t) {
return t.func('get_jobs');
});
}
function poll() {
getJobs()
.then(function (jobs) {
// process the jobs
})
.catch(function (error) {
// error
});
setTimeout(poll, 55);
}
poll(); // start polling
Gets even simpler when using ES6 syntax:
var pgp = require('pg-promise')();
var conString = "postgres://username:password#server/database";
var db = pgp(conString);
function poll() {
db.tx(t=>t.func('get_jobs'))
.then(jobs=> {
// process the jobs
})
.catch(error=> {
// error
});
setTimeout(poll, 55);
}
poll(); // start polling
The only thing that I didn't quite understand in your example - the use of a transaction to execute a single SELECT. This is not what transactions are generally for, as you are not changing any data. I assume you were trying to shrink a real piece of code you had that changes some data also.
In case you don't need a transaction, your code can be further reduced to:
var pgp = require('pg-promise')();
var conString = "postgres://username:password#server/database";
var db = pgp(conString);
function poll() {
db.func('get_jobs')
.then(jobs=> {
// process the jobs
})
.catch(error=> {
// error
});
setTimeout(poll, 55);
}
poll(); // start polling
UPDATE
It would be a dangerous approach, however, not to control the end of the previous request, which also may create memory/connection issues.
A safe approach should be:
function poll() {
db.tx(t=>t.func('get_jobs'))
.then(jobs=> {
// process the jobs
setTimeout(poll, 55);
})
.catch(error=> {
// error
setTimeout(poll, 55);
});
}
Use CTEs to create partial result sets instead of temp tables.
CREATE OR REPLACE FUNCTION get_jobs (
) RETURNS TABLE (
...
) AS
$BODY$
DECLARE
_nowstamp bigint;
BEGIN
-- take the current unix server time in ms
_nowstamp := (select extract(epoch from now()) * 1000)::bigint;
RETURN query (
-- 1. get the jobs that are due
WITH jobs AS (
select ...
from really_big_table_1
where job_time < _nowstamp;
-- 2. get other stuff attached to those jobs
), jobs_extra AS (
select ...
from really_big_table_2 r
inner join jobs j on r.id = j.some_id
)
-- 3. return the final result with a join to a third big table
select je.id, ...
from jobs_extra je
left join really_big_table_3 r on je.id = r.id
group by je.id
);
END
$BODY$ LANGUAGE plpgsql VOLATILE;
The planner will evaluate each block in sequence the way I wanted to achieve with temp tables.
I know this doesn't directly solve the memory leak issue (I'm pretty sure there's something wrong with Postgres' implementation of them, at least the way they manifest on the RDS configuration).
However, the query works, it is query planned the way I was intending and the memory usage is stable now after 3 days of running the job and my server doesn't crash.
I didn't change the node code at all.

Concurrent writing to redis in node.js

In my node.js application I read messages from AWS Kinesis stream, and I need store all messages, for last minute in cache (Redis). I run next code in one node worker:
var loopCallback = function(record) {
var nowMinute = moment.utc(record.Data.ts).minute();
//get all cached kinesis records
var key = "kinesis";
cache.get(key,function (err, cachedData) {
if (err) {
utils.logError(err);
} else {
if(!cachedData) {
cachedData = [];
} else {
cachedData = JSON.parse(cachedData);
}
//get records with the same minute
var filtered = _.filter(cachedData, function (item) {
return moment.utc(item.ts).minute() === nowMinute;
});
filtered.push(record.Data);
cache.set(key, JSON.stringify(filtered), function (saveErr) {
if (saveErr) {
utils.logError(saveErr);
}
//do other things with record;
});
}
});
};
Most of the records (few dozens) I receive exactly in the same moment. So when I try to save it, some records are not stored.
I uderstand it happen due to race condition.
Node reads old version of array from Redis and overwrites array while it writes another record to cache.
I have read about redis transactions, but as I understand it will not help me, because only one transaction will be completed, and other will be rejected.
There is way to save all records to cache in my case?
Thank you
You could use a sorted set, with the score being a Unix timestamp
ZADD kinesis <unixtimestamp> "some data to be cached"
To get the elements added less than one minute ago, create a timestamp for (now - 60 seconds) then use ZRANGEBYSCORE to get the oldest element first:
ZRANGEBYSCORE myzset -inf (timestamp
or ZREVRANGEBYSCORE if you want the newest element first:
ZRANGEBYSCORE myzset -inf (timestamp
To remove the elements older than one minute, create a timestamp for (now - 60 seconds) then use ZREMRANGEBYSCORE
ZREMRANGEBYSCORE myzset -inf (timestamp

Resources