Redis: 'Get - Check Condition - Increase' with locking key - node.js

I'm trying to make a task scheduling program on node server.
and, the server will run on multiple computer.
so, I need to in memory database to share server state.
I wrote a code like below.
const { createClient } = require('redis');
const redis = createClient();
const notify = () => {
const runningTaskCount = redis.get('running_task_count');
if (runningTaskCount >= 10) return;
redis.incr('running_task_count');
const taskId = redis.lpop('task_id_queue');
const task = new Task(taskId);
task.run();
task.on('end', () => {
redis.decr('running_task_count');
notify();
});
};
const add = (taskId) => {
redis.rpush('task_id_queue', taskId);
notify();
};
if only one server running, no problem in this code.
but, multiple server running, running_task_count can be over 10.
So, I want to do like below
Lock 'running_task_count'
Get 'running_task_count'
if (running_task_count >= 10) {
Unlock 'running_task_count'
return
}
Incr 'running_task_count'
Unlock 'running_task_count'
(go on...)
How can I implement this?

Redis has the ability to run server-side logic via Lua scripts (see the EVAL command). Script execution is atomic, so race conditions are eliminated. A script like the following can do the job:
local val = redis.call('GET', KEYS[1])
if val >= 10 then
return
end
redis.call('INCR', KEYS[1])

you're looking for sth like redlock
// the string identifier for the resource you want to lock
var resource = 'running_task_count';
// the maximum amount of time you want the resource locked in milliseconds,
// keeping in mind that you can extend the lock up until
// the point when it expires
var ttl = 1000;
redlock.lock(resource, ttl).then(function(lock) {
// ...do something here...
redis.incr('running_task_count')
// unlock your resource when you are done
return lock.unlock()
.catch(function(err) {
// we weren't able to reach redis; your lock will eventually
// expire, but you probably want to log this error
console.error(err);
});
});

Related

How to deal with re-usable connection objects in functional programming (in Node.js)?

How does one deal with persistent objects like socket connections in functional programming?
We have several functions like this
function doSomething(argument1, argument2) {
const connection = createConnection(argument1); // connection across the network
const result = connection.doSomething();
connection.close()
return result;
}
Each one recreating the connection object, which is a fairly expensive operation. How could one persist a connection like that in functional programming? Currently, we simply made the connection global.
Your program is going to have state. Always. Your program is going to do some I/O. Pretty much always. Functional programming is not about not doing those things, it's about controlling them: doing them in a way where such things that tend to complicate code maintenance and reason-ability are reasonably confined.
As for your particular function, I would argue that it has a problem: you've conflated creating a connection with doing something with that connection.
You probably want to start with something more like this:
const createConn = (arg) => createConnection(arg);
const doSomething = (conn, arg) => conn.doSomething(arg);
Note that this is easier to test: you can pass a mock in a unit test in a way that you can't with your original. An even better approach would be to have a cache:
const cache = new WeakMap();
const getConn = (arg) => {
const exists = cache.get(arg);
let conn;
if (!exists) {
conn = createConnection(arg);
cache.set(arg, conn);
} else {
conn = exists;
}
return conn;
}
Now your getConn function is idempotent. And a better approach still would be to have a connection pool:
const inUse = Symbol();
const createPool = (arg, max=4) => {
// stateful, captured in closure, but crucially
// this is *opaque to the caller*
const conns = [];
return async () => {
// check the pool for an available connection
const available = conns.find(x => !x[inUse]);
if (available) {
available[inUse] = true;
return available;
}
// lazily populate the cache
if (conns.length < max) {
const conn = createConn(arg);
conn.release = function() { this[inUse] = false };
conn[inUse] = true;
conns.push(conn);
return conn;
}
// If we don't have an available connection to hand
// out now, return a Promise of one and
// poll 4 times a second to check for
// one. Note this is subject to starvation even
// though we're single-threaded, i.e. this isn't
// a production-ready implementation so don't
// copy-pasta.
return new Promise(resolve => {
const check = () => {
const available = conns.find(x => !x[inUse]);
if (available) {
available[inUse] = true;
resolve(available);
} else {
setTimeout(check, 250);
}
};
setTimeout(check, 250);
});
};
}
Now the details of the creation are abstracted away. Note that this is still stateful and messy, but now the consuming code can be more functional and easier to reason about:
const doSomething = async (pool, arg) => {
const conn = await pool.getConn();
conn.doSomething(arg);
conn.release();
}
// Step 3: profit!
const pool = createPool(whatever);
const result = doSomething(pool, something);
As a final aside, when trying to be functional (especially in a language not built on that paradigm) there is only so much you can do with sockets. Or files. Or anything else from the outside world. So don't: don't try to make something inherently side-effective functional. Instead put a good API on it as an abstraction and properly separate your concerns so that the rest of your code can have all of the desirable properties of functional code.

await for Lock() on stateless action

Problem:
front-end page make x parallel requests (let's call it first group),
the next group (x request) will be after 5 seconds, the first request (of the first group) set the cache from DB.
the other x-1 requests got empty array insted of wait to first request to done his job.
the second group and the all next requests got proper data from cache.
What is the best practics to lock other threads until the first done (or fail) in stateless mechanism?
EDIT:
The cache module allow use trigger of set chache but it's not work since it stateless mechanism.
const GetDataFromDB= async (req, res, next) => {
var cachedTableName = undefined;
// "lockFlag" uses to prevent parallel request to get into critical section (because its take time to set cache from db)
// to prevent that we uses "lockFlag" that is short-initiation to cache.
//
if ( !myCache.has( "lockFlag" ) && !myCache.has( "dbtable" ) ){
// here arrive first req from first group only
// the other x-1 of first group went to the nest condition
// here i would build mechanism to wait 'till first req come back from DB (init cache)
myCache.set( "lockFlag", "1" )
const connection1 = await odbc.connect(connectionConfig);
const cachedTableName = await connection1.query(`select * from ${tableName}`);
if(cachedTableName.length){
const success = myCache.set([
{key: "dbtable", val: cachedTableName, ttl: 180},
])
if(success)
{
cachedTableName = myCache.get( "dbtable" );
}
}
myCache.take("lockFlag");
connection1.close();
return res.status(200).json(cachedTableName ); // uses for first response.
}
// here comes x-1 of first group went to the nest condition and got nothing, bacause the cache not set yet
//
if ( myCache.has( "dbtable" ) ){
cachedTableName = myCache.get( "dbtable" );
}
return res.status(200).json(cachedTableName );
}
You can try the approach given here, with minor modifications to apply it for your case.
For brevity, I removed comments and shortened variables names.
Code, then explanation:
const EventEmitter = require('events');
const bus = new EventEmitter();
const getDataFromDB = async (req, res, next) => {
var table = undefined;
if (myCache.has("lockFlag")) {
await new Promise(resolve => bus.once("unlocked", resolve));
}
if (myCache.has("dbtable")) {
table = myCache.get("dbtable");
}
else {
myCache.set("lockFlag", "1");
const connection = await odbc.connect(connectionConfig);
table = await connection.query(`select * from ${tableName}`);
connection.close();
if (table.length) {
const success = myCache.set([
{ key: "dbtable", val: table, ttl: 180 },
]);
}
myCache.take("lockFlag");
bus.emit("unlocked");
}
return res.status(200).json(table);
}
This is how it should work:
At first, lockFlag is not present.
Then, some code calls getDataFromDB. That code evaluates the first if block to false, so it continues: it sets lockFlag to true ("1"), then goes on to retrieve the table data from db. In the meantime:
Some other code calls getDataFromDB. That code, however, evaluates the first if block to true, so it awaits on the promise, until an unlocked event will be emitted.
Back to the first calling code: It finishes its logic, caches the table data, sets lockFlag back to false, emits an unlocked event, and returns.
The other code can now continue its execution: it evaluates the second if to true, so it takes the table from the cache, and returns.
As workaround i add "finally" scope to remove lock-key from cache after first initiation, and this:
while(myCache.has( "lockFlag" )){
await wait(1500);
}
And the "wait" function:
function wait(milleseconds) {
return new Promise(resolve => setTimeout(resolve, milleseconds))
}
(source)
This is working, but still could be time (<1500 ms) that there is cache and the thread not aware.
I'ld happy for batter solution.

react-beautiful-dnd and mysql state management

okay. I'm confused as to the best way to do this:
the following pieces are in play: a node js server, a client-side react(with redux), a MYSql DB.
in the client app I have lists (many but for this issue, assume one), that I want to be able to reorder by drag and drop.
in the mysql DB the times are stored to represent a linked list (with a nextKey, lastKey, and productionKey(primary), along with the data fields),
//mysql column [productionKey, lastKey,nextKey, ...(other data)]
the current issue I'm having is a render issue. it stutters after every change.
I'm using these two function to get the initial order and to reorder
function SortLinkedList(linkedList)
{
var sortedList = [];
var map = new Map();
var currentID = null;
for(var i = 0; i < linkedList.length; i++)
{
var item = linkedList[i];
if(item?.lastKey === null)
{
currentID = item?.productionKey;
sortedList.push(item);
}
else
{
map.set(item?.lastKey, i);
}
}
while(sortedList.length < linkedList.length)
{
var nextItem = linkedList[map.get(currentID)];
sortedList.push(nextItem);
currentID = nextItem?.productionKey;
}
const filteredSafe=sortedList.filter(x=>x!==undefined)
//undefined appear because server has not fully updated yet, so linked list is broken
//nothing will render without this
return filteredSafe
;
}
const reorder = (list, startIndex, endIndex) => {
const result = Array.from(list);
const [removed] = result.splice(startIndex, 1);
result.splice(endIndex, 0, removed);
const adjustedResult = result.map((x,i,arr)=>{
if(i==0){
x.lastKey=null;
}else{
x.lastKey=arr[i-1].productionKey;
}
if(i==arr.length-1){
x.nextKey=null;
}else{
x.nextKey=arr[i+1].productionKey;
}
return x;
})
return adjustedResult;
};
I've got this function to get the items
const getItems = (list,jobList) =>
{
return list.map((x,i)=>{
const jobName=jobList.find(y=>y.jobsessionkey==x.attachedJobKey)?.JobName;
return {
id:`ProductionCardM${x.machineID}ID${x.productionKey}`,
attachedJobKey: x.attachedJobKey,
lastKey: x.lastKey,
machineID: x.machineID,
nextKey: x.nextKey,
productionKey: x.productionKey,
content:jobName
}
})
}
my onDragEnd
const onDragEnd=(result)=> {
if (!result.destination) {
return;
}
// dropped outside the list
const items = reorder(
state.items,
result.source.index,
result.destination.index,
);
dispatch(sendAdjustments(items));
//sends update to server
//server updates mysql
//server sends back update events from mysql in packets
//props sent to DnD component are updated
}
so the actual bug looks like the graphics are glitching - as things get temporarily filtered in the sortLinkedList function - resulting in jumpy divs. is there a smoother way to handle this client->server->DB->server->client dataflow that results in a consistent handling in DnD?
UPDATE:
still trying to solve this. currently implemented a lock pattern.
useEffect(()=>{
if(productionLock){
setState({
items: SortLinkedList(getItems(data,jobList)),
droppables: [{ id: "Original: not Dynamic" }]
})
setLoading(false);
}else{
console.log("locking first");
setLoading(true);
}
},[productionLock])
where production lock is set to true and false from triggers on the server...
basically: the app sends the data to the server, the server processes the request, then sends new data back, when it's finished the server sends the unlock signal.
which should trigger this update happening once, but it does not, it still re-renders on each state update to the app from the server.
What’s the code for sendAdjustments()?
You should update locally first, otherwise DnD pulls it back to its original position while you wait for backend to finish. This makes it appear glitchy. E.g:
Set the newly reordered list locally as your state
Send network request
If it fails, reverse local list state back to the original list

setTimeout or child_process.spawn?

I have a REST service in Node.js with one specific request running a bunch of DB commands and other file processing that could take 10-15 seconds to run. Since I didn't want to hold up my browser request thread, I wrote a separate .js script to do the needful, called the script using child_process.spawn() in my Node.js code and immediately returned OK back to the client. This works fine, but then so does calling the same script (as a local function) by just using a simple setTimeout.
router.post("/longRequest", function(req, res) {
console.log("Started long request with id: " + req.body.id);
var longRunningFunction = function() {
// Usually runs a bunch of things that take time.
// Simulating a 10 sec delay for sample code.
setTimeout(function() {
console.log("Done processing for 10 seconds")
}, 10000);
}
// Below line used to be
// child_process.spawn('longRunningFunction.js'
setTimeout(longRunningFunction, 0);
res.json({status: "OK"})
})
So, this works for my purpose. But what's the downside ? I probably can't monitor the offline process easily as child_process.spawn which would give me a process id. But, does this cause problems in the long run ? Will it hold up Node.js processing if the 10 second processing increases to a lot more in the future ?
The actual longRunningFunction is something that reads an Excel file, parses it and does a bulk load using tedious to a MS SQL Server.
var XLSX = require('xlsx');
var FileAPI = require('file-api'), File = FileAPI.File, FileList = FileAPI.FileList, FileReader = FileAPI.FileReader;
var Connection = require('tedious').Connection;
var Request = require('tedious').Request;
var TYPES = require('tedious').TYPES;
var importFile = function() {
var file = new File(fileName);
if (file) {
var reader = new FileReader();
reader.onload = function (evt) {
var data = evt.target.result;
var workbook = XLSX.read(data, {type: 'binary'});
var ws = workbook.Sheets[workbook.SheetNames[0]];
var headerNames = XLSX.utils.sheet_to_json( ws, { header: 1 })[0];
var data = XLSX.utils.sheet_to_json(ws);
var bulkLoad = connection.newBulkLoad(tableName, function (error, rowCount) {
if (error) {
console.log("bulk upload error: " + error);
} else {
console.log('inserted %d rows', rowCount);
}
connection.close();
});
// setup your columns - always indicate whether the column is nullable
Object.keys(columnsAndDataTypes).forEach(function(columnName) {
bulkLoad.addColumn(columnName, columnsAndDataTypes[columnName].dataType, { length: columnsAndDataTypes[columnName].len, nullable: true });
})
data.forEach(function(row) {
var addRow = {}
Object.keys(columnsAndDataTypes).forEach(function(columnName) {
addRow[columnName] = row[columnName];
})
bulkLoad.addRow(addRow);
})
// execute
connection.execBulkLoad(bulkLoad);
};
reader.readAsBinaryString(file);
} else {
console.log("No file!!");
}
};
So, this works for my purpose. But what's the downside ?
If you actually have a long running task capable of blocking the event loop, then putting it on a setTimeout() is not stopping it from blocking the event loop at all. That's the downside. It's just moving the event loop blocking from right now until the next tick of the event loop. The event loop will be blocked the same amount of time either way.
If you just did res.json({status: "OK"}) before running your code, you'd get the exact same result.
If your long running code (which you describe as file and database operations) is actually blocking the event loop and it is properly written using async I/O operations, then the only way to stop blocking the event loop is to move that CPU-consuming work out of the node.js thread.
That is typically done by clustering, moving the work to worker processes or moving the work to some other server. You have to have this work done by another process or another server in order to get it out of the way of the event loop. A setTimeout() by itself won't accomplish that.
child_process.spawn() will accomplish that. So, if you have an actual event loop blocking problem to solve and the I/O is already as async optimized as possible, then moving it to a worker process is a typical node.js solution. You can communicate with that child process in a number of ways, but one possibility would be via stdin and stdout.

A better way to structure a Mongoose connection module

I have refactored some code to place all my mongoose.createConnection(...) in a single file. This file is then required in other files that use connections to the various databases specified. The connections are lazily created and are used in both an http server and in utility scripts.
The connection file looks like this:
var mongoose = require("mongoose");
var serverString = "mongodb://localhost:27017";
var userDBString = "/USER";
var customerDBString = "/CUSTOMER";
var userConnection = null;
exports.getUserConnection = function () {
if (userConnection === null) {
userConnection = mongoose.createConnection(serverString + userDBString, {server: { poolSize: 4 }});
}
return userConnection;
};
var customerConnection = null;
exports.getCustomerConnection = function () {
if (customerConnection === null) {
customerConnection = mongoose.createConnection(serverString + customerDBString, { server: { poolSize: 4 }});
}
return customerConnection;
};
My models are stored in a separate files (based on their DB) that looks a bit like this:
exports.UserSchema = UserSchema; //Just assume I know how to define a valid schema
exports.UserModel = connection.getUserConnection().model("User", UserSchema);
Later , I use the getUserConnection() to refer to the connection I have created to actually do work the model.
TL;DR
In utilities that use this connection format, I have to call
connection.getUserConnection().on("open", function() {
logger.info("Opened User DB");
//Do What I Need To Do
});
It is possible that in some scenarios the task processor will have already broadcast the open event. In some, it won't be guaranteed to have happened yet. I noticed that it doesn't queue work if the connection isn't open (specifically, dropCollection) so I feel stuck.
How can I be certain that the connection is open before proceeding given that I can't depend on subscribing to the open event before the task processor runs?
Is there a better pattern for centralizing the managing of multiple connections?
I can answer part of my own question
How can I be certain that the connection is open before proceeding
given that I can't depend on subscribing to the open event before the
task processor runs?
if (connection.getUserConnection().readyState!==1) {
logger.info("User connection was not open yet. Adding open listener");
connection.getSR26Connection().on("open", function () {
logger.info("User open event received");
doStuff();
});
} else {
logger.info("User is already open.");
doStuff();
}
function doStuff() {
logger.info("Doing stuff");
}
If you see a better way then please comment or offer up an answer. I would still like to hear how other people manage connections without rebuilding the connection every time.

Resources