I'm currently trying to use Node.js Kue for processing jobs in a queue, but I believe I'm not doing it right.
Indeed the way I'm working now, I have two different services (which in this case I'm running with Docker Compose): one Web API built with Express with sends jobs to the queue and one processing module. The issue here is with the processing module.
I've coded it as follows:
var kue = require('kue');
var config = require('./config');
var queue = kue.createQueue({
prefix: config.redis.queuePrefix,
redis: {
port: config.redis.port,
host: config.redis.host
}
});
queue.process('jobType', function (job, done) {
// do processing here...
});
When we run this with Node, it sits there waiting for things to be placed on the queue to do the processing.
There are two issues however:
It needs that Redis be available before running this module. If we run this without Redis already available, it crashes because the host is not accessible and ends the process.
If Redis suddenly becomes unavailable, the processing module also crashes because it cannot stablish the connection and the process is killed.
How can I avoid these problems?
My guess is that I should somehow make the code "wait" for Redis, but I have no idea on how to do this.
How can this be done in this case?
You can use promise to wait until redis is loaded. Then run your module.
loadRedis().then(() => {
//load your module
})
Or you can use generator to "stop" until redis is loaded.
function*(){
const redisLoaded = yield loadRedis();
//load your module
}
Related
When making a REST service using express in node, how do i prevent a blocking task from blocking the entire rest service? Take as example the following express rest service:
const express = require('express');
const app = express();
app.get('/', (req, res) => res.send('Hello, World'));
const blockService = async function () {
return new Promise((resolve, reject) => {
const end = Date.now() + 20000;
while (Date.now() < end) {
const doSomethingHeavyInJavaScript = 1 + 2 + 3;
}
resolve('I am done');
});
}
const blockController = function (req, res) {
blockService().then((val) => {
res.send(val);
});
};
app.get('/block', blockController);
app.listen(3000, () => console.log('app listening on port 3000'));
In this case, a call to /block will render the entire service unreachable for 20 seconds. This is a big problem if there are many clients using the service, since no other client will be able to access the service for that time. This is obviously a problem of the while loop being blocking code, and thus hanging the main thread. This code might be confusing, since, despite using a promise in blockService, the main thread still hangs. How do i ensure that blockService will run a worker-thread and not the event-loop?
By default node.js runs your Javascript code in a single thread. So, if you really have CPU intensive code in a request handler (like you show above), then that is indeed a problem. Your options are as follows:
Start up a Worker Thread and run the CPU-intensive code in a worker thread. Since version 10, node.js has had worker threads for this purpose. You then communicate back the result to the main thread with messaging.
Start up any other process that runs node.js code or any type of code and compute the result in that other process. You then communicate back the result to the main thread with messaging.
Use node clustering to start N processes so that if once process is stuck with a CPU intensive operation, at least one of the others is hopefully free to run other requests.
Please note that a lot of things that servers do like read files, do networking, make requests to databases are all asynchronous and non-blocking so it's not incredibly common to actually have lots of CPU intensive code. So, if this is just a made up example for your own curiosity, you should make sure you actually have a CPU-intensive problem in your server before you go designing threads or clusters.
Node.js is an event-based model that uses a single runtime thread. For the reasons you've discovered, Node.js is not a good choice for CPU bound tasks (or synchronously blocking tasks). Node.js works best for coordinating I/O asynchronously.
worker-threads were released in Node.js v12. This allows you to use another thread for blocking tasks. They are relatively simple to use and could work if you absolutely need the offload blocking tasks.
I am running a nodeJS application using forever npm module.
Node application also connects to Redis DB for cache check. Quite often the API stops working with the following error on the forever log.
{ ReplyError: Ready check failed: ERR max number of clients reached
at parseError (/home/myapp/core/node_modules/redis/node_modules/redis-parser/lib/parser.js:193:12)
at parseType (/home/myapp/core/node_modules/redis/node_modules/redis-parser/lib/parser.js:303:14)
at JavascriptRedisParser.execute (/home/myapp/ecore/node_modules/redis/node_modules/redis-parser/lib/parser.js:563:20) command: 'INFO', code: 'ERR' }
when I execute the client list command on the redis server it shows too many open connections. I have also set the timeout = 3600 in my Redis configuration.
I do not have any unclosed Redis connection object on my application code.
This happens once or twice in a week depending on the application load, as a stop gap solution I am restarting the node server( it works ).
What could be the permanent solution in this case?
I have figured out why. This has nothing to do with Redis. Increasing the OS file descriptor limit was just a temporary solution. I was using Redis in a web application and the connection was created for every new request.
When the server was restarted occasionally, all the held-up connections by the express server were released.
I solved this by creating a global connection object and re-using the same. The new connection is created only when necessary.
You could do so by creating a global connection object, make a connection once, and make sure it is connected before every time you use that. Check if there is an already coded solution depending on your programming language. In my case it was perl with dancer framework and I used a module called Dancer2::Plugin::Redis
redis_plugin
Returns a Dancer2::Plugin::Redis instance. You can use redis_plugin to
pass the plugin instance to 3rd party modules (backend api) so you can
access the existing Redis connection there. You will need to access
the actual methods of the the plugin instance.
In case if you are not running a web-server and you are running a worker process or any background job process, you could do this simple helper function to re-use the connection.
perl example
sub get_redis_connection {
my $redis = Redis->new(server => "www.example.com:6372" , debug => 0);
$redis->auth('abcdefghijklmnop');
return $redis;
}
...
## when required
unless($redisclient->ping) {
warn "creating new redis connection";
$redisclient = get_redis_connection();
}
I was running into this issue in my chat app because I was creating a new Redis instance each time something connected rather than just creating it once.
// THE WRONG WAY
export const getRedisPubSub = () => new RedisPubSub({
subscriber: new Redis(REDIS_CONNECTION_CONFIG),
publisher: new Redis(REDIS_CONNECTION_CONFIG),
});
and where I wanted to use the connection I was calling
// THE WRONG WAY
getNewRedisPubsub();
I fixed it by just creating the connection once when my app loaded.
export const redisPubSub = new RedisPubSub({
subscriber: new Redis(REDIS_CONNECTION_CONFIG),
publisher: new Redis(REDIS_CONNECTION_CONFIG),
});
and then I passed the one-time initialized redisPubSub object to my createServer function.
It was this article here that helped me see my error: https://docs.upstash.com/troubleshooting/max_concurrent_connections
See this example node.js code:
const http = require('http');
const server = http.createServer(function (req, res) {
if (req.url === '/loop') {
console.log('LOOP');
while (true) {}
}
res.write('Hello World');
res.end();
});
server.listen(3000);
In my script each request takes 3 to 5 seconds to process. while (true) {} is just for example.
But, here nodejs not processing another request when one request in process.
I want to run multiple requests at same time. But, server is running only one request at one time.
NOTE: I don't like to open cluster or child_process for each request. Because nodejs takes another 65 ms for starting cluster or child_process.
When you create server ( and listens ) nodejs creates an eventloop in which it process the request, you will not be able to use a infinite loop in it, since it will block the eventloop in which your server is running.
I hope you are not dealing with an infinite loop, but a certain process that takes time, for that you make use of, modules like async
in request/res function block use async module like this,
async.map(['param1','param2','param3'], task, function(err, results) {
// results of task function
});
what it does is that it will make use of already running eventloop and run the process.
Point to Note :
Most Javascript VMs are single threaded ( including NodeJS ) hence you can also make use of setTimeout function instead of an infinite while loop
You will not be able to create a thread in NodeJS instead use any process based solution like cluster or childprocess ( single threaded VM )
I have a redis queue of long and boring stuff to do, that gets filled by my main http server. This queue is then slowly processed by another server that I use as a worker (heroku worker). When this process has finished an item, it saves the result to the database.
Is it ok to code my nodejs worker process in a synchronous way? It does make sense to me since it does one thing at a time anyway and does not have to answer to any request.
Yes you can use synchronous code to develop a worker process for Redis. So you want to make your keys file for your Redis host and port:
module.exports = {
redisHost: process.env.REDIS_HOST,
redisPort: process.env.REDIS_PORT
};
So when you want to connect to redis you look for the hostname or url for redis and the port you are supposed to connect from your environment variable.
Then require that in along with redis:
const keys = require("./keys");
const redis = require("redis");
Then create your redisClient like so:
const redisClient = redis.createClient({
host: keys.redisHost,
port: keyof.redisPort,
retry_strategy: () => 1000
});
The retry_strategy: tells the worker that if we lose connection, retry connecting once every second.
I'm developing a node.js app and I am in need of heavy Redis usage. The app will be clustered across 8 CPU cores.
Right now I have 100 concurrent connections to Redis because every worker per CPU has several modules running require('redis').createClient().
Scenario A:
file1.js:
var redis = require('redis').createClient();
file2.js
var redis = require('redis').createClient();
SCENARIO B:
redis.js
var redis = require('redis').createClient();
module.exports = redis;
file1.js
var redis = require('./redis');
file2.js
var redis = require('./redis');
Which approach is better: creating new Redis instance in every new file I introduce (scenario A) or creating one Redis connection globally (scenario B) and sharing this connection across all modules I have. What are drawbacks/benefits of each solution?
Thanks in advance!
When I face a question such as this I generally think about three basic questions.
Which is more readable?
Which allows better code reuse?
Which is more efficient?
Not necessarily in this order as it depends on the scenario, but I believe in this case all three of these questions are in favor of option B.
If you ever needed to modify options for createClient, you would then need to edit them in every file which uses it. Which in option A is every file which uses redis, and option B is just redis.js. Also if a newer or different product comes out and you want to replace redis It would be feasible to make redis.js a wrapper for a different package or even a newer redis client substantially cutting down conversion time.
Globals are generally a bad thing, but in this example redis.js should not be storing mutable state, so there is no problem having a global/singleton in this context.
Both Node and Redis can handle lots of connections pretty well, so that's not a problem.
In your situation, you're creating Redis connections at the startup of your application, so the number of connections you're setting up is limited (in the sense that after your application is started, the number of connections will be constant).
Situations where you'd want to reuse the same connection is in highly dynamic situations, for instance with an HTTP-server where you need to query Redis for every request. Creating a new connection for each request would be a waste of resources (creating and destroying connections all the time) and reusing one connection for each request would be preferable.
As for which of the two scenario's I'd prefer, I'm leaning towards Scenario A myself.
You can create file to handle connection and functions with redis
redis-con.js
const redis = require('redis');
let redisClient;
(async () => {
redisClient = redis.createClient();
redisClient.on("error", (error) => console.error(`Error Redis: ${error}`));
await redisClient.connect();
})();
module.exports = redisClient;
Then you need to create function to handle set, get and del.
now, just import the connection