Need of a lock in Nodejs - node.js

As i understand Nodejs has a event loop based mechanism, therefore the main nodejs code is single threaded which just tries to execute the methods and callbacks present in the call stack, therefore at a single point of time in the nodejs runtime only one thread executes(main thread), so if i define a data structure like a queue or map i need not to worry about locking it or blocking others accessing it when one of the callbacks is using because
unless a callback is done executing the main thread will keep executing ( unlike multithreaded application where a thread could execute for a bit and then contexted switched out to give chance to some other thread.)
no 2 callbacks are executed in parallel
Example:
let arr = [];
const fun = async (website) => {
const result = await fetch(website, {}, "POST");
arr.push(...result); }
const getData = async () => {
let promises = [];
for (let i = 0; i < 10; i++) {
promises.push(fun(`website${i}`));
}
Promise.all(promises);
}
So my question is do i ever need to lock a resource on the nodejs side of thing,
i do understand the libuv is multithreaded therefore when calling async operations that use libuv (ex writing to file) could cause problems if i don't lock.
But my question is do i ever need a lock in the nodejs runtime.

Related

how to run multiple infinite loops without blocking each other node.js?

I need to run multiple parallel tasks (infinite loops) without blocking each other in node.js. I'm trying now to do:
const test = async () => {
let a = new Promise(async res => {
while (true) {
console.log('test1')
}
})
let b = new Promise(async res => {
while (true) {
console.log('test2')
}
})
}
test();
But it does not work, only 'test1' appears in the console. What am I doing wrong?
You can't. You need to stop thinking in loops but start thinking in event loop instead.
There are two functions that can schedule functions to run in the future that can be used for this: setTimeout() and setInterval(). Note that in Node.js there is also setImmediate() but I suggest you avoid using it until you really know what you are doing because setImmediate() has to potential to block I/O.
Note that neither setTimeout() not setImmediate() are Promise aware.
The minimal code example that does what you want would be something like:
const test = () => {
setInterval(() => {
console.log('test1')
},
10 // execute the above code every 10ms
)
setInterval(() => {
console.log('test2')
},
10 // execute the above code every 10ms
)
}
test();
Or if you really want to run the two pieces of code as fast as possible you can pass 0 as the timeout. It won't run every 0ms but just as fast as the interpreter can. The minimal interval differs depending on OS and how busy your CPU is. For example Windows can run setInterval() down to every 1ms while Linux typically won't run any faster than 10ms. This is down to how fast the OS tick is (or jiffy in Linux terminology). Linux is a server oriented OS so sets its tick bigger to give it higher throughput (each process gets the CPU for longer thus can finish long tasks faster) while Windows is a UI oriented (some would say game oriented) OS so sets its tick smaller for smoother UI experience.
To get something closer to the style of code you want you can use a promisified setTimeout() and await it:
function delay (x) {
return new Promise((done, fail) => setTimeout(done,x));
}
const test = async () => {
let a = async (res) => {
while (true) {
console.log('test1')
await delay(0) // this allows the function to be async
}
}
let b = async (res) => {
while (true) {
console.log('test2')
await delay(0) // this allows the function to be async
}
}
a();
b();
}
test();
However, this is no longer the minimal possible working code, though not by much.
Note: After writing the promisified example above it suddenly reminded me of the programming style on early cooperative multitasking OSes. I think Windows3.1 did this though I never wrote anything on it. It reminds me of MacOS Classic. You had to periodically call the WaitNextEvent() function to pass control of the CPU back to the OS so that the OS can run other programs. If you forgot to do that (or your program gets stuck in a long loop with no WaitNextEvent() call) your entire computer would freeze. This is exactly what you are experiencing with node where the entire javascript engine "freezes" and executes only one loop while ignoring other code.

Nodejs redis method BLPOP blocks the event-loop

I use the library node-redis: https://github.com/NodeRedis/node-redis
let client = redis.createClient();
let blpopAsync = util.promisify(client.blpop).bind(client);
let rpushAsync = util.promisify(client.rpush).bind(client);
async function consume() {
let data = await blpopAsync('queue', 0); // the program is blocked at here
}
async function otherLongRunningTasks() {
// call some redis method
await rpushAsync('queue', data);
}
I expect that method blpopAsync will be blocked until a element is popped. And in this time, event loop will run other async functions. However, my program is blocked at await blpopAsync forever.
Could you tell me how to use this function correctly to not block event-loop?
I find that blpop blocks other non-blocking methods of redis client, so other functions which use redis will wait forever. The event loop is still running.
As #AnonyMouze mentioned, doing client.duplicate().blpop(...) is the trick to freeing the connection.
i currently cannot comment so i will have to post it here. blpop will block when there is no data. Another thing to consider is the Redis's version . blpop behave differently between versions. You can refer to redis documentation .

Node.js Spawning multiple threads within a class method

How can I run a single method multiple times multi-threaded when called as a method of a class?
At first I tried to use the cluster module, but I realize it just re-runs the whole process from the start, rightfully so.
How can I achieve something like what's outlined below?
I want a class's method to spawn n processes, and when the parallel tasks are completed, I can resolve a promise which the method returns.
The problem with the code below is that calling cluster.fork() will fork index.js process.
index.js
const Person = require('./Person.js');
var Mary = new Person('Mary');
Mary.run(5).then(() => {...});
console.log('I should only run once, but I am called 5 times too many');
Person.js
const cluster = require('cluster');
class Person{
run(distance){
var completed = 0;
return new Promise((resolve, reject) => {
for(var i = 0; i < distance; i++) {
// run a separate process for each
cluster.fork().send(i).on('message', message => {
if (message === 'completed') { ++completed; }
if (completed === distance) { resolve(); }
});
}
});
}
}
I think the short answer is impossible. It's even worse - this has nothing to do with js. To multi (process or thread) in your particular problem you will essentially need a copy of the object in every thread, since it needs (maybe) access to fields - in this case you would need to either initialize it in every thread or share memory. That last one I don't think is provided in cluster, and not trivial in other languages in every use case.
If the calculation is independent of the Person I suggest you extract it, and use the usual (in index.js):
if(cluster.isWorker) {
//Use the i for calculation
} else {
//Create Person, then fork children in for loop
}
You then collect the results and change the Person as needed. You will be copying index.js, but this is standard and you only run what you need.
The problem is if results are dependent on Person. If these are constant for all i you can still send them to your forks independently. Otherwise what you have is the only way to fork. In general forking in cluster is not meant for methods, but for the app itself, which is the standard forking behavior.
Another solution
Following your comment, I suggest you checkout child_process.execFile or child_process.exec on same file.
This way you can spawn a totally independent process on the fly. Now instead of calling cluster.fork you call execFile. You can use either the exit code or stdout as return values (stderr etc.). Promise is now replaced with:
var results = []
for(var i = 0; i < distance; i++) {
// run a separate process for each
results.push(child_process.execFile().child.execFile('node', 'mymethod.js`,i]));
}
//... catch the exit event from all results or return a callback using results.
Inside mymethod.js Have your code that takes i and returns what you want either in the exit code or through stdout, both properties of the returned child_process. This is a bit un-node.js-y since you're waiting on asynchronous calls, but you're requirements are non standard. Since I'm not sure how you use this perhaps returning a callback with the array is a better idea.

.wait() on a task in c++/cx throws exception

I have a function which calls Concurrency::create_task to perform some work in the background. Inside that task, there is a need to call a connectAsync method on the StreamSocket class in order to connect a socket to a device. Once the device is connected, I need to grab some references to things inside the connected socket (like input and output streams).
Since it is an asynchronous method and will return an IAsyncAction, I need to create another task on the connectAsync function that I can wait on. This works without waiting, but complications arise when I try to wait() on this inner task in order to error check.
Concurrency::create_task( Windows::Devices::Bluetooth::Rfcomm::RfcommDeviceService::FromIdAsync( device_->Id ) )
.then( [ this ]( Windows::Devices::Bluetooth::Rfcomm::RfcommDeviceService ^device_service_ )
{
_device_service = device_service_;
_stream_socket = ref new Windows::Networking::Sockets::StreamSocket();
// Connect the socket
auto inner_task = Concurrency::create_task( _stream_socket->ConnectAsync(
_device_service->ConnectionHostName,
_device_service->ConnectionServiceName,
Windows::Networking::Sockets::SocketProtectionLevel::BluetoothEncryptionAllowNullAuthentication ) )
.then( [ this ]()
{
//grab references to streams, other things.
} ).wait(); //throws exception here, but task executes
Basically, I have figured out that the same thread (presumably the UI) that creates the initial task to connect, also executes that task AND the inner task. Whenever I attempt to call .wait() on the inner task from the outer one, I immediately get an exception. However, the inner task will then finish and connect successfully to the device.
Why are my async chains executing on the UI thread? How can i properly wait on these tasks?
In general you should avoid .wait() and just continue the asynchronous chain. If you need to block for some reason, the only fool-proof mechanism would be to explicitly run your code from a background thread (eg, the WinRT thread pool).
You could try using the .then() overload that takes a task_options and pass concurrency::task_options(concurrency::task_continuation_context::use_arbitrary()), but that doesn't guarantee the continuation will run on another thread; it just says that it's OK if it does so -- see documentation here.
You could set an event and have the main thread wait for it. I have done this with some IO async operations. Here is a basic example of using the thread pool, using an event to wait on the work:
TEST_METHOD(ThreadpoolEventTestCppCx)
{
Microsoft::WRL::Wrappers::Event m_logFileCreatedEvent;
m_logFileCreatedEvent.Attach(CreateEventEx(nullptr, nullptr, CREATE_EVENT_MANUAL_RESET, WRITE_OWNER | EVENT_ALL_ACCESS));
long x = 10000000;
auto workItem = ref new WorkItemHandler(
[&m_logFileCreatedEvent, &x](Windows::Foundation::IAsyncAction^ workItem)
{
while (x--);
SetEvent(m_logFileCreatedEvent.Get());
});
auto asyncAction = ThreadPool::RunAsync(workItem);
WaitForSingleObjectEx(m_logFileCreatedEvent.Get(), INFINITE, FALSE);
long i = x;
}
Here is a similar example except it includes a bit of Windows Runtime async IO:
TEST_METHOD(AsyncOnThreadPoolUsingEvent)
{
std::shared_ptr<Concurrency::event> _completed = std::make_shared<Concurrency::event>();
int i;
auto workItem = ref new WorkItemHandler(
[_completed, &i](Windows::Foundation::IAsyncAction^ workItem)
{
Windows::Storage::StorageFolder^ _picturesLibrary = Windows::Storage::KnownFolders::PicturesLibrary;
Concurrency::task<Windows::Storage::StorageFile^> _getFileObjectTask(_picturesLibrary->GetFileAsync(L"art.bmp"));
auto _task2 = _getFileObjectTask.then([_completed, &i](Windows::Storage::StorageFile^ file)
{
i = 90210;
_completed->set();
});
});
auto asyncAction = ThreadPool::RunAsync(workItem);
_completed->wait();
int j = i;
}
I tried using an event to wait on Windows Runtime Async work, but it blocked. That's why I had to use the threadpool.

Node.js Synchronous Library Code Blocking Async Execution

Suppose you've got a 3rd-party library that's got a synchronous API. Naturally, attempting to use it in an async fashion yields undesirable results in the sense that you get blocked when trying to do multiple things in "parallel".
Are there any common patterns that allow us to use such libraries in an async fashion?
Consider the following example (using the async library from NPM for brevity):
var async = require('async');
function ts() {
return new Date().getTime();
}
var startTs = ts();
process.on('exit', function() {
console.log('Total Time: ~' + (ts() - startTs) + ' ms');
});
// This is a dummy function that simulates some 3rd-party synchronous code.
function vendorSyncCode() {
var future = ts() + 50; // ~50 ms in the future.
while(ts() <= future) {} // Spin to simulate blocking work.
}
// My code that handles the workload and uses `vendorSyncCode`.
function myTaskRunner(task, callback) {
// Do async stuff with `task`...
vendorSyncCode(task);
// Do more async stuff...
callback();
}
// Dummy workload.
var work = (function() {
var result = [];
for(var i = 0; i < 100; ++i) result.push(i);
return result;
})();
// Problem:
// -------
// The following two calls will take roughly the same amount of time to complete.
// In this case, ~6 seconds each.
async.each(work, myTaskRunner, function(err) {});
async.eachLimit(work, 10, myTaskRunner, function(err) {});
// Desired:
// --------
// The latter call with 10 "workers" should complete roughly an order of magnitude
// faster than the former.
Are fork/join or spawning worker processes manually my only options?
Yes, it is your only option.
If you need to use 50ms of cpu time to do something, and need to do it 10 times, then you'll need 500ms of cpu time to do it. If you want it to be done in less than 500ms of wall clock time, you need to use more cpus. That means multiple node instances (or a C++ addon that pushes the work out onto the thread pool). How to get multiple instances depends on your app strucuture, a child that you feed the work to using child_process.send() is one way, running multiple servers with cluster is another. Breaking up your server is another way. Say its an image store application, and mostly is fast to process requests, unless someone asks to convert an image into another format and that's cpu intensive. You could push the image processing portion into a different app, and access it through a REST API, leaving the main app server responsive.
If you aren't concerned that it takes 50ms of cpu to do the request, but instead you are concerned that you can't interleave handling of other requests with the processing of the cpu intensive request, then you could break the work up into small chunks, and schedule the next chunk with setInterval(). That's usually a horrid hack, though. Better to restructure the app.

Resources