Differences between gevent and tornado - wsgi

I understand that both tornado and gevent are asynchronous python frameworks.
While reading the bottle documentation I found that gevent actually is NOT asynchronous, and you can create thousands to pseudo-threads that work synchronously.
Seondly, in gevent, you can not terminate the request handler early and you need to return the full response, while in tornado you can. (correct me if I'm wrong here)
Can some one describe in detail how these systems work internally, and in what ways they are different. Also, how does WSGI play with the asynchronous nature of these systems? Do these frameworks conform to WSGI, if yes, how?

Have a read of:
http://en.wikipedia.org/wiki/Coroutines
and:
http://en.wikipedia.org/wiki/Event-driven_architecture
http://en.wikipedia.org/wiki/Event-driven_programming
The gevent package uses coroutines and Tornado is event driven.
Even driven systems don't readily map to WSGI, but a coroutine system, because it looks like threads, can be made to support WSGI if blocking points can be patched to switch coroutines when things would block.

gevent and Tornado are a bit different. gevent is much more like Twisted - an asynchronous network framework, whereas Tornado is a web only framework.
The main highlight of gevent is that it utilizes coroutines and makes code look like it's running synchronously, but in fact most IO blocking functions are non-blocking and return control to the gevent main loop. This is very important for IO bound programming since it allows you to write highly efficient single thread code the same way you would write multithreaded code, which is much more resource hungry.
gevent also includes a WSGI request handler so it can be used to handle HTTP requests in a standalone manner, like Tornado.
Tornado is an asynchronous web framework which relies on the programmer to write asynchronous code in Python, which is often a pain in the Backend because there are no multiline anonymous closures or classes, like in JavaScript or Java. Therefore, writing good code using Tornado is really hard. For example, using blocking libraries becomes a pain.
Indeed both frameworks are asynchronous at their core, but the resulting code looks a bit different (easier to program with gevent).
You can actually use Torando and gevent together, but I haven't tried it out (yet).

Related

How does Hystrix for NodeJS work despite being single threaded?

hystrixjs is an implementation of Hystrix for NodeJS.
But I'm not able to understand how it works in a single threaded js. How does it handle timing out the task?
The documentation talks about that a bit
Since this library targets nodejs application, it is by far not as
complex as the java implementation (remember... just one thread... no
concurrency, at least most of the time...). So how does it accomplish
the goals outlined above:
wraps all promise-based calls to external systems (or "dependencies") in a "Command"
Does that mean all the library is just run the promise and letting node handle the multi threading part(like network calls being handled by OS threads)?

Does microservices with node.js prevent the main thread from blocking

I have just started to understand how node.js event loop and microservices works and I was wondering if microservices can prevent the main thread of node.js application from blocking. What I mean is that we can run synchronous code on a different microservice which can send the response back when done and we can scale only that instance of microservice.
Is my understanding correct or please let me know if I got something wrong?
I think you're mixing up two concepts.
Microservices are relatively small, loosely-coupled services written in whatever language. For example if I work at BigEcommerceCompany I might have a variety of microservices written in a variety of technologies to manage, such as an auth service, cart service, payments service, reviews service, etc., and they might all be in the same language or all be in different languages.
Node's event loop is single threaded, but also has a worker pool that can be used for work without blocking the event loop, and Node can also be clustered with the build in cluster module (or various wrappers) across available CPUs. An example of a blocking function call in Node would be child_process.spawnSync; an example of a non-blocking call would be child_process.spawn. It's common when writing Node code to use a lot of Promises or callbacks to avoid blocking the event loop as much as possible.
The two concepts aren't really related except in that by writing small microservices it may be easier to find, isolate, and fix problems with Node performance.

Node.js Express server: running res.render() / ejs.render() using Node.js threadpool

We have an app that uses server-side rendering for SEO purposes using EJS templating.
I am well-versed with Node.js and know that it's probably possible to tap into the Node.js threadpool for asynchronous I/O for whatever purpose you want, whether it's a good idea or a bad idea. Currently I am wondering if it is possible to run ejs.render() or res.render() with a thread in the threadpool instead of the main thread in Node.js?
We are doing a lot of heavy computational lifting in the render functions and we definitely want that off the main thread, otherwise we will be paying $$$ for more servers.
Is it just the rendering that is concerning you? There are other template engines which should produce better results; being that template rendering should be an idempotent operation, you could additionally distribute across a cluster.
V8 will compile your code to assembly and, if your not hitting any deoptimizations or getting stalled by the garbage collector, I believe you should be in the neighborhood of your network I/O limits. I would definitely recommend you try other template engines, adding a caching HTTP reverse proxy at the front and running some benchmarks first.
EJS is known to be synchronous, and that's not going to change, so basically it's an inefficient rendering engine for Node.js since it blocks the JS thread whenever it renders a view, which degrades your overall throughput, especially if your rendering is CPU heavy.
You should definitely think about some other options. E.g. https://github.com/ericf/express-handlebars
If you really have CPU-heavy computation in your webserver, then Node.js is definitely not the right tool for the job anyway. There are much better servers to handle multi-threading and parallel processing. You could just setup Node to be a controller and forward your CPU-heavy requests to a backend service/server that can do the heavy-lifting.
It would be helpful to see what kind of computation you are doing during render to provide a better answer.
Tapping into the thread-pool (which is handled by libuv) would probably be a bad idea, but it is possible of course.. you just need some C++ skills and the uv_queue_work() method of the libuv library to schedule stuff on a worker thread.
I have experimented with building a scripting engine that is run in a forked process (Read on node's child process module here). I find that to be an attractive proposition for implementing rendering engines. Yes there are issues of passing parameters (post/get query strings, session status, etc) but they are easy to deal with, especially if you use the fork option (as opposed to exec or spawn). There are standard messaging methods to communicate between the child and parent.
The only overhead is the generation of the additional instance of node (the rendering engine itself). If you are doing extensive computation in the scripting engine then this constant, the one-time per rendering request overhead of forking a new process will be minor compared to the time taken to render.
If EJS rendering blocks the main node thread, then that alone is sufficient reason NOT to use it if you are doing any significant computation during rendering.

Instances where Node.js operations need to be synchronous

JavaScript is single-threaded (besides web workers and spawning multiple processes), and it's best not to wait for long-running operations as it blocks the thread. But still, when taking a peek into several modules in Github, they actually use these syncrhonous operations, most of the time in file operations.
Am I staring at bad code/practice? Or is there some real need for synchronous operations in JavaScript that I am not aware of?
Can you post an example? You are most likely looking at:
a call that is actually asynchronous such as fs.read. Note that all synchronous calls in the node core API end with the word "Sync" like fs.readSync.
synchronous code such as require('somemodule') that runs before the application begins accepting network requests
And yes, if you are seeing code doing something like fs.readSync while responding to an HTTP request, that is bad code/practice and that application will lock up while that synchronous operation happens.
Node.js it's not single-threaded, uses a pool of threads but they are exposed as a single thread to the javascript layer, otherwise would be impossible to write asynchronous code. Any I/O call blocks the current thread.
Threads are used internally to fake the asynchronous nature of all the
system calls. libuv also uses threads to allow you, the application,
to perform a task asynchronously that is actually blocking, by
spawning a thread and collecting the result when it is done.
http://nikhilm.github.com/uvbook/threads.html
Node.js has decided to include synchronous functions to maintain a similitude with other common languages, but they shouldn't be used, never, never, never!
Node.js in its nature is asynchronous, it's pure javascript. Javascript is synonym of closure, of callback. If you want to write synchronous code with Node.js perhaps you should try another scripting language like python.
There's an excellent module called async that eases the pain of nested callbacks. Then, why should I use synchronous code? Silly. If I use synchronous code I lose all the benefits that Node provides to me. The only exception are CLI apps, but again, I'll prefer to write all the code asynchronously. It's not really hard.
At some level, a synchronous operation has to occur. Node.js puts those in threads so that the whole server isn't blocked.

Simulate website load on Node.JS

I am thinking of creating my own simple load test, where I can hit my website with multiple requests (like 100-1000 concurrent users) to see how it performs. I want to try Node.js out, but I don't know if it is the wrong technology for the job, since Node.js don't use threads?
Can I with the async model that Node.js uses, simulate the many user requests, or would that be more appropiate to do in another language like Ruby/.NET/Python?
Node.js ought to be perfect for the task. I do this at work. The one crucial piece that you will have to change is the http socket pool. The following code snipped will disable pooling entirely, letting you starve your Node.js process if you want to.
var http = require('http');
var req = http.request(..., agent: false)
You can read about this more at the http.Agent documentation.
Your concern about threads is astute, but even if you hit that limit (Node is very good at keeping your resources efficient) the solution is simple: start multiple instances (processes) of your load test. As it is, you may have to use multiple machines entirely to correctly simulate load.
In any case, you will not win automatically by using Ruby or Python for this. Asynchronous programming is ideal for I/O and network-bound tasks, and Node excels at this. Similarly, while Ruby and Python have third-party asynchronous frameworks, they're by definition more obscure than the standard asynchronous framework given in Node.
Node can fire off pretty much as many requests as you want it to (though you may have to change the defaults for http:Agent). You're more likely to be limited by what your OS can do than by anything inherent in node (and of course such limitations will apply in any other language you use).
It's simple to create load tests with nodeload.

Resources