Node WASI vs spawning a child process

Node WASI vs spawning a child process - node.js

In the NodeJS docs it states the following:
The WASI API provides an implementation of the WebAssembly System Interface specification. WASI gives sandboxed WebAssembly applications access to the underlying operating system via a collection of POSIX-like functions.
My question is:
What is the biggest benefit of using the WASI API over, say, spawing some other child process or similar methods of running non-nodejs code?
I would have to assume it's faster than spawning a child process, or using some C code with bindings due to the native-api.
Maybe I'm simply misunderstanding the entire idea behind WASI, which is plausable, given that part of what makes WASM so amazing is the ability to use a server-side, full blown programming language on the web (mostly), like all the crazy tools we've seen with Go/Rust.
Is this more so for the benefit of running WASM in node, natively, and again, if so, what are the benefits compared to running child processes?

I ended up getting my answer from a post here that was removed.
In really high-level terms, WASI is simply an (systems) interface for WASM.
I ended up finding this short 'article', if you will, super helpful too!
Just as WebAssembly is an assembly language for a conceptual machine, WebAssembly needs a system interface for a conceptual operating system, not any single operating system. This way, it can be run across all different OSs.
This is what WASI is — a system interface for the WebAssembly platform.

Related

Node js vs Kotlin for REST APIs

Kotlin vs Node JS for REST Api's
I couldn't find any proper explanation regarding the differences b/w Kotlin and Node JS for REST APIs
Which is better in performance wise?

Let me set the context. Its Kotlin/JVM vs JS/Node.js. We cannot blindly say that this language is better. In general Kotlin is supposed to be faster since it compiled language compared to JS which is interpreted language.
Irrespective of the language used, we will discuss on the API architecture. Serving the APIs can be implemented in either blocking or non-blocking way (I am not going to explain about what it is). Traditionally before a few of years Java/Kotlin with Spring have been using the blocking architecture which delivered performance X. On a contrary, Node.js is based on non-blocking architecture which gave us better performance than the blocking architecture and architecture style is the only reason why Node.js performed better. Later Spring released a newer version of the framework to support non-blocking architecture. The non-blocking style is called as Reactive programming/Spring Webflux.
So now both of the languages support non-blocking architecture. In terms of raw language performance, Kotlin will be better since its compiled language. Also in theory interpreted languages are supposed to be slower. But we cannot say which is better without any testing.
Personally I am fan of Java/Spring because of OOPS and later at one point I started using TS/Node.js. TS eliminates most of the runtime issues with its type checking. But still we cannot compare it with the type system available in Java/Kotlin. As a language I feel Java/Kotlin is superior and one thing I like most in JavaScript is handling objects/JSON. Checkout "Kotlin for JavaScript" as well which lets you write in Kotlin and transpile to JS. Ignore this "Kotlin for JavaScript" feature, I am planning to try Kotlin/Spring in non-blocking architecture for my future projects. If you have usecases with WebSockets, I think Node.js will perform better and I am not sure If there are any libraries in Java/Kotlin since I havn't explored it.
One disadvantage in non-blocking style is that I need to pass the login context object to almost all the methods in the project. In blocking architecture we will add the login context information in thread local so that we can access it anywhere until the request is completed.
I am sure that I did not answer your question completely. But I hope that the information what I have give is useful.
Correct me If I am wrong in any of the aspects.

What are the security risks associated with WASM?

Using Deno you can execute WASM on a server. WASM is sandboxed for the user's safety. From my understanding, WASM code cannot do HTTP requests or modify the DOM.
Is safety guaranteed server side too? I'm looking to run arbitrary Python code from user input on servers using pyodide but was concerned that I have missed some important security flaw.

Using Deno you can run WebAssembly modules on a server because the Deno wasi module provides an implementation of WASI, the WebAssembly system interface. Using Deno is just one way of running wasm modules on a server. You could choose between many other implementations of WASI, like the wasi module in Node.js, wasmtime, lucet, wasmer, etc.
Code [running] outside of a browser needs a way to talk to the system — a system interface.
As for your security concerns, keep in mind that your WebAssembly code runs in a sandboxed environment. It's not your host system that executes directly the code in your wasm module. It's the wasm runtime — that implements the WASI interface — that runs it. And as far as I know the only way for your code to produce side effects (e.g. perform a HTTP call, access files) is to go through appropriate APIs defined by WASI.

General server-side use of V8: Isolates

Google's open sourced V8 engine is mature, performant JIT compiler.
Implemented primarily in C++, acting as JS centric execution runtime.
It has an isolation implementation (V8: Isolates), providing isolation granularity within a single process.
Leading to two part question.
(Generic)
Can this capability be broadly used for isolation across server-side web application engines (e.g. nginx, apache) and programming languages?
(And more specific ->)
What I've grasped of V8 - is that it's designed for JS scripting lang (even though, it compiles directly to machine code).
Wanting to use a programming language for source code - say Haskell, C++/C - then tends to still have JS interface in between.
Would there be a much direct way to generate machine code, while still using V8: Isolates?

V8 is a JavaScript (and WebAssembly, in recent versions) engine and as such cannot be used to compile or execute any other languages.
If you have C++ code, you'll need to use a C++ compiler to generate executable machine code for it. Haskell code needs a Haskell compiler.
Depending on your requirements, WebAssembly might be interesting to you: it is a portable compilation target for languages like C++ that is more suitable for this purpose than JavaScript.
This should answer both your "more specific" and the "generic" question.
Note that there isn't really any magic in V8's Isolates that one might want to use for other purposes; the term mostly describes the ability to have several separate instances of V8 in the same process. That's rather easy to pull off if you start your own project from scratch (no matter what its purpose is), you just have to maintain a bit of coding discipline; for an existing codebase it requires refactoring of all global state (static variables etc).
Also, note that the world has learned this year that from a security point of view, there really is no such thing as in-process isolation. If you have strong security requirements, then at the very least you'll have to run separate processes for different security domains. (To be clear, V8's Isolates do not provide protection from side-channel attacks.)

Why was Node.js written in the C/C++ programming language?

Unfortunately JavaScript is the only programming language I have experience with. So naturally my gut instinct is to wonder why you wouldn't use write a programming language (in this case Node) in JavaScript?
Why C? What benefits are you getting?

C is a low-level language suited to systems programming--i.e. the construction of operating systems, database engines, and other code that must be highly efficient (in both time and space used to complete a given task). C is "close to the bare metal," compiling every effectively into machine code and CPU instructions.
You can certainly write compilers and middleware in higher-level languages than C. While there can be a speed-of-development advantage for doing so, they will almost always run slower and consume far more memory. Many languages (Python, PHP, JavaScript, ...) are implemented in C (or C++) as a result.
If you wanted to implement something like Node in another language, you would probably best look to another language that majors on systems programming, such as C++, C#, Rust, D, ...

Node.js is built on chrome's V8 engine(which allows it to execute javascript), so you should ask that why was v8 written in c++?
This answer on Quora might help you for the 2nd question

Node js is created using JavaScript language which can be run in the desktop to create application. Node js is also written in C++ because when the web server needs access to internal system functionality such as networking.
C++ has many features that let it directly interact with the OS directly
JavaScript does not! So it has to work with C++ to control these computer features.
Referring to client and server side architecture example . (Here Mick is the client) Mick's Mac/Windows needs access to a website which is hosted in the internet somewhere in a server which basically a computer.

Managing multiple-processes: What are the common strategies?

While multithreading is faster in some cases, sometimes we just want to spawn multiple worker processes to do work. This has the benefits of not crashing the main app if one of the worker crashes, and that the user doesn't need to worry a lot about inter-locking stuffs.
COM+'s Application Pooling seems like a good way to achieve this on Windows. The downside is that we need to write a COM+ wrapper for the worker process.
However, when I search for Application Pooling on Google, it seems like most of its usages are related to IIS. Don't other applications (such as scientific/graphics) find it useful to spawn multiple worker processes?
So there are several questions:
Why isn't COM+ more popular in areas other than IIS? If I write a non-IIS application and want to use process management on Windows, should I go with COM+ or are there better alternatives out there?
What would be the cross platform way to do it? Are there libraries out there that give me a "process pool" (worker processes will intelligently pick up work, can be managed, etc.)

I can't offer any answers to the COM aspect of your question, but it's worth noting there's another world (besides HPC MPI) where multi-processing (rather than the more common multi-threading approach) is apparently alive, well and thriving: Python.
Why ? Python's GIL ("global interpreter lock") cripples most attempts to multithread python code so badly that multiprocessing is the generally recommended approach to parallelising Python on SMP. The standard library includes process pools; there are various other options too.
Python certainly ought to satisfy any multi-platform requirement!

You might want to investigate how the apache web server manages process pools. From version 2.0 it runs natively on windows and one of the multi-processing models it supports are process pools. A part of apache is also APR (apache portable runtime), which handles platform-specific issues.

No one can answer why something is not popular because may be no body is looking for what you are looking for. After .NET came in picture, people shifted from COM to Managed Environment, before .NET, COM and ATL and relative other technologies were quite painful to implement and they would crash and were also quite difficult to debug.
That is the reason, managed environment came in existence.
However, .NET 4 onwards, parallel libraries give much more power to user for parallel programming and also you can spawn and control other proceeses.
For multiplatform, you can look for zvrba's answer.

Yes, other applications--especially science applications--find it useful to spawn multiple processes. Since few super-computers run Microsoft Windows, scientists generally avoid using anything that ties them to a Microsoft platform. Nothing related to COM will help scientists leverage their enormous existing code base written in Fortran.
People who choose to run IIS have generally already drunk the Microsoft Koolaid, so they have fewer inhibitions to tying themselves to Microsoft's proprietary platforms, which is why COM-specific terminology will get lots of hits related to IIS.
One of the open standards for doing what you want is the Message Passing Interface. Several implementations exist and some of them run on supercomputers using Fortran. Some of them run on cheaper computers using sexier languages.
See http://en.wikipedia.org/wiki/Message_Passing_Interface

There hasn't been a mob rushing through the doors of COM application pooling primarily because of two factors:
COM is a pain in the ass to deal with compared to just about anything else
Threading can be a headache, but it's a lot easier and more convenient to manage than inter-process communication
COM application pooling was essentially created for IIS. It has one very specific benefit over normal multithreading: the multiple processes are fully isolated from each other. This is important for data security and for app stability when dealing with third party plugins of questionable stability.
Scientific computing generally doesn't need strong data security isolation between operations, and I would venture to guess that scientific computing doesn't rely much on third party plugins of questionable stability. When doing big math operations, you're either using a sexy numerics library that had better be rock solid to be taken seriously, or you're using your own code, in which case crashes should be fixed and repeat offenders should be spanked.
Oh, and all crashes except stack overflow can be trapped and dealt with within a multithreaded app, especially if it's your own code.
In short, COM app pooling is overkill for just about anything other than IIS.

Google's webbrowser chrome is a multi-process architecture software. It is open source, so you can check out its code and see how to manage processes.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string