tokio-tungstenite with tokio::select! macro? - rust

The documentation for the tokio-tungstenite crate mostly just directs you to the examples section.
The client example hasn't been updated in a few years and I saw some unfamiliar async code here and because Rust's async APIs are still in flux I wanted to check the use futures_util::future::select was still idiomatic in this situation.
In particular, it seems to work fine if I replace it with the more commonly seen tokio::select! macro, which doesn't require pinning the passed futures.
I have good familiarity with tokio's APIs but not so much the lower level futures ones. Is there a reason to manually pin here and use the future::select instead? More generally, in modern idiomatic async Rust code when would one use the latter?

It should be fine to replace any use of futures::future::select() with tokio::select! (or futures::select!). They both do the same thing. select! allow for more than one future, while select() gives you a named future. I prefer select! when I can (I think it is also more performant, but I'm not sure).

Related

Why are asynchronous runtimes like Tokio necessary?

My first experience doing a computer system project was building a server using vanilla Java and then a client on an Android phone. Since then, I've found that there are a lot of frameworks to help manage scalability and remove the need to write boilerplate code.
I'm trying to understand what services like Tokio and Rayon enable.
I came across this paragraph on the Tokio tutorial page and I'm having a hard time understanding it
When you write your application in an asynchronous manner, you enable it to scale much better by reducing the cost of doing many things at the same time. However, asynchronous Rust code does not run on its own, so you must choose a runtime to execute it.
I first thought a "runtime" might refer to where the binary can run, but it looks like Tokio just provides functions that are already available in the Rust standard library while Rayon implements functions that aren't in the standard library.
Are the standard implementations for asynchronous functions written poorly in the standard library or am I not understanding what service Tokio is providing?
Rust currently does not provide an async runtime in the standard library. For full details, see Asynchronous Programming in Rust, and particularly the chapter on "The Async Ecosystem."
Rust currently provides only the bare essentials for writing async code. Importantly, executors, tasks, reactors, combinators, and low-level I/O futures and traits are not yet provided in the standard library. In the meantime, community-provided async ecosystems fill in these gaps.
Rust has very strict backward compatibility requirements, and they haven't chosen to lock-in a specific runtime. There are reasons to pick one over another (features versus size for example), and making it part of the standard library would impose certain choices that aren't clearly the right ones for all projects. This may change in the future as the community projects better explore this space and help determine the best mix of choices without the strong backward compatibility promises.

How to choose between block_in_place and spawn_blocking?

I'm working a lot with tokio and I've been using spawn_blocking for code that is going to block the thread. Then I saw the documentation for block_in_place and it seems like it's an unrestricted (Send, 'static) version of the former.
My question is, if I'm already on a threaded runtime, when is using block_in_place not advisable? What are the differences and advantages of each method of driving sync code? Can it be a problem if I block_in_place a lot, for example, in all my threads at the same time? How does it work?
I read all of the tokio documentation and didn't find the answer to these questions, so it felt right to ask here.

How do I write an async method with Tokio?

I'm trying to write a library that will connect to remote servers and exchange data. I did this in C++ using Boost::Asio and am trying to do the same with Rust.
One of the problems I have is mapping concepts from Asio, like async_write/read to Tokio, starting with the fact that seemingly all Tokio examples demand that I replace my main() with an async main(), while I would like to encapsulate all my async code in structures and associated implementations.
Is it possible to use Tokio without replacing main()? Is mio perhaps the only way?
You can create a runtime manually using Runtime::new() which is what the tokio main macro is doing under the hood. It's just for an awful lot of apps, especially examples that's just boilerplate. So the macro automates the simple case.
However, depending on the context of your library, it may be more idiomatic to provide a future based API, and then leave the app consumer to set up the runtime.

Why do coroutines have futures?

Once you have coroutines you can create pipelines (haskell: pipes, conduits; python: generators) or cooperative event loops (python: curio). Once you have futures, it appears you can do the same; pipelines (rust: futures-rs) and event loops (rust: tokio). Since futures aren't cooperative they require a callback-based (even poll-based futures require callbacks) scheduler to execute blocking tasks within a thread or process pool. What benefits are there to combining futures (library-level) with coroutines (language-level) as these languages do: (python: asyncio), (rust: rfc), (ecmascript 6+). Fundamentally they seem to be conflicting solutions to the same problem.
I'm not looking for a pro/con comparison, and I don't buy the argument that futures are "one-shot" coroutines. Just look at rust, which built an entire state-machine-based event framework using just futures. I want to know why python/asyncio and javascript both require coroutines together with futures. Why rust is planning on adding coroutines to its futures? Does it have to do with composability of events? Or the implicit stack of coroutines versus the explicit stack of continuation-passing futures? Not that I completely understand this argument, as both futures and coroutines are implemented using continuations... Or does it have something to do with direct vs indirect style?
These are all different (though related) ideas with different amounts of power.
A future is an abstraction that lets you begin a process and then yield back to a handler, that is chosen by the original caller, when the process is done.
A generator is more powerful than a future because it can yield multiple times. You can implement futures on top of generators.
A coroutine is more powerful than a generator because it can choose who to yield to, instead of only to the caller. For example it can yield to another coroutine. You can implement generators on top of coroutines.
Why would you use the less powerful tool, when more powerful ones are available? Sometimes the less powerful tool is the right tool for the job. It's useful to statically encode your program's invariants using types, because it can give you certainty about what something can't do.
For example, when making a REST call to a remote server, a future is probably sufficient. If the REST client exposed a generator, you'd have to deal with the possibility that it could yield multiple times, even though you know there is only going to be one result. If it exposed a coroutine, you'd have to consult the documentation to work out exactly how you're supposed to interact with it - even though you actually only need to do one thing, which is obvious when you're dealing with a future.

node.js modules: Async vs Fibers.promise vs Q_oper8

Just wondering if anyone could give me a comparison of trade-offs between these modules for handling async events. Specifically, I'm interested in knowing about reasons to use Async instead of Fibers.promise, which I am using quite extensively at least in my test code right now. In particular, one of the major pluses I see in Fibers.promise is that I can keep the stack chain front bifurcating, making it possible to use try { } catch { } finally, and also allowing me to ensure that after a request has been handled that the response is ended.
Is anyone using Q_oper8? I found this on another page and was just wondering if that's already dead or if its something I should check out.
I've never heard of Q_oper8, so I can't comment on it, but I'll come at this from the other direction. I heard about async first and Fiber (and its helper libraries) second, and I don't like the latter, actually.
The Downsides of Fiber
Unfamiliarity for other Javascript developers
Fiber introduces the concept of co-routines to Javascript via a compiled Fiber native method that takes over the interpretation of the Javascript code passed to it, intercepting calls to yield to jump back to the waiting co-routine.
This may not matter to you, but if you need to work on a team, you'll have to teach the concept to your members (or hope they have experience with the concept from other languages, like Go).
No Windows Support
So, in order to use Fiber or any of the libraries written on top of it, you'll have to natively compile it for your platform first. I don't use Windows, but note that Fiber is not supported on Windows, so that restricts the utility of your own library off-the-bat. Which means you won't be finding general-purpose Node.js libraries written in Fiber at all (and you probably wouldn't have, anyways, since it adds a costly compilation step that you'd otherwise avoid with async).
Browser Incompatible
This means any code you write using Fiber will not be able to run in the browser, because you can't mix native code with the browser (nor would I as a browser user want you to), even if everything you write is "Javascript" (it's syntatically Javascript, but semantically not).
More Difficult Debugging
While the "callback hell" may be less visually pleasing, Continuation-Passing Style does have one very good thing going for it over Co-Routines -- you know exactly where a problem has occurred from the call stack and can trace backwards. Co-Routines enter the function at more than one point in the program, and can exit from three kinds of calls: return, throw and yield(), where the latter is also a return point.
With co-routines, you have cross-execution between two or more functions running "simultaneously", and you may have more than one set of co-routines running at the same time on the event loop. With traditional callbacks, you're guaranteed that the outer scope of the function is static during the execution of said function, so you only need to check those outer variables once if they're needed. Co-routines need these checks to be run after every yield() (since it's usage with the originating co-routine would be translated into a callback chain in real Javascript).
Basically, I think the co-routine concept is made more difficult to work with because it has to exist inside of the Javascript event loop, rather than being a method to implement one.
What makes Async "better"?
Worse is Better
It's sort of the "worse-is-better" idea, actually. Rather than extend the Javascript language to try and get rid of its warts (and create new ones, in my opinion), Async is a pure-Javascript solution to cover them up, like makeup.
Control flow explicit
The Async functions describe different types of logic flow that needs to cross the event loop barrier, and the library covers up the implementation details of the callback code needed to implement that logic, and you just provide it functions it should run in roughly the linear order they will execute across the event loop.
If you're willing to drop the first indentation level around the async methods' arguments, you have no extra indentation versus Co-Routines and only a minor number of extra lines of function(callback) { declarations, like this:
var async = require('async');
var someArray = [1, 2, 3, 4, 5, 6, 7, 8, 9];
async.forEach(someArray,
function(number, callback) {
//Do something with the number
callback();
}, function(err) {
//Done doing stuff, or one of the calls to the previous function returned an error I need to deal with
});
In this case, you know that all of the variables your code is using could only have been changed before your code is run if they weren't changed by your code, so you can debug easier, and there is only one "return" mechanism: callback(). You either callback with nothing on success or pass the callback an error when something's gone wrong.
Code reuse not difficult
The above example makes code reuse difficult but it doesn't have to be. You can always pass in named functions as the parameters:
var async = require('async');
// Javascript doesn't care about declaration order within a scope,
// so order the declarations in a way that's most readable to you
async.forEach(someArray, frazzleNumber, doneFrazzling);
var someArray = [1, 2, 3, 4, 5, 6, 7, 8, 9];
function frazzleNumber(number, callback) {
// Do something to number
callback();
}
function doneFrazzling(err) {
// Do something or handle error
}
Functional, not imperative
The async module discourages the use of imperative-style flow control and encourages (requires, for the parts that cross the event loop) the use of functions for flow control.
The advantage of the functional style is that you can easily re-use the body of your loop or your conditional, and that you can create new control flow "verbs" that better match the flow of your code (demonstrated by the very existence of the async library), like the async.auto control flow method that implements dependency graph resolution for function call order. (You specify a series of named functions and list the other functions, if any, that it depends on to execute, and auto runs first the "independent" functions then the next function that can run based on when its dependent functions have finished running.)
Rather than writing your code to fit the imperative style dictated by your language, you write your code as the logic of the problem dictates, and implement the "glue" control flow to get it to happen.
In Summary
Fiber, by its very nature of extending the Javascript language, cannot develop a large ecosystem within Node.js, especially when Async gets 80% of the way on the looks department, and has none of the other downsides of co-routines in Javascript.
The short answer:
Async is a pure/classic javascript solution to managing single-thread asynchronousity
Fibers is a node.js extension for creating coroutines. It includes a futures library for managing single-thread asynchronousity.
There are many other futures libraries (listed below) that don't require an extension of javascript.
Q_oper8 is a node.js module for managing multi-process concurrency
Note that none of these offer "threads" and so none can be said to do multithreading (though there is a node.js extension for that too: threads_a_gogo).
Async vs Fiber/futures
Async
Async and Fibers/futures are different ways to solve the same problem: managing asynchronously resolving dependencies. Async seems to have many more "bells and whistles" than many other libraries that try to solve this problem, which in my opinion makes it worse (much more cognitive overhead - ie more crap to learn).
In javascript basic asynchronisity looks like this:
asyncCall(someParam, function(result) {
useThe(result);
});
If you have a situation that requires more than just basic asynchronisity, like where you need the results of two asyncronous calls, you might do something like this:
asyncCall1(someParam, function(result0) {
asyncCall2(someParam, function(result1) {
use(result0, result1);
}
});
Already starts to look like callback hell. Also its inefficient because the second call is waiting for the first call to complete even though it isn't dependent on it, not to mention the code doesn't even do any sort of reasonable error handling. Async provides one solution to writing it a little more efficiently:
async.parallel([
function(callback) {
asyncCall1(someParam, function(result0) {
callback(null,result0);
},
function(callback) {
asyncCall1(someParam, function(result1) {
callback(null,result1);
},
}
],
function(err, results) {
use(results[0], results[1]);
});
So to me, thats rather worse than callback hell, but to each his own I suppose. Despite it being ugly, it allows both calls to happen simultaneously (as long as they make non-blocking IO calls or something like that). Async has many more options for managing asynchronous code, so if you're interested take a look at the documentation.
Enter fiber/futures
The coroutines the Fibers module includes a futures library that uses coroutines to re-inject asynchronous events back into the current continuation (future.wait()).
Fibers is different from most other futures libraries because it allows the current continuation to wait on an asynchronous event - meaning it doesn't require the use of callbacks in order for you to get a value back from an async request - allowing asynchronous code to become synchronous-like. Read about coroutines for more about that.
Node.js has io functions like readFileSync, which lets you wait on the function in-line while it gets the file for you. This is not something that is normally done in javascript, and isn't something that can be written in pure javascript - it requires an extension like Fibers.
Going back to the same asynchronous example above, this is what it would look like with fibers/futures:
var future0 = asyncCall1(someParam);
var future1 = asyncCall2(someParam);
use(future0.wait(), future1.wait());
This is drastically simpler and just as efficient as the Async mess up there. It avoids callback-hell in an elegant efficient way. There are (minor) downsides though. David Ellis overstated many of the downsides, so I'll repeat the only valid one here:
Browser Incompatibility
By virtue of Fibers being a node.js extension, it will not be compatible with browsers. This will make sharing code that uses fibers impossible with both a node.js server and the browser. However, there is a strong argument that most asynchronous code you want on the server (filesystem, database, network calls) is not the same code you want on a browser (ajax calls). Maybe timeouts collide, but that seems like it.
Beyond that, the streamline.js project has the ability to bridge this gap. Seems like it has a compilation process that can transform streamline.js code using synchronization and futures into pure javascript using the callback style, similar to the now unsupported Narrative Javascript. Streamline.js can use a couple different mechanisms behind the scenes, one being node.js Fibers, another being ECMAScript 6 generators, and the last being translation into callback-style javascript which I already mentioned.
More difficult debugging
This one seems like a valid, if minor, gripe. Even if you're just planning on using fibers/futures, and not using coroutines for anything else, there might still be confusing context switches because of unexpected function exit (and entrance) points.
Introduces pre-emptiveness into javascript
This is probably the most major problem with fibers, since it has the possibility (however unlikely) of introducing hard-to-understand bugs. Basically, because a Fiber yield can cause a temporary exit of a set of code to another undetermined function, its possible that some invalid state can be read or introduced. See this article for more info. Personally, I think the incredible cleanness of fibers/futures and similar structures is well worth the rare insidious bugs. Many more bugs are caused by awful concurrency code.
Invalid gripes
Not on windows: this just isn't true anymore
Unfamiliarity with coroutines: A. Unfamiliarity is never a reason to shun something. If its good its good, regardless of how familiar you are with it. B. While coroutines and yields may be unfamiliar, futures are an easy concept to understand.
Other futures libraries
There are many libraries that implement futures, where the concept may be called "futures", "deferred objects", or "promises". This includes libraries like async-future, streamline.js, Q, when.js, promiscuous, jQuery's deferred, coolaj86's futures, kriszyp's promises, and Narrative Javascript.
Most of these use callbacks to resolve the futures, which get around many of the problems Fibers introduces. However, they aren't quite as clean as fibers/futures, tho they are far cleaner than Async. Here's the same example again using my own async-future:
var future0 = asyncCall1(someParam);
var future1 = asyncCall2(someParam);
Future.all([future0, future1]).then(function(results) {
use(results[0], results[1])
}).done()
Q_oper8
Q_oper8 is really a different beast. It runs jobs in a queue using a pool of processes. Since javascript is single-threaded*, and javascript doesn't have native threading available, processes are the usual way to take advantage of more than one processor in node.js. Q_oper8 is intended as an alternative to managing processes using node.js's child_process module.
You should also check out Step.
It handles only a small subset of what async can do, but I think the code is much easier to read. It's great for just handling the normal case of doing a sequence of things, with some of those things happening in parallel.
I tend to use Step for the bulk of my logic, and then use async occasionally when I need to apply methods repeatedly in serial or parallel execution (ie - call this function until, or call this function on each element of this array).
I'm using jQuery's Deferred functionality on the client and jQuery Deferred for nodejs on the server in place of nested callbacks. It has greatly reduced the code and made things so readable.
http://techishard.wordpress.com/2012/05/23/promises-promises-a-concise-pattern-for-getting-and-showing-my-json-array-with-jquery-and-underscore/
http://techishard.wordpress.com/2012/05/29/making-mongoose-keep-its-promises-on-the-server/

Resources