To what extent is Rust shadowing zero-cost?

To what extent is Rust shadowing zero-cost? - rust

A Zero-Runtime-Cost Mixed List in Rust outlines how to create a heterogenous list in Rust using tuples and normal traits (not trait objects like this question suggests). The list seems to rely heavily on shadowing and effectively changes the entire type of the list every time a new element is added.
The implementation seems brilliant to me, but after reviewing a few Rust's homepage and resources I could not find anyplace that explicitly defines shadowing as zero-cost. As far as I know, repeatedly abandoning data on the stack is less costly than indirection, but repeatedly copying and adding to existing data instead of mutating it sounds pretty expensive.
What you don’t use, you don’t pay for. And further: What you do use, you couldn’t hand code any better.
Bjarne Stroustrup
Shadowing seems to fulfill the first requirement, but the second?
Is Rust's shadowing actually zero-cost?

Official Rust material tries very hard to never talk about "zero cost" by itself, so you'll have to cite where you see "zero cost" without further qualification. The article states zero runtime cost, so the author of the post is aware of that. In most cases, "zero cost" is used in the context of zero-cost abstractions.
Your Stroustrup quote only partially and obliquely deals with zero-cost abstractions. A better explanation, emphasis mine:
It means paying no penalty for the abstraction, or said otherwise, it means that whether you use the abstraction or instead go for the "manual" implementation you end up having the same costs (same speed, same memory consumption, ...).
Matthieu M.
This means that any time you see "zero-cost abstraction", you have to have something to compare the abstraction against; only then can you tell if it's truly zero-cost.
I don't think that shadowing even counts as an abstraction, but let's pretend it does (and I'll word the rest of my answer as if I believe it is).
Shadowing a variable means having multiple distinct variables with the same name, with the later ones precluding access to the previous ones. The non-"abstract" version of that is having multiple distinct variables of different names. I'd say that having two variables of the same name is the same cost as having two variables of different names, so it is a zero-cost abstraction.
See also:
Why do I need rebinding/shadowing when I can have mutable variable binding?
In Rust, what's the difference between "shadowing" and "mutability"?
Playing the game further, you can ask "is having two variables a zero-cost abstraction?". I'd say that this depends on what the variables are and how they relate.
In this example, I'd say that this is a zero-cost abstraction as there's no more efficient way I could have written the code.
fn example() {
let a = String::new();
let a = a;
}
On the other hand, I'd say that this is not a zero-cost abstraction, as the first a will not be deallocated until the end of the function:
fn example() {
let a = String::new();
let a = String::new();
}
A better way I could choose to write it would be to call drop in the middle. There are good reasons that Rust doesn't do this, but it's not as efficient in regards to memory usage as a hand-written implementation could be.
See also:
Is it possible in Rust to delete an object before the end of scope?
Does Rust free up the memory of overwritten variables?
Is the resource of a shadowed variable binding freed immediately?

Related

Rust Box vs non-box

Given a rust object, is it possible to wrap it so that multiple references and a mutable reference are allowed but do not cause problems?
For example, a Vec that has multiple references and a single mutable reference.

Yes, but...
The type you're looking for is RefCell, but read on before jumping the gun!
Rust is a single-ownership language. It always will be. It's exactly that feature that makes Rust as thread-safe and memory-safe as it is. You cannot fully circumvent this, short of wrapping your entire program in unsafe and using raw pointers exclusively, and if you're going to do that, just write C since you're no longer getting any benefits out of using Rust.
So, at any given moment in your program, there must either be one thing writing to this memory or several things reading. That's the fundamental law of single-ownership. Keep that in mind; you cannot get around that. What I'm about to say still follows that rule.
Usually, we enforce this with our type signatures. If I take a &T, then I'm just an alias and won't write to it. If I take a &mut T, then nobody else can see what I'm doing till I forfeit that reference. That's usually good enough, and if we can, we want to do it that way, since we get guarantees at compile-time.
But it doesn't always work that way. Sometimes we can't prove that what we're doing is okay. Sometimes I've got two functions holding an, ostensibly, mutable reference, but I know, due to some other guarantees Rust doesn't know about, that only one will be writing to it at a time. Enter RefCell. RefCell<T> contains a single T and pretends to be immutable but lets you borrow the thing inside either mutably or immutably with try_borrow_mut and try_borrow. When we call one of these functions, we get a reference-like value that can read (and write, in the mutable case) to the original data, even though we started with a &RefCell<T> that doesn't look mutable.
But the fundamental law still holds. Note that those try_* functions return a Result, i.e. they might fail. If two functions simultaneously try to get try_borrow_mut references, the second one will fail, and it's your job to deal with that eventuality (even if "deal with that" means panic! in your particular use case). All we've done is move the single-ownership rules from compile-time to runtime. We haven't gotten rid of them; we've just changed who's responsible for enforcing them.

What costs are incurred when using Cell<T> as opposed to just T?

I ran across a comment on reddit that indicates that using Cell<T> prevents certain optimizations from occurring:
Cell works with no memory overhead (Cell is the same size as T) and little runtime overhead (it "just" inhibits optimisations, it doesn't introduce extra explicit operations)
This seems counter to other things I've read about Cell<T>, in particular that it's "zero-cost." The first place I encountered this categorization is here.
With all that said, I'd like to understand the actual cost of using Cell<T>, including whatever optimizations it may prevent.

TL;DR Cell is Zero-Overhead Abstraction; that is, the same functionality implemented manually has the same cost.
The term Zero-Cost Abstractions is not English, it's jargon. The idea of Zero-Cost Abstractions is that the layer of abstraction itself does not add any cost compared to manually doing the same thing.
There are various misunderstandings that have sprung up: most notably, I have regularly seen zero-cost understood as "the operation is free", which is not the case.
To add to the confusion, the exception mechanism used by most C++ implementations, and which Rust uses for panic = unwind is called Zero-Cost Exceptions, and purports1 to add no overhead on the non-throwing path. It's a different kind of Zero-Cost...
Lately, my recommendation is to switch to using the term Zero-Overhead Abstractions: first because it's a distinct term from Zero-Cost Exceptions, so less likely to be mistaken, and second because it emphasizes that the Abstraction does not add Overhead, which is what we are trying to convey in the first place.
1 The objective is only partially achieved. While the same assembly executed with and without the possibility of throwing indeed has the same performance, the presence of potential exceptions may hinder the optimizer and cause it to generate sub-optimal assembly in the first place.
With all that said, I'd like to understand the actual cost of using Cell<T>, including whatever optimizations it may prevent.
On the memory side, there is no overhead:
sizeof::<Cell<T>>() == sizeof::<T>(),
given a cell of type Cell<T>, &cell == cell.as_ptr().
(You can peek at the source code)
On the access side, Cell<T> does incur a run-time cost compared to T; the cost of the extra functionality.
The most immediate cost is that manipulating the value through a &Cell<T> requires copying it back and forth1. This is a bitwise copy, so the optimizer may elide it, if it can prove that it is safe to do so.
Another notable cost is that UnsafeCell<T>, on which Cell<T> is based, breaks the rules that &T means that T cannot be modified.
When a compiler can prove that a portion of memory cannot be modified, it can optimize out further reads: read t.foo in a register, then use the register value rather than reading t.foo again.
In traditional Rust code, a &T gives such a guarantee: no matter if there are opaque function calls, calls to C code, etc... between two reads to t.foo, the second read will return the same value as the first, guaranteed. With a &Cell<T>, there is no such guarantee any longer, and thus unless the optimizer can prove beyond doubt that the value is unmodified2, then it cannot apply such optimizations.
1 You can manipulate the value at no cost through &mut Cell<T> or using unsafe code.
2 For example, if the optimizer knows that the value resides on the stack, and it never passed the address of the value to anyone else, then it can reasonably conclude that no one else can modify the value. Although a stack-smashing attack may, of course.

Why does the Rust book present assigning a variable to another as copying the top-level structure?

In the section on ownership in The Rust Programming Language, Strings are represented as a structure with 3 fields (with one of the 3 fields being a pointer to the actual byte vector). There is an example:
let s1 = String::from("hello");
let s2 = s1;
The book explains this as copying the 3-field structure contained in s1 to s2 (but not the byte-vector) and then marking the structure contained in s1 as "invalid" (figure 4-4).
Why is it presented that way instead of presenting s2 as pointing to the same top-level structure as s1 and then marking s1 as "invalid"?
Would this alternate presentation result in a visible difference in semantics (or would it even cause problems)? If not, is it because it better reflects the underlying implementation? And if so, why would the implementation make such a copy operation?

Why is it presented that way
Because that's a very close (if not exact) way of modeling Rust's ownership and moving semantics.
Would this alternate presentation result in a visible difference in semantics
Yes. Rust's current semantics indicate that when a variable is moved, there's no guarantee that it remains at the same address. Your alternate presentation would suggest to readers that the address is guaranteed to be the same ("because the picture told me so!").
This cannot be the case for every move, so it's not worth teaching people misleading semantics. It's hard to pinpoint specifics, but cases I'd expect to have a higher chance of the value moving:
Transferring them across threads
Returning values from a function — although (Named) Return Value Optimization can prevent this.
When the value is "very small" — it's cheaper to copy it than to dereference memory.
why would the implementation make such a copy operation?
The implementation doesn't necessarily make a copy. While the semantics provide no guarantee that the address stays the same, they also don't enforce that it must change. In fact, the optimizer spends time attempting to minimize all sorts of needless copies where it can. The particular example in question is extremely likely to not involve any copies.

Why is immutability enforced in Rust unless otherwise specified with `mut`?

Why is immutability forced in Rust, unless you specify mut? Is this a design choice for safety, do you consider this how it should be naturally in other languages?
I should probably clarify, I'm still a newbie at Rust. So is this a design choice related to another feature in the language?

The Rust-Book actually addresses this topic.
There is no single reason that bindings are immutable by default, but we can think about it through one of Rust’s primary focuses: safety. If you forget to say mut, the compiler will catch it, and let you know that you have mutated something you may not have intended to mutate. If bindings were mutable by default, the compiler would not be able to tell you this. If you did intend mutation, then the solution is quite easy: add mut.
There are other good reasons to avoid mutable state when possible, but they’re out of the scope of this guide. In general, you can often avoid explicit mutation, and so it is preferable in Rust. That said, sometimes, mutation is what you need, so it’s not verboten.
Basically it is the C++-Mantra that everything that you don't want to modify should be const, just properly done by reversing the rules. Also see this Stackoverflow article about C++.

Multiple specialization, iterator patterns in Rust

Learning Rust (yay!) and I'm trying to understand the intended idiomatic programming required for certain iterator patterns, while scoring top performance. Note: not Rust's Iterator trait, just a method I've written accepting a closure and applying it to some data I'm pulling off of disk / out of memory.
I was delighted to see that Rust (+LLVM?) took an iterator I had written for sparse matrix entries, and a closure for doing sparse matrix vector multiplication, written as
iterator.map_edges({ |x, y| dst[y] += src[x] });
and inlined the closure's body in the generated code. It went quite fast. :D
If I create two of these iterators, or use the first a second time (not a correctness issue) each instance slows down quite a lot (about 2x in this case), presumably because the optimizer no longer chooses to do specialization because of the multiple call sites, and you end up doing a function call for each element.
I'm trying to understand if there are idiomatic patterns that keep the pleasant experience above (I like it, at least) without sacrificing the performance. My options seem to be (none satisfying this constraint):
Accept dodgy performance (2x slower is not fatal, but no prizes either).
Ask the user to supply a batch-oriented closure, so acting on an iterator over a small batch of data. This exposes a bit much of the internals of the iterator (the data are compressed nicely, and the user needs to know how to unwrap them, or the iterator needs to stage an unwrapped batch in memory).
Make map_edges generic in a type implementing a hypothetical EdgeMapClosure trait, and ask the user to implement such a type for each closure they want to inline. Not tested, but I would guess this exposes distinct methods to LLVM, each of which get nicely inlined. Downside is that the user has to write their own closure (packing relevant state up, etc).
Horrible hacks, like make distinct methods map_edges0, map_edges1, ... . Or add a generic parameter the programmer can use to make the methods distinct, but which is otherwise ignored.
Non-solutions include "just use for pair in iterator.iter() { /* */ }"; this is prep work for a data/task-parallel platform, and I would like to be able to capture/move these closures to work threads rather than capturing the main thread's execution. Maybe the pattern I should be using is to write the above, put it in a lambda/closure, and ship it around instead?
In a perfect world, it would be great to have a pattern which causes each occurrence of map_edges in the source file to result in different specialized methods in the binary, without forcing the entire project to be optimized at some scary level. I'm coming out of an unpleasant relationship with managed languages and JITs where generics would be the only way (I know of) to get this to happen, but Rust and LLVM seem magical enough that I thought there might be a good way. How do Rust's iterators handle this to inline their closure bodies? Or don't they (they should!)?

It seems that the problem is resolved by Rust's new approach to closures outlined at
http://smallcultfollowing.com/babysteps/blog/2014/11/26/purging-proc/
In short, Option 3 above (make functions generic with respect to a new closure type) is now transparently implemented when you make an implementation generic using the new closure traits. Rust produces the type behind the scenes for you.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string