I have a wasm binary I am trying to execute in Rust.I am not sure if it has integer overflow Are there any tools that can help me detect integer overflow in Rust itself?
There are a few methods, that can help you:
checked_add which will perform an addition. If the addition would overflow, it returns None, else it returns Some(sum).
overflowing_add which will perform an addition and returns a tuple, where the first one is the overflowing sum, and the second element a boolean which indicates whether an overflow happened or not.
You can also check saturaing_add and wrapping_add and see if they fit your needings.
Related
Rust treats signed integer overflow differently in debug and release mode. When it happens, Rust panics in debug mode while silently performs two's complement wrapping in release mode.
As far as I know, C/C++ treats signed integer overflow as undefined behavior partly because:
At that time of C's standardization, different underlying architecture of representing signed integers, such as one's complement, might still be in use somewhere. Compilers cannot make assumptions of how overflow is handled in the hardware.
Later compilers thus making assumptions such as the sum of two positive integers must also be positive to generate optimized machine code.
So if Rust compilers do perform the same kind of optimization as C/C++ compilers regarding signed integers, why does The Rustonomicon states:
No matter what, Safe Rust can't cause Undefined Behavior.
Or even if Rust compilers do not perform such optimization, Rust programmers still do not anticipate seeing a signed integer wrapping around. Can't it be called "undefined behavior"?
Q: So if Rust compilers do perform the same kind of optimization as C/C++ compilers regarding signed integers
Rust does not. Because, as you noticed, it cannot perform these optimizations as integer overflows are well defined.
For an addition in release mode, Rust will emit the following LLVM instruction (you can check on Playground):
add i32 %b, %a
On the other hand, clang will emit the following LLVM instruction (you can check via clang -S -emit-llvm add.c):
add nsw i32 %6, %8
The difference is the nsw (no signed wrap) flag. As specified in the LLVM reference about add:
If the sum has unsigned overflow, the result returned is the mathematical result modulo 2n, where n is the bit width of the result.
Because LLVM integers use a two’s complement representation, this instruction is appropriate for both signed and unsigned integers.
nuw and nsw stand for “No Unsigned Wrap” and “No Signed Wrap”, respectively. If the nuw and/or nsw keywords are present, the result value of the add is a poison value if unsigned and/or signed overflow, respectively, occurs.
The poison value is what leads to undefined behavior. If the flags are not present, the result is well defined as 2's complement wrapping.
Q: Or even if Rust compilers do not perform such optimization, Rust programmers still do not anticipate seeing a signed integer wrapping around. Can't it be called "undefined behavior"?
"Undefined behavior" as used in this context has a very specific meaning that is different from the intuitive English meaning of the two words. UB here specifically means that the compiler can assume an overflow will never happen and that if an overflow will happen, any program behavior is allowed. That's not what Rust specifies.
However, an integer overflow via the arithmetic operators is considered a bug in Rust. That's because, as you said, it is usually not anticipated. If you intentionally want the wrapping behavior, there are methods such as i32::wrapping_add.
Some additional resources:
RFC 560 specifies everything about integer overflows in Rust. In short: panic in debug mode, 2's complement wrap in release mode.
Myths and Legends about Integer Overflow in Rust. Nice blog post about this topic.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
Consider this example
fn main() {
let mut i: Option<i32> = None;
//after some processing it got some value of 55
i = Some(55);
println!("value is {:?}", i.unwrap());
}
In go, nil represents the zero-value of that type.
However in rust, it represents absence of a value. How is absence of a value useful in practice?
When a variable with a type is declared, it must have some value either initialized or un-initialized. Why will one declare it to have it absent?
Also please explain, at what point the memory is allocated for i during the initial declaration or when i gets some value?
I might be asking a stupid question, but want to get my head around the need of this concept.
How is absence of a value useful in practice?
A simple example is a function that looks for the first matching element in a collection. It may find it, and return it, or not find any.
The docs give a few more cases:
Initial values
Return values for functions that are not defined over their entire input range (partial functions)
Return value for otherwise reporting simple errors, where None is returned on error
Optional struct fields
Struct fields that can be loaned or "taken"
Optional function arguments
Nullable pointers
Swapping things out of difficult situations
Now, you may ask: why don't we use one of the values to mark an empty one? For two reasons:
There are cases where you do not have a valid "zero-value" or a valid "invalid" value. In this case, you have to use some flag somewhere else to store the fact that something is invalid.
In general, it is simpler to use the same solution everywhere than having to mark and document which is the "none" value.
Why will one declare it to have it absent?
This is different than initialized/uninitialized values. Option is simply a type that contains either "nothing" (None) or a "value" of some type (Some(value))
You can conceptually see it as a struct with a flag and some space for the value itself.
Also please explain, at what point the memory is allocated for i during the initial declaration or when i gets some value?
That depends on the implementation. One could decide to implement Option using a pointer to the value, which means it could delay allocating.
However, the most likely implementation is avoiding pointers and keeping the value plus an extra flag. Note that, for some types, you can also optimize further and avoid the flag altogether. For instance, if you have an Option of a pointer, you can simply use the zero value for None. In fact, Rust does such a thing for types like Option<Box<T>>.
I have to do integer arithmetic in kernel, specifically I need to increment a size_t object by some delta, and this will happen quite often. So I'm wondering if I need to guard against possible integer overflows in the kernel, and if so, does the kernel provide macros or APIs for this?
size_t doesn't overflow; it is an unsigned type, with well-defined "wraparound" semantics. Incrementing the highest value of a size_t results in
zero.
In the specific case of size_t, in simple operations on size_t, like adding two sizes together, it is usually enough to just check whether the resulting operand is larger than one of the two source operands. If (size3 = size1 + size2) < size1), you have a wrap.
If an unsigned type is used as a clock value which goes around a "wheel", there are macros for doing "time before" calculations correctly. For instance, we want the time 0xFFFFFFFE to be treated as being a few time units in the past w.r.t. the time 0x00000003. If you're using the "jiffies" time in the kernel, then you can use the time_before inline function, and others in that family. (Note that there are "classic jiffies" (my term) represented as long and 64 bit jiffies represented as u64, with separate functions like time_before versus time_before64).
But are there some general macros for doing math with overflow checks? Casually combing through a kernel tree (3.18.31 that I have at my convenience), it doesn't appear that way. grep -i overflow on the include subtree doesn't come up with anything and similar searches in code areas like fs reveal the use of ad hoc locally coded overflow checks. It's a shame, really; you'd think the problem of "if I add these two int values together, is there a problem" is common enough that there would be a solution in place that everyone can just use like some addv(x_int, y_int, &overflow_flag) or whatever.
integer overflow in kernel — possible?
Yes. It doesn't matter, user space or kernel -- it's just how CPU works.
I'm wondering if I need to guard against possible integer overflows in the kernel
If you think that it can happen and it's not acceptable in your case -- then yes. For signed integers it can even lead to undefined behavior.
does the kernel provide macros or APIs for this
No, there are no ready-to-use functions in kernel for dealing with integer overflows. Well, there are some GCC wrappers for overflow detection... But be sure not to use it. Otherwise Linus Torvalds will come and yell at you, like here :)
Anyway, it's quite easy to detect integer overflows manually, when you really need that. Look here for example. In your case, size_t is unsigned, so you only need to ensure that it doesn't wrap or handle wrapped value: details.
I am finding what I think is a very strange behaviour. Rustc panics when when a variable overflows at runtime; this makes sense to me. However, it only raises a warning when value that overflows is assigned at compile time. Shouldn't that be a compile time error? Otherwise, two behaviours seem inconsistent.
I expect a compile time error:
fn main() {
let b: i32 = 3_000_000_000;
println!("{}", b);
}
Produces:
<anon>:2:18: 2:31 warning: literal out of range for i32, #[warn(overflowing_literals)] on by default
<anon>:2 let b: i32 = 3_000_000_000;
Playground 1
This makes sense to me:
fn main() {
let b: i32 = 30_000;
let c: i32 = 100_000;
let d = b * c;
println!("{}", d);
}
Produces:
thread '<main>' panicked at 'arithmetic operation overflowed', <anon>:4
playpen: application terminated with error code 101
Playground 2
Edit:
Given the comment by FrancisGagné, and me discovering that Rust implements operators that check for overflow during the operation, for example checked_mul, I see that one needs to implement overflow checks themselves. Which makes sense, because release version should be optimized, and constantly checking for overflows could get expensive. So I no longer see the "inconsistency". However, I am still surprised, that assigning a value that would overflow does not lead to compile time error. In golang it would: Go Playground
Actually, your comments are not consistent with the behavior you observe:
in your first example: you get a compile-time warning, which you ignore, and thus the compiler deduces that you want wrapping behavior
in your second example: you get a run-time error
The Go example is similar to the first Rust example (except that Go, by design, does not have warnings).
In Rust, an underflow or overflow results in an unspecified value which can be ! or bottom in computer science, a special value indicating that the control flow diverge which in general means either abortion or exception.
This specification allows:
instrumenting the Debug mode to catch all overflows at the very point at which they occur
not instrumenting1 the Release mode (and using wrapping arithmetic there)
and yet have both modes be consistent with the specification.
1 Not instrumenting by default, you can if you choose and for a relatively modest performance cost outside of heavy numeric code activate the overflow checks in Release with a simple flag.
On the cost of overflow checks: the current Rust/LLVM situation is helpful for debugging but has not really been optimized. Thus, in this framework, overflow checks cost. If the situation improves, then rustc might decide, one day, to activate overflow checking by default even in Release.
In Midori (a Microsoft experimental OS developed in a language similar to C#), overflow check was turned on even in Release builds:
In Midori, we compiled with overflow checking on by default. This is different from stock C#, where you must explicitly pass the /checked flag for this behavior. In our experience, the number of surprising overflows that were caught, and unintended, was well worth the inconvenience and cost. But it did mean that our compiler needed to get really good at understanding how to eliminate unnecessary ones.
Apparently, they improved their compiler so that:
it would reason about the ranges of variables, and statically eliminate bounds checks and overflow checks when possible
it would aggregate checks as much as possible (a single check for multiple potentially overflowing operations)
The latter is only to be done in Release (you lose precision) but reduces the number of branches.
So, what cost remain?
Potentially different arithmetic rules that get in the way of optimizations:
in regular arithmetic, 64 + x - 128 can be optimized to x - 64; with overflow checks activated the compiler might not be able to perform this optimization
vectorization can be hampered too, if the compiler does not have overflow checking vector built-ins
...
Still, unless the code is heavily numeric (scientific simulations or graphics, for example), then it might impact it indeed.
UPDATE
Hm, I have an update. Apparently my huge array of "unsigned long long fhash[105][100555]" was not getting initialized to zero automatically in vC++... It worked when I did = {0}. Isn't it supposed to initialize automatically?
I'm doing contest programming, and I usually compile with g++ at school/ideone etc... but I have to use a VC++ 2010 compiler.
That said, I have code to do polynomial rolling hashing (like used in Rabin-Karp), but do these overflow differently on these compilers?
Code is here: http://pastebin.com/UFdpwHCt (hashing is around line 67)
Output is here: http://i.imgur.com/KCcvI.png
How come "bhash" is equal between the two compilers, but "fhash" isn't? They are hashed using the same method... In the G++-3 output, the "fhash" and "bhash" outputs are the same (they are supposed to be) but in the VC++-10 output the "fhash" and "bhash" aren't the same...
I'm using the overflow to let it mod itself naturally, to speed up execution, instead of explicitly modding it with a large prime.
Wasn't an issue. the issue was that it wasn't getting initialized to zero. fixed it using memset.