The Rust compiler checks for lots of security issues (e.g. the borrow checker, ownership rules, etc.) and prevents insecure code from even compiling. This is amazing.
However, what if hackers who want to publish malware compile the code using their own manipulated compilers which do not check for any of those rules anymore?
There will be lots of Rust crates in repositories soon and developers are relying on their code's security just because of the Rust compiler.
The problem of users getting shady binaries from untrusted sources is completely orthogonal to Rust's promise of memory and thread safety.
Rust aims to make it harder to accidentally write buggy code that could be exploited by a hacker, not to make it impossible to write malicious code.
All crates from crates.io are compiled again when you use them, so it is impossible for a crate author to introduce memory unsafety that way.
Of course, if a crate uses unsafe code, then it's up to you to be comfortable with what the author has done.
Let's break this down:
We know that the Rust compiler will check for lots of security issues (e.g. Borrow check, Ownership rules, etc) and prevent insecure code even to be compiled. This is amazing.
Yes, this is true. The Rust compiler will enforce type and memory safety.
However, what if hackers who want to publish sort of malsoftwares, compile the codes using their own version of manipulated compilers which do not check for any of those rules anymore?
If you download a crate that was published on crates.io, the crate will be recompiled locally. It is true that downloaded crates may use unsafe code that avoids some memory safety enforcement.
In the case of downloading a binary directly off the internet (not through crates.io), you cannot be sure that what you are downloading is what is says it is. In fact, a file marked as a Rust library may in fact be a virus written in assembly. This problem has nothing to do with Rust, or any other language.
There will be lots of Rust crates and libs on the repositories soon and developers are relying on their code's security just because of Rust compiler.
This not true. Developers may be relying on the type and memory safety of their code that is enforced by the compiler. However, just because the compiler enforces safety within you code does not mean that you it will stop viruses or malware from infecting your device. Anything that you download off the internet can potentially harm your device, regardless of the language.
Related
The program I have ends up using over 110 crates in its build. However, the core performance gains (where 80+%) of the benefit is, resides in only a few 'crates'. Ya, I took the time to drill down through crates that used crates. Consequently, I'd like the linker to use lto options for only those 5-6 crates, rather that looking at all +110. Does anyone know if this is possible? And, if it is, how do I direct the linker to do it? Yes, the difference is just build-time, but its only going to get worse as I add more crates.
There are a number of tricks I've learned about for "getting around" Rust's restrictions without using unsafe. For example:
Option::unwrap
RefCell
There are probably others I'm forgetting.
In cases like these, responsibility for specific aspects of correctness is shifted from the compiler to the programmer. Things that would have been compilation errors become panics, and the programmer is expected to just "know that their logic is right".
Panics are better than memory corruption, but given Rust's branding as a fully-safe language, I would think these "trap-doors" would be formally identified somehow - in the type system, documentation, or otherwise - for easy identification. The programmer should know when they're using a shortcut and taking on added responsibility.
Does this kind of distinction exist? Even just an explicit list somewhere in the documentation? Is my mental model wrong, making such a thing unnecessary?
No, there is no formal distinction.
I believe you are asking if there is an effect system. While this has been talked about by compiler developers for a while, there is no consensus about if it would truly be beneficial or detrimental in the long run.
"getting around" Rust's restrictions
These "get around" nothing. The methods themselves ensure the requirements are upheld.
shifted from the compiler to the programmer
I disagree with this assessment. Responsibility has been shifted from compile time to run time, but the compiler and the library code still ensures that safety is upheld.
using unsafe
Unsafe code truly moves the responsibility to the programmer. However, then that programmer builds safe abstractions that other programmers can make use of. Ideally, they build the abstractions using tools that are checked at compile time, helping to reduce runtime errors.
Rust's branding as a fully-safe language
responsibility for specific aspects of correctness
Yes, Rust intends to be a memory-safe language, which does not mean that code written in Rust is correct. The branding emphasizes memory safety; other people assume that means things like "cannot crash", but we cannot prevent all mistaken interpretations.
See also:
Why does Rust consider it safe to leak memory?
This question was asked before Rust officially supported incremental compilation. Rust 1.24.0 and later enable incremental compilation by default for development (debug) builds.
I'm an outsider trying to see if Rust is appropriate for my projects.
I've read that Rust lacks incremental compilation (beta features notwithstanding).
Is this similar to having everything be implemented in the headers in C++ (like in much of Boost)?
If the above is correct, does this limit Rust to rather small projects with small dependencies? (If, say, Qt or KDE were header-only libraries, then programs using them would be extremely painful to develop, since you'd effectively recompile Qt/KDE every time you want to compile your own code.)
In C and C++, a compilation unit is usually a source file and all the header files it transitively includes. An application or library is usually comprised of multiple compilation units that are linked together. An application or library can additionally be linked with other libraries. This means that changing a source file requires recompiling that source file only and then relinking, changing an external library only requires relinking, but changing a header file (whether it's part of the project or external; the compiler can't tell the difference) requires recompiling all source files that use it and then relinking.
In Rust, the crate is the compilation unit. (A crate can be an application or a library.) Rust doesn't use header files; instead, the equivalent information is stored as metadata in the compiled crates (which is faster to parse, and has the same effect as precompiled headers in C/C++). A crate can additionally be linked with other crates. This means that changing any of the source files for a crate requires recompiling the whole crate, and changing a crate requires recompiling all crates that depend on it (currently, this means recompiling from source, even if the API happens to not have changed).
To answer your questions, no, Rust doesn't recompile all dependencies every time you recompile your project; quite the opposite in fact.
Incremental compilation in Rust is about reusing the work done in previous compilations of a crate to speed up compilation times. For example, if you change a module and it doesn't affect the other modules, the compiler would be able to reuse the data that was generated when the other modules were compiled last time. The lack of incremental compilation is usually only a problem with large or complex crates (e.g. those who make heavy use of macros).
I'm looking for a way to run an arbitrary Haskell code safely (or refuse to run unsafe code).
Must have:
module/function whitelist
timeout on execution
memory usage restriction
Capabilities I would like to see:
ability to kill thread
compiling the modules to native code
caching of compiled code
running several interpreters concurrently
complex datatype for compiler errors (insted of simple message in String)
With that sort of functionality it would be possible to implement a browser plugin capable of running arbitrary Haskell code, which is the idea I have in mind.
EDIT: I've got two answers, both great. Thanks! The sad part is that there doesn't seem to be ready-to-go library, just a similar program. It's a useful resource though. Anyway I think I'll wait for 7.2.1 to be released and try to use SafeHaskell in my own program.
We've been doing this for about 8 years now in lambdabot, which supports:
a controlled namespace
OS-enforced timeouts
native code modules
caching
concurrent interactive top-levels
custom error message returns.
This series of rules is documented, see:
Safely running untrusted Haskell code
mueval, an alternative implementation based on ghc-api
The approach to safety taken in lambdabot inspired the Safe Haskell language extension work.
For approaches to dynamic extension of compiled Haskell applications, in Haskell, see the two papers:
Dynamic Extension of Typed Functional Languages, and
Dynamic applications from the ground up.
GHC 7.2.1 will likely have a new facility called SafeHaskell which covers some of what you want. SafeHaskell ensures type-safety (so things like unsafePerformIO are outlawed), and establishes a trust mechanism, so that a library with a safe API but implemented using unsafe features can be trusted. It is designed exactly for running untrusted code.
For the other practical aspects (timeouts and so on), lambdabot as Don says would be a great place to look.
I received a library from an external developer in form of a well defined API(in C++ and Java). What can be some tests to check if the library is thread-safe ?
Basically you can't, it's more or less impossible to test for thread-safety.
And also if you don't have the author's guarantee that the library is thread-safe then they aren't going to fix threading issues, so future versions might be less thread-safe.
If you've got the source code, then you can investigate common thread-safety issues: shared state, locks etc. But if you've only got binaries, then the best you can hope is to show that the library is not thread-safe. Even then reproducing the problems reliably might be extremely difficult.