I'm currently in the progress of learning Rust. I'm mainly using The Rust Programming Language book and this nice reference which relates Rust features/syntax to C++ equivalents.
I'm having a hard time understanding where the core language stops and the standard library starts. I've encountered a lot of operators and/or traits which seems to have a special relationship with the compiler. For example, Rust has a trait (which from what I understand is like an interface) called Deref which let's a type implementing it be de-referenced using the * operator:
fn main() {
let x = 5;
let y = Box::new(x);
assert_eq!(5, x);
assert_eq!(5, *y);
}
Another example is the ? operator, which seems to depend on the Result and Option types.
Can code that uses those operators can be compiled without the standard library? And if not, what parts of the Rust language are depending on the standard library? Is it even possible to compile any Rust code without it?
The Rust standard library is in fact separated into three distinct crates:
core, which is the glue between the language and the standard library. All types, traits and functions required by the language are found in this crate. This includes operator traits (found in core::ops), the Future trait (used by async fn), and compiler intrinsics. The core crate does not have any dependencies, so you can always use it.
alloc, which contains types and traits related to or requiring dynamic memory allocation. This includes dynamically allocated types such as Box<T>, Vec<T> and String.
std, which contains the whole standard library, including things from core and alloc but also things with further requirements, such as file system access, networking, etc.
If your environment does not provide the functionality required by the std crate, you can choose to compile without it. If your environment also does not provide dynamic memory allocation, you can choose to compile without the alloc crate as well. This option is useful for targets such as embedded systems or writing operating systems, where you usually won't have all of the things that the standard library usually requires.
You can use the #![no_std] attribute in the root of your crate to tell the compiler to compile without the standard library (only core). Many libraries also usually support "no-std" compilation (e.g. base64 and futures), where functionality may be restricted but it will work when compiling without the std crate.
DISCLAIMER: This is likely not the answer you're looking for. Consider reading the other answers about no_std, if you're trying to solve a problem. I suggest you only read on, if you're interested in trivia about the inner workings of Rust.
If you really want full control over the environment you use, it is possible to use Rust without the core library using the no_core attribute.
If you decide to do so, you will run into some problems, because the compiler is integrated with some items defined in core.
This integration works by applying the #[lang = "..."] attribute to those items, making them so called "lang items".
If you use no_core, you'll have to define your own lang items for the parts of the language you'll actually use.
For more information I suggest the following blog posts, which go into more detail on the topic of lang items and no_core:
Rust Tidbits: What Is a Lang Item?
Oxidizing the technical interview
So yes, in theory it is possible to run Rust code without any sort of standard library and supplied types, but only if you then supply the required types yourself.
Also this is not stable and will likely never be stabilized and it is generally not a recommended way of using Rust.
When you're not using std, you rely on core, which is a subset of the std library which is always (?) available. This is what's called a no_std environment, which is commonly used for some types of "embedded" programming. You can find more about no_std in the Rust Embedded book, including some guidance on how to get started with no_std programming.
Related
I'm pretty new to Rust and have been working on some mathematical problems. For one of these problems I needed ceilf32 and sqrtf32. I was surprised to find that these functions are unsafe; both are fairly simple mathematical functions and my understanding is that unsafe Rust is used only as necessary to work around either the conservatism of the compiler or to allow inherently unsafe OS operations. I can't see any reason either function would run into either issue, thus I can't understand what would stop them being implemented with memory safety.
Could someone please enlighten me?
The functions you're looking at are in core::intrinsics, which are low-level compiler instructions. I don't see any official documentation on why they're marked unsafe, but my guess is that all of the compiler intrinsics were marked that way as a rule, since they're lower-level than most of Rust proper.
Regardless, for normal operation, you're looking for the inherent methods f32::ceil and f32::sqrt. These are the Rust standard library implementations that presumably[1] call the intrinsics as a course of action, and these methods are not marked unsafe.
Since they're inherent methods, you can either call them on f32 objects (my_number.sqrt()) or directly with the namespace (f32::sqrt(my_number)).
[1] In fact, a look at the source code for the current implementations indicates that both of these simply delegate to their intrinsic counterpart, wrapping it in an unsafe block to guarantee safety.
If you check languages percentage in github rust lang compiler repository it says that 97.6% of the rust lang compiler is written in rust. So how does this exactly works?. How you can create a programming language (I think this is related to a compiler, since it's whom read the code, doesn't it?) written in itself.
This is called self-hosting or bootstrapping. The basic idea goes like this:
Write an initial compiler for a small subset of Rust using your Other Programming Language of Choice. You now have compiler C0.
Using the subset of Rust you have a compiler for, rewrite the source for C0 purely in Rust. Compile that program using compiler C0 to form compiler C1.
Add features to Rust by adding code to the compiler you just wrote to properly parse and implement those features. Compile that Rust program with C1 to form compiler C2.
By repeating step (3) as many times as you’d like, you can add progressively more and more features to the Rust language, with the Rust compiler always being written in Rust itself.
There’s a famous talk called Reflections on Trusting Trust that talks about how this process works, as well as how you can use this process to do Nefarious Things.
These two traits (std::ops::Add, core::ops::Add) provide the same functionality, and they both use the same example (both utilize std::ops::Add). Their set of implementors differ somewhat.
Should one default to using std::ops::Add? Why do both, as opposed to one, of them exist?
There aren't two traits. There is one trait which is exported under several interchangeable names. This is far from unique. Virtually everything in core is also exported from std, and virtually always under exactly the same path (i.e., you can just replace the "core" prefix with "std").
As for which one you should use: If you have a reason to not link to the standard library (#![no_std]), then the std::* one isn't available so obviously you use core::*. If on the other hand you do use the standard library, you should use the std::* re-export. It is more customary and requires less typing.
They're in fact exactly the same, despite the set of implementors being listed as slightly different.
The core library is designed for bare-metal/low-level tasks, and is thus more barebones than what std can provide by assuming an operating system exists. However, people using std will want the stuff that's in core too (e.g. Add or Option or whatever), and so to avoid having to load both std and core, std reexports everything from core, via pub use. That is, std provides aliases/import paths for the things in core.
There are some unfortunate error messages where the compiler points to the original source of an item, not the reexport, which might not be in a crate you're extern crateing.
Is it possible to compile a Rust library crate so that the user can't see the source code but can still use the library?
If it is, are all the generics provided as "Source code" or some IR, or does Rust implement generics differently from C++ templates?
A lot of metadata is included with each library crate, be it statically linked (.rlib) or dynamically linked (.so/.dylib/.dll):
module structure
exported macro_rules macros
type and trait definitions
constants with their initializer expressions
signatures for all functions
the entire body of each function that is marked as #[inline] or is generic (default trait methods are considered generic over Self)
All of this is enough to reproduce some of the original source (how much depends on the usage of generics), albeit with no comments or other whitespace.
The function bodies are serialized in the compiler's internal AST structure - you can see a pretty form of it with rustc -Z ast-json lib.rs.
While the metadata is binary, not JSON, using librustc to extract all exported function definitions from a compiled crate, and pretty-printing the ASTs is fairly easy.
In the future, there might not be any AST past type-checking, so the metadata would encode an IR of sorts – one possibility is CFG, i.e. "control flow graph", which is already used internally in a couple places.
However, that would still expose more information than Java bytecode, it would be an optimization, you could still approximate the original code (and easily get something which compiles).
As such, there are only two options I can recommend:
expose a C API; it has the advantage of being a stable ABI, but it's quite limiting and brittle;
expose a Rust API using only trait objects, instead of generics; this way you get to keep memory safety and all monomorphic functions would still work normally, but trait objects (dynamic dispatch) cannot express all the patterns possible with generics: in particular, generic trait methods are not callable on trait objects (C++ should have a similar restriction for mixing template and virtual, with workarounds potentially available on a case-by-case basis).
I wondered if there is a programming language which compiles to machine code/binary (not bytecode then executed by a VM, that's something completely different when considering typing) that features dynamic and/or weak typing, e.g:
Think of a compiled language where:
Variables don't need to be declared
Variables can be created during runtime
Functions can return values of different types
Questions:
Is there such a programming language?
(Why) not?
I think that a dynamically yet strong typed, compiled language would really sense, but is it possible?
I believe Lisp fits that description.
http://en.wikipedia.org/wiki/Common_Lisp
Yes, it is possible. See Julia. It is a dynamic language (you can write programs without types) but it never runs on a VM. It compiles the program to native code at runtime (JIT compilation).
Objective-C might have some of the properties you seek. Classes can be opened and altered in runtime, and you can send any kind of message to an object, whether it usually responds to it or not. In that way, you can implement duck typing, much like in Ruby. The type id, roughly equivalent to a void*, can be endowed with interfaces that specify a contract that the (otherwise unknown) type will adhere to.
C# 4.0 has many, if not all of these characteristics. If you really want native machine code, you can compile the bytecode down to machine code using a utility.
In particular, the use of the dynamic keyword allows objects and their members to be bound dynamically at runtime.
Check out Anders Hejlsberg's video, The Future of C#, for a primer:
http://channel9.msdn.com/pdc2008/TL16/
Objective-C has many of the features you mention: it compiles to machine code and is effectively dynamically typed with respect to object instances. The id type can store any class instance and Objective-C uses message passing instead of member function calls. Methods can be created/added at runtime. The Objective-C runtime can also synthesize class instance variables at runtime, but local variables still need to be declared (just as in C).
C# 4.0 has many of these features, except that it is compiled to IL (bytecode) and interpreted using a virtual machine (the CLR). This brings up an interesting point, however: if bytecode is just-in-time compiled to machine code, does that count? If so, it opens to the door to not only any of the .Net languages, but Python (see PyPy or Unladed Swallow or IronPython) and Ruby (see MacRuby or IronRuby) and many other dynamically typed languages, not mention many LISP variants.
In a similar vein to Lisp, there is Factor, a concatenative* language with no variables by default, dynamic typing, and a flexible object system. Factor code can be run in the interactive interpreter, or compiled to a native executable using its deploy function.
* point-free functional stack-based
VB 6 has most of that
I don't know of any language that has exactly those capabilities. I can think of two that have a significant subset, though:
D has type inference, garbage collection, and powerful metaprogramming facilities, yet compiles to efficient machine code. It does not have dynamic typing, however.
C# can be compiled directly to machine code via the mono project. C# has a similar feature set to D, but again without dynamic typing.
Python to C probably needs these criteria.
Write in Python.
Compile Python to Executable. See Process to convert simple Python script into Windows executable. Also see Writing code translator from Python to C?
Elixir does this. The flexibility of dynamic variable typing helps with doing hot-code updates (for which Erlang was designed). Files are compiled to run on the BEAM, the Erlang/Elixir VM.
C/C++ both indirectly support dynamic typing using void*. C++ example:
#include <string>
int main() {
void* x = malloc(sizeof(int))
*(int*)x = 5;
x = malloc(sizeof(std::string));
*(std::string*x) = std::string("Hello world");
free(x);
return 0;
}
In C++17, std::any can be used as well:
#include <string>
#include <any>
int main() {
std::any x = 5;
x = std::string("Hello world");
return 0;
}
Of course, duck typing is rarely used or needed in C/C++, and both of these options have issues (void* is unsafe, std::any is a huge performance bottleneck).
Another example of what you may be looking for is the V8 engine for JavaScript. It is a JIT compiler, meaning the source code is compiled to bytecode and then machine code at runtime, although this is hidden from the user.