An easy way to see all (sub)dependencies of a Rust crate (online)? - rust

On crates.io we can easily see the direct dependencies of a crate by just clicking on the Dependencies tab. Is there a way to also easily see the sub-dependencies of a crate? Perhaps in a tree-like view, similar to what cargo tree would display. Or at least the number of all (sub)dependencies.
I think that can be helpful, for example, when we need to decide which crate to use among alternatives. By having an indicator of the total number of (sub)dependencies, we would have a better idea on how "heavy" a library actually is. I think that can be especially useful for a language like Rust where the build speed seems to heavily depend on the number of dependencies.

Related

What is the exact difference between a Crate and a Package?

I come from a Java background and have recently started with Rust.
The official Rust doc is pretty self-explanatory except the chapter that explains Crates and Packages.
The official doc complicates it with so many ORs and ANDs while explaining the two.
This reddit post explains it a little better, but is not thorough.
What is the exact difference between a Crate and Package in Rust? Where/When do we use them?
Much thanks!
Crates
From the perspective of the Rust compiler, "crate" is the name of the compilation unit. A crate consists of an hierarchy of modules in one or multiple files. This is in contrast to most "traditional" compiled languages like Java, C or C++, where the compilation unit is a single file.
From the perspective of an user, this definition isn't really helpful. Indeed, in most cases, you will need to distinguish between two types of crates:
binary crates can be compiled to executables by the Rust compiler. For example, Cargo, the Rust package manager, is a binary crate translated by the Rust compiler to the executable that you use to manage your project.
library crates are what you'd simply call libraries in other languages. A binary crate can depend on library crates to use functionality supplied by the libraries.
Packages
The concept of packages does not originate in the Rust compiler, but in Cargo, the Rust package manager. At least for simple projects, a package is also what you will check into version control.
A package consists of one or multiple crates, but no more than one library crate.
Creating packages
to create a new package consisting of one binary crate, you can run cargo new
to create a new package consisting of one library crate, you can run cargo new --lib
to create a package consisting of a library as well as one or multiple binaries, you can run either cargo new or cargo new --lib and then modify the package directory structure to add the other crate
When should you use crates, and when should you use packages?
As you can see now, this question doesn't really make sense – you should and must always use both. A package can't exist without at least one crate, and a crate is (at least if you are using Cargo) always part of a package.
Therefore, a better question is this:
When should you put multiple crates into one package?
There are multiple reasons to have more than one crate in a package. For example:
If you have a binary crate, it is idiomatic to have the "business logic" in a library in the same package. This has multiple advantages:
Libraries can be integration tested while binaries can't
If you later decide that the business logic needs to also be used in another binary, it is trivial to add this second binary to the package and also use the library
If you have a library crate that generates some files (a database engine or something like that), you may want to have a helper binary to inspect those files
Note that if you have a very big project, you may instead want to use the workspace feature of Cargo in these cases.

When building a Rust binary with lto=true, is there a way to limit the crates the linker examines?

The program I have ends up using over 110 crates in its build. However, the core performance gains (where 80+%) of the benefit is, resides in only a few 'crates'. Ya, I took the time to drill down through crates that used crates. Consequently, I'd like the linker to use lto options for only those 5-6 crates, rather that looking at all +110. Does anyone know if this is possible? And, if it is, how do I direct the linker to do it? Yes, the difference is just build-time, but its only going to get worse as I add more crates.

Is it documented that Cargo can download and bundle multiple versions of the same crate?

By forking and playing with some code, I noticed that Cargo can download and bundle multiple versions of the same crate in the same project (native-tls 0.1.5 and 0.2.1, for example). I have wasted so much time by looking at the documentation of the wrong version.
I have looked for some information about this behaviour but I was not able to find anything. Is this documented somewhere?
Is there an easy way to determine/detect the version used by the code you're working on (current edited file)? Or can we tell Cargo to show some warnings/prevent the build if two versions the same crate are required?
It was a conscious decision when designing Rust to allow multiple versions of the same crate.
You have probably heard of Dependency Hell before, which occurs when 2 (or more) dependencies A and B have a common dependency C but each require a version which is incompatible with the other.
Rust was designed to ensure that this would not be an issue.
In general, cargo will attempt to find a common version which satisfies all requirements. As long as crate authors use SemVer correctly, and the requirements give enough leeway, a single version of the dependency can be computed and used successfully.
Sometimes, however, multiple versions of the same dependency are necessary, such as in your case since 0.1.x and 0.2.x are considered two different major versions. In this case, Rust has two features to allow the two versions to be used within the same binary:
a unique hash per version is appended to each symbol.
the type system considers the same type Foo from two versions of C to be different types.
There is a limitation, of course. If a function of A returns an instance of C::Foo and you attempt to pass it to a function of B, the compiler will refuse it (it considers the two types to be different). This is an intractable problem1.
Anytime the dependency on C is internal, or the use of C is isolated, it otherwise works automatically. As your experience shows, it is so seamless that the user may not even realize it is happening.
1 See the dtolnay trick that crate authors can use to allow some types to be interchangeable.
Cargo can indeed link multiple versions of some crate, but only one of those versions can be a direct dependency. The others are indirect references.
The direct reference is always the version referenced by Cargo.toml and on top-level of Cargo.lock (while the indirect references are in the dependencies subsections).
I am not sure how much it is documented, unfortunately.

Options for adding a minor feature to a crate that necessitates a new dependency

I want to add a relatively small feature to crate A which necessitates depending on a second crate B. A's maintainer reasonably objects about introducing a whole new dependency for a small, although potentially useful, feature.
What are the best ways of handling this? The options I can think of are:
make a new crate with the new feature (with dependencies on both A and B)
use Cargo "features" somehow so users have to opt in to the extra dependency.
1 seems suboptimal since it's a small feature that's pretty tied in to A, and it seems annoying to have to specially depend on a third crate, but 2 seems almost as bad...
Is there some way to compile my minor feature if and only if a crate explicitly depends on both A and B?
This is exactly the case optional features solve:
Cargo supports features to allow expression of:
conditional compilation options (usable through cfg attributes);
optional dependencies, which enhance a package, but are not required; and
clusters of optional dependencies, such as postgres, that would include the postgres package, the postgres-macros package, and
possibly other packages (such as development-time mocking libraries,
debugging tools, etc.).
Unfortunately, you seem to have already dismissed this option without any explanation as to why it's "almost bad", so... I guess the only option you have left is to fork the crate and apply your changes.

How are the Haddock module fields Portability, Stability and Maintainer used?

In lots of Haddock-generated module documentation (e.g. Prelude), a small box in the top-right can be seen, containing portability, stability and maintainer information:
From looking at the source code to such modules and experimentation, I confirmed that this information is generated from lines like the following in the module description:
-- Maintainer : libraries#haskell.org
-- Stability : stable
-- Portability : portable
There are several strange things about this:
The fields only seem to work in this order — any fields put out of order are simply treat as part of the module description itself. This is despite the fact that the order in the source file is the opposite of the order in the generated documentation!
I have been unable to find any official documentation of these fields. There is a Cabal package property named stability, the example values of which match the values I've seen in the equivalent Haddock fields, but beyond that, I've found nothing.
So: How are these fields intended to be used, and are they documented anywhere?
In particular, I'd like to know:
The full list of commonly-used values for Portability and Stability. This HaskellWiki page has a list, but I'd like to know where this list originated from.
The criteria for deciding whether a module is portable or non-portable. In particular, the package I would like the answers to these questions for, acme-strfry, is an FFI binding to strfry, a function only available in glibc. Is the package non-portable, because it only works on glibc systems, or portable, because it does not use any Haskell language extensions? The common usage seems to imply the latter.
Why a specific order of fields is required in the source file, and why it's the opposite of the ordering in the generated documentation.
Oh, I thought those fields were from the cabal package description. They don't seem to be documented at all on Haddock's docs. I've found this, which doesn't really answer your question but:
http://trac.haskell.org/haddock/ticket/71
So if it's freeform anyway, why not just write "non-portable (depends on glibc)"? I've seen even "portable (depends on ghc)", which is odd. I also wonder what happens with modules that were non-portable due to non-Haskell98 extension Foo, after Foo was added to Haskell 2010.
Note that the Cabal documenation you link to also says stability is freeform. Of course, even if haddock or cabal were to define what are the acceptable values, it'd still be up to the maintainer to subjectively select one.
About the specific order, you should probably just ask at the haddock mailing list, or check the source and file a bug.
PS: strfry is an invaluable contribution to the Haskell community, but it should be pure and portable, don't you think?
Ah yes, one of the more obscure and crufty features of Haddock.
As best as I can tell, it's just an undocumented hack. There's no sane reason why the order of the fields should matter, but it does. The specific choice of formatting (i.e., as a special form inside the module comment rather than as a separate block of some kind) isn't the best either. My guess is that somebody wanted to quickly add this feature one day, so they hacked up something minimal but functioning, and left it at that. (Without bothering to document it.)
Personally, I just don't bother with these fields at all. The information is available from Cabal, so I don't bother duplicating it in Haddock as well. Perhaps some day Cabal will pass this information to Haddock automatically...

Resources