Usize breaking zksync when compiling for ARM

Usize breaking zksync when compiling for ARM - rust

This is a complex and hard issue, but I will break it down to the best of my abilities. It comes down to when I am compiling a rust project for ARM64 (goal is to run on rasp pi 4).
A large majority of the libraries compile (704 / 740) but it breaks during compilation when it goes to compile a zksync directory. The the yagna client for golem is what I am compiling, I am using
target - target.arm-unknown-linux-musleabi
linker - arm-linux-gnueabihf-ld
I'd love to hear ideas solutions, or what I am doing wrong so I can get this project running on ARM.
The error code I am getting is
Ok(stat.blocks_available() as u64 * stat.fragment_size())
^^^^^^^^^^^^^^^^^^^^ expected `u64`, found `u32`
among other errors, all revolving around bit differences when converting integers. This had led to me suspecting usize to be the culprit as it bases size off CPU architecture, which would explain the ARM compilation messing it up, and not showing up until you have to handle the int (at conversion).
let me know if there is any more information you need, tried to my best to encapsulate the problem

stat is a Statvfs structure, the return type of Statvfs::blocks_available() is fsblkcnt_t, and the return type of Statvfs::fragment_size() is c_ulong. These two types are defined in the libc package, which is a paper-thin wrapper around the low-level C OS-calls. The types are equivalent to the C types in the OS-specific *.h files. The sizes of the types vary from platform to platform.
The library you're compiling appears to have some assumptions baked in about these sizes and their arithmetic compatibility.
Some bug reports to the library authors would be in order.
If you're willing to patch the library yourself, then you should first investigate your platform's libc package and check the definitions of all types involved in the errors you're seeing. Then fix the arithmetic expressions so the types are compatible and they don't overflow.
I don't think Rust's type system is smart enough to support one body of code that handles all potential combinations of type sizes both safely (without overflow or truncation) and efficiently (without needlessly casting everything up to u128) . It might be that platform-specific patches are necessary.

Related

Can Rust code compile without the standard library?

I'm currently in the progress of learning Rust. I'm mainly using The Rust Programming Language book and this nice reference which relates Rust features/syntax to C++ equivalents.
I'm having a hard time understanding where the core language stops and the standard library starts. I've encountered a lot of operators and/or traits which seems to have a special relationship with the compiler. For example, Rust has a trait (which from what I understand is like an interface) called Deref which let's a type implementing it be de-referenced using the * operator:
fn main() {
let x = 5;
let y = Box::new(x);
assert_eq!(5, x);
assert_eq!(5, *y);
}
Another example is the ? operator, which seems to depend on the Result and Option types.
Can code that uses those operators can be compiled without the standard library? And if not, what parts of the Rust language are depending on the standard library? Is it even possible to compile any Rust code without it?

The Rust standard library is in fact separated into three distinct crates:
core, which is the glue between the language and the standard library. All types, traits and functions required by the language are found in this crate. This includes operator traits (found in core::ops), the Future trait (used by async fn), and compiler intrinsics. The core crate does not have any dependencies, so you can always use it.
alloc, which contains types and traits related to or requiring dynamic memory allocation. This includes dynamically allocated types such as Box<T>, Vec<T> and String.
std, which contains the whole standard library, including things from core and alloc but also things with further requirements, such as file system access, networking, etc.
If your environment does not provide the functionality required by the std crate, you can choose to compile without it. If your environment also does not provide dynamic memory allocation, you can choose to compile without the alloc crate as well. This option is useful for targets such as embedded systems or writing operating systems, where you usually won't have all of the things that the standard library usually requires.
You can use the #![no_std] attribute in the root of your crate to tell the compiler to compile without the standard library (only core). Many libraries also usually support "no-std" compilation (e.g. base64 and futures), where functionality may be restricted but it will work when compiling without the std crate.

DISCLAIMER: This is likely not the answer you're looking for. Consider reading the other answers about no_std, if you're trying to solve a problem. I suggest you only read on, if you're interested in trivia about the inner workings of Rust.
If you really want full control over the environment you use, it is possible to use Rust without the core library using the no_core attribute.
If you decide to do so, you will run into some problems, because the compiler is integrated with some items defined in core.
This integration works by applying the #[lang = "..."] attribute to those items, making them so called "lang items".
If you use no_core, you'll have to define your own lang items for the parts of the language you'll actually use.
For more information I suggest the following blog posts, which go into more detail on the topic of lang items and no_core:
Rust Tidbits: What Is a Lang Item?
Oxidizing the technical interview
So yes, in theory it is possible to run Rust code without any sort of standard library and supplied types, but only if you then supply the required types yourself.
Also this is not stable and will likely never be stabilized and it is generally not a recommended way of using Rust.

When you're not using std, you rely on core, which is a subset of the std library which is always (?) available. This is what's called a no_std environment, which is commonly used for some types of "embedded" programming. You can find more about no_std in the Rust Embedded book, including some guidance on how to get started with no_std programming.

When should inline be used in Rust?

Rust has an "inline" attribute that can be used in one of those three flavors:
#[inline]
#[inline(always)]
#[inline(never)]
When should they be used?
In the Rust reference, we see an inline attributes section saying
The compiler automatically inlines functions based on internal heuristics. Incorrectly inlining functions can actually make the program slower, so it should be used with care.
In the Rust internals forum, huon was also conservative about specifying inline.
But we see considerable usage in the Rust source, including the standard library. A lot of inline attributes are added to one-line-functions, which should be easy for the compilers to spot and optimize through heuristics according to the reference. Are those in fact not needed?

One limitation of the current Rust compiler is that it if you're not using LTO (Link-Time Optimization), it will never inline a function not marked #[inline] across crates. Rust uses a separate compilation model similar to C++ because LLVM's LTO implementation doesn't scale well to large projects. Therefore, small functions exposed to other crates need to be marked by hand. This isn't a great situation, and it's likely to be fixed in the future by some combination of improvements to LTO and MIR inlining.
#[inline(never)] is sometimes useful for debugging (separating a piece of code which isn't working as expected). In theory, it can be used for benchmarking, but that's usually a bad idea: turning off inlining doesn't prevent other inter-procedural optimizations like constant propagation. In terms of normal code, it can reduce codesize if you have a frequently used helper function which is only used for error handling.
#[inline(always)] is generally bad idea; if a function is big enough that the compiler won't inline it by default, it's big enough that the overhead of the call doesn't matter (and excessive inlining increases instruction cache pressure). There are exceptions, but you need performance measurements to justify it. This example is the sort of situation where it's worth considering. #[inline(always)] can also be used to improve -O0 code quality, but that's not usually worth worrying about.

Why can't I replace libraries distributed with GHC? What would happen if I did?

I see in this answer and this one that "everything will break horribly" and Stack won't let me replace base, but it will let me replace bytestring. What's the problem with this? Is there a way to do this safely without recompiling GHC? I'm debugging a problem with the base libraries and it'd be very convenient.
N.B. when I say I want to replace base I mean with a modified version of base from the same GHC version. I'm debugging the library, not testing a program against different GHC releases.

Most libraries are collections of Haskell modules containing Haskell code. The meaning of those libraries is determined by the code in the modules.
The base package, though, is a bit different. Many of the functions and data types it offers are not implemented in standard Haskell; their meaning is not given by the code contained in the package, but by the compiler itself. If you look at the source of the base package (and the other boot libraries), you will see many operations whose complete definition is simply undefined. Special code in the compiler's runtime system implements these operations and exposes them.
For example, if the compiler didn't offer seq as a primitive operation, there would be no way to implement seq after-the-fact: no Haskell term that you can write down will have the same type and semantics as seq unless it uses seq (or one of the Haskell extensions defined in terms of seq). Likewise many of the pointer operations, ST operations, concurrency primitives, and so forth are implemented in the compiler themselves.
Not only are these operations typically unimplementable, they also are typically very strongly tied to the compiler's internal data structures, which change from one release to the next. So even if you managed to convince GHC to use the base package from a different (version of the) compiler, the most likely outcome would simply be corrupted internal data structures with unpredictable (and potentially disastrous) results -- race conditions, trashing memory, space leaks, segfaults, that kind of thing.
If you need several versions of base, just install several versions of GHC. It's been carefully architected so that multiple versions can peacefully coexist on a single machine. (And in particular installing multiple versions definitely does not require recompiling GHC or even compiling GHC a first time, which seems to be your main concern.)

Discover if structure initialization doesn't modify all members

Consider the following code:
typedef struct _sMYSTRUCT_BASE
{
int b_a;
int b_b;
int b_c;
} sMYSTRUCT_BASE;
typedef struct _sMYSTRUCT
{
sMYSTRUCT_BASE base;
int a;
int b;
} sMYSTRUCT;
Private const sMYSTRUCT mystruct_init =
{
0,
1,
3,
4
};
I am looking for a way to generate an error (compile-, or runtime) to indicated that the structure initialization hasn't explicitly 'touched' all structure members.
There are 5 integers in the structure, but 'mystruct_init' only have 4 values.
I know that last member (mystruct_init.b) will be zero, but I need some kind of warning/error to inform the programmer about the mistake.
This has to work on a very old compiler (maybe not even ansi-c compliant).

Modern compilers are capable of producing such a warning...in gcc, it's turned on with -Wmissing-field-initializers (which warns about initializers that exist but do not initialize all members, but not about structs with no initializer expression; these can at least sometimes be caught by turning on -Wuninitialized, which will warn you if it sees you reading a potentially uninitialized value, at least if you read it in the same function the variable was declared in).
If your very old compiler happens to supply such a warning, you could of course just turn it on, but that seems unlikely from your description.
Your best option, I think, if you want to do an exhaustive search for them, would be to see whether you can get the code to compile with some version of gcc -- it wouldn't have to compile well enough to actually run on your target platform in order to get the warnings. I can't guarantee that it will be able to compile your pre-ANSI C code, particularly if it widely uses compiler-specific extensions, but I can at least say that support for the legacy K&R syntax is still present in the modern C standard, so I wouldn't be surprised if your code compiles better than you might think.
If that works, then to consistently produce the warnings in your IDE, you could modify the build script so that it both compiles and links the code with the real compiler you're targeting, and also compiles it (but not necessarily links it) with gcc, just to generate additional warnings that can be picked up and displayed by the IDE.
The other option would be to see if you can find a compatible static analyzer that can perform such a check; I work on a tool called EnSoft Atlas that builds a data-flow graph which, together with a simple script, could be used to enforce initialization more thoroughly than the gcc warnings allow, by checking whether flow of uninitialized values to fields of structs occurs.
However, our support for C is still in beta. Atlas requires that Eclipse CDT (or JDT for Java) be able to parse your code, and the current C beta only fully supports modern strongly-typed struct initializers (i.e. struct foo f = (struct foo) {...} has fully connected data-flow, but support for the older initializer list syntax struct foo f = {...} was not implemented in our first pass), so I'm not sure it would be able to meet your needs at this time.

Why is /Wp64 deprecated?

Why is the /Wp64 flag in Visual C++ deprecated?
cl : Command line warning D9035 :
option 'Wp64' has been deprecated and will be removed in a future release

I think that/Wp64 is deprecated mainly because compiling for a 64-bit target will catch the kinds of errors it was designed to catch (/Wp64 is only valid in 32-bit compiles). The option was added back when 64-bit targets were emerging to help people migrate their programs to 64-bit and help detect code that wasn't '64-bit clean'.
Here's an example of the kinds of problems with /Wp64 that Microsoft just isn't interested in fixing - probably rightly so (from http://connect.microsoft.com/VisualStudio/feedback/details/502281/std-vector-incompatible-with-wp64-compiler-option):
Actually, the STL isn't intentionally incompatible with /Wp64, nor is
it completely and unconditionally incompatible with /Wp64. The
underlying problem is that /Wp64 interacts extremely badly with
templates, because __w64 isn't fully integrated into the type system.
Therefore, if vector<unsigned int> is instantiated before vector<__w64 unsigned int>, then both of them will behave like vector<unsigned int>, and vice versa. On x86, SOCKET is a typedef for __w64 unsigned int. It's not obvious, but vector<unsigned int> is being instantiated
before your vector<SOCKET>, since vector<bool> is backed (in our
implementation) by vector<unsigned int>.
Previously (in VC9 and earlier), this bad interaction between /Wp64
and templates caused spurious warnings. In VC10, however, changes to
the STL have made this worse. Now, when vector::push_back() is given
an element of the vector itself, it figures out the element's index
before doing other work. That index is obtained by subtracting the
element's address from the beginning of the vector. In your repro,
this involves subtracting const SOCKET * - unsigned int *. (The latter
is unsigned int * and not SOCKET * due to the previously described
bug.) This /should/ trigger a spurious warning, saying "I'm
subtracting pointers that point to the same type on x86, but to
different types on x64". However, there is a SECOND bug here, where
/Wp64 gets really confused and thinks this is a hard error (while
adding constness to the unsigned int *).
We agree that this bogus error message is confusing. However, since
it's preceded by an un-silenceable command line deprecation warning
D9035, we believe that that should be sufficient. D9035 already says
that /Wp64 shouldn't be used (although it doesn't go on to say "this
option is super duper buggy, and completely unnecessary now").
In the STL, we could #error when /Wp64 is used. However, that would
break customers who are still compiling with /Wp64 (despite the
deprecation warning) and aren't triggering this bogus error. The STL
could also emit a warning, but the compiler is already emitting D9035.

/Wp64 on 32-bit builds is a waste of time. It is deprecated, and this deprecation makes sense. The way /Wp64 worked on 32-bit builds is it would look for a _w64 annotation on a type. This _w64 annotation would tell the compiler that, even though this type is 32-bits in 32-bit mode, it is 64-bits in 64-bit mode. This turned out to be really flakey, especially where templates are involved.
/Wp64 on 64-bit builds is extremely useful. The documentation (http://msdn.microsoft.com/en-us/library/vstudio/yt4xw8fh.aspx) claims that it is on by default in 64-bit builds, but this is not true. Compiler warnings C4311 and C4312 are only emitted if /Wp64 is explicitly set. Those two warnings indicate when a 32-bit value is put into a pointer, or vice-versa. These are very important for code correctness, and claim to be at warning level 1. I have found bugs in very widespread code that would have been stopped if the developers had turned on /Wp64 for 64-bit builds. Unfortunately, you also get the command line warning that you have observed. I know of no way to squelch this warning, and I have learned to live with it. On the bright side, if you build with as warnings as errors, this command line warning doesn't turn into an error.

Because when using the 64 Bit compiler from VS2010 the compiler does the detection of 64 bit problems automatically... this switch is from back in the day when you could try to detect 64 Bit problem running the 32 Bit compiler...
See http://msdn.microsoft.com/en-us/library/yt4xw8fh%28v=VS.100%29.aspx

You could link to the deprecation warning, but couldn't go to the /Wp64 documentation?
By default, the /Wp64 compiler option is off in the Visual C++ 32-bit compiler and on in the Visual C++ 64-bit compiler.
If you regularly compile your application by using a 64-bit compiler, you can just disable /Wp64 in your 32-bit compilations because the 64-bit compiler will detect all issues.
Emphasis added

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string