How to declare/access an extern function that uses LPCTSTR? - rust

How do I declare an external MFC function that has LPCTSTR tokens?
I have the following C function in a library I wish to use:
DLL_PUBLIC void output( LPCTSTR format, ... )
Where DLL_PUBLIC is resolved to
__declspec(dllimport) void output( LPCTSTR format, ...)
In Rust, I was able to get things to build with:
use winapi::um::winnt::CHAR;
type LPCTSTR = *const CHAR;
#[link(name="mylib", kind="static")]
extern {
#[link_name = "output"]
fn output( format:LPCTSTR, ...);
}
I'm not sure this is the best approach, but it seems to get me part way down. Although the module is declared in a DLL module, the symbol decorations in the native DLL are such that it does not have _imp prepended to it in the binary. So I find that "static" seems to look for the correct module, although I still have not been able to get it to link.
The dumpbin output for the module (this target function) in "mylib.dll" is:
31 ?output##YAXPEBDZZ (void __cdecl output(char const *,...))

This is assuming that what you are trying to accomplish here is to link Rust code against a custom DLL implementation. If that is the case, then things are looking good.
First things first, though, you'll need to do some sanity cleanup. LPCTSTR is not a type. It is a preprocessor symbol that either expands to LPCSTR (aka char const*) or LPCWSTR (aka wchar_t const*).
When the library gets built the compiler commits to either one of those, and that decision is made for all eternity. Clients, on the other hand, that #inlcude the header are still free to choose, and you have no control over this. Lucky you, if you're using C++ linkage, and have the linker save you. But we aren't using C++ linkage.
The first order of action is to change the C function signature using an explicit type, so that clients and implementation always agree. I will be using char const* here.
C++ library
Building the library is fairly straight forward. The following is a bare-bones C++ library implementation that simply outputs a formatted string to STDOUT.
dll.cpp:
#include <stdio.h>
#include <stdarg.h>
extern "C" __declspec(dllexport) void output(char const* format, ...)
{
va_list argptr{};
va_start(argptr, format);
vprintf(format, argptr);
va_end(argptr);
}
The following changes to the original code are required:
extern "C": This is requesting C linkage, controlling how symbols are decorated as seen by the linker. It's the only reasonable choice when planning to cross language boundaries.
__declspec(dllexport): This is telling the compiler to inform the linker to export the symbol. C and C++ clients will use a declaration with a corresponding __declspec(dllimport) directive.
char const*: See above.
This is all that's required to build the library. With MSVC the target architecture is implied by the toolchain used. Open up a Visual Studio command prompt that matches the architecture eventually used by Rust's toolchain, and run the following command:
cl.exe /LD dll.cpp
This produces, among other artifacts, dll.dll and dll.lib. The latter being the import library that needs to be discoverable by Rust. Copying it to the Rust client's crate's root directory is sufficient.
Consuming the library from Rust
Let's start from scratch here and make a new binary crate:
cargo new --bin client
Since we don't need any other dependencies, the default Cargo.toml can remain unchanged. As a sanity check you can cargo run to verify that everything is properly set up.
If that all went down well it's time to import the only public symbol exported by dll.dll. Add the following to src/main.rs:
#[link(name = "dll", kind = "dylib")]
extern "C" {
pub fn output(format: *const u8, ...);
}
And that's all there is to it. Again, a few details are important here, namely:
name = "dll": Specifies the import library. The .lib extension is implied, and must not be appended.
kind = "dylib": We're importing from a dynamic link library. This is the default and can be omitted, though I'm keeping it for posterity.
extern "C": As in the C++ code this controls name decoration and the calling convention. For variadic functions the C calling convention (__cdecl) is required.
*const u8: This is Rust's native type that corresponds to char const* in C and C++. Using type aliases (whether those provided by the winapi crate or otherwise) is not required. It wouldn't hurt either, but let's just keep this simple.
With that everything is set up and we can take this out for a spin. Replace the default generated fn main() with the following code in src/main.rs:
fn main() {
unsafe { output("Hello, world!\0".as_ptr()) };
}
and there you have it. cargo running this produces the famous output:
Hello, world!
So, all is fine, right? Well, no, not really. Actually, nothing is fine. You could have just as well written, compiled, and executed the following:
fn main() {
unsafe { output(b"Make sure this has reasons to crash: %f".as_ptr(), "💩") };
}
which produces the following output for me:
Make sure this has reasons to crash: 0.000000💩
though any other observable behavior is possible, too. After all, the behavior is undefined. There are two bugs: 1 The format specifier doesn't match the argument, and 2 the format string isn't NUL terminated.
Either one can be fixed, trivially even, though you have opted out of Rust's safety guarantees. Rust can't help you detect either issue, and when control reaches the library implementation, it cannot detect this either. It will just do what it was asked to do, subverting each and every one of Rust's safety guarantees.
Remarks
A few words of caution: Getting developers interested in Rust is great, and I will do my best to try whenever I get a chance to. Getting Rust-curious developers excited about Rust is often just a natural progression.
Though I will say that trying to get developers excited about Rust by starting out with unsafe Rust isn't going to be successful. It's eventually going to provoke a response like: "Look, ma, a steep learning curve with absolutely no benefit whatsoever, who could possibly resist?!" (I'm exaggerating, I know).
If your ultimate goal is to establish Rust as a superior alternative to C (and in part C++), don't start by evaluating how not to benefit from Rust. Specifically, trying to import a variadic function (the unsafest language construct in all of C++) and exposing it as an unsafe function to Rust is almost guaranteed to be the beginning of a lost battle.
Now, this may read bad as it is already, but this isn't over yet. In an attempt to make your C++ code accessible from Rust, things have gotten worse! With a C++ compiler and static code analysis tools (assuming the format string is known at compile time, and the tools understand the semantics), the tooling can and frequently will warn about mismatches. That option is now gone, forever, and there's not even a base level of protection.
If you absolutely want to make some sort of logging available to Rust, export a function from the library that takes a single char const*, use Rust's format! macro, and provide a variadic wrapper to C and C++ clients.

Related

Is it possible to export a function from rust dll which takes a reference as a parameter

I'm new to learning rust and I saw that rust uses generic lifetimes to validate references at compile time which is very nice. I didn't dig much around building a dll or shared library in rust but from what I saw a function can be exported from dll (like in c++). But if I want to export a function that takes a reference as a parameter or returns a reference other than static references, how will this work ? the dll and the app are not in the same project so the compiler can't check lifetimes in this case so this is simply not possible like exporting templated functions in c++ is not possible without instantiation but I don't know how instantiate a generic lifetime function in rust.
Why ask this question ? rust is aimed to be a system programming language and shared libraries are mandatory in any system so how a system built in rust will provide its api in pure rust without going through c ?
In c++ this is possible like in c :
struct some_type_c
{
int i;
float f;
void* p;
};
// vary between windows and unix likes
#define EXPORT some_platform_export
EXPORT void use_some_type_c(some_type_c* s) { ++s->i; }
struct some_type_cpp
{
int i;
float f;
void* p;
// may be implemented in a cpp file
EXPORT void use() { ++i; }
};
Since rust has no stable ABI, exported functions need to go through the C ABI and deal with raw pointers along the way.
When going through C, it's common practice to create more ergonomic functions that wrap the underlying unsafe FFI (foreign function interface) calls and have the right lifetime annotations in order to represent the relationship between different references.
Crates that dynamically load system libraries are often composed of two crates, one with a -sys suffix, containing the FFI bindings, and the more ergonomic version (e.g. openssl depends on openssl-sys, containing generated C-compatible types, using bindgen).
When exporting functions, the references lifetimes are lost when converting them to raw pointers.
However it's worth noting that all rust crates are statically linked at compile time, and lifetime checks are enforced across crate boundaries.

Using old std::string in gcc

I have program that uses std::string, but memmove the std::string` instances.
It worked fine until gcc 5.1.
However this no longer works as of gcc 5.3. I think developers finally did SSO with internal pointer.
I will definitely fix that, but is there easy way to fix it with some define or pragma?
Code looks similar to this:
// MyClass have std::string inside
MyClass *a = malloc(MAX * sizeof(MyClass));
// ...
// placement new on a[0]
// ...
memmove(&a[1], &a[0], sizeof(MyClass));
// ...
process(a[1]);
This is old code, please do not comment about malloc usage.
I will refactor or switch to std::vector, but I want the code to work until I do so.
You are experiencing effects of undefined behavior, but I think you know this. You cannot rely on the effects of byte-wise copying non-POD resp. not trivially copyable types, and the compiler is free to change that behavior.
I think it may be possible to define a safe overload for memmove with your class as arguments and use the copy-constructor inside it. I don't know if that is strictly legal, but you seem to be using the C-function instead of the C++ version in namespace std, so at least you are not changing namespace std which is not allowed.
void memmove(MyClass* a, MyClass* b, size_t)
{
*a = *b;
}
Strictly speaking, I think this is still undefined behavior because 17.6.4.3 of the C++ standard specifies that
If a program declares or defines a name in a context where it is
reserved, other than as explicitly allowed by this Clause, its
behavior is undefined.
In addition, all names in C library are reserved names and shall not be used by the program (17.6.4.3.2). Practically, I think this will work.
You may need to compile with -fno-builtin to prevent gcc from replacing memmove globally. If it is illegal to overwrite the function, you can replace it dynamically with LD_PRELOAD.
This is a hack solution! Your code may still not work because the compiler makes the assumption that, when you memmove it is a POD/TriviallyCopyable object and uses that for some optimisation, e.g. by assuming that after the memmove, both objects are represented by identical bytes. This is broken when you re-implement memmove with the copy-constructor.

Discover if structure initialization doesn't modify all members

Consider the following code:
typedef struct _sMYSTRUCT_BASE
{
int b_a;
int b_b;
int b_c;
} sMYSTRUCT_BASE;
typedef struct _sMYSTRUCT
{
sMYSTRUCT_BASE base;
int a;
int b;
} sMYSTRUCT;
Private const sMYSTRUCT mystruct_init =
{
0,
1,
3,
4
};
I am looking for a way to generate an error (compile-, or runtime) to indicated that the structure initialization hasn't explicitly 'touched' all structure members.
There are 5 integers in the structure, but 'mystruct_init' only have 4 values.
I know that last member (mystruct_init.b) will be zero, but I need some kind of warning/error to inform the programmer about the mistake.
This has to work on a very old compiler (maybe not even ansi-c compliant).
Modern compilers are capable of producing such a warning...in gcc, it's turned on with -Wmissing-field-initializers (which warns about initializers that exist but do not initialize all members, but not about structs with no initializer expression; these can at least sometimes be caught by turning on -Wuninitialized, which will warn you if it sees you reading a potentially uninitialized value, at least if you read it in the same function the variable was declared in).
If your very old compiler happens to supply such a warning, you could of course just turn it on, but that seems unlikely from your description.
Your best option, I think, if you want to do an exhaustive search for them, would be to see whether you can get the code to compile with some version of gcc -- it wouldn't have to compile well enough to actually run on your target platform in order to get the warnings. I can't guarantee that it will be able to compile your pre-ANSI C code, particularly if it widely uses compiler-specific extensions, but I can at least say that support for the legacy K&R syntax is still present in the modern C standard, so I wouldn't be surprised if your code compiles better than you might think.
If that works, then to consistently produce the warnings in your IDE, you could modify the build script so that it both compiles and links the code with the real compiler you're targeting, and also compiles it (but not necessarily links it) with gcc, just to generate additional warnings that can be picked up and displayed by the IDE.
The other option would be to see if you can find a compatible static analyzer that can perform such a check; I work on a tool called EnSoft Atlas that builds a data-flow graph which, together with a simple script, could be used to enforce initialization more thoroughly than the gcc warnings allow, by checking whether flow of uninitialized values to fields of structs occurs.
However, our support for C is still in beta. Atlas requires that Eclipse CDT (or JDT for Java) be able to parse your code, and the current C beta only fully supports modern strongly-typed struct initializers (i.e. struct foo f = (struct foo) {...} has fully connected data-flow, but support for the older initializer list syntax struct foo f = {...} was not implemented in our first pass), so I'm not sure it would be able to meet your needs at this time.

Is there a workaround for _Complex syntax in VC++?

I've got a library compiled with MinGW which supports the C99 Keywords, _Complex. I'd like to use this library with MSVC++ 2010 compiler. I've tried to temporarily switch off all the _Complex syntax code, so that it compiles. I found most of the other functions worked fine in MSVC++. Now I want to enable the parts with _Complex definition, but really don't know how to.
Obviously I can't recompile it in MSVC++ as the library asks for C99 features, etc. However, I feel like it is such a waste to give it up, and look for substutions, because it works perfect with most other functions.
I think I can write wrappers of the APIs that require _Complex syntax and compile it with MinGW GCC then it will be able to import into my MSVC project. But I still want to know if there is any better workaround of this problem, like what is the "standard" way people dealing with problem when compile C99 complex number syntax in VC++?
Xing.
From the C Standard (C11 §6.2.5 ¶13; C99 has approximately the same language):
Each complex type has the same representation and alignment requirements as an array
type containing exactly two elements of the corresponding real type; the first element is
equal to the real part, and the second element to the imaginary part, of the complex
number.
I don’t have the C++ Standard in front of me, but the complex type templates defined in <complex> have the same requirement; this is intended for compatibility.
You can therefore re-write C functions taking & returning values of type double _Complex as C++ functions taking & returning values of type std::complex<double>; so long as name-mangling on the C++ side has been turned off (via extern "C") both sides will be compatible.
Something like this might help:
#ifdef __cplusplus
#include <complex>
#define std_complex(T) std::complex<T>
#else
#define std_complex(T) T _Complex
#endif

Passing std::string from VC++2005 to VC++6 DLL results in garbage

I have a dynamic link library written in VC++6. I wrote some code with VC++2005 which calls the native VC++6 library. Whenever I pass std::string to the native library, the result is always garbage. However, this does not happen if I pass other types like char *, int, etc. Any ideal what is causing this?
The following code illustrates this.
// VC++ 6 Code
class __declspec(dllexport) VC6
{
public:
VC6();
void DoSomething(const std::string &s);
}
VC6()::VC6() {}
void VC6::DoSomething(const std::string &s)
{
std::cout << s; // Resulting output on screen is garbage
}
// VC++ 2005 Code
void VC2005::DoSomething()
{
VC6 *vc6 = new VC6();
std::string s("Test String");
vc6->DoSomething(s);
delete vc6;
}
Classes such as std::string aren't necessarily defined the same way in every version of the runtime library (even though they have the same name), so you shouldn't mix the libraries this way. On the other hand, types such as int and char* are the same for a given platform so you can pass them.
In your example, it's better to pass the string as a (pointer,size) pair or simply as a null-terminated string.
Edit: forgot to mention the obvious solution of using the same compiler version. Do this if you want to pass objects around.
Don't do this. It doesn't work.
C++ does not define a fixed ABI, so you can't in general pass non-POD types between libraries or translation units compiled by different compilers.
In your case, VC6 and VC8 have different definitions of std::string (and the compilers may also insert different padding and other changes), and the result is garbage, and/or unpredictable behavior and crashes.
If you need to pass data to a VC6 DLL (a better option might be to recompile that code under a sane compiler), you have to stick to types where you can be sure it'll work. That means 1) POD types (either built-in primitives such as char*, or C structs containing only POD types), or COM objects.
You can write a wrapper DLL with a C interface. A C++/CLI wrapper may be needed if p-invoke alone can not handle the interop. Writing a COM server in ATL is probably a better choice to provide an object oriented interface in native code and to avoid writing another wrapper DLL in C++/CLI).

Resources