How to programmatically wrap a C++ dll with Python - python-3.x

I know how to use ctypes to call a function from a C++ .dll in Python by creating a "wrapper" function that casts the Python input types to C. I think of this as essentially recreating the function signatures in Python, where the function body contains the type cast to C and a corresponding .dll function call.
I currently have a set of C++ .dll files. Each library contains many functions, some of which are overloaded. I am tasked with writing a Python interface for each of these .dll files. My current way forward is to "use the hammer I have" and go through each function, lovingly crafting a corresponding Python wrapper for each... this will involve my looking at the API documentation for each of the functions within the .dlls and coding them up one by one. My instinct tells me, though, that there may be a much more efficient way to go about this.
My question is: Is there a programmatic way of interfacing with a Windows C++ .dll that does not require crafting corresponding wrappers for each of the functions? Thanks.

I would recommend using Cython to do your wrapping. Cython allows you to use C/C++ code directly with very little changes (in addition to some boilerplate). For wrapping large libraries, it's often straightforward to get something up and running very quickly with minimal extra wrapping work (such as in Ctypes). It's also been my experience that Cython scales better... although it takes more front end work to stand Cython up rather than Ctypes, it is in my opinion more maintainable and lends itself well to the programmatic generation of wrapping code to which you allude.

Related

Is it possible to "customize" python?

Can I change the core functionality of Python, for example, rewrite it to use say("Hello world") instead of print("Hello world")?
If this is possible, how can this be done?
I see a few possibilities as to how to accomplish this. I've arranged them in order of how much programming is needed/how obnoxious they are:
Renaming builtins
If, as in your example, you are simply more comfortable using say() or printf() than print(), then you can, as others have answered, just alias the builtin function to your own function with something like say=print.
Rewriting builtins
Let's pretend we don't trust the official implementation of print() and we want to implement our own. A lot of the internals in Python such as stdin are contained in the sys library. You could, if you wanted, implement your own. I asked a question a couple years ago here that discussed how to rename the _ variable to ans which might be illuminating to take a look at.
Sending your code through a preprocessor
Ok, so gcc doesn't require C code as input. If you use the right precompiler flags, then you could get away with evaluating #define macros in your source code before you send it to python. Technically a valid answer, but obnoxious as heck.
Writing modules in another language
Cython (python written in C) can have modules written for it in C. You could build a wrapper for printf in C (or assembly, if you'd rather) and use that library in your python code.
Recompiling Python
Unfortunately, doing the above is not possible with all tokens. What if, in a fit of fancy, we'd like to use whilst loops instead of while loops? The only way to accomplish this is actually altering the functioning of python itself. Now, this isn't for the faint of heart or the new programmer. Compilers are really complicated.
Since, however, Python is open source and you can download the source code here, in theory, you could go into the compiler and manually make all the edits you want, then compile your version of python and use that. By no means would your code be portable (as essentially you'd be making a fork of python) but you could technically do it.
Or just conform to the Python standards. That works too.
Writing a PEP
Python is a living language. It's constantly being updated. The ruling body of "What gets included" is the BDFL-delegate and the Council, but anyone can write a Python Enhancement Proposal that proposes to change the language in some way. Most features in Python started out as a PEP. See PEP 0001 for more details.
yes you can just write
say = print
say("hello")

Can I use the Rust lexer or parser to retrieve a list of functions within a Rust file?

The lexer/parser file located here is quite large and I'm not sure if it is suitable for just retrieving a list of Rust functions. Perhaps writing my own/using another library would be a better route to take?
The end objective would be to create a kind of execution manager. To contextualise, it would be able to read a list of function calls wrapped in a function. The function calls that are within the function will then be able to be re/ordered from some web interface. Thought it might be nice to manage larger applications this way.
No. I mean, not really. Whether you write your own parser or re-use syntex, you're going to hit a fundamental limitation: macros.
So let's say you go all-out and expand macro_rules!-based macros, including the ones defined in external crates (which means you'll also need to extract rustc's crate metadata loading... which isn't stable). What about procedural macros and custom derive attributes? Those are defined in code and depend on compiler-internal interfaces to function.
The only way this is likely to ever work correctly is if you build on top of the compiler, or duplicate a huge amount of work (which also involves unstable binary interfaces).
You could use syntex to parse the Rust code in a build script.

GObject Introspection across multiple languages

The Wiki page of the old PyGTK 2.8 binding states that an object properly written in Python
should also be easily usable from C code, or even other language bindings.
But PyGTK is outdated and should be replaced by PyGObject. Is it possible to mix and match languages with the newer introspection-based binding, too? For example, can I write a gobject class in Vala, extend it with Python and use the result in Java?
I've written a C based plugin library that does essentially this. It does use GObject Introspection and in theory it is possible. Right now there's C/C++, Python, Lua, and SeedJS all playing together in the same memory space, but I haven't tried to subclass or call anything other than C in the other languages.
Anyways, feel free to tinker if you like. GPlugin
In theory, yes, it should be possible. In practice, no, not really. Mixing multiple runtimes like that is extremely difficult, and extremely wasteful of resources. If you want your code to be usable in multiple languages you need pretty much need to write it in C or Vala. Or C++ as long as you expose a C API.
The closest thing you're really going to get is something like libpeas, where you create well-defined extension-points, and are then free to implement those extensions in whatever language you choose.

What would be involved in calling ARPACK++ (a C++ library) from Haskell?

I've spent a couple of days developing a program in Haskell, while learning the language. Now I realize that I'll need to call Arpack (a Fortran library) or Arpack++ (a C++ wrapper to Arpack) -- I can't find a good implementation of Lanczos method with Haskell bindings. Do any more experienced Haskell programers have an opinion of how difficult this would be?
I've been able to get ".so" ("shared object") versions of libarpack and libarpack++ installed through Ubuntu's repository, but I'm not sure that will suffice. I suspect I'm going to ultimately need to build Arpack++ from source code, which is possible, but I'm getting a lot of build errors, so it will take time. Is there any way to use just the ".so" files, without knowing exactly which version of the header files were used to generate them?
I'm considering using GreenCard, because it looks like the most well maintained Haskell/C bridge. I can't find much documentation though, so I'm wondering whether it will support C++ too.
I'm also starting to wonder whether I should rewrite my program in Python, and use scipy to call Arpack, but I've already sunk a couple of days into writing Haskell. I really like Haskell too, so I'm hoping I can make this work. I guess my overall question is this: What would be involved in making this work with Haskell?
Thanks much.
ELF format is standard format of executables and shared libraries, so accessing the code in these compiled modules is only a matter of knowing function names. If I understand correctly, Fortran is interoperable with C. As a consequence, Fortran should be interoperable with any language which can use C bindings, including Haskell. FYI, you can find all names exported by a module (executable or shared object or simple object archive) using nm tool (it is usually available in all linux distros by default). This of course would work if the binary file was not "stripped", but AFAIK it is not common practice.
However, Haskell cannot use C++ bindings in sane way, since C++ polymorphic features require name mangling, and the method of this name transformation is highly compiler-dependent. It is well-known problem which is not specific to Haskell. Of course, you could try to get a list of exported symbols from C++ shared object and then bind them using FFI, but... It isn't worth it.
As dsign said, you can use Foreign Function Interface GHC feature to create bindings to foreign code. All you would require is library headers (and the library itself of course). In case of C language that would be header files (*.h), but since your library is written in Fortran, you have to find header files analogue in library sources, refere to this page to match Fortran and C types, and then use this information to write FFI bindings. It would be helpful first to write C bindings, i.e. write C header. Then you can even use automatic FFI binding programs like c2hs.
It maybe also helpful to look through C++ bindings. It is possible that it has the header file I've described above. If it has one, then writing FFI bindings will be no more difficult than writing them for any other library.
So, it is not entirely impossible, but it may require some thorough work. Writing bindings to scientific/pure computational libraries is way easier than writing them for some system library which does a lot of IO and keeps its own internal state, but since this library is written not in C... Well, it may be advisable to invest your time in easier alternatives. I cannot say anythin about scipy, I've never used it, but since Python as a language is much more simpler than Haskell, it may be good alternative.
I can tell you that using a C/Fortran library from Haskell, with the help of the Foreign Function Interface would be certainly possible and not terribly complicated. Here is an introduction. In my understanding, you should be able to call anything with a C calling convention, and perhaps even Fortran, without need of recompiling the code. The only exception is with things that look like function calls but are indeed macros, in which case you will have to figure out what the macros do and reproduce them in Haskell.
As of greencard, I have never used it, so I can not vouch for it.
Your second idea of using Python could potentially save you more than a couple of days. Sad as it is, I have never managed Haskell code to easily adapt to my changing requirements, while I find that trivial in Python. Of course, that could be a limitation on my skills with Haskell or my thinking process rather that something to blame to the language.

Is there a way to convert from a string to pure code in C++?

I know that its possible to read from a .txt file and then convert various parts of that into string, char, and int values, but is it possible to take a string and use it as real code in the program?
Code:
string codeblock1="cout<<This is a test;";
string codeblock2="int array[5]={0,6,6,3,5};}";
int i;
cin>>i;
if(i)
{
execute(codeblock1);
}
else
{
execute(codeblock2);
}
Where execute is a function that converts from text to actual code (I don't know if there actually is a function called execute, I'm using it for the purpose of my example).
In C++ there's no simple way to do this. This feature is available in higher-level languages like Python, Lisp, Ruby and Perl (usually with some variation of an eval function). However, even in these languages this practice is frowned upon, because it can result in very unreadable code.
It's important you ask yourself (and perhaps tell us) why you want to do it?
Or do you only want to know if it's possible? If so, it is, though in a hairy way. You can write a C++ source file (generate whatever you want into it, as long as it's valid C++), then compile it and link to your code. All of this can be done automatically, of course, as long as a compiler is available to you in runtime (and you just execute it with system). I know someone who did this for some heavy optimization once. It's not pretty, but can be made to work.
You can create a function and parse whatever strings you like and create a data structure from it. This is known as a parse tree. Subsequently you can examine your parse tree and generate the necessary dynamic structures to perform the logic therin. The parse tree is subsequently converted into a runtime representation that is executed.
All compilers do exactly this. They take your code and they produce machine code based on this. In your particular case you want a language to write code for itself. Normally this is done in the context of a code generator and it is part of a larger build process. If you write a program to parse your language (consider flex and bison for this operation) that generates code you can achieve the results you desire.
Many scripting languages offer this sort of feature, going all the way back to eval in LISP - but C and C++ don't expose the compiler at runtime.
There's nothing in the spec that stops you from creating and executing some arbitrary machine language, like so:
char code[] = { 0x2f, 0x3c, 0x17, 0x43 }; // some machine code of some sort
typedef void (FuncType*)(); // define a function pointer type
FuncType func = (FuncType)code; // take the address of the code
func(); // and jump to it!
but most environments will crash if you try this, for security reasons. (Many viruses work by convincing ordinary programs to do something like this.)
In a normal environment, one thing you could do is create a complete program as text, then invoke the compiler to compile it and invoke the resulting executable.
If you want to run code in your own memory space, you could invoke the compiler to build you a DLL (or .so, depending on your platform) and then link in the DLL and jump into it.
First, I wanted to say, that I never implemented something like that myself and I may be way off, however, did you try CodeDomProvider class in System.CodeDom.Compiler namespace? I have a feeling the classes in System.CodeDom can provide you with the functionality you are looking for.
Of course, it will all be .NET code, not any other platform
Go here for sample
Yes, you just have to build a compiler (and possibly a linker) and you're there.
Several languages such as Python can be embedded into C/C++ so that may be an option.
It's kind of sort of possible, but not with just straight C/C++. You'll need some layer underneath such as LLVM.
Check out c-repl and ccons
One way that you could do this is with Boost Python. You wouldn't be using C++ at that point, but it's a good way of allowing the user to use a scripting language to interact with the existing program. I know it's not exactly what you want, but perhaps it might help.
Sounds like you're trying to create "C++Script", which doesn't exist as far as I know. C++ is a compiled language. This means it always must be compiled to native bytecode before being executed. You could wrap the code as a function, run it through a compiler, then execute the resulting DLL dynamically, but you're not going to get access to anything a compiled DLL wouldn't normally get.
You'd be better off trying to do this in Java, JavaScript, VBScript, or .NET, which are at one stage or another interpreted languages. Most of these languages either have an eval or execute function for just that, or can just be included as text.
Of course executing blocks of code isn't the safest idea - it will leave you vulnerable to all kinds of data execution attacks.
My recommendation would be to create a scripting language that serves the purposes of your application. This would give the user a limited set of instructions for security reasons, and allow you to interact with the existing program much more dynamically than a compiled external block.
Not easily, because C++ is a compiled language. Several people have pointed round-about ways to make it work - either execute the compiler, or incorporate a compiler or interpreter into your program. If you want to go the interpreter route, you can save yourself a lot of work by using an existing open source project, such as Lua

Resources