How to resolve current process symbols in LLVM MCJIT based JIT? - rust

I'm creating a simple MCJIT based JIT (implementing Kaleidoscope tutorial in Rust to be more precise). I'm using SectionMemoryManager::getSymbolAddress for symbol resolution. It sees symbols from libraries (e.g. sin function), but fails to resolve functions from my program (global, visible with nm, marked there by T). Is this the expected behavior? Or should it be some error in my code?
If this is the expected behavior, how should I properly resolve symbols from the current process? I'm adding symbols from the process with LLVMAddSymbol now, so resolution starts to work. Is this the right solution?
For those, who'll read my code. The problem with symbols is not related with the name mangling, as when I tried to make SectionMemoryManager::getSymbolAddress work, I used no_mangle directive, so they were named properly.

Thanks to Lang Hames, he has answered my question in other place. I cite the answer here for the case if somebody will look at the same problem as me:
In answer to your question: SectionMemoryManager::getSymbolAddress eventually (through the RTDyldMemoryManager base class) makes a call to llvm::sys::DynamicLibrary::SearchForAddressOfSymbol, which searches all previously loaded dynamic libraries for the symbol. You can call to llvm::sys::DynamicLibrary::LoadLibraryPermanently(nullptr) as part of your JIT initialisation (before any calls to getSymbolAddress) to import the program's symbols into DynamicLibrary's symbol tables.
If you really want to expose all functions in your program to the JIT'd code this is a good way to go. If you only want to expose a limited set of runtime functions you can put them in a shared library and just load that.

Related

Dynamically loading CUDA [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed yesterday.
Improve this question
I'm trying to add CUDA functionality to an existing code. The desired result is that if the user has cuda runtime installed on their machine, the code will use their cuda runtime (using dlopen) to check if a CUDA enabled GPU is available and then run the CUDA code on it if that's true. Otherwise, run the original non-GPU accelerated code. However, there are some gaps in my understanding of libraries and CUDA that make this tricky for me.
The code compiles just fine if I specify the location of the required CUDA libraries (cudart and cublas) and dynamically link them. However, I tried not linking these libraries and instead wrapping 'everything' I need using dlopen and dlsym to get handles to the functions I need. However, compilation fails when it gets to actual device code (definitions for angle bracket code) because it's looking for things like __cudaRegisterFunction during compile time. I've replaced the angle bracket calls with a wrapped version of cudaLaunchKernel but still get this issue, possibly because the definitions of the machine code themselves require some special calls.
Some fundamental things I'm unsure about are when the symbols in a shared lib have to be resolved. For example, let's say the user does not have cudart.so, is it possible for me to just not run any cudart/cuda code and avoid any runtime issues involving finding references to functions contained in this library? Or do all cudart.so functions need to be found in the .so file regardless of whether or not they're used? If the answer to this question is that only functions that are used need to be resolved, would this not obviate the need for wrapping functions via dlopen/dlsym? Another question somewhat related to this is: can you compile cuda code without linking to cudart? I may be confusing two separate issues in that it might be necessary to link to cudart.so when compiling CUDA code but that does not mean you are actually using cudart.so during runtime.
It's entirely possible I'm going about this the entirely wrong way so hopefully the general statement of what I'm trying to do can get me to some working answer.

Relation between MSVC Compiler & linker option for COMDAT folding

This question has some answers on SO but mine is slightly different. Before marking as duplicate, please give it a shot.
MSVC has always provided the /Gy compiler option to enable identical functions to be folded into COMDAT sections. At the same time, the linker also provides the /OPT:ICF option. Is my understanding right that these two options must be used in conjunction? That is, while the former packages functions into COMDAT, the latter eliminates redundant COMDATs. Is that correct?
If yes, then either we use both or turn off both?
Answer from someone who communicated with me off-line. Helped me understand these options a lot better.
===================================
That is essentially true. Suppose we talk just C, or C++ but with no member functions. Without /Gy, the compiler creates object files that are in some sense irreducible. If the linker wants just one function from the object, it gets them all. This is specially a consideration in programming for libraries, such that if you mean to be kind to the library's users, you should write your library as lots of small object files, typically one non-static function per object, so that the user of the library doesn't bloat from having to carry code that actually never executes.
With /Gy, the compiler creates object files that have COMDATs. Each function is in its own COMDAT, which is to some extent a mini-object. If the linker wants just one function from the object, it can pick out just that one. The linker's /OPT switch gives you some control over what the linker does with this selectivity - but without /Gy there's nothing to select.
Or very little. It's at least conceivable that the linker could, for instance, fold functions that are each the whole of the code in an object file and happen to have identical code. It's certainly conceivable that the linker could eliminate a whole object file that contains nothing that's referenced. After all, it does this with object files in libraries. The rule in practice, however, used to be that if you add a non-COMDAT object file to the linker's command line, then you're saying you want that in the binary even if unreferenced. The difference between what's conceivable and what's done is typically huge.
Best, then, to stick with the quick answer. The linker options benefit from being able to separate functions (and variables) from inside each object file, but the separation depends on the code and data to have been organised into COMDATs, which is the compiler's work.
===================================
As answered by Raymond Chen in Jan 2013
As explained in the documentation for /Gy, function-level linking
allows functions to be discardable during the "unused function" pass,
if you ask for it via /OPT:REF. It does not alter the actual classical
model for linking. The flag name is misleading. It's not "perform
function-level linking". It merely enables it by telling the linker
where functions begin and end. And it's not so much function-level
linking as it is function-level unlinking. -Raymond
(This snippet might make more sense with some further context:here are the posts about classical linking model:1, 2
So in a nutshell - yes. If you activate one switch without the other, there would be no observable impact.

Access Linux kernel symbols that are not exported via EXPORT_SYMBOL*

We have a need to access kernel global vars in net/ipv4/af_inet.c that are not exported explicitly from a loadable kernel module. We are using 2.6.18 kernel currently.
kallsyms_lookup_name doesn't appear to be available anymore (not exported)
__symbol_get returns NULL (upon further reading, symbol_get/__symbol_get looks through the kernel and existing modules' symbol tables that contains only exported symbol, and it is there to make sure the module from which a symbol is exported is actually loaded)
Is there anyway to access symbols that are not exported from a kernel module?
After doing a lot of reading and looking at answers people provided, it appears that it would be very hard to find one method across many kernel versions since the kAPI changes significantly over time.
You can use the method you mentioned before by getting it from /proc/kallsyms or just use the address given in the System.map (which is the same thing), it may seem hackish but this is how I've seen it done before (never really had to do it myself). Either this or you can build your own custom kernel where you actually do EXPORT_SYMBOL on whatever it is you want exported but this is not as portable.
If performance is not a big concern, you can traverse the whole list of symbols with kallsyms_on_each_symbol() (exported by the kernel for GPL'd modules) and check the names to get the ones you need. I would not recommend doing so unless there is no other choice though.
If you would like to go this way, here is an example from one of our projects. See the usage of kallsyms_on_each_symbol() as well as the code of symbol_walk_callback(), the other parts are irrelevant to this question.

Is there a technical term for the part of an IDE which maintains a dynamic symbol table as you code?

My context is MSVC 6.
Starting with a successfully compiled program, with browse information built, I can go into a existing function and hover over a variable, and the IDE will show me the data type, and variable name. One could well imagine that the information is coming from the browse file.
In practice, If I create a new variable.
int z;
and hover over the z, the IDE will show me the data type and variable name. I have not compiled the program yet, hence the browse file has not been updated. This seems to say,
that there is a portion of the IDE, which watches as you type, and stays aware of the datatypes and functions as you enter them. For all I know, it may compile them internally as well.
I have also noticed, that syntax errors can effectively disable this functionality.
I haven't seen this discussed anywhere. Is there a term for this sort of functionality?
It's probably the lexical analysis and syntactic analysis at work and building up it's own symbol table. It's part of the parsing phase of most compilers. That would explain why the functionality breaks when you see syntax errors. The parsing needs to occur successfully to have a reliable symbol table.
In compilers, its usually called a symbol table.
I'm not sure that there's a term common to all integrated development environments.

What are the porting issues going from VC8 (VS2005) to VC9 (VS2008)?

I have inherited a very large and complex project (actually, a 'solution' consisting of 119 'projects', most of which are DLLs) that was built and tested under VC8 (VS2005), and I have the task of porting it to VC9 (VS2008).
The porting process I used was:
Copy the VC8 .sln file and rename it
to a VC9 .sln file.
Copy all of
the VC8 project files, and rename
them to VC9 project files.
Edit
all of the VC9 project files,
s/vc8/vc9.
Edit the VC9 .sln,
s/vc8/vc9/
Load the VC9 .sln with
VS2008, and let the IDE 'convert'
all of the project files.
Fix
compiler and linker errors until I
got a good build.
So far, I have run into the following issues in that last step.
1) A change in the way decorated names are calculated, causing truncation of the names.
This is more than just a warning (http://msdn.microsoft.com/en-us/library/074af4b6.aspx). Libraries built with this warning will not link with other modules. Applying the solution given in MSDN was non-trivial, but doable. I addressed this problem separately in How do I increase the allowed decorated name length in VC9 (MSVC 2008)?
2) A change that does not allow the assignment of zero to an iterator. This is per the spec, and it was fairly easy to find and fix these previously-allowed coding errors. Instead of assignment of zero to an iterator, use the value end().
3) for-loop scope is now per the ANSI standard. Another easy-to-fix problem.
4) More space required for pre-compiled headers. In some cases a LOT more space was required. I ended up using /Zm999 to provide the maximum PCH space. If PCH memory usage gets bumped up again, I assume that I will have to forgo PCH altogether, and just endure the increase in what is already a very long build time.
5) A change in requirements for copy ctors and default dtors. It appears that in template classes, under certain conditions that I haven't quite figured out yet, the compiler no longer generates a default ctor or a default dtor. I suspect this is a bug in VC9, but there may be something else that I'm doing wrong. If so, I'd sure like to know what it is.
6) The GUIDs in the sln and vcproj files were not changed. This does not appear to impact the build in any way that I can detect, but it is worrisome nevertheless.
Note that despite all of these issues, the project built, ran, and passed extensive QA testing under VC8. I have also back-ported all of the changes to the VC8 projects, where they still build and run just as happily as they did before (using VS2005/VC8). So, all of my changes required for a VC9 build at least appear to be backward-compatible, although the regression testing is still underway.
Now for the really hard problem: I have run into a difference in the startup sequence between VC8 and VC9 projects. The program uses a small-object allocator modeled after Loki, in Andrei Alexandrescu's Book Modern C++ Design. This allocator is initialized using a global variable defined in the main program module.
Under VC8, this global variable is constructed at the very beginning of the program startup, from code in a module crtexe.c. Under VC9, the first module that executes is crtdll.c, which indicates that the startup sequence has been changed. The DLLs that are starting up appear to be confusing the small-object allocator by allocating and deallocating memory before the global object can initialize the statistics, which leads to some spurious diagnostics. The operation of the program does not appear to be materially affected, but the QA folks will not allow the spurious diagnostics to get past them.
Is there some way to force the construction of a global object prior to loading DLLs?
What other porting issues am I likely to encounter?
Is there some way to force the construction of a global object prior to loading DLLs?
How about the DELAYLOAD option? So that DLLs aren't loaded until their first call?
That is a tough problem, mostly because you've inherited a design that's inherently dangerous because you're not supposed to rely on the initialization order of global variables.
It sounds like something you could try to work around by replacing the global variable with a singleton that other functions retrieve by calling a global function or method that returns a pointer to the singleton object. If the object exists at the time of the call, the function returns a pointer to it. Otherwise, it allocates a new one and returns a pointer to the newly allocated object.
The problem, of course, is that I can't think of a singleton implementation that would avoid the problem you're describing. Maybe this discussion would be useful: http://www.oneunified.net/blog/Personal/SoftwareDevelopment/CPP/Singleton.article
That's certainly an interesting problem. I don't have a solution other than perhaps to change the design so that there is no dependence on undefined behavior of the order or link/dll startup. Have you considered linking with the older linker? (or whatever the VS.NET term is)
Because the behavior of your variable and allocator relied on some (unknown at the time) arbitrary order of startup I would probably fix that so that it is not an issue in the future. I guess you are really asking if anyone knows how to do some voodoo in VC9 to make the problem disappear. I am interested in hearing it as well.
How about this,
Make your main program a DLL too, call it main.dll, linked to all the other ones, and export the main function as say, mainEntry(). Remove the global variable.
Create a new main exe which has the global variable and its initialization, but doesn't link statically to any of the other application DLLs (except for the allocator stuff).
This new main.exe then dynamically loads the main.dll using LoadLibrary(), then uses GetProcAddress to call mainEntry().
The solution to the problem turned out to be more straightforward than I originally thought. The initialization order problem was caused by the existence of several global variables of types derived from std container types (a basic design flaw that predated my position with that company). The solution was to replace all such globals with singletons. There were about 100 of them.
Once this was done, the initialization (and destruction) order was under programmer control.

Resources