Can I use the Rust lexer or parser to retrieve a list of functions within a Rust file? - rust

The lexer/parser file located here is quite large and I'm not sure if it is suitable for just retrieving a list of Rust functions. Perhaps writing my own/using another library would be a better route to take?
The end objective would be to create a kind of execution manager. To contextualise, it would be able to read a list of function calls wrapped in a function. The function calls that are within the function will then be able to be re/ordered from some web interface. Thought it might be nice to manage larger applications this way.

No. I mean, not really. Whether you write your own parser or re-use syntex, you're going to hit a fundamental limitation: macros.
So let's say you go all-out and expand macro_rules!-based macros, including the ones defined in external crates (which means you'll also need to extract rustc's crate metadata loading... which isn't stable). What about procedural macros and custom derive attributes? Those are defined in code and depend on compiler-internal interfaces to function.
The only way this is likely to ever work correctly is if you build on top of the compiler, or duplicate a huge amount of work (which also involves unstable binary interfaces).

You could use syntex to parse the Rust code in a build script.

Related

Alternative to ctor/inventory for when compiling to wasm?

The ctor crate doesn't support web assembly currently, although there is active discussion about how to fix this.
Although I am well aware of the issues associated with static initialization coming from C++, being able to register things with a factory at startup is a very handy capability, and is necessary to avoid violating the DRY principle in a lot of cases. Without it every time you want to add a new possibility to your factory you have to add a separate line of code to main(), possibly more lines if you need to import your new function. This quickly gets tedious.
I am wondering if it is possible to build something equivalent (at least when everything you want to register is concentrated inside a single crate) using a procedural attribute macro and build.rs. The macro would be used to mark the functions that I want to register, and it's implementation would save the module paths ("crate::your::registered_function") of those functions off to the side somewhere in a file but otherwise just be a passthrough. Then build.rs would generate a function that calls all the functions listed in the file, and I would have main() call that one function by hand.
Is there a different trick that would work instead?
Does an implementation of this already exist somewhere I could use as a reference?
How would the procedural macro actually generate the module path for the function to be called? There is module_path! but if invoked from inside the definition of the procedural macro it will give the module path for the macro, not the module path associated with what the TokenStream is going to expand into. The macro could generate a call to module_path! but that won't evaluate until later when the final program is run.

template instantiation statistics from compilers

Is there a way to get a summary of the instantiated templates (with what types and how many times - like a histogram) within a translation unit or for the whole project (shared object/executable)?
If I have a large codebase and I want to take advantage of the C++11 extern keyword I would like to know which templates are most used within my project (or from the internals of stl - like std::less<MyString> for example).
Also is it possible to have a weight assigned to each template instantiation (time spent by the compiler)?
Even if only one (c++11 enabled) compiler gives me such statistics I would be happy.
How difficult would it be to implement such a thing with Clang's LibTooling?
And is this even reasonable? Many people told me that I can reason which template instantiations I should extern without the use of a tool...
There are several ways to attack this problem.
If you are working with an open-source compiler, it's not hard to make a simple change to the source code that will trace all template substantiations.
If that sounds like too much hassle, you can also try to force the compiler to produce a warning on each template instantiation for a given symbol. Steven Watanabe has written a set of tools that can help you with that.
Finally, possibly the best options is to use the debugging symbols (or map files), generated by the compiler, to track down how many times each function appears in the final image and more importantly how much does it add to the weight in bytes. The best example for such a tool is Andrian Stone's SymbolSort, which is based on the Microsoft's toolset. Another similar tool is the Map File Browser.

Can you disallow a common lisp script called from common lisp to call specific functions?

Common Lisp allows to execute/compile code at runtime. But I thought for some (scripting-like) purposes it would be good if one could disallow a user-script to call some functions (especially for application extensions). One could still ask the user if he will allow an extension to access files/... I'm thinking of something like the Android permission system for Common Lisp. Is this possible without rewriting the evaluation code?
The problem I see is, that in Common Lisp you would probably want a script to be able to use reader macros and normal macros and for the latter operators like intern, but those would allow you to get arbitrary symbols (by string manipulation & interning), so simply scanning the code before evaluation won't suffice to ensure that specific functions aren't called.
So, is there something like a lock for functions? I thought of using fmakunbound / makunbound (and keeping the values in a local variable), but would that be possible in a multi-threaded environment?
Thanks in advance.
This is not part of the Common Lisp specification and there is no Common Lisp implementation that is extended to make this kind of restriction easy.
It seems to me like it would be easier to use operating system restrictions (e.g. rlimit, capabilities, etc) to enforce what you want on the Common Lisp process.
This is not an unusual desire, i.e. to run untrusted 3rd party code in a sandbox.
You can hand craft a sandbox by creating a custom parser and interpreter for your scripting language. It is pedantic, but true, than any program with an API is providing such a service. API designers and implementors needs to worry about the vile users.
You can still call eval or the compiler to run your sandbox scripts. It just means you need to assure that your reader, parser and language decline to provide access to any risky functionality.
You can use a lisp package to create a good sandbox. You can still use s-expressions for your scripting language's syntax, but you must cripple the standard reader so the user can't escape package-sandbox. You can still use the evaluator and the compiler, but you need to be sure the package you have boxed the user into contains no functionality that he can use to do inappropriate things.
Successful sandbox design and construction is easier when you start with an empty sandbox and slowly add functionality. Common Lisp is a big language and that creates a huge surface for attacker to poke at. So if you create a sandbox out of a package it's best to start with an empty package and add functions one at a time. Thinking thru what risks they create. The same approach is good when creating your crippled reader. Don't start with the full reader and throw things away, start with a useless reader and add things. Sadly taking that advice creates a pretty significant cost to getting started. But, if you look around I suspect you can find an existing safe reader.
Xach's suggestion is another way to go and in many case more straight forward.

Implicitly registering unit tests in Haskell

I'm new to Haskell, with a C++ background. I'm doing some exercises in Haskell, and I want to implement them as a bunch of functions covered with unit tests, so the testing driver is my only app.
And with my background, I'm looking for something like GTest. HUnit is its analog in Haskell world. But need to explicitly register tests is really annoying - thats tedious and violates DRY principle.
So I was thinking about experimenting with custom testing framework.Seems that template Haskell can be used to automate providing assertion descriptions and registering tests within one module. But how can automatically collect all tests from all linked modules?
Of course, it is always possible to write build script that would grep sources and generate required code, but I wonder, if this can be done in Haskell only?
test-framework-th provides this functionality for test-framework. The simplest thing is to use the defaultMainGenerator function to collect all top-level definitions prefixed with case_ (HUnit) or prop_ (QuickCheck) into test groups.
If you have multiple test groups, you do still need to list them in a main entry point for your tests. There’s probably a way around that, and I guess that’s what you’re really asking about, but honestly I have found little need to break tests into more than a handful of modules. The effort needed to avoid repetition is sometimes less than the effort needed to maintain it.

Is there a way to convert from a string to pure code in C++?

I know that its possible to read from a .txt file and then convert various parts of that into string, char, and int values, but is it possible to take a string and use it as real code in the program?
Code:
string codeblock1="cout<<This is a test;";
string codeblock2="int array[5]={0,6,6,3,5};}";
int i;
cin>>i;
if(i)
{
execute(codeblock1);
}
else
{
execute(codeblock2);
}
Where execute is a function that converts from text to actual code (I don't know if there actually is a function called execute, I'm using it for the purpose of my example).
In C++ there's no simple way to do this. This feature is available in higher-level languages like Python, Lisp, Ruby and Perl (usually with some variation of an eval function). However, even in these languages this practice is frowned upon, because it can result in very unreadable code.
It's important you ask yourself (and perhaps tell us) why you want to do it?
Or do you only want to know if it's possible? If so, it is, though in a hairy way. You can write a C++ source file (generate whatever you want into it, as long as it's valid C++), then compile it and link to your code. All of this can be done automatically, of course, as long as a compiler is available to you in runtime (and you just execute it with system). I know someone who did this for some heavy optimization once. It's not pretty, but can be made to work.
You can create a function and parse whatever strings you like and create a data structure from it. This is known as a parse tree. Subsequently you can examine your parse tree and generate the necessary dynamic structures to perform the logic therin. The parse tree is subsequently converted into a runtime representation that is executed.
All compilers do exactly this. They take your code and they produce machine code based on this. In your particular case you want a language to write code for itself. Normally this is done in the context of a code generator and it is part of a larger build process. If you write a program to parse your language (consider flex and bison for this operation) that generates code you can achieve the results you desire.
Many scripting languages offer this sort of feature, going all the way back to eval in LISP - but C and C++ don't expose the compiler at runtime.
There's nothing in the spec that stops you from creating and executing some arbitrary machine language, like so:
char code[] = { 0x2f, 0x3c, 0x17, 0x43 }; // some machine code of some sort
typedef void (FuncType*)(); // define a function pointer type
FuncType func = (FuncType)code; // take the address of the code
func(); // and jump to it!
but most environments will crash if you try this, for security reasons. (Many viruses work by convincing ordinary programs to do something like this.)
In a normal environment, one thing you could do is create a complete program as text, then invoke the compiler to compile it and invoke the resulting executable.
If you want to run code in your own memory space, you could invoke the compiler to build you a DLL (or .so, depending on your platform) and then link in the DLL and jump into it.
First, I wanted to say, that I never implemented something like that myself and I may be way off, however, did you try CodeDomProvider class in System.CodeDom.Compiler namespace? I have a feeling the classes in System.CodeDom can provide you with the functionality you are looking for.
Of course, it will all be .NET code, not any other platform
Go here for sample
Yes, you just have to build a compiler (and possibly a linker) and you're there.
Several languages such as Python can be embedded into C/C++ so that may be an option.
It's kind of sort of possible, but not with just straight C/C++. You'll need some layer underneath such as LLVM.
Check out c-repl and ccons
One way that you could do this is with Boost Python. You wouldn't be using C++ at that point, but it's a good way of allowing the user to use a scripting language to interact with the existing program. I know it's not exactly what you want, but perhaps it might help.
Sounds like you're trying to create "C++Script", which doesn't exist as far as I know. C++ is a compiled language. This means it always must be compiled to native bytecode before being executed. You could wrap the code as a function, run it through a compiler, then execute the resulting DLL dynamically, but you're not going to get access to anything a compiled DLL wouldn't normally get.
You'd be better off trying to do this in Java, JavaScript, VBScript, or .NET, which are at one stage or another interpreted languages. Most of these languages either have an eval or execute function for just that, or can just be included as text.
Of course executing blocks of code isn't the safest idea - it will leave you vulnerable to all kinds of data execution attacks.
My recommendation would be to create a scripting language that serves the purposes of your application. This would give the user a limited set of instructions for security reasons, and allow you to interact with the existing program much more dynamically than a compiled external block.
Not easily, because C++ is a compiled language. Several people have pointed round-about ways to make it work - either execute the compiler, or incorporate a compiler or interpreter into your program. If you want to go the interpreter route, you can save yourself a lot of work by using an existing open source project, such as Lua

Resources