I am starting out learning Rust macros, but the documentation is somewhat limited. Which is fine — they're an expert feature, I guess. While I can do basic code generation, implementation of traits, and so on, some of the built-in macros seem well beyond that, such as the various print macros, which examine a string literal and use that for code expansion.
I looked at the source for print! and it calls another macro called format_args. Unfortunately this doesn't seem to be built in "pure Rust" the comment just says "compiler built-in."
Is it possible to write something as complex as print! in a pure Rust macro? If so, how would it be done?
I'm actually interested in building a "compile time trie" -- basically recognizing certain fixed strings as "keywords" fixed at compile time. This would be performant (probably) but mostly I'm just interested in code generation.
format_args is implemented in the compiler itself, in the libsyntax_ext crate. The name is registered in the register_builtins function, and the code to process it has its entry point in the expand_format_args function.
Macros that do such detailed syntax processing cannot be defined using the macro_rules! construct. They can be defined with a procedural macro; however, this feature is currently unstable (can only be used with the nightly compiler and is subject to sudden and unannounced changes) and rather sparsely documented.
Rust macros cannot parse string literals, so it's not possible to create a direct Rust equivalent of format_args!.
What you could do is to use a macro to transform the function-call-like syntax into something that represents the variadic argument list in the Rust type system in some way (say, as a heterogeneous single-linked list, or a builder type). This can then be passed to a regular Rust function, along with the format string. But you will not be able to implement compile-time type checking of the format string this way.
I have a TH-heavy file which takes around 30 seconds to compile. What are some techniques I can use to help debug the performance of my Template Haskell?
If I understand compile flow of TH correctly, the ordinary haskell functions are being executed while splicing at compile time. But you can run then at the runtime on your own, of course.
For example you have something like $(foo x y ...) in your TH-heavy file. Create another file and call 'foo x y' there but don't splice the result. Then you'll be able to profile 'foo' as usual. If the bottleneck is at the AST generation stage you'll locate it. Don't forget to consider lazyness.
As of GHC 8, this can be done with -fexternal-interpreter.
Compile the library defining the TH function with profiling enabled, then compile the code* which uses the TH function in a splice with GHC options -fexternal-interpreter -opti+RTS -opti-p. This should produce a file called ghc-iserv-prof.prof.
This approach has the advantage that you can use the full functionality of the Q monad.
* A benchmark suite in the same cabal project as the TH library (but in a different hs-source-dir) also works. It might even work with a TH function defined and used in the same library, but I think you'll be profiling interpreted code then.
I'm writing a small tool for generating php checks from javascript code, and I would like to know if anyone knows of a standard way of transforming functional code into imperative code?
I found this paper: Defunctionalization at Work it explains defunctionalization pretty well.
Lambdalifting and defunctionalization somewhat answered the question, but what about datastructures, we are still parsing lists as if they are all linkedlists. Would there be a way of transforming the linkedlists of functional languages into other high-level datastructures like c++ vectors or java arraylists?
Here are a few additions to the list of #Artyom:
you can convert tail recursion into loops and assignments
linear types can be used to introduce assignments, e.g. y = f x can be replaced with x := f x if x is linear and has the same type as y
at least two kinds of defunctionalization are possible: Reynolds-type defunctionalization when you replace a high-order application with a switch full of first-order applications, and inlining (however, recursive functions is not always possible to inline)
Perhaps you are interested in removing some language elements (such as higher-order functions), right?
For eliminating HOFs from a program, there are techniques such as defunctionalization. For removing closures, you can use lambda-lifting (aka closure conversion). Is this something you are interested in?
I think you need to provide a concrete example of code you have, and the target code you intend to produce, so that others may propose solutions.
Added:
Would there be a way of transforming the linkedlists of functional languages into other high-level datastructures like c++ vectors or java arraylists?
Yes. Linked lists are represented with pointers in C++ (a structure "node" with two fields: one for the "payload", another for the "next" pointer; empty list is then represented as a NULL pointer, but sometimes people prefer to use special "sentinel values"). Note that, if the code in the source language does not rely on the representation of singly linked lists (in the source language implementation), you can also implement the "cons"/"nil" operations using a vector in the target language (not sure if this suits your needs, though). The idea here is to give an alternative implementations for the familiar operations.
No, there is not.
The reason is that there is no such concrete and well defined thing like functional code or imperative code.
Such transformations exist only for concrete instances of your abstraction: for example, there are transformations from Haskell code to LLVM bytecode, F# code to CLI bytecode or Frege code to Java code.
(I doubt if there is one from Javascript to PHP.)
Depends on what you need. The usual answer is "there is no such tool", because the result will not be usable. However look at this from this standpoint:
The set of Assembler instructions in a computer defines an imperative machine. Hence the compiler needs to do such a translation. However I assume you do not want to have assembler code but something more readable.
Usually these kinds of heavy program transformations are done manually, if one is interested in the result, or automatically if the result will never be looked at by a human.
Ninth bullet point in Paul Graham's What Made Lisp Different says,
9. The whole language always available.
There is no real distinction between read-time, compile-time, and runtime. You can compile or run code while reading, read or run code while compiling, and read or compile code at runtime.
Running code at read-time lets users reprogram Lisp's syntax; running code at compile-time is the basis of macros; compiling at runtime is the basis of Lisp's use as an extension language in programs like Emacs; and reading at runtime enables programs to communicate using s-expressions, an idea recently reinvented as XML.
Does this last bullet point hold for Clojure?
You can mix runtime and compile-time freely in Clojure, although Common Lisp is still somewhat more flexible here (due to the presence of compiler macros and symbol macros and a fully supported macrolet; Clojure has an advantage in its cool approach to macro hygiene through automagic symbol resolution in syntax-quote). The reader is currently closed, so the free mixing of runtime, compile-time and read-time is not possible1.
1 Except through unsupported clever hacks.
It does hold,
(eval (read-string "(println \"Hello World!!\")"))
Hello World!!
nil
Just like emacs you can have your program configuration in Clojure, one project that I know Clojure for is static which allows you to have your template as a Clojure vector along with arbitrary code which will be executed at read time.
C++ is probably the most popular language for static metaprogramming and Java doesn't support it.
Are there any other languages besides C++ that support generative programming (programs that create programs)?
The alternative to template style meta-programming is Macro-style that you see in various Lisp implementations. I would suggest downloading Paul Graham's On Lisp and also taking a look at Clojure if you're interested in a Lisp with macros that runs on the JVM.
Macros in Lisp are much more powerful than C/C++ style and constitute a language in their own right -- they are meant for meta-programming.
let me list a few important details about how metaprogramming works in lisp (or scheme, or slate, or pick your favorite "dynamic" language):
when doing metaprogramming in lisp you don't have to deal with two languages. the meta level code is written in the same language as the object level code it generates. metaprogramming is not limited to two levels, and it's easier on the brain, too.
in lisp you have the compiler available at runtime. in fact the compile-time/run-time distinction feels very artificial there and is very much subject to where you place your point of view. in lisp with a mere function call you can compile functions to machine instructions that you can use from then on as first class objects; i.e. they can be unnamed functions that you can keep in a local variable, or a global hashtable, etc...
macros in lisp are very simple: a bunch of functions stuffed in a hashtable and given to the compiler. for each form the compiler is about to compile, it consults that hashtable. if it finds a function then calls it at compile-time with the original form, and in place of the original form it compiles the form this function returns. (modulo some non-important details) so lisp macros are basically plugins for the compiler.
writing a lisp function in lisp that evaluates lisp code is about two pages of code (this is usually called eval). in such a function you have all the power to introduce whatever new rules you want on the meta level. (making it run fast is going to take some effort though... about the same as bootstrapping a new language... :)
random examples of what you can implement as a user library using lisp metaprogramming (these are actual examples of common lisp libraries):
extend the language with delimited continuations (hu.dwim.delico)
implement a js-to-lisp-rpc macro that you can use in javascript (which is generated from lisp). it expands into a mixture of js/lisp code that automatically posts (in the http request) all the referenced local variables, decodes them on the server side, runs the lisp code body on the server, and returns back the return value to the javascript side.
add prolog like backtracking to the language that very seamlessly integrates with "normal" lisp code (see screamer)
an XML templating extension to common lisp (includes an example of reader macros that are plugins for the lisp parser)
a ton of small DSL's, like loop or iterate for easy looping
Template metaprogramming is essentially abuse of the template mechanism. What I mean is that you get basically what you'd expect from a feature that was an unplanned side-effect --- it's a mess, and (although tools are getting better) a real pain in the ass because the language doesn't support you in doing it (I should note that my experience with state-of-the-art on this is out of date, since I essentially gave up on the approach. I've not heard of any great strides made, though)
Messing around with this in about '98 was what drove me to look for better solutions. I could write useful systems that relied on it, but they were hellish. Poking around eventually led me to Common Lisp. Sure, the template mechanism is Turing complete, but then again so is intercal.
Common Lisp does metaprogramming `right'. You have the full power of the language available while you do it, no special syntax, and because the language is very dynamic you can do more with it.
There are other options of course. No other language I've used does metaprogramming better than Lisp does, which is why I use it for research code. There are lots of reasons you might want to try something else though, but it's all going to be tradeoffs. You can look at Haskell/ML/OCaml etc. Lots of functional languages have something approaching the power of Lisp macros. You can find some .NET targeted stuff, but they're all pretty marginal (in terms of user base etc.). None of the big players in industrially used languages have anything like this, really.
Nemerle and Boo are my personal favorites for such things. Nemerle has a very elegant macro syntax, despite its poor documentation. Boo's documentation is excellent but its macros are a little less elegant. Both work incredibly well, however.
Both target .NET, so they can easily interoperate with C# and other .NET languages -- even Java binaries, if you use IKVM.
Edit: To clarify, I mean macros in the Lisp sense of the word, not C's preprocessor macros. These allow definition of new syntax and heavy metaprogramming at compiletime. For instance, Nemerle ships with macros that will validate your SQL queries against your SQL server at compiletime.
Nim is a relatively new programming language that has extensive support for static meta-programming and produces efficient (C++ like) compiled code.
http://nim-lang.org/
It supports compile-time function evaluation, lisp-like AST code transformations through macros, compile-time reflection, generic types that can be parametrized with arbitrary values, and term rewriting that can be used to create user-defined high-level type-aware peephole optimizations. It's even possible to execute external programs during the compilation process that can influence the code generation. As an example, consider talking to a locally running database server in order to verify that the ORM definition in your code (supplied through some DSL) matches the schema of the database.
The "D" programming language is C++-like but has much better metaprogramming support. Here's an example of a ray-tracer written using only compile-time metaprogramming:
Ctrace
Additionally, there is a gcc branch called "Concept GCC" that supports metaprogramming contructs that C++ doesn't (at least not yet).
Concept GCC
Common Lisp supports programs that write programs in several different ways.
1) Program data and program "abstract syntax tree" are uniform (S-expressions!)
2) defmacro
3) Reader macros.
4) MOP
Of these, the real mind-blower is MOP. Read "The Art of the Metaobject Protocol." It will change things for you, I promise!
I recommend Haskell. Here is a paper describing its compile-time metaprogramming capabilities.
Lots of work in Haskell: Domain Specific Languages (DSL's), Executable Specifications, Program Transformation, Partial Application, Staged Computation. Few links to get you started:
http://haskell.readscheme.org/appl.html
http://www.cse.unsw.edu.au/~dons/papers/SCKCB07.html
http://www.haskell.org/haskellwiki/Research_papers/Domain_specific_languages
The ML family of languages were designed specifically for this purpose. One of OCaml's most famous success stories is the FFTW library for high-performance FFTs that is C code generated almost entirely by an OCaml program.
Cheers,
Jon Harrop.
Most people try to find a language that has "ultimate reflection"
for self-inspection and something like "eval" for reifying new code.
Such languages are hard to find (LISP being a prime counterexample)
and they certainly aren't mainstream.
But another approach is to use a set of tools that can inspect,
generate, and manipulate program code. Jackpot is such a tool
focused on Java. http://jackpot.netbeans.org/
Our DMS software reengineering toolkit is
such a tool, that works on C, C++, C#, Java, COBOL, PHP,
Javascript, Ada, Verilog, VHDL and variety of other languages.
(It uses production quality front ends to enable it to read
all these langauges).
Better, it can do this with multiple languages at the same instant.
See http://www.semdesigns.com/Products/DMS/DMSToolkit.html
DMS succeeds because it provides a regular method and support infrastructure for complete access to the program structure as ASTs, and in most cases additional data such a symbol tables, type information, control and data flow analysis, all necessary to do sophisticated program manipulation.
'metaprogramming' is really a bad name for this specific feature, at least when you're discussing more than one language, since this feature is only needed for a narrow slice of languages that are:
static
compiled to machine language
heavily optimised for performance at compile time
extensible with user-defined data types (OOP in C++'s case)
hugely popular
take out any one of these, and 'static metaprogramming', just doesn't make sense. therefore, i would be surprised if any remotely mainstream language had something like that, as understood on C++.
of course, dynamic languages, and several functional languages support totally different concepts that could also be called metaprogramming.
Lisp supports a form of "metaprogramming", although not in the same sense as C++ template metaprogramming. Also, your term "static" could mean different things in this context, but Lisp also supports static typing, if that's what you mean.
The Meta-Language (ML), of course: http://cs.anu.edu.au/student/comp8033/ml.html
It does not matter what language you are using -- any of them is able to do Heterogeneous Generative Metaprogramming. Take any dynamic language such as Python or Clojure, or Haskell if you are a type-fan, and write models in this host language that are able to compile themself into some mainstream language you want or forced to use by your team/employer.
I found object graphs a good model for internal model representation. This graph can mix attributes and ordered subgraphs in a single node, which is native to attribute grammar and AST. So, object graph interpretation can be an effective layer between your host and target languages and can act itself as some sort of no-syntax language defined over data structures.
The closest model is an AST: describe AST trees in Python (host language) targets to C++ syntax (target language):
# from metaL import *
class Object:
def __init__(self, V):
self.val = V
self.slot = {}
self.nest = []
class Module(Object):
def cc(self):
c = '// \ %s\n' % self.head(test=True)
for i in self.nest:
c += i.cc()
c += '// / %s\n' % self.head(test=True)
return c
hello = Module('hello')
# <module:hello> #a04475a2
class Include(Object):
def cc(self):
return '#include <%s.h>\n' % self.val
stdlib = Include('stdlib')
hello // stdlib
# <module:hello> #b6efb657
# 0: <include:stdlib> #f1af3e21
class Fn(Object):
def cc(self):
return '\nvoid %s() {\n}\n\n' % self.val
main = Fn('main')
hello // main
print(hello.cc())
// \ <module:hello>
#include <stdlib.h>
void main() {
}
// / <module:hello>
But you are not limited with the level of abstraction of constructed object graph: you not only can freely add your own types but object graph can interpret itself, can thus can modify itself the same way as lists in a Lisp can do.