Introduction to Metaprogramming in Nim
2016-06-06 · Nim · ProgrammingIntroduction to the Introduction (Meta-Introduction)
Wikipedia gives us a nice description of metaprogramming:
Metaprogramming is the writing of computer programs with the ability to treat programs as their data. It means that a program could be designed to read, generate, analyse and/or transform other programs, and even modify itself while running.
In this article we will explore Nim’s metaprogramming capabilities, which are quite powerful and yet still easy to use. After all great metaprogramming is one of Nim’s main features. The general rule is to use the least powerful construct that is still powerful enough to solve a problem, in this order:
So before looking at Nim’s two main metaprogramming constructs, templates and macros, we’ll look at what we can do with procs and iterators as well.
Regular Programming Constructs
Normal procs
We’re in normal programming land here. Regular procedures are what you know as functions elsewhere and they’re pretty easy to define and use:
Generic procs
With generics we can define procs that work on multiple types. Actually a new proc will be generated based on our generic definition for each instantiation:
Inline iterators
Inline iterators are the default iterators in Nim. They get compiled into high performance loops:
So this code gets compiled into:
Of course we can make iterators generic too:
Closure iterators
Inline iterators simultaneously have the advantage and disadvantage of being translated into loops. This means you can not pass them around. This limitation can be lifted by using closure iterators instead:
As you can see closure iterators keep their state. You can call them again and get the next value, or use them inside of a for-loop to get out many values.
Templates
You can think of templates as Nim’s equivalent to the C preprocessor. But templates are written in Nim itself and fit well into the rest of the language.
Templates simply insert their code at the invocation site, working at the level of the abstract syntax tree. They can be used in just the same way as procs.
Logger
A common example are loggers, which we looked at in another article already. Consider that you want to have extensive debug logging in your program. A trivial implementation would look like this:
[2016-06-05 22:00:50]: Everything looking good!
We have to call expensiveDebuggingInfo
to get the debugging info, which is
fine right now since our logLevel
is set to Level.debug
. But it stops being
fine when we instead set logLevel
to anything higher than debug
. Then it
still takes a full second to evaluate the expensiveDebuggingInfo
parameter
for debug
, but inside of debug
nothing is done with that information. This
is of course a consequence of call-by-value argument evaluation, which Nim
uses, just as most other languages do. A notable exception would be lazy
evaluation in Haskell, where this kind of logger would work perfectly fine,
only calling expensiveDebuggingInfo
when its value is actually needed.
But let’s stay in Nim-land and use a template instead of a proc to magically fix this:
[2016-06-05 22:01:30][logger]: Everything looking good!
Note that we also conveniently use instantiationInfo()
to find out at what
location in the program our template was instantiated, something we could not
do using a procedure.
We can still call the template in the exact same way as the proc. But now we
have the advantage that the template is inlined at compiletime, so
expensiveDebuggingInfo
is only called if the runtime logLevel
actually
requires it. Perfect.
Safe locking
Another problem that can be solved with a template is automatically acquiring and releasing a system lock:
Compile with --threads:on
for platform independent lock support.
This looks pretty simple, we just acquire the lock, execute the passed
statements and finally release the lock, even if exceptions have been thrown.
We can pass any set of statements as the body
. The usage is as easy as using
a built-in if statement:
When our template accepts a value of type stmt
we can use the colon to pass an entire indented block of code. When we have multiple parameters of type stmt
the do notation can be used.
This gets transformed into:
Now we will never forget to call release lock
. You could use this to make a
higher level locking library that only exposes withLock
instead of the
lower-level acquire
and release
primitives.
Macros
Just like templates, macros are executed at compiletime. But with templates you can only do constant substitutions in the AST. With macros you can analyze the passed arguments and create a new AST at the current position in any way you want. A nice property of Nim is that these compiletime macros are also written in the regular Nim language, so there is no need to learn another language.
A simple way to create an AST is to use parseStmt
and parseExpr
to parse
the regular textual representation into a NimNode. For example
parseStmt("result = 10")
returns this AST:
StmtList
Asgn
Ident !"result"
IntLit 10
A very useful way to find the AST of a piece of code is dumpTree
:
This is the same output as you get with treeRepr
:
Alternatively you can use lispRepr
to get a lisp-like representation:
StmtList(Asgn(Ident(!"result"), IntLit(10)))
Finally there is also the repr
proc, which turns a NimNode AST back into its
textual representation.
Many beginners start by piecing strings together and finally calling
parseStmt
on them. While this works it is inefficient and prone to bugs.
Instead you can use the macros module
to create NimNodes of all kinds yourself. dumpTree
gives you a hint if you’re
not sure how a specific piece of code will look in its AST representation.
JSON Parsing
JSON is pretty popular, so let’s improve the support for it in Nim. What we
want is to have a magical %*
so that we can write JSON directly in Nim source
code and have it checked at compile time, like this:
So far if you want to use JSON in Nim, you have to use the JSON constructor %
a lot:
Looks annoying. How can we implement %*
? As a macro of course!:
Ok, that doesn’t do anything interesting yet. We just call the still
unspecified compile time proc toJson
and return the result. We want toJson
to traverse the passed AST x
and create a new AST, which inserts a %
call
at just the right places, exactly as it would happen if we added the %
calls
manually.
For this purpose we print the AST of j2
by putting it into dumpTree
from
the macros module:
We get the following AST printed when compiling this program:
Prefix
Ident !"%"
Bracket
Prefix
Ident !"%"
TableConstr
ExprColonExpr
StrLit name
Prefix
Ident !"%"
StrLit John
ExprColonExpr
StrLit age
Prefix
Ident !"%"
IntLit 30
Prefix
Ident !"%"
TableConstr
ExprColonExpr
StrLit name
Prefix
Ident !"%"
StrLit Susan
ExprColonExpr
StrLit age
Prefix
Ident !"%"
IntLit 31
This turned out quite big, but from here we can see how the AST we want to
construct looks like. We do the same for j1
to see what we’re working with:
StmtList
Bracket
TableConstr
ExprColonExpr
StrLit name
StrLit John
ExprColonExpr
StrLit age
IntLit 30
TableConstr
ExprColonExpr
StrLit name
StrLit Susan
ExprColonExpr
StrLit age
IntLit 31
The idea now is to insert a %
at each level, except in front of the "name"
and "age"
in our case, the first elements in colon expressions.
And that’s it! Now our %*
works just as we want it to. If we did anything
wrong, we can modify the macro to check the actual code it produces:
This prints:
% [% {"name": % "John", "age": % 30}, % {"name": % "Susan", "age": % 31}]
Perfect! This macro we just developed landed in Nim’s json module already.
Enum Parsing optimization
With enums we can create new types that contain ordered values, just like this:
Strings can be parsed to an enum using parseEnum
from strutils:
If we do this a lot, we notice that it’s kind of slow though:
This takes 2.2 seconds on my machine. Let’s look at the definition of
parseEnum
to find out why:
We can see the problem already. We iterate through all the values inside the
enum type, from low(T)
to high(T)
. Then $e
creates a string of each enum
value, which is quite expensive. Since we already know the type of the enum at
compile time, we could create the strings at compile time as well.
Again, let’s think about what we want the result to look like before writing the macro. Basically what we want to do is unroll the for loop at compile time:
Now we can create the proc. Other than in the last example we won’t create the
AST manually this time. Instead we use parseStmt
to create a statement AST
from a string containing Nim code. An equivalent parseExpr
for expressions
exists as well. Here’s how the final proc with a macro inside looks:
Running the same code with our new implementation of parseEnum takes 0.5 seconds now, about 4 times faster than before. Great!
HTML DSL
We can use Nim’s templates and macros to create domain specific languages (DSL) that are translated into Nim code at compiletime. Nim’s syntax is quite flexible, so this is a powerful tool. As an example we build a simple HTML DSL.
The goal is to be able to write this:
And thus print the following HTML:
For convenience we want to use the htmlTemplate
macro as a pragma, annotated
as {.htmlTemplate.}
. Instead we could also write it in this way:
The htmlTemplate
macro shall transform the page
proc, adding a string
return type and creating a new body out of the DSL definition, into this:
Looks simple enough, here’s how the macro works:
The real magic of recursively handling the HTML tags happens in htmlInner
of
course, a compiletime proc that calls itself recursively to iterate over the
body definition:
We can check that we get the expected output by adding a simple echo
result.repr
at the end of htmlTemplate
:
Where \x0A
is just the newline character. Looks good and the output works!
emerald is a much more complete HTML DSL that works in a similar manner. A simpler HTML generator is included in the standard library in the htmlgen module.
Conclusion
I hope you enjoyed this trip through Nim’s metaprogramming capabilities. Always remember: With great power comes great responsibility, so use the least powerful construct that does the job. This reduces complexity and makes it easier to understand the code and keep it maintainable.
For further information and reference see:
Discuss on Hacker News and r/programming.