Wednesday, May 28, 2008

Introduction to staged metaprogramming and metalinguistic abstraction

What is metaprogramming?
According to Wikipedia definition of metaprogramming is the writing of computer programs that write or manipulate other programs (or themselves) as their data... In many cases, this allows programmers to get more done in the same amount of time as they would take to write all the code manually.
Metaprogramming is not a "silver bullet" but it's a well known as a technique that could significantly improve productivity of programmers.
Probably you don't write machine code manually. You use the best programming language insert_name_here. Compiler or interpreter of your favorite programming language is the brightest example of metaprogramming that really improves your productivity.
Moreover I suppose that metaprogramming is the only known way to significantly improve productivity by reducing complexity1.

Metaprogramming and software development industry
Why metaprogramming is not in mainstream now? I tried to address this question to my colleagues and got following answers:
  • "Metaprogramming is only for GURUs"
  • "To do metaprogramming we need to write parser/compiler/interpreter/translator/whatever. These are too complicated"
  • "Programs that writes other program are hard to understand and debug"
  • "Metaprogramming - most people don't know what is it..... This is both reason and consequence."
  • "Many people think that metaprograming means that C++ templates will be used :)"
  • "It is hard to find people that can do this. It's more practical to hire 100 Indians and solve the task in traditional way"
  • "Metaprogramming is suitable only for large-scale projects/systems/problems"
  • "It's idiotically (and horrible) to learn hundred languages..."
  • Advertising from big companies enforces to use other methods"
  • "Lack of tools. Lack of theory"
Some of those statements are true. But some of them are just only commonly shared myth.
One of purposes of this series of articles is to collect and systematize materials which show following:
  • Metaprogramming can be simple. It can be available not only for GURUs
  • Metaprogramming can be effectively used for different scale of domains and ranges of sizes of problems
  • Entering to mainstream is happening right now together with rising of complexity of problems
Metalinguistic abstraction
Metalinguistic abstraction is the process of solving complex problems by creating a new language or vocabulary to better understand the problem space. Such languages in recent time are called domain-specific languages (DSL). While DSL is relatively new term metalinguistic abstraction is old and well-known way of solving problems2.
Metalinguistic abstraction uses language as main instrument of abstraction in the same way as procedural approach uses procedures, OOP uses objects/classes and AOP uses aspects.
By using metalinguistic abstraction we use metaprogramming in cascaded way: we translate program written in high-level DSL to program in another DSL, than we translate to lower-level DSL's until we reach target platform. This is called staged metaprogramming. Algorithm of solving problems using this approach in top-down way can look like this:
  1. Describe solution of you problem using high level DSL
  2. Chose the target platform. This can be whatever you want or whatever required (examples: other DSL, machine language, general purpose programming language)
  3. If your DSL is easily and naturally can be mapped to platform primitives - do this and finish
  4. Substitute original problem to (smaller) problem of making DSL runnable at selected platform. Goto 1.
In simple cases DSLs can be stacked one onto another. In more complex cases DSLs could create hierarchy.
What are benefits of metalinguistic abstraction?
  • Programs can be more readable, supportable if they are specified using language which is close to requirements
  • Avoiding of duplication: ability to follow DRY (Don't Repeat Yourself) rule.
  • Solutions tend to be more ready to radical and unpredicted requirement changes.
    Why this can be true? Functional requirements mostly reside at higher metalinguistic abstraction layers. These are "encoded" in platform independent and close-to-domain form.
    Non-functional requirements tend to reside at lower metalinguistic abstraction layers. Even unpredictable requirement of moving program to exotic platform can often be satisfied by replacing low-level layers without rewriting or affecting high-level.
Is really metaprogramming complicated?
Any simple problem can be solved in complex way using any technique. Metaprogramming isn't an exception. Just don't go this way. If one doesn't want go this way (s)he should not:
  • Invent yet another general purpose language while solving problem of controlling coffee-machine
  • Invent exotic syntax for you languages that is hard to parse
  • Select tools/languages/frameworks which doesn't allow you to easily manipulate/translate syntax trees smoothly and naturally without any overhead (Example of such extra overhead are maintaining classes for any AST node with stuff to implement visitor pattern, manually-supported bridge class hierarchies, so on...)
  • Select tools/languages/instruments that require 5+ years training before programmer can do something serious.
To avoid of building extra-complicated solutions it's reasonable to:
  • Put to DSL as little as possible.
  • Think about any DSL in terms of its AST and keep its syntax as much close to AST as possible. This might allow you to avoid writing and debugging parsers at all.
  • Select simple easy-to-learn tools and languages (or subsets of languages) to allow quick joining of new people to the project
  • Choose one of "programmable programming languages". I.e. extensible language that directly supports homogeneous metaprogramming3.
Should we minimize the number of metalinguistic abstraction layers in order to reduce complexity? If your answer is "yes" consider following analogy: should we minimize the number of functions in the structural approach or minimize the number of classes in OOP-way to reduce complexity? Of course this answer is "no". We can obtain benefits with clean-fine-granted design with thin and simple high-isolated metalinguistic abstraction layers. But of course complexity of support should not grow exponentially with count of these languages. For example procedural programming languages allow easy splitting procedure into several parts when needed. Analogous operation should be easy against DSL.

To be continued...

References:
1. Similar idea is highlighted in the article: "Using a hierarchy of Domain Specific Languages in complex software systems design" by V.S. Lugovsky
2. "Structure and Interpretation of Computer Programs" book aka SICP
3. A Taxonomy of meta-programming systems -
4. Lambda the Ultimate
- The Programming Languageds Weblog