ANTLR is a DSL tool that can generate a language parser based on a grammar (*.g) file. From there, you hook into the parser so that your code can interpret the language into constructs your application can understand. But, in order to get your grammar file correct, you make end up using a variety of development methods such as TDD, and all of them will have the same step: run the ANTLR tool against your grammar file to generate your parser code.The ANTLR web site has an article on its wiki that will let you integrate the ANTLR tool into your Visual Studio Build process. Follow the steps outlined in that article first.  Since VS 2008 project files are basically MS Build files, the steps work almost verbatim. However, here are a few caveats I encountered:The Exec command in the GenerateAntlrCode target needed some classpath modifications. Basically, I needed to tell java where the Antlr jar file was located. Here is what I ended up with:

1:  <Target Name="GenerateAntlrCode" Inputs="@(Antlr3)" Outputs="%(Antlr3.OutputFiles)">     2:      <Exec Command="java -cp %22$(Antlr3ToolPath)\antlr.jar;$(Antlr3ToolPath)\antlr-3.0.1.jar;$(Antlr3ToolPath)\stringtemplate-3.1b1.jar%22 org.antlr.Tool -message-format vs2005 -lib $(AntlrGrammarPath)  @(Antlr3)" Outputs="%(Antlr3.OutputFiles)" />     3:    </Target>

A couple of notes regarding this target:

  • %22 is the escaped form of double-quotes (“)
  • I decided to create a build variable so that any changes to a path was updated everywhere (DRY principle).
  • You need to reference several jars:
    • antlr.jar -> The Antlr 2.7.7 tool
    • antlr-3.0.1.jar -> The Antlr 3.0.1 tool
    • stringtemplate-3.1b1.jar -> The String Template library

The antlr 2.7.7 jar is needed for the StreamToken class that seems to be in a different namespace than the 3.0.1 jar.The properties called Antlr3ToolPath and Antlr3GrammarPath are defined in a property group like so:

1:    <PropertyGroup>     2:      <Antlr3ToolPath>$(MSBuildProjectDirectory)\..\..\Tools\ANTLR</Antlr3ToolPath>    3:      <AntlrGrammarPath>$(MSBuildProjectDirectory)\Model\Parsers</AntlrGrammarPath>     4:      <BuildDependsOn>GenerateAntlrCode;$(BuildDependsOn)</BuildDependsOn>     5:    </PropertyGroup>

The reason the GrammarPath is basically pointing to the output folder of the parser. This is needed because I am using the ANTLR output option of AST, which generates an Abstract Syntax Tree that much be consumed by another ANTLR grammar. For more details on why you need two grammar files for this process, I recommend the book The Definitive ANTLR Reference, by Terrence Parr.After that, I was able to use Visual Studio to generate the code on every build of the code. When using TDD tools like resharper or testdriven.net, this is an added bonus in that any changes made to the grammar are incorporated in my unit tests. Bonus number 2: since the changes are within the project file itself, and since my continuous integration server is based on my Visual Studio Solution file, the grammar files are also included on my build server and will cause the build to fail if the grammar fails to compile.I hope your build goes just as nicely.

As a student in the Engineering college, I thought all the computer science classes were a waste of time. I mean, who cares how to properly define a push down automata and prove that it is, in fact, deterministic? Can P = NP? How do you write a grammar that defines this language? This class was the engineering equivalent of the touchy-feely hippie course that was required to complete my Computer Engineering degree. I mean, we were engineers, we weren’t going to be doing proofs in our future careers. We were going to be paid to implement; and I am being paid to implement software now. 

In much of the code I have seen, there have been far too many examples where a background in computer science was not present in the developer’s mind (minds when pairing).  Not that you have to have a formal degree, or classes, but a fundamental understanding that there are good ways to do things, and bad ways. It goes back to the analogy of having all the tools in your toolbox to use so that every problem does not become the nail to your hammer. If you understand the concept and application of grammars and languages, then you can use the tools available to you to create one. 

I have learned in the time since I completed college and have become a full time software engineer is that you need that background in computer science in order to understand the tools available to you, and to implement them properly.  This is the difference between a software engineer with a college education and a hacker who just writes code.

So, to bring this post to something concrete, let’s look at what I’m working on now. I’ve been tasked with parsing some stuff for an application. I could have just as well created my own parser using my own TDD methods so that at least when I need to make changes to the parsing algorithm or the language changes, I can make changes easily and know where things break, FAST. But I thought to myself. “There is no way I am the first person to encounter this problem, someone else has to have some experience to share” So, poking around, I found an open-source project called ANTLR. Now, to be honest, I’m still trying to wrap my head around certain aspects of what is done by the tool, and how to use it effectively. But, by using this tool, I effectively get a high end custom language parser generated for me based on my language specification. This is also a section of the code I don’t have to directly maintain. I do have to fix language inaccuracies, however, I get a lot of functionality for free. Things like error-checking, multi-language compatibility, and efficient parsing algorithms are useful, and completely maintained by someone else!

This post has turned into an agile leaning example, but this is one of the many tips included in The Pragmatic Programmer. Be sure to learn the tools available to you because while it will be slower at first to grok something new, once you get over the hump, you will be able to use that tool at a moment’s notice, and move on to other problems that need solving.