This Blog has Moved

June 21, 2008

I’m moving this blog to blog.codedemora.com. I’ll be keeping this on up for historical reasons. Pleas update your bookmarks and RSS feeds

Thanks

There was quite a bit of uproar regarding my rant about C# regions. At the time, I was looking through a lot of really bad code, and I was pretty peeved at how all the pieces of complex code were hidden away by Visual Studio.  However, some of the comments seemed to imply that there is no such thing as “perfect” or “good” code when you refer to code written by someone other than yourself. And, to be honest, that is true: I like to think the code I write is better than most other developers. However, I want to qualify what better is:

Better code is:

  • Code that is tested
  • Code that is maintainable
  • Code that is understandable by someone other than me
  • Code that is changeable by someone other than me

Even with those guidelines, there is still some room to define more terms, but we have to stop at some point. Notice how none of these points really points to Lines of Code, the degree of algorithm obfuscation or how low the average cyclomatic complexity is for your codebase. These bullet points take a professional approach to the software development profession. That is, I and many others are professionals, and we should write our code as professionals. I hold myself to the same high standards as I hold others. Granted, I don’t meet those standards all the time either. I try not to beat myself up about it, because I would never win.

However, I wanted to be less emotional about why regions are really not that great. Just flat out, the main reason I see is that when you encapsulate code in region blocks, you miss an opportunity for one of many refactorings. In other words, you are knowingly violating principles of good software development by hiding the “bad” code under the rug.

Regions are simply comments preprocessor-style comments. As with normal comments, they are removed before the code is compiled, and are only there to help our non-digital brains cope with the mess we created in the first place.

I try my best to follow the TDD (and by extension, BDD) methodology, and so far, I have found no need for regions. Ever. I find that they clutter the code. I also find that TDD manages the mess I create much better than regions because they are like little pieces of thoughts that stick around to keep verifying my intentions several features back. TDD is by no means a do all methodology, but it has saved my butt several times, and does tend to make you write code that is neat, well tested, well organized and region free.

This is so last minute notice, but tomorrow is the Desert Code Camp in Tempe, AZ. I’ve attended the last few code camps, and they’ve been loads of help to me and my career in software development. This year, I decided that I should try to give something back, so I’m presenting on a couple of topics: Test Driven Development, and Behavior Driven Development. I’ve been practicing and evangelizing TDD for a couple of years now, and this is yet another outlet to try to convince others that this is the path toward code enlightenment.

Spoiler Alert: BDD on the other hand is somewhat new to me. However, as I’ve read many of the descriptions, and BDD testing frameworks, its really all about the way you look at tests, and the eventual evolution of what your firstly written unit tests eventually become: behavior specifications. I will be talking about the little I’ve learned about rspec, nspec, and nbehave tomorrow.

See you all tomorrow!

Just say No! to C# Regions

February 29, 2008

I’ve seen lots of C# code over the last few days, and 100% of it contains the #region construct. So far, I have see it used in only one way: to hide big code blocks. Some of you might be saying “Duh! That’s what they’re for!”.

Well, let’s think about that for a minute:

You need a language construct so that your IDE can help you hide your big messy code block from you (because it’s just so hideous, you don’t even want to look at it anymore)?

If your code is so bad, that you just want to shove it under the covers, then I would argue that your design and the solution to said design is too complex.

Most of the uses I’ve seen of regions are to group member fields, constructors (and related overloads), methods and properties together. This has to be the most retarded way to use a tool because you have other tools to perform this same function. Granted not everyone can buy Resharper, but the File Structure window provides all that same information. Heck, if you simply followed a simple code standard such as:

class <className>
{

  private members
  constructors
  properties
  methods

}

this would do the same job. Honestly, if you can’t LOOK at your code and tell a constructor apart from a property or field, then you should be looking for a different line of work. Please, stop commenting the obvious!

Now, you will eventually get classes that are thousands of lines long, however there are other constructs in the language that can help you keep things neat without using this lame excuse for a preprocessor tag. When you DO find your class encroaching on this large line number threshold, that is a prime time to look at your code for copy/paste refactoring tragedies and do it right. It might be a little painful now, but it could be the thing that saves your project in the long run.

Another lame way I’ve seen regions used is within a method. Usually there is a large block of setup code that is copy and pasted all over the place, with the same region tags. C’mon people, at that point, read about the Extract Method refactoring and actually REUSE that block of code intelligently!

I honestly can’t think of any reason to use #region tags because all the things they are hiding is just really bad code. When you start to just hide bad code, you are less inclined to go back and refactor it properly.

Let’s try this analogy: if you were looking to lose weight before the summer beach season, the best thing to do is get a mirror. This way, you will be re-motivated on a regular basis.

The mere act of having the bad code stare back at you will make sure that it is at least not forgotten. Nearly all codebases have some code that is not koesher. Let’s try to make that lump of code as small as possible. By then, you may even find a way to get rid of that eyesore (and system-sore) all together…after all, it was probably some hack anyway.

Antlr + log4net = ?

February 14, 2008

I’ve been using log4net in nearly every application I’ve written for the past couple of years. It has met my needs in every single instance. From logging to a file, to sending emails, to writing to a console, to doing all three at the same time, you can’t beat log4net (or your own flavor).

I’ve also started to use ANTLR for, what else, parsing some input. I got the hang of parsing input and converting it to an Abstract Syntax Tree without too many issues. However, when it came time to include actions when walking the tree, and finding out the values that are being created or parsed, nothing beats the old printf function. Or rather, WriteLine. However, in all the ANTLR examples I found, everyone was writing to the Console. I don’t like writing to the console directly anymore, as I find that there may be a case when I want to pipe that output to the System Event Log (for example). So, I didn’t think too hard in what needed to come next: a reference to a log4net Logger in the grammar file.

In your grammar file, start with:

@header{

   using System.Reflection;

   using System.Text;        using log4net;

}

Then add a dash of

@members {

   private readonly static ILog Log = LogManager.GetLogger(MethodBase.GetCurrentMethod().DeclaringType); }

Then, in your actions, you can reference your Log object like so:

myRule  : methodName '(' oneParam ')'

{

  Log.Debug("Who cares where I am, I can send my parse output to log4net!");

}

;

Parse with glee now that you can watch the parser do what it does best to your DSLs…

I’ve been a huge fan of Rhino Mocks since I heard about it on a way-back-episode of Dot Net Rocks HanselMinutes. One of the more awesome features of Rhino Mocks is that it utilizes a Record/Playback semantic so that you can set up your mocks with actual calls to the mock objects, then play them back in your tests. This is great if you rely on intellisense when calling your object instances, or if you use Resharper or another code refactoring tool.

Recently, I trying to use Rhino Mocks with DataTables like so:

DataTable a = new DataTable();
DataTable b = new DataTable(); 

MockObject mock = mockRepository.CreateMock<MockObject>();

using (mockRepository.Record())
{
   Expect.Call(mock.CreateRecord("TableName", a, b).Return(1);
}
using (mockRepository.Playback())
{
  MyObject obj = new MyObject(mock);
  obj.Save();
}

When I ran the test, Rhino Mocks says that a the DataTable created during the obj.Save() call wasn’t the same as the one expected. This was correct as there were now two instances, and it seems that the default check Rhino Mocks performs is an Object.Equals(). Since I needed to do some more deep comparisons of the Expected and Actual DataTables, a fellow developer and I started a very simple DataTableConstraint:

    class DataTableConstraint : AbstractConstraint
    {
        private readonly DataTable _expected;
        private string _errorMessage;

        public DataTableConstraint(DataTable expected)
        {
            _expected = expected;
        }

        public override bool Eval(object obj)
        {
            DataTable actual = obj as DataTable;
            if (actual == null)
                return false;

            if (!CheckColumns(actual))
            {
                return false;
            }
            if (!CheckData(actual))
            {
                return false;
            }
            return true;
        }

        private bool CheckData(DataTable actual)
        {
            if (actual.Rows.Count == 0)
            {
                _errorMessage = "Actual table has no data to compare";
                return false;
            }
            try
            {
                for (int i=0; i < _expected.Rows.Count; i++)
                {
                    foreach (DataColumn column in _expected.Columns)
                    {
                        object expectedCell = _expected.Rows[i][column];
                        object actualCell = actual.Rows[i][column];

                        if (expectedCell != actualCell)
                        {
                            _errorMessage = string.Format("Expected {0} in Row ({1}), Column ({2}), but was {3}", expectedCell, i, column.ColumnName, actualCell);
                            return false;
                        }
                    }
                }
                return true;
            }
            catch (System.Exception e)
            {
                _errorMessage = e.Message;
                return false;
            }
        }

        private bool CheckColumns(DataTable actual)
        {
            foreach (DataColumn column in _expected.Columns)
            {
                if (!actual.Columns.Contains(column.ColumnName))
                {
                    _errorMessage = string.Format("Could not find column {0} in {1}", column, actual);
                    return false;
                }
            }
            return true;
        }

        public override string Message
        {
            get { return _errorMessage; }
        }
    }

To use this using the above sample code, do

Expect.Call(mock.CreateRecord(null, null, null).Constraints
(
  Is.Anything(),
  new DataTableConstraint(a),
  new DataTableConstraint(b)
);

Now, this is just a start, and will probably evolve more as I start to get more into verifying the data in two tables. Perhaps Oren will create a new Syntax Helper along the lines of Data.Equals(expectedDataTable) since using a new Constraint() call is a little awkward.

ANTLR is a DSL tool that can generate a language parser based on a grammar (*.g) file. From there, you hook into the parser so that your code can interpret the language into constructs your application can understand. But, in order to get your grammar file correct, you make end up using a variety of development methods such as TDD, and all of them will have the same step: run the ANTLR tool against your grammar file to generate your parser code.The ANTLR web site has an article on its wiki that will let you integrate the ANTLR tool into your Visual Studio Build process. Follow the steps outlined in that article first.  Since VS 2008 project files are basically MS Build files, the steps work almost verbatim. However, here are a few caveats I encountered:The Exec command in the GenerateAntlrCode target needed some classpath modifications. Basically, I needed to tell java where the Antlr jar file was located. Here is what I ended up with:

1:  <Target Name="GenerateAntlrCode" Inputs="@(Antlr3)" Outputs="%(Antlr3.OutputFiles)">     2:      <Exec Command="java -cp %22$(Antlr3ToolPath)\antlr.jar;$(Antlr3ToolPath)\antlr-3.0.1.jar;$(Antlr3ToolPath)\stringtemplate-3.1b1.jar%22 org.antlr.Tool -message-format vs2005 -lib $(AntlrGrammarPath)  @(Antlr3)" Outputs="%(Antlr3.OutputFiles)" />     3:    </Target>

A couple of notes regarding this target:

  • %22 is the escaped form of double-quotes (“)
  • I decided to create a build variable so that any changes to a path was updated everywhere (DRY principle).
  • You need to reference several jars:
    • antlr.jar -> The Antlr 2.7.7 tool
    • antlr-3.0.1.jar -> The Antlr 3.0.1 tool
    • stringtemplate-3.1b1.jar -> The String Template library

The antlr 2.7.7 jar is needed for the StreamToken class that seems to be in a different namespace than the 3.0.1 jar.The properties called Antlr3ToolPath and Antlr3GrammarPath are defined in a property group like so:

1:    <PropertyGroup>     2:      <Antlr3ToolPath>$(MSBuildProjectDirectory)\..\..\Tools\ANTLR</Antlr3ToolPath>    3:      <AntlrGrammarPath>$(MSBuildProjectDirectory)\Model\Parsers</AntlrGrammarPath>     4:      <BuildDependsOn>GenerateAntlrCode;$(BuildDependsOn)</BuildDependsOn>     5:    </PropertyGroup>

The reason the GrammarPath is basically pointing to the output folder of the parser. This is needed because I am using the ANTLR output option of AST, which generates an Abstract Syntax Tree that much be consumed by another ANTLR grammar. For more details on why you need two grammar files for this process, I recommend the book The Definitive ANTLR Reference, by Terrence Parr.After that, I was able to use Visual Studio to generate the code on every build of the code. When using TDD tools like resharper or testdriven.net, this is an added bonus in that any changes made to the grammar are incorporated in my unit tests. Bonus number 2: since the changes are within the project file itself, and since my continuous integration server is based on my Visual Studio Solution file, the grammar files are also included on my build server and will cause the build to fail if the grammar fails to compile.I hope your build goes just as nicely.