Code Spelunking Techniques

8

When trying to understand other people’s code, I find a few different techniques helpful. Since I’ve been doing this a lot lately, they’re fresh in my mind.

  1. Categorize class state – I find it is often helpful to classify the fields in a class into a few common categories (some of which may not exist):
    • Configuration – Configuration consists of properties (typically simple types) indicating how this class should behave. Configuration attributes are passed to the class (in constructor or via setter) and are often constant throughout the lifetime of the class.
    • Resources – Resources are typically other (complex) objects that are managing state or access to other components. Similar in ways to configuration but they tend to have a weightier feel to them. Often constant throughout the lifetime of the class.
    • State – State are attributes that hold the internal data pertinent to this class and typically change through the lifetime of the class. They may be initialized from configuration.

    You may also see “derived” forms of all of these as well. Derived configuration may be computed (by converting seconds to milliseconds or a string value to an int or something). Derived resources may be obtained through the initial resources you are given. And derived state may compute properties (age) based on other properties (birth date).

    I find most attributes fall into these three broad categories. Separating the fields of a class into these three categories (and labeling them thusly) can be immensely helpful in understanding a class with a bunch of fields.

  2. Examine method callers / class users – search for who is calling a method in an interface or class to see how it is being used. In Eclipse, you can either open the method call hierarchy (Eclipse: Ctrl-Alt-H) or search for references on a class or interface (Eclipse: Ctrl-Shift-G or Cmd-Shift-G).
  3. Open method implementors – often you will look at a method in an interface and want to look at one or more of the implementations of that interface method. The easiest way to do this in Eclipse is to highlight the method, open the quick hierarchy (Eclipse: Ctrl-T or Cmd-T), then pick an implementation of the interface. Eclipse will open the implementation AND put you in the implementation of the method you had selected. I only learned this trick recently but it’s made the world a better place.
  4. Project dependency diagramming – sometimes its helpful to get the picture of project dependencies. You can find this information in the files of your IDE or build system. I have written crappy little programs to scrape it out of Eclipse .classpath and build XML files in the past. Generally, this shouldn’t take too long as it can be pretty quick and dirty. Then just build a DOT file like this:

    digraph dependencies {
    “ui” -> “server”
    “tools” -> “server”
    “server” -> “data”
    “data” -> “common”
    }

    and use any DOT viewer like Graphviz which will generate a dependency graph for you.

    I find often there are projects like “common” that effectively everything depends on. Those projects generate a lot of lines but not much information, so I typically add a filter in whatever code I use to generate the .dot file to filter any dependency that includes common. That cuts down on the noise and helps bring out the real structure.

    Of course the above pic is a toy example. Here’s a (partial) pic of the Terracotta project dependency graph, which is about mid-level complexity in stuff I’ve worked on. It’s messy but with not much effort you can get a pretty good idea how the code layers and where to look for the juicy stuff.

  5. Debug an example – Walking through a running test can also be very enlightening on how things are put together. This is a good way to discover the run-time structure of the code (the last item tells you more about the compile-time structure).

I hope some of that was useful! I’m sure there are other techniques but these are the ones I’ve used the most recently.

Comments

8 Responses to “Code Spelunking Techniques”
  1. Excellent categories of properties. I’ll have some of my client teams use these named categories for new and existing code.

    Thanks for taking the time to write this all out in a semi-formal fashion.

  2. Alex Barady says:

    Thanks for the point “Open method implementors”. It is really useful. I knew this Ctrl-T combination before but had no idea how to use the quick hierarchy in practice.

  3. Mark Bradley says:

    This may be good for OO based languages, but what techniques can be applied to all programming language paradigms?

  4. Alex says:

    Unfortunately, I don’t program in all programming language paradigms regularly so I don’t know. :) If you do, please blog it!

  5. Great analogy (re: code spelunking).

    I hope to blog on this as well, but for now:

    @ Mark B. It seems pretty easy to abstract these ideas for non-OO projects. e.g. All projects have some kind of organization. Also, the code base can probably be categorized either into layers (surely) or into similar categories (configuration, resources), even if the “units” aren’t members of a class. Finally, debugging (or tracing) a program is almost universal, no? (an exception being something like a realtime system but that’s a whole new world of hurt).

  6. There are now some tools that can help a lot understanding a code base. Have a look at this features of the tool NDepend for example:
    http://www.ndepend.com/Features.aspx

Trackbacks

Check out what others are saying about this post...
  1. […] Alex Miller writes about code spelunking techniques for when you’re digesting someone else’s code. He offers some good advice, particularly for tracking down call hierarchies and implementors (and some useful Eclipse shortcuts). What was missing, I thought, was the easiest one of all: […]

  2. […] Code Spelunking Techniques – Alex Miller talks about a few ways to get started with an unfamiliar codebase. […]