Writing Code Layer by Layer

The Value of Abstraction

Abstraction is one of the most powerful concepts in software development. Without it, large software projects would be impossible to develop and debug. Hiding as many implementation details as possible limits the work required to understand how to use a software component and prevents the errors caused by uncontrolled alteration of key data structures. This is called "functional isolation" and is commonly implemented by defining an Applications Program Interface (API) that provides the only authorized methods to access the component.

Benefits of functional isolation include testability, reliability, scalability, and reusability:

Limiting access to code and data structures improves testability by reducing the number of tests required to exercise all possible conditions within the component. Only certain states and state changes are allowed.
Multi-million line software projects are impossible to coordinate when every piece of code relies on every other piece of code. Development and test milestones require that everyone finish at the same time - an impossible goal.
Testable code is reliable code; if you can't test code, you can't rely on it (see Software Design for Testability).
Testable code is reusable code; if you can verify that the code performs as intended, you can use the subsystem elsewhere without restarting the test process. Reusability is also improved when you can use a subsystem without importing vast quantities of unrelated code.

Top Down vs. Bottom Up

Rare is the subsystem in a modern application that can be implemented in a single layer of code. Even straightforward data processing code has file I/O routines, data extraction routines, data checking, and data manipulation routines. While it is tempting to combine several of these routines into one large routine (particularly if they are small), the result is much harder to understand, test, debug, optimize, and extend.

In programming classes you were probably taught to make a flowchart or state diagram describing the top-level functionality of the code, then write code from the top down, moving to lower levels only when all of the levels above were completed. This is how managers like to think code is written, and sometimes it actually works for something larger than a classroom exercise.

The reality, particularly in the research and development (R&D) work that I do, is much different. There is a broad functional goal, and perhaps some performance requirements, but quite often there will be significant uncertainty about how something can be done or how well it will work. If you are developing an entirely new product, or even a new algorithm, some exploratory work will be necessary.

You may also be tempted to start from the bottom up, writing the lowest levels of code because you know what they need to do. You need something working, after all, to support the higher level code that provides all of the value in your product. This too can lead to trouble, because there is a temptation to implement only what you know, unduly constraining the features of the product.

For any project of significant size, I end up working from both directions. At the very top level, I sketch out the program's data flow. Usually this is in pseudocode or informal C/C++ code. Each time a subsystem call is needed, the requirements are noted as part of the stub call to that subsystem. Working in broad strokes to avoid spending too much time, I describe as much of this level as I can, then start looking at the subsystems that are needed.

For each subsystem, I gather together the calls (with their requirements) and repeat the process. The routines being called are the subsystem's API. I write pseudocode for each of the API routines, noting the requirements of the next level below this point.

Sometimes several subsystems must work together, in which case the pseudocode for each is written at the same time. In other cases, the requirements of a particular subsystem crystallize and I can focus on that subsystem by itself.

At some point as I work my way downward, an "aha!" mechanism kicks in, and I can see all of the requirements for a subsystem. Now the code for the subsystem can be written and tested. Going back to its caller(s), details can be filled in, and now I can work on another subsystem or begin to implement one or more of the callers.

Note that each subsystem is tested as it is written. This is crucial in a strict layer-by-layer development system, because you can't test code until all its prerequisites are working. It's very tempting to put in stub routines for subsystems so that you can show a prototype, but that presumes you will be able to get the subsystems working!

The other problem with writing code from the top down is that you have to test multiple levels of code at once. I have found that the time required to debug multiple levels of code increases exponentially with the number of levels (see The Cost of Debugging Software). Particularly in optimization programs, when your algorithms and data structures are experimental you have to decide quickly whether a failure is an error in the implementation or a weakness in the design.

It may seem wasteful to test low-level code before its callers are complete, because some requirements may have been omitted. I have found that even when an implementation must be replaced completely, nearly all of the test cases can be retained. I had to rewrite a geometric design rule checking subsystem for one of my research projects - several thousand lines of precision high-performance code. I was able to keep about 80% of the test cases, because no matter how a width or spacing check is implemented, the shapes and values to be checked will still be the same. The requirements of a subsystem do not change nearly as much as the implementation.

So why is this top-down design, bottom-up co-design strategy necessary? A strict top-down programming model assumes that you will be able to implement at the lower levels what you need for the upper levels. If there are multiple levels of code, this is never certain, and you may be forced to drop some features from the project after you have already committed them. A strict bottom-up programming module assumes that you know all of the requirements from the upper levels of code. This can get you in trouble when the data structures or algorithms are not extensible - "oops" requirements added as you implement upper levels can force total rewrites of lower levels.

I used this development strategy to implement a reader/writer package for the OASIS chip design file format. OASIS (Open Artwork System Interchange Standard) is a SEMI (Semiconductor Equipment and Materials International) standard for representing integrated circuit polygon data. It replaces the earlier GDSII file format and includes many methods for reducing the size of integrated circuit layout design files. A typical OASIS file is 10-20 times smaller than an equivalent GDSII file.

The compression methods come at a cost: lots of code. The OASIS reader/writer comprises 123 files totalling about 124,000 lines of code (including comments). Roughly half of the code is in test programs. The code is split into six layers, from byte I/O at the bottom to an abstract API at the top. For comparison, my GDSII reader/writer required about 8,000 lines of code including comments and test programs.

Layer by Layer

Programming discipline is key to a layer-by-layer development strategy:

Define a strict ordering of modules to help keep track of the layers. I build modules in order by layer. Sometimes different layers are in different directories.
Given this ordering, never let one module make a forward reference to another unless they are part of a tightly coupled unit. Such a unit is typically developed and tested together, which of course is harder than developing a smaller, simpler unit.
Assign an abstraction concept to each layer (e.g. "parsing of records within a file"). If a layer has multiple unrelated purposes, it will be harder to reuse that code elsewhere.
Ensure that the abstraction concept for each layer is progressively closer to something that an "average" user (whether a consumer or a developer) can comprehend. Users of the top-level code in my OASIS file reader/writer need to know only a few details of how the files are represented (namely record properties, which are user-defined anyway). All other details are abstracted away.
Strictly limit the interactions between modules in the same layer. Whenever a module in a layer calls another module in that same layer, you are in effect creating an intermediate layer.
Code in one layer of a subsystem should not directly invoke code more than one or two layers down. Doing so violates the abstraction concept. Calls to globally visible library code (e.g. object containers or assertions) do not fall under this rule; strictly speaking library code is not part of the subsystem.
Help code maintainers by creating a file that describes the layers, naming the files in each one.
When adding new features, don't let features "move forward" in the dependency chain. For example, I defined an error manager with message replacement. It used a table of messages loaded from a file. I added assertions to the table manager, thus creating a dependency loop when the error manager wanted to retrieve a replacement message. An assertion failure in the table manager would result in a recursive call to the error manager.

This seems like a lot of work, but it is small compared to the effort of developing a large subsystem such as the OASIS reader/writer. Reading and writing integrated circuit design files, though critical, is a tiny part of the chip design process. If you and your development team aren't disciplined everywhere, the millions of lines of code needed to design a chip or search the Web or simulate the weather will never work together.

As an example, here are the layers of code within the OASIS file reader/writer:

layer 0: read or write file bytes, possibly compressed
layer 1: read a primitive (integer, float, or string)
layer 2: read the fields of a record
layer 3: manage relationships between records
layer 4: manage file-level structures; validate records
layer 5: API

Layer 0 handles file buffering, especially when compressing data for writing. Layer 1 manages the representations of numbers and strings in a machine-independent manner. Layer 2 contains the individual record parsers, which read fields based on a flag field. Layer 3 handles modal variables for repeated coordinates and conversion of OASIS-specific geometry representations to generic polygons and paths usable by any application. Layer 4 handles name tables for indexing and implements record-by-record validation. Finally, layer 5 is the API that callers can use to read or write OASIS data without knowing anything about the OASIS specification except record properties.

Note that layer 4 has two main functions: management of file-level structures and record validation. Record validation includes cross-checking the records against the file-level structures (name tables in particular). Strict module ordering still applies here (name table management does not validate records) but the definition of "layer" is blurred both because there are two functions at this level and because the validation module (one file) calls the others. The distinction here is that the next layer (the top-level API) can access both sets of routines in the layer, and some validation routines are optional.

Implementation Issues

The abstractions involved in layer-by-layer programming can sometimes have a noticeable impact on runtime. A routine that simply repackages its parameters to pass down to a lower layer (a "wrapper") is not doing useful work; it only preserves the abstraction concept.

In C, you can reduce the runtime overhead of abstraction layers by defining macros that execute "wrapper" functionality. These can be tricky to use, of course. A syntax error in a deeply nested macro invocation can be very difficult to fix.

In C++, you can define inline functions. These are expanded into the caller at the option of the compiler, preserving the abstraction but reducing function call overhead and allowing for code optimization. These work so well that I almost never use macros in C++ programming.

Both of these methods allow you to replace the inline code with standalone routines if the task grows. It's much easier to recompile everything than to rewrite all of the code which used private routines or data structures from lower layers.

Use inline functions and macros judiciously, of course; "code bloat" can have its own costs as heavily executed loops grow beyond the size of the processor's cache. If you are usually skipping over a function call, there is no gain to having it inline. Profile your code before resorting to these methods for anything but the most trivial routine.

Conclusions

Informal top-down code architecture combined with bottom-up implementation allows you to develop testable, reusable code that is easy to understand. Testing the code as you go helps improve code quality still further (see Yes, You Can Test Every Line of Code), as you don't have to debug multiple levels of code simultaneously.

The certainty you get from a fully tested infrastructure helps you build in quality all the way up. And even if the preliminary specification passed down has to be revised, you haven't wasted a lot of time - requirements don't change that much and most of the test code can be reused. The new version will be up and running much faster.

You need discipline to build modern software projects with millions of lines of code. Layer-by-layer code development lets you start those good habits early. Your customers will thank you.

Chapman Consulting

Software Development Done Right.

The Value of Abstraction

Top Down vs. Bottom Up

Layer by Layer

Implementation Issues

Conclusions