Design for total cost


Approximated cost in hours to find a software defect:

Requirement Review 10 Minutes
Code Review 20 Minutes
Unit Level Testing 1 Hour
Automated Tests 10 Hours
Manual Tests 15 Hours
User Testing 20 Hours
Customer Finds 30+ Hours
(from “Quality through Change-Based Test Management” — IBM)

Most software projects think they are doing well with a process for user testing in place. That is on the hour cost of 20 hours per defect. Sounds good to you?

Spending money in the maintenance ball is spending it wrong. Spending it right would be to focus on the little requirements ball. There is where you find real leverage (10 minutes to find a defect vs. 30+ hours if an end user finds it).

Cost to user

The cost to user is difficult to estimate, but very important. Sometimes the cost is visible, for example a software defect that prevents 1000 users from working would be quite noticeable.

Sometimes the cost is hidden and more insidious. A design flaw that makes something difficult to accomplish carries a cost to user as well, but it in the form of stress. Imagine a bad design flaw staying with the application for five years, stressing out thousands of users that has to deal with it on a daily basis.

  • Aim for user friendliness
  • Thorough testing before release
  • Easy way to report bugs
  • Knowledgeable help-desk

Avoid maintenance cost through increased reliability

Software should be designed with maintenance goals in mind. Most of the time we design for unneeded complexity. Use a simple infrastructure, and make it complex only if you need to. Every item you add to the chain of items needed for operation increases the risk of failure of said chain.

  • 24h service, or office hours?
  • Automated surveillance reduces downtime
  • Avoid complexity in server infrastructure!
  • Surveillance tools for server admins

Minimize developer hours spent on data repair

Sometimes we need to alter the data (or metadata) of the database. The need can come from a requirements change, from a database crash, from badly designed user input validation, or from external data-load. If you are using very complex data structures in the database, these changes will be very costly to carry out.

  • Easier to salvage database designed for enough normalization
  • Easier to repair non-dynamic data structures
  • Establish data contracts for integration early
  • Use strict database validation. Never allow input or import of erroneous data

Turnover costs

The developer team will experience turnover sooner or later. These costs can be reduced by relevant documentation and use of standards and standard technology. If the technology is special-special, and no one knows how to do anything except for the guy who is leaving, replacing him will obviously be very costly.

  • Good specifications makes team turnover and knowledge transfer less costly
  • Standard technology makes it easier replace team members.
  • Standard guidelines makes it easier to replace team members.

Software design can increase maintainability

It is no surprise that a good design increases maintainability. What does that really mean? It means making changes to the application will be cheaper. Changes can be both developing something new, and repairing something that is broken. The challenge here is to keep the design working while adding complexity. Shortcuts introduce software rot and must be avoided.

  • Domain driven design will reduce complexity
  • If two things can be separate they should be separate. DRY (don’t repeat yourself) is a double edged sword.
  • Good design is not an awesome ball of yarn that can do anything
  • Good design is as simple as possible, not simpler.

Be aware of software entropy

The development team is in a constant fight against software entropy. Every change to the codebase carries with it the chance to mess something up. It can be by introducing new bugs, or simply by destroying the design. Alot of team turnover coupled with feature creep will guarantee software rot.

The team must be made aware of this entropy, because the application is constantly moving towards it. If it sets in too far, you will have an unmaintainable application where changes will be so costly and risky that its not worth doing them.

When you get the feeling that developers are very hesitant to add any new features, that they prefer to poke at the application from afar (with a tall rod), you are probably looking at the putrid pile of an unmaintainable application.

That is why every new feature must not only be weighed against the direct cost to code it, but also against the hidden cost of adding complexity to the application as a whole, making it more expensive to maintain.

  • Allocate time for redesign and the fight of software rot
  • Be extremely wary of feature creep
  • Keep conceptual integrity in mind when adding new features

Fixing broken windows


The broken windows theory is a criminological theory of the normsetting and signalling effects of urban disorder and vandalism on additional crime and anti-social behavior. The theory states that monitoring and maintaining urban environments in a well-ordered condition may prevent further vandalism as well as an escalation into more serious crime.

Wikipedia, Broken windows theory

The theory is that fixing broken windows could be a method to combat further vandalism. Instead of letting a broken window turn into a broken door turn into a broken wall turn into a wrecked house, you would simply repair the windows as they break.

The basic idea can apply to many different areas of life, for example litter on the streets, neglecting to pay your bills, and so on. This is entropy at work; what requires effort to maintain will break down without a steady input of effort. If you would chart the phenomenon it would look something like an exponential function bottoming out due to diminishing returns (it requires less effort to maintain a low entropy state).

The maintenance of software is also subject to entropy. Software needs a constant input of effort to remain maintainable.

Building begins

The team agrees on rules regarding documentation and specifications. They set up rules as to what type of code goes into which parts, what functions goes into which domains. The team agrees on rules as to testing and qc, a promise to refactor early and often. All the good stuff, yea.

The application is being developed, and its quality is according to the rules set up by the team. If the rules are good and the team follows them, good software will result.

After development the application goes into production and the team scales down, sometimes leaving a group of junior developers to maintain the application.

After release a host of bugs and design shortcomings appear, often within the first few months. Some will require little effort to fix, some will require more. Suddenly the team is dealing with fewer hours yet harsher deadlines. Users are phoning in demanding new releases.

At this point the senior developer (or architect) might have left the team and is now allocated to other projects.

Keeping software working

Without the understanding of the design, and the understanding of discipline, will the junior team continue refactoring? Updating specifications? Separation of code? Documentation? Testing? Qc?

All good habits that were agreed upon as critical during development are soon forgotten in the maintenance phase. Entropy and software rot sets in, shortcuts are being made, and the codebase goes beyond repair.

The truth is that it takes a good deal of effort to keep a codebase workable.

Graphing the effort in maintaining a codebase might look like a jigsaw. For every change maintainability goes down. You need effort to push it back up again; effort to keep the design working, keep the specs updated, test the changes, qc, and so on.

If the team neglects pushing it back into “good as new” for too long, it will degenerate past an event horizon where the only way out is do it all over again. There are too many broken windows and too much effort is required to make it “whole” again.

In the figure above, the red line describes changes to code. For every change quality and maintainability goes down as complexity increases. That is why for every new feature, there is an associated effort cost to make sure the design is working. That means reevaluating object models, refactoring code, rewriting automated tests, and so on.

Degenerating codebase:

  • Team lacking understanding of overall design
  • Inexperienced coders
  • No specifications
  • Time pressure
  • No method in place to deal with code maintenance

In the figure above there is a sufficient input of effort to bring the codebase back to quality after each change. The codebase remains maintainable, and there are fewer bugs.

Regenerating codebase:

  • Coders understand value of refactoring
  • Coders understand overall design
  • Structure is in place to deal with redesign and refactoring
  • Unit testing is in place
  • Team will question feature creep

Fighting software rot

Software rot will set in sooner or later. If it goes too far, the application becomes unmaintainable. We want to keep the quality high for as long as possible, and there are a few things that can be done prolong the active life of software. Many of these items come through keeping a good practice culture in the project where you don’t allow for broken windows.

  • Understand that redesign takes time
  • Be extremely wary of feature creep
  • Keep conceptual integrity in mind when adding new features
  • Don’t add new features when stability is an existing problem
  • Team members knowledgeable of overall design will do the big changes
  • Separation of code
  • Unit testing in place will encourage refactoring
  • Good specifications will encourage refactoring

You could argue that these items aren’t practical, and yes some applications are just too far gone to help. What is important is to understand that some applications are vastly more maintainable than others. With good maintainability, adding new and complex features actually takes very little time.  On the other hand, if the team chooses to ignore software rot you will quickly find yourself with an unmaintainable application where even the smaller features take a long time to implement.

Version 1.0

The most critical period in an applications life cycle is not so much its infancy as its adolescence.

Before release the customer accepts bugs. Milestone deadlines are MS Project phenomena.

After release the nice customer is replaced by angry users. Deadlines? Yesterday.

Before release you have whiteboard meetings, cookies and pleasantries.

After release you get nasty phone calls and late nights at work.

The truth is: after release is where the design will be tested and where the difficult redesign will occur.

Is this, the period between release and maturity, where you need team discipline more than ever? Is this where you need experienced people who understand the applications original design and goals?

I think so.

At release the application isn’t “done”, it just got started.