nicolas @ uucidl

Anti-Pattern: Blobs of test data

April 29, 2015 by nicolas, tagged testing and programming, filed under projects

During the development of automated tests, test data is sometimes represented in blobs, stored in central repositories. They are often shared across automated tests and help setting them up. The repositories can take the form of code (constructing a complete tree of objects), files or even relational databases.

The creation of a shared repository of test data is often introduced because creating and setting up test data is difficult or costly, both at development and execution time. Some reasons:

the domain objects and their collaborators are hard to construct, fake or wire in a test,
the domain is itself very complex and test writers have to master many aspects of the domain to create the correct test data at runtime,
the creation of these objects takes time to execute.

Check if those reasons really apply to your software project. Is 2) inherent to your domain? Can 1) and 3) be remedied? Are they even maybe a result of the application of this anti-pattern?

Personal experience

I have worked on a project that had shared data in the form of a centralized database against which every unit + acceptance test suite would be run.

The database had been created at a certain date, then updated sometimes by hand (handwritten SQL) or code as well as standard database schema migrations.

When a test failed it would be because the behaviors of the code changed (intentionally or not) or because the test data had not been migrated. Finding out what nature the test data had was also difficult. Did that person write this test against object O because it was an object with a precise, intended set up or because somehow it had some property that the writer of the test liked? Those aspects were almost never documented. In effect it meant that the test would often not document what it was constructed against.

Also the test data always grew because modifying items would mean taking the risk of breaking tests that you had no idea how to fix.

Why is a blob of test-data an unit-test anti-pattern?

A good unit-test is fast, precise, readable and isolated. It brings confidence into the working state of the system under test.

Tests become hard to read, imprecise and poorly isolated

Unit tests written against a blob of test data tend to be hard to read, poorly isolated and imprecise.

When a unit-test refer to the entire blob or even part of it, they are potentially depending on the entire tree rather than isolating only a part of the system.

When the test cherry-picks one particular item of the test data blob, the precise setup that the test is using is barely described. One must read the data to find out what the test is actually doing.

When creating a new test, it is very tempting to just look around and inherit one piece of data that someone else’s has written. This becomes a liability if this item is touched further, and couples the two tests implicitely. (I.e. the test failures are correlated)

It also means the test in question never really can state what its starting state is.

And if one cherry-picks the correct data within the blob in practice each tests get its own test data within the entire blob, which means that the blob is growing with the number of tests and never shrinking.

Tests become hard to trust

Unit tests written against a blob of test data also tend to be harder to trust.

In the long run as the application changes so must the test data. When the test data is not correctly versioned or updated then it becomes difficult to trust it. Although code-generated data is superior in this way because at least it can be made to use the basic operations of the data model, leading to well-formed test data in practice it’s always a bit of a mixture of static and generated data.

Tests are still slow

Finally performance wise, although these blobs are often brought in to solve performance issues with setting up the tests, if the test data is mutable, all modifications made to the blobs must be rolled-back so as to keep each test isolated. This may undermine the expected performance benefits of the shared data.

It goes further: when the test data repository is actually a shared resource such as a database, then it is inefficient under heavy parallel testing, making the unit test suite run slowly.

Why is a blob of test-data an acceptance-test anti-pattern as well?

While a unit test tests a system, an acceptance test tests a product.

A good acceptance test embodies the specification of the product in user terms.

When written against a blob of test data, an acceptance test becomes poorly specified. It starts depending on implicit properties of the test data.

Suggestions & Example

Write tests which directly construct their own starting state.

Unit-Test Example: specifications-based setup

A concrete alternative is to write your unit-test in this way:

a setup phase that constructs the objects out of a concise specification (a compressed version of your test data)
a test phase which operates on the resulting domain objects and verifies its expectations.
an unwind phase where the domain objects are destructed

An example in javascript:

function test_thatNotesCanBeDeletedWithADoubleClick() {
    withMidiEditorOnNotes(
        // specification for this test's data:
        [
            { midiPitch: 64, startTime: 7.0 },
        ],
        function (midiEditor, midiNotes) {
            doubleClick(midiEditor, timeToX(7.0), midiToY(64));
            verify(midiNotes.isEmpty());
        }
    );
}

Commentary on suggestion

For unit-tests this means constructing the smallest amount of domain objects necessary for the system under test.

For acceptance tests this means dedicated setup code to move the product into a desired state via domain object manipulation. It is acceptable here to use dedicated shortcuts (using model operations) to bring the product efficiently into this state.

All in all, creating well formed domain objects should anyway not be an after thought. Types with good specification and defaults that create well-formed values allow the creation of domain object values which can be directly used by tests.

It translates into domain objects that can be created anywhere (In C++: on the stack/on the heap), objects that can live standalone without being part of a complex network of other objects. I.e. properties of a modular code base.

A proposal for tracking the health of a code base

September 13, 2014 by nicolas, tagged management and programming, filed under projects

Code as Liability, features as Asset

For a peer reviewed software development project (ideally a module/sub-module) we introduce a dashboard to track its health.

The dashboard is regularly compiled and updated and includes:

A balance listing

“mass of code” as liability [EWD.1]
“user features” as asset

An indicator:

“feature density” the ratio “user features” per “mass of code” unit

It must be applied to peer reviewed projects where the review process exist to guarantee that code is and will remain easy to understand by all peers.

Only features which have are validated / tested in the software can of course be included in the dashboard.

Motivation

This metric encourages reducing the “mass of code” as well and/or the production of fine grained list of its “user features“, as both raise the feature density metric. It acts as both a trigger and a reward for the removal of cruft.

For a given module with a defined business scope, reducing the mass of code encourages finding simpler, more factored expressions of the user features in code, more compact documentation, as well as factoring out in other modules/products what is not directly linked to the domain.

For the same module, producing fine grained lists of user features encourages the understanding of its scope, and can help breaking down development into smaller deliverable units.

Application

The metric is not intended for comparaisons of software projects.

It is meant to be used by the developers themselves (software engineers, designers, documenters) to detect when and where they should direct their efforts. [EWD.2]

Tracking the derivative (its variation over time) of the metric (as for many other metrics) makes it easier to act upon.

Mass of code unit

Mass of code is voluntarily vague. Define it as you see fit. I would for instance include the code, its tests as well as documentation. All of these needs to be maintained in the name of the delivered features.

If code is by default “peer reviewed” then using lines of code is reasonable. With the peer review an additional control already exists for the readability of the code and thus the lines of code are themselves normalized somehow.

User features unit

Inside a focused module, user features can be considered equivalent and simply counted.

References

[EWD.1]: Inspiration from an E.W. Dijkstra’s quote:

From there it is only a small step to measuring “programmer productivity” in terms of “number of lines of code produced per month”. This is a very costly measuring unit because it encourages the writing of insipid code, but today I am less interested in how foolish a unit it is from even a pure business point of view. My point today is that, if we wish to count lines of code, we should not regard them as “lines produced” but as “lines spent”: the current conventional wisdom is so foolish as to book that count on the wrong side of the ledger. — E.W. Dijkstra [EWD1036]

[EWD.2]: simplicity is difficult

Firstly, simplicity and elegance are unpopular because they require hard work and discipline to achieve and education to be appreciated. — E.W. Dijkstra [EWD1243a

Acknowledgement

Thanks to Julien Kirch for his feedback

A word on my github repositories

March 16, 2014 by nicolas, filed under projects

Historically my programming output has always been largely hidden, first because the code I develop professionally is rarely opensource, as were also my personal projects (Demos, Visualism, Net) or even personal tools.

As you probably have seen, I have started publishing more code on github. In particular my goal is to publish libraries or code that I use in personal or business projects, when they are worthwhile to share to the larger community.

I would also love to see outside contributions of any form: code contributions, comments/criticisms etc..

To help you make sense of the degree of maturity of my public repositories, I’ve been marking my work in progress repositories with a pre prefix, and explorations/experiments under the exp prefix.

All the other repositories, either with no or with another prefix such as uu are those I consider ready for external consumption.

Summary

uu. : mature repositories
pre. : work in progress
exp. : explorations / hacks

motivation-hacking (activity tracking)

February 14, 2010 by nicolas, tagged motivation and activity, filed under projects

An experiment in motivation-hacking

In the spirit of:

http://lifehacker.com/281626/jerry-sein … ity-secret

One night I was in the club where Seinfeld was working, and before he went on stage, I saw my chance. I had to ask Seinfeld if he had any tips for a young comic. (…)
He said the way to be a better comic was to create better jokes and the way to create better jokes was to write every day. (…)

He told me to get a big wall calendar that has a whole year on one page and hang it on a prominent wall. The next step was to get a big red magic marker.

He said for each day that I do my task of writing, I get to put a big red X over that day. "After a few days you’ll have a chain. Just keep at it and the chain will grow longer every day. You’ll like seeing that chain, especially when you get a few weeks under your belt. Your only job next is to not break the chain."

"Don’t break the chain," he said again for emphasis.

And so this daily tracking page was born. It shows the overall activity of our code trees. Each day a commit has been performed, the day will be darkened. As the number of updated files grows, so does the darkness of each square.

Do something every day and you will progress. Do it for a long time and you will become an expert.