Test Automation for a Legacy IBM Emulator
June 2019 – July 2020
Also available in ES →
TL;DR
- One of eight developers on the emulator team, in a company of about twenty
- Replaced hours of manual validation with a fully automated, unattended test suite
- Only Linux user on the team: uncovered OS-dependent bugs baked in for years
- Introduced git and Maven to modernise practices nobody had questioned before
The context
BASE100 develops Caravel, an emulator and converter for IBM i Series (AS/400) legacy software. The tool converts and runs decades-old business code on modern infrastructure. The kind of work that demands very high correctness guarantees.
The team building the emulator was around eight people, at a company of about twenty.
The test process when I arrived was entirely manual: load data, run conversions, inspect outputs, compare against expected results. Slow, inconsistent, and as accurate as the person running it on any given day.
What I built
I automated the entire workflow. A single script could now fetch remote code and data, populate the test databases, execute the conversion and emulation runs, validate outputs against expected results, and report. What had been hours of manual work became a repeatable, unattended process.
The scripts were written in Python and shell, driven by plain-text test definitions, which also meant anyone could add a new test case without touching the tooling.
I also contributed to the Caravel core itself: new emulator features and refactoring of decade-old Java code to reduce technical debt. And I introduced git and Maven to a team still working with CVS and Ant — not because I was told to, but because the alternative was painful.
The Linux angle
I was the only person on the team using Linux. This turned out to be a useful accident.
The codebase had hardcoded path strings scattered throughout: the kind of thing that works fine on Windows and loudly breaks everywhere else. I replaced them with proper Java constructs (File.separator, Paths.get()) and documented the rest. It was a small change technically, but it revealed an assumption baked into the codebase: that it would always run on Windows, forever.
I also found a Boolean object (capital B, boxed object) being used as a three-state variable: true, false, or null, each meaning something different. I proposed replacing it with a proper enum. The tech lead disagreed. I noted it, moved on, and filed it under “lessons in picking battles in legacy codebases.”