Many of my philosophies in work life1, my volunteer life2, and my personal life stem from experiences. As a developer, many of those come from being burnt by rough edges or mistakes. Just as health & safety principles come from accidents, my development practices come from bugs, errors, and mistakes.
With that in mind, here’s a war-story from my days running an R&D startup when we lost all the data we thought we gathered from a psychology study.
I founded an R&D startup that worked closely with psychology researchers. We would build tools for gathering data from psychology exercises, which the researchers would use in their studies or assessments.
One of these tests was the Stroop test - users would get shown four words in different colours, with one of those colours shown in the middle of the 4 words. The user would have to respond as quickly as possible to pick the right word, through pressing the up/down/left/right keys on their keyboard. The user would be asked to do this with some test words (usually colours, office supplies, words without significant meaning to the user). Once they had an average response time within an acceptable range (in milliseconds), we would introduce some words that we wanted to measure the user's reaction on. For example, if a user was particularly addicted to tea, then their reaction time would be different when "tea" is one of the words they get to choose from than a control wordset without any mention of tea. These tests require complete concentration for a significant period of time.

We built the Stroop test tooling to work with any group of users that the researchers wished to analyze. Sometimes it'd be those addicted to substances, sometimes it would be to measure the societal factors (e.g fear of terrorism). In the normal workflow, we'd get the users to add their study email address so that the researchers could follow up with their results. We'd also send the participants their own results via email, in case they were curious. The tech was boring: a PHP backend, MySQL database, and a minimal JavaScript frontend. The results would be sent after the user had completed the full test, then stored in the database.
One of the studies assessed a group of people who worked in a high security workplace. They would not have internet access, and we would need to provide a locked down device that would only be used for the study. No problem, we setup a laptop with a locked-down Linux install, and put the code on there to run locally just as had in development. We disabled WiFi, Bluetooth, and anything else we could. These instructions came on the same morning as the test: so as we drove to the location, we were fixing the code and testing.
We arrive, set up the test, then the study participants all come and take the test. It all seemed good, each participant managed to complete the test successfully. We left feeling good.
As we arrived back at the office, we realized that the data was missing. All of the data. We had not saved any of the participant responses, and it wasn't clear why. The code had worked locally on the machine when we tested, but not in the secure location. We did, however, have logs. The logs seemed okay, nothing particularly special. Then we decided to test without internet access. The problem became clear.
function saveResults($userEmail, $results) {
sendUserTheirResults($userEmail, $results);
saveResultsToDatabase($userEmail, $results);
}
function saveResultsToDatabase($userEmail, $results) {
...
}
function sendUserTheirResults($userEmail, $results) {
mail($userEmail, 'Results', $results) or die();
}
PHP had a popular pattern: do_something() or die(). If do_something failed, then the program would exit early. We used mail() or die(). When the email failed to send since there was no internet access, the script would exit before saving the results to the database. We used or die() incorrectly - we should've handled the error gracefully instead of exiting the entire script. We had not saved any of the data from any of the participants. And worse: to run the study again, we'd have to book more time in their calendar, with approval steps all over again.
Our first idea was to see if we had logged the user interactions somewhere. The frontend would log the results in the console, but since we had many participants, we would run each session in an incognito window, then close the window. We looked at the backend: we had logs, but the requests were sent as POST, with no logs of the body. The good feeling of "we pulled this off" turned into the bad feeling of "we've messed up".
There was no choice but to own the error, and run the test again. The second time, we made sure to test on-site before getting any participants in. The only change in the code was to replace or die() with some proper error handling.
While this bug was a costly mistake, we learned from it. Whenever we would deploy code-last minute, we'd try to test it more rigorously. If we were running a study without internet access, we'd make sure to test in the same environment. We hadn't accounted for the environment change, partially due to the short notice for the locked-down machine, but also just because we didn't test with the exact same restrictions.
But we also started moving all of our backend code away from PHP. It was too easy to make a simple but costly mistake. Our main languages became JavaScript, Go, and Python.
Consider the same pattern in Node.js:
function sendUserTheirResults(userEmail, results) {
if (!mail(userEmail, "Results", results)) {
// this would be a very weird thing to do
process.exit();
}
}
The design of languages, APIs, and frameworks can often lead developers to make mistakes. It applies outside of code, too. When you see a trashcan with a circle hole next to one with a rectangular hole, you know instinctively that the bottles go into the circle hole, while other trash goes into the rectangular one. In psychology and design, this is known as an object's affordance. To go against an object's affordance feels unnatural. It feels wrong to push on a door that has a handle for pulling.
The same effect applies to code: you could call process.exit() when an optional part of the program has an error, but why would you? That feels wrong. Whereas extending a line with or die seems like a natural choice in a language where that's already an established pattern.
A principle I follow out of this experience: If you don't make it easy to do the right thing and awkward to do the wrong thing, people with good intentions will do the wrong thing.
If this chapter was fun to read, feel free to sign up for my book waiting list. This is one of the chapters that will be included in a section dedicated to moments that shape my philosophy. It will be released for free as HTML when I'm done - with ebook/physical editions available for a minimal price. The waiting list is just a handy way for me to let you know “the book is ready”, with a pre-release copy. Otherwise, have a great summer!
Tech Enabler in the CTO Office for Schibsted, a news company
I’m the board leader for Tekna’s Network for Developers. Tekna’s a STEM union in Norway. We host normal tech events, political events, and represent union members in union and political discussions.