Bad practices and lucky streaks

Imagine you’re at a poker table. You get dealt a 2 and a 7, objectively the worst hand in the game. And just like that, you decide to go all in. The other players call your bluff and before you know it, there’s a boatload of money on the table. When all the cards are dealt, you end up with a Full House and take the entire pot. Ka-ching!

The outcome of this game is completely detached from the move you’ve made. Holding all the chips does not change the fact that this was a dumb move. Not ballsy. Not crazy. Dumb.

You got very lucky and won’t be able to repeat this. But your brain has now been tricked into believing that you can get away with it. That you are on a lucky streak. This kind of win will make you play even more risky moves in the future. In a game of chance, it sets you up for failure.

We see this kind of reinforcement all the time in software development.

A team does something that they know they shouldn’t be doing and gets away with it. They start believing it’s not really an issue.

An example

After months of delays and bug fixing in a staging environment, our project finally went live.
The lesson we should learn is that we don’t want that struggle, smaller releases are better.
The lesson we learn is that we can get away with Big Bang releases.

Another one

We don’t have unit tests, so refactoring is hard. Moving from Mongo to Postgres requires exhaustive manual testing and a few sleepless nights.
The lesson we should learn is that we need automated tests to act as guard rails.
The lesson we learn is that it was a one-off and hey, we managed, right?

A project I once worked on required a pretty risky database migration. While the script had been tested in a staging environment, that data was nowhere near representative of the real thing. Starting the migration script felt dodgy and just to be sure, I stored a dump of the production database on my laptop. When it turned out the script had issues and corrupted the database, we used this dump to restore it. Phew! No data was lost, crisis averted. The lessons we should have learned were crystal clear. Improve the data quality in staging. Verify that our backups work. Dry run the migration script on a copy of production.

But we didn’t learn those lessons because we got away with it.

The belief in lucky streaks becomes ingrained in a team’s culture. “We know we should set up a deployment pipeline, but our manual deploys don’t have any issues, so it’s OK.

In poker, betting it all on a bad hand will make us go bust. We’ll cut our losses, have a drink, and go home. It’s just a game, after all.

Your product is not a game. Going bust means losing customers. Or jobs.

The question we should ask ourselves from time to time is: what are we getting away with here? What is our lucky streak?