You’re a developer and darned proud of the code you write. You follow the specs and build what the stakeholders and designers want. You’ve tested it and all the test scenarios work as expected. You deploy and the app goes into the wild. But, what happens when there’s a problem that no one anticipated? Not you, not the app owner, not QA, not ops, not anyone.
After having a problem with my mobile phone, I visited the local store of the phone manufacturer, a giant company named for a fruit. None of the store’s so-called geniuses could figure out why the docs and data on my phone kept ballooning up to fill — and even attempt to exceed — the 128 GB device’s available memory. It turns out that this giant hardware company (or is it a software company) had no diagnostic software that could peek into the device and see what was suddenly occupying so much RAM, more than 23 GB. Manually adding up the data use reported by all the installed apps plus recently deleted photos still in RAM totaled less than 1 GB.
Multiple reboots did not help. The only alternative, they said, was a reset and restore from my last backup. After doing that, I attempted to re-add a credit card to the phone’s wallet. That resulted in a “card is already in wallet” error though it clearly was not. Multiple calls with said fruit company’s tech support experts (on my land line) could not solve the problem. Again, there was no way to peer into the phone to get a snapshot.
Four calls later, one young man in the fruit company’s Jacksonville, Fla. call center suggested the purely undocumented move of logging out of my cloud account associated with my phone then logging back in. Voila! I could now add my credit card. Why do this? He couldn’t say.
Why did the phone operating software and its wallet app behave in this manner? No one knows. Why did logging out and back in solve the problem? No one knows. Was that the cloud equivalent of a three-fingered Ctrl-Alt-Del forced-reboot salute? No one knows. With tens of millions of phones in use that support wallets and credit cards, why hadn’t this been seen before? Or if it had been, why wasn’t it documented? No one knows. Web searches did yielded some highly convoluted suggestions, though.
The innocent bystander in this are the developers who build software based on a set of specs. I’m not saying that the developers were surrounded by fools to the left and jokers to the right, but it’s clear that neither the specs nor the test scripts anticipated this situation. Perhaps this couldn’t have been imagined. It’s not like the famous spec failure that says how to process a payment if it’s received less than 29 or more than 29 days after the due date, but which fails to specify what to do if the payment is received exactly 29 days after the due date. That’s just bad design.
Burke Holland, director of developer relations at Progress Software, has reminded me more than once that great software isn’t finished until it’s fully tested. But, just as you can never prove that an app is completely secure, only that it is not secure, you can’t test for scenarios beyond your wildest imagination.
What really would have helped is powerful diagnostic software to do a RAM autopsy. It didn’t exist. And that makes me wonder what we should be doing in terms of creating diagnostics for the apps we build. Do you do that? Share your thoughts, we’d like to hear from you.