IT industry experts said the launch of Healthcare.gov this month was hobbled by a byzantine, "old-school" infrastructure. Could cloud computing have solved the site's performance problems?
In a world where billions of users log on to Facebook every day without a second thought, the idea of a website collapsing under a load of visitors is foreign.
But Healthcare.gov, the portal through which Americans shop for health insurance plans under the Affordable Care Act, experienced freezes, crashes and other glitches when it officially opened Oct. 1. Problems persisted throughout the ensuing week as government IT scrambled to stabilize it amid the ongoing government shutdown.
Healthcare.gov: What went wrong?
The user account creation portion of the website was to blame for the problems, and it collapsed under a heavy traffic load from millions of people trying to sign up for health insurance plans, Todd Park, United States chief technology officer (CTO), told the New York Times the week after the failure.
Servers associated with the faulty application were to be moved from virtual machines (VMs) to dedicated hardware to fix the performance issues, according to public statements from spokespeople for the Department of Health and Human Services, which oversaw the website rollout.
But it didn't have to be this way, technical experts said.
"These are very manageable, typical ecommerce scaling problems," said John Engates, Rackspace Hosting CTO. Rackspace had nothing to do with the Healthcare.gov mess, but it has supported other, larger Web properties in the past, Engates said.
An application designed to scale out in response to high demand is hardly a rare species as cloud computing has become mainstream, Engates said.
Healthcare.gov's outdated infrastructure to blame
Cloud computing services support some areas of the highly complex government healthcare site, which connects to numerous federal agencies, such as the Internal Revenue Service, to determine an applicant's eligibility for federal healthcare subsidies.
But a trace of IP addresses back to their owners using ping, Tracert, WhoIs, "view source" on the Healthcare.gov website, and other forensic analysis tools, suggested cloud computing and scale-out application code were probably not used in most of the site's infrastructure.
For example, one portion of the site, data.healthcare.gov, is running in a data center belonging to CenturyLink's Savvis. The IP address is associated with a company that uses Savvis for colocation services, a Savvis spokesperson said.
[The Healthcare.gov architecture is] for online applications built in 2003, not 2013.
Carl Brooks, 451 Research analyst
Other reports have one federal agency that connects with the website, the Center for Medicare and Medicaid Services, hosted on Verizon Terremark's cloud, but company spokespeople did not respond to attempts to clarify whether that service is cloud or managed hosting.
Industry experts said these portions of the site are more likely to be based on managed hosting rather than cloud computing services.
"In general, I haven't seen any of these exchanges use cloud services," said Shlomo Swidler, CEO of consulting firm Orchestratus Inc. "Several states, including California, host their exchanges in government data centers. Others use Akamai or other content delivery networks, so it's not clear where the actual hosting lives."
The way the website is built, using managed hosting and a static application infrastructure, "is the anti-cloud solution," said Carl Brooks, an analyst at New York-based 451 Research. "Totally old-school for a big Web property these days."
Brooks sees the site as an amalgamation of the Akamai content delivery network, collocated server hardware, private Ethernet and "some other hops over backwater networks."
"Akamai is actually the strongest link in the chain, but the back end is just not there," Brooks said. "It's not the [service] providers' fault; that [architecture is] for online applications built in 2003, not 2013."
People that build sites on Amazon Web Services (AWS) or similar environments don't have these problems, Brooks said.
"Compared with the much-vaunted Obama election night systems that were built on [AWS] and worked flawlessly," Brooks asked, "why weren't those folks in charge of a big site launch like this?"
Politics trump technical best practices
Indeed, the technology exists to create scale-out, high-performance Web properties using cloud computing, so why didn't the government take a page out of the Obama campaign's playbook when it came to building Healthcare.gov?
The fact that disparate parts of the site are hosted in different data centers is a clue as to what really went wrong here: There were too many cooks in the kitchen due to the balkanized way federal IT procurement contracts work.
Unlike the Obama campaign, which was run by a single unified organization, a total of 47 different contractors worked on building Healthcare.gov, according to the Sunlight Foundation, a nonprofit, Washington, D.C.-based organization that reports on government.
The result was predictably fragmented dysfunction. For example, using a 'community edition' tool called Maltego, David Campbell, CEO of cloud security startup JumpCloud, determined that local.healthcare.gov was built using Apache, jQuery, optimize.ly and Pingdom; data.healthcare.gov is based on NGINX, Apache and Ruby on Rails.
"It's obvious that these two pieces probably came from completely different development teams working for completely different organizations," Campbell said.
The two main contractors working on the site were from the CGI Group, a Canadian consulting company, and Quality Software Services, a Maryland-based health care IT company, according to the Washington Post. These two companies did not respond to requests for comment on Healthcare.gov this week.