Harvard University has a cloud -- a holistically-managed, hyper-dense IT infrastructure delivered over a fast fiber link for research scientists, to be exact. Several years of effort have collapsed its Faculty of Arts and Sciences IT into six data centers,
The biggest barrier to doing a cloud for researchers is bandwidth.
James Cuff, director of research computing and CTO, Harvard University
It's a harbinger of things to come for managing and provisioning IT infrastructure, with a twist; Harvard's next move is bailing out of the Boston data center altogether and moving into a "community cloud" with four other universities at the Massachusetts Green High Performance Computing Center Project, 90 miles away. When the project is complete in 2012, it might be the last data center Harvard ever builds, according to James Cuff, director of research computing and CTO at Harvard.
"I don't need to be in the hardware business at this point," he said in an interview at the Faculty of Arts and Sciences (FAS) computing offices; formerly the site of Harvard's storied cyclotron, decommissioned in 2002. The building is an appropriate place for Cuff's high-tech operations.
Sitting in his office with a conspicuously large monitor on the wall and a laptop to peck on, Cuff can list active computing projects and resources with a few commands. He can show everything from a few CPUs and a some file stores to a project currently engaging hundreds of terabytes and virtual servers at once, all listed by the identity of the 'owner' who provisioned the resources.
Cuff is in charge, of course; this isn't Amazon Web Services. His resources are practically much more finite, so he's got control over his cloud. "We have about 4,000 subscribers with around 1,000 active at any one time," he said.
Cuff is out in front on how his team handles infrastructure, experimenting constantly with new models and techniques for different kinds of infrastructure. He's fond of saying that an undergrad with a "dodgy Perl script" could take down his entire operation, which is prime motivation for him to have excellent insight and control into what his infrastructure is doing.
The results are pretty spectacular from an IT standpoint. Several years into Harvard FAS' consolidation efforts, Cuff and a cheerful pirate crew of operators drawn from all sorts of disciplines, not just computer science, manage hundreds of servers, petabytes of storage, and thousands of virtual machines with minimum effort and with a supply channel anyone can use. He said he could run his operations by ordering parts on discount retailer Newegg (Harvard actually has a couple of distributors that give them academic discounts), and he pitches his operation to researchers even as he is busy taking away their existing computer labs.
They can spend their grant money on their own equipment if they like, Cuff said, but at the scale at which he operates, he can give them more than they could ever get on their own in a few minutes. And that's an easy sell compared to weeks or months of cluster building by hand. Cuff said the heart of cloud-style operations is the network.
The heart of cloud-style operations is the network.
The biggest barrier to doing a cloud for researchers is bandwidth, according to Cuff. Grid computing and supercomputing have been around forever, but until recently, you had to be close by to take advantage of that, as in, you had to be in the room with the computers. The issue was finding a way around that fairly binary problem so Cuff could leverage both scale and a modern high-density data center.
The answer was a 40 GBps fiber link to a facility in the heart of downtown Boston. The bottom
floor is Macy's, but the rest of the building is far more interesting.
It's a data center space The Markley Group runs, and Cuff uses his corner of the 7th floor for high-density computing, running everything from blade servers on InfiniBand to a 576 TB array of SATA drives to commodity servers on 40 GBPs Ethernet, a "commodity cloud" as he puts it. Touring the data center, Cuff said it sometimes boggles his mind how much activity takes place in such a small space.
"I like to show off the star simulation, that's always my favorite," he said with a smile, waving his hands in a small circle at the middle of one rack. "That's it! That's the entire galaxy in there." He's referring to a recent Harvard project to simulate the spiral formation of the galaxy, an experiment that postulated, calculated and extrapolated the movement of billions of stars over time.
Check out part two of this piece for more information on the community cloud.
Carl Brooks was Senior Technology Writer for SearchCloudComputing.com. Contact us at email@example.com.