When it comes to managing on-premises systems, virtualization and cloud computing shops want quick answers to complex queries -- without dealing with the underlying infrastructure.
Some of those shops have taken a leap of faith and adopted emerging products that offer such "big data" analytics
It's made us better at doing our job.
vice president of SaaS technical operations, LimeLight Networks
"They can simplify views of very complex information into a webpage that anyone can look at," said Nathan Smith, senior engineer at Centered Networks, a San Francisco-based Desktop as a Service provider.
Smith has experimented with CloudPhysics, a cloud-based beta offering that aggregates data using big data techniques. It aims to compare data from different customers to provide deeper insights into virtual and cloud infrastructures, and it draws from VMware Inc.'s vCenter statistics rather than machine logs.
"You don't need to be an expert … because it's broken down in [a] very simple, tabulated format," Smith said.
Centered Networks hopes to more easily troubleshoot storage performance issues in its VMware infrastructure and speed up its customer service responses with CloudPhysics' big data analytics Software as a Service once it hits general availability, Smith said.
"We can see from a storage perspective how the storage thinks it's doing as far as performance, but that doesn't necessarily tie up with ESX thinking that the storage is performing the way it needs to," he added.
How big data analytics Software as a Service works
Other companies offering big data analytics Software as a Service include Sumo Logic, Splunk, AppFirst and ScaleXtreme. The information their services gather is anything but simple.
LimeLight Networks, a Web presence management company, supports customers through its own SaaS offering, behind which is a pool of about 600 physical servers. Within that infrastructure is a Web content management business, which accounts for about half of those servers. The company became an early adopter of Sumo Logic's service after failed attempts to aggregate its machine logs on-premise.
"I can start typing in a query that will give me data across two different data centers, multiple different apps, and it's all essentially happening in real time," said Tom Cignarella, vice president of SaaS technical operations for Tempe, Ariz.-based LimeLight Networks, an early adopter of Sumo Logic's service.
LimeLight has so far deployed Sumo Logic on the servers in its Web content management business.
"We'd always had issues with disk space and how to ship things around and all the headaches that come along with that," Cignarella said.
Sumo Logic, officially launched in January, aggregates machine log data in the cloud and makes it available for search by customers. In the background, it uses some components of the Hadoop Distributed File System and Cassandra, along with its own distributed database. Also, Sumo Logic uses a patent-pending algorithm it claims can shrink a million log messages down to 10 to 15 categories, thereby getting to the root cause of issues in the customer's cloud infrastructure faster.
"It's made us better at doing our job," Cignarella said. "We could always dig through individual log files, but this has made it easier to narrow things down more quickly."
Big data analytics Software as a Service still evolving
Some of Sumo Logic's capabilities remain a promise for now, as it's still a very young company. For example, Cignarella and his staff can run search queries on aggregated log data, but dashboards that would track the most common queries are still on Sumo's roadmap. Comparative analysis of LimeLight's log data with other Sumo customers, another facet of Sumo's big data analysis, is also something Cignarella hasn't looked into. Another common aspect of big data analytics, anomaly detection, has yet to be incorporated into the product as well.
Sumo Logic officials declined to comment on product features that haven't yet been announced. CloudPhysics also has a ways to go before it's a finished product, Smith said.
"It's kind of an evolving product, and it's a process for us, where we're trying to figure out what else we can get from it," he said.
CloudPhysics' software boils down its assessments of individual customers' environments and presents them in a tiled interface, with each tile (called a "card") showing information about a different aspect of the infrastructure. Centered Networks has been helping CloudPhysics develop its storage performance card, for example, and during that process, ideas for new cards have sprung up, Smith said.
"Right now the card we're working on is changing the HA configuration, so [like a] simulator … it shows you what the actual impact would be to your hosts," he said.