VMware vSphere offers 'big data' fault tolerance

Hadoop's success in the cloud may rest on virtualization. As IT pushes for more "big data" support, VMware claims it has the answer.

Although many IT shops want to host applications that process ginormous amounts of data in the cloud, the most...

popular “big data” platform requires dedicated hardware, which leads to reliability issues, among other concerns.

That could change with VMware’s Apache Software Foundation (ASF) open source project dubbed Serengeti. It will allow enterprises to deploy and manage Apache Hadoop on vSphere 5.0 in both cloud and virtual environments.

Hadoop on a virtual infrastructure could remove reliability concerns; with vSphere, a Hadoop application will be able to automatically restart if a node fails, according to company statements.

Additionally, the virtualization giant is working with members of the Hadoop community, including Cloudera Inc., Greenplum, Hortonworks, IBM and MapR, to contribute extensions to the ASF to make significant parts of Hadoop "virtualization-aware."

VMware’s Hadoop strategy: smart or misguided?
Some say VMware is wise to make adaptations for Hadoop on vSphere and become a player in the big data space.

With big data getting bigger every day, it is clear that there is a significant virtualization opportunity for big data-crunching workloads.

Al Hilwa, program director for application development software at IDC

"With big data getting bigger every day, it is clear that there is a significant virtualization opportunity for big data-crunching workloads,” said Al Hilwa, program director for application development software at Framingham, Mass.-based IDC.

Big data platforms such as Hadoop and other  distributed databases were the missing piece of the modern application stack in VMware’s vFabric application software, said Jeffrey Reed, director of application development for Logicalis Group, an enterprise cloud provider based in the U.K.

"If [VMware isn’t] going to provide [its] own Hadoop or Hadoop-like solution, it is critical that [it has] a strategy around Hadoop and it’s ecosystem of distribution vendors," Reed said.

Not everyone agrees with that analysis, however.

"VMware's approach to highly available Hadoop is misguided," said Shlomo Swidler, CEO of Orchestratus Inc., a cloud computing consultancy in West Hempstead, N.Y. "It offers high availability via infrastructure-level support, whereas software-level HA is the norm for modern applications," Swidler added.

Still, the moves constitute two halves of a strategy to cement VMware's position vis-à-vis Hadoop and HA, said one analyst.

"Most important is making Hadoop a first-class corporate citizen," said Tony Baer, principal analyst at research firm Ovum based in London. "Hadoop is not very fault tolerant and virtualization is one of the technologies [that will help accomplish that]," Baer added.

Serengeti, which is available as a free download via the Apache 2.0 license, allows admins to deploy a Hadoop cluster in a single click within minutes.

Further, VMware is working with its Hadoop partners to contribute changes developed for the Hadoop Distributed File System and Hadoop MapReduce to the Hadoop community.VMware also announced an initiative to support its Cloud Foundry on OpenStack cloud environments last month.

Stuart J. Johnston is Senior News Writer for SearchCloudComputing.com. Contact him at sjohnston@techtarget.com.

Dig Deeper on Big data and cloud business intelligence

PRO+

Content

Find more PRO+ content and other member only offers, here.

Join the conversation

4 comments

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

How does VMware's support for Hadoop affect your plans to store big data in the cloud?
Cancel
Hadoop, Big Data and vCloud is the future.
Cancel
Bring on the legacy killers.
Cancel
since with growing data for banks and other org, always needs to look for upgrading there system and also needs to looks for big security and challenges to save data, because of those reasons org./ banks might be going to look forward, where there pumping data would be centralized at one place without any hurdles,
Cancel

-ADS BY GOOGLE

SearchServerVirtualization

SearchVMware

SearchVirtualDesktop

SearchAWS

SearchDataCenter

SearchWindowsServer

SearchSOA

SearchCRM

Close