Is private, public or hybrid cloud better for a big data project?
Organizations can deploy a big data project on private, public or hybrid clouds. However, your cloud environment selection could significantly impact cost, technical demands and other factors.
To provide a dedicated cloud that a business owns and operates, private clouds use virtualized on-premises storage and computing resources. Organizations requiring direct cloud environment control -- usually due to security or regulatory constraints -- typically choose private cloud. Since private clouds exist within on-premises data centers, these environments require additional storage and computing resources, as well as software like Hadoop, to support big data. A business must absorb those infrastructure costs and handle any technical or architectural issues that arise. As a result, businesses typically don't deploy big data on private clouds.
Third-party providers create and operate public clouds that share physical resources for networking, storage and computing. Users upload and operate workloads in the provider's cloud. Since public cloud providers support many users, the computing infrastructure is far more vast and scalable than that of private cloud. Users can scale up to harness massive amounts of computing power with distributed computing software, but only pay for the resources they use. To lower operating costs, unused resources are released once the computing job is finished. Public cloud represents a "utility" computing model, and is ideal for on-demand big data tasks.
Hybrid clouds merge private and public cloud, allowing workloads to migrate between the two through orchestration. When additional compute is necessary, hybrid clouds use public cloud resources -- a feature known as cloud bursting. Private cloud supports basic workloads, while public cloud resources temporarily accommodate spikes in demand. This feature can also support big data analytics. However, organizations rarely use hybrid clouds for big data projects because public cloud is simpler and takes advantage of any long-term price concessions from the public cloud provider.
As organizations rely more on diverse data sets to make decisions, big data is growing in importance. But this is just the beginning. Technologies like the Internet of Things promise a tsunami of new data for businesses, scientists and governments to analyze. While big data is not contingent on the cloud, cloud facilitates big data storage and analytics, providing scalable, on-demand computing at a reasonable cost.
About the author:
Stephen J. Bigelow is the senior technology editor of the Data Center and Virtualization Media Group. He can be reached at firstname.lastname@example.org.