Private clouds offer agility, flexibility and an operational expenditure pricing model -- but performance, especially...
related to storage, can be a challenge. When managing a private cloud, admins need to look under the hood to avoid storage gridlock.
All users in an organization share private cloud storage. This is possible because it is networked storage and, depending on the data center configuration, may be steps away from the server instance. Network and protocol latencies reduce the bandwidth available, but the real hit to private cloud storage performance comes from the fact that the storage farm is shared. Hard disk drives -- still the dominant form of storage in cloud environments -- can each deliver around 150 I/O operations per second (IOPS), irrespective of capacity. With installed drives now at around 4 terabytes (TBs), this means just a quarter of the IOPS per TB of three years ago. In other words, storage is slowing down.
Performance variability problems also come from the shared nature of private cloud storage. I/O tends to be bursty rather than steady. If traffic is light, a burst of I/O will get rapid service -- well in excess of average rates. Sometimes, though, a noisy neighbor hogs the I/O system for an extended period, causing lags for all other tenants on a server and longer runtimes for jobs.
SSDs boost cloud storage performance
One way to tune your private cloud storage farm is to emulate the large cloud service providers. CSPs present a variety of instance choices for both compute and storage. Persistent solid-state drive (SSD) storage and all-flash caches are options you can deploy to increase IOPS and, with SSD prices becoming more comparable to hard-drive disk (HDD) prices, we can expect SSD persistent storage to become the option of choice within the next three years.
SSDs boost IOPS by huge factors, ranging from a 100x to 1000x increase over HDDs. This performance jump will obviate the slowness of shared network storage, but the network and storage appliance controllers now become the bottleneck and still remain subject to congestion and noisy neighbors. Faster networks than today's typical 10 Gigabit Ethernets (GbE) are just reaching market, such as 25 GbE and backbone versions with four lanes per link.
Marc Staimer, founder of Dragon Slayer Consulting, discusses the characteristics you want from an object-oriented storage systems and the challenges that could arise.
Faster networking can remove bottlenecks and boost cloud storage performance, though deployment will take some time and the noisy neighbor issue remains. One solution, which is an alternative to using high-priced SSD persistent storage, is to have local instance SSD stores in the servers. Instance storage is nonpersistent and requires a network replication for writes, but that substantially reduces the I/O load in use cases with the typical one-write-to-eight-read ratio. While it is a good practice to mirror all writes to networked persistent storage, most reads can be local.
With instance storage, there are enough I/Os from the SSD to mask noisy neighbors and provide a massive performance boost to apps. You can avoid rewriting I/O routines if the operating system (OS) supports an asymmetric mirror I/O mode.
Storage lessons learned from Google Cloud Platform
Google Cloud Platform is notable for having 2% variability in I/O rates, according to a report from cloud analyst firm Cloud Spectator. This suggests the cloud provider has internal mechanisms to deal with noisy neighbor problems, as well as efficient I/O delivery mechanisms. While Google treats tuning as a trade secret, we do know that the company localizes instances to its stored data, which reduces latencies.
More than likely, Google's orchestration software monitors I/O rates by instance and can adjust quality of service to compensate for high-loaders, or even move instances away from heavy loads. In a private cloud, this may require special, high I/O instances with constraints on server placement.
The role of in-memory, container technology
In-memory alternatives will likely drive the future of I/O in the cloud. Google, for example, already has RAM disk storage options, and nonvolatile dual in-line memory modules (NVDIMMs) will make that much more affordable, while extending memory to many TBs. NVDIMM memory will replace local instance storage as the preferred I/O speed up. The NVDIMM approach fits well with a hyper-converged topology, too, bringing all storage and compute into close proximity.
Containers add a little zest to the cloud storage performance issue. The containerized approach to virtualization increases the instance count on any given server by three to five times. While only one OS image is needed, ongoing I/O per server will increase considerably. However, with the SSD and NVDIMM changes described above, there should be plenty of I/O to go around.
Pay attention to storage software choices when building up private cloud storage. Red Hat and SanDisk have partnered to tune Ceph when using SSDs, which will play well in the OpenStack community. Scality and Caringo are also active in this area, while on the integrated appliance side, DDN's WOS system is reporting very high performance.
Understand your cloud storage availability and durability
Read about top cloud storage news from 2015
Eight mistakes enterprises make in private cloud