Two of the most common OpenStack storage options are Swift, which is being developed as part of the OpenStack project,...
and Ceph, an independent open source system. Both options offer object storage, and can be downloaded for free. As a result, it can be difficult to choose between the two. Here are some considerations for evaluating Swift vs. Ceph for OpenStack storage.
Support can be a challenge for both Swift and Ceph -- and there are two options. Organizations can add staff to handle both the underlying hardware and open source software, or buy a supported code distribution, which comes with software support and configuration expertise.
Many vendors support Swift, each offering their own OpenStack distribution. Support can be software-only or include hardware, as well, if you buy a vendor's pre-integrated OpenStack system. Until a couple of years ago, Ceph was supported by a startup company, Inktank, but is now fully supported by Red Hat. There are plenty of vendors selling pre-integrated Ceph appliances and addressing hardware support.
Acquisition and support are on a somewhat level playing field. Ensure that after-sales drive add-ons are reasonably priced, as some major vendors ask for huge markups on drives. Generally, Ceph vendors use commercial off-the-shelf drives and allow users to purchase standard drives from distribution, while some Swift vendors are more proprietary and require you to buy their drives.
Compare the functionality, maturity of Swift vs. Ceph
Ceph is a mature product, with lots of usage already. But it isn't wrinkle-free, as some parts of Ceph, such as the object storage daemon (OSD) code, are still under major renovation. Ceph also has filer and block-IO access mode support, and has been demonstrated by CERN to scale to large sizes.
Swift is also mature. However, large OpenStack deployments are still rare, so Swift scalability remains somewhat untested. Swift also entered the arena a couple of years after Ceph and has been playing catch-up since. As a result, some Swift developers are now focused on roadmap details that could help further differentiate Swift from Ceph.
This, currently, is leading to the development of proprietary Swift APIs that not only differ from Ceph, but also from Amazon Simple Storage System. Resistance to yet another set of interfaces is building and unless there are strong reasons for the divergence, Ceph's market share might grow.
Looking at roadmaps, the Ceph Special Interest Group is articulating a good story. Red Hat and SanDisk recently partnered to improve SSD and flash performance in Ceph, in anticipation of hard drive usage declining in the next few years. One known deficit of Ceph, however, is the intense back-end traffic that can create performance bottlenecks. Erasure coding, as opposed to replication, improves traffic levels, and a Red Hat partnership with Mellanox allowed remote direct memory access and fast LAN links to improve throughput and response time.
Further improvement is in the works, according to Red Hat. For example, Ceph's OSD code, which drives storage devices, is being rewritten and tuned for higher performance. Ceph code is also already structured for software-defined infrastructure and can be easily virtualized and replicated easily. This makes Ceph suitable for hyper-converged configurations.
Getting quizzical with open source cloud computing
An apple on the teacher's desk won't help you with this open source cloud quiz, which includes facts on OpenStack.
Data consistency in Swift vs. Ceph
Swift and Ceph differ in terms of data consistency management. Swift offers eventual consistency, where some of the replicas of a data object are written asynchronously from the first copy. This exposes the possibility of an incomplete update returning wrong data, but it works well when the replicas are in different geographical regions.
Ceph uses a synchronous process that requires a quorum of replicas to be written before acknowledging write complete. This guarantees consistency, but adds latency if the remote site has to be part of the quorum. You can overcome these issues by choosing the right replica placement or by setting controls. This is also true for the exposure of Swift to incomplete writes, where the write_affinity setting can be used to force a quorum based on multiple local writes.
While the write quorum issue has a huge impact on performance, it can be resolved to only local storage in either case.
In the Swift vs. Ceph race for OpenStack storage, it would seem that Ceph is winning -- at least right now. But to complete the OpenStack storage story, it's important to address block-IO. The OpenStack Cinder project addresses this, providing a front end for a wide variety of SAN- and LAN-based networked storage. Traditional block-IO software, such as iSCSI, is used in these boxes. There is no competitive software stack to Cinder.
How do I determine my OpenStack version?
Tips to manage an OpenStack private cloud
Customize the OpenStack Horizon dashboard