Ceph: the future of Storage

storage-cloud1

Overview

Since joining Red Hat in 2015, I have intentionally stayed away from the topic of storage. My background is storage but I wanted to do something else as storage became completely mundane and frankly boring. Why?

Storage hasn’t changed much in 20 years. I started my career as a Linux and Storage engineer in 2000 and everything that existed then, exists today. Only things became bigger, faster, cheaper, due to incremental improvements from technologies such as flash. There comes a point however, where minor incremental improvements are no longer good enough and a completely new way of addressing challenges is the only way forward.

I realized in late 2015 that the storage industry is starting a challenging period for all vendors but, didn’t really have feeling for when that could lead to real change. I did know that the monolithic storage array, built on proprietary Linux/Unix, with proprietary x86 hardware we all know and love, was a thing of the past. If you think about it storage is a scam today, you get opensource software running on x86 hardware packaged as a proprietary solution that doesn’t interoperate with anything else. So you get none of the value of opensource and pay extra for it. I like to think that economics like gravity, eventually always wins.

Challenges

There are many challenges facing our storage industry but I will focus on three: cost, scale and agility.

Cost

Linux has become a great equalizer in storage and fueled storage startups over the past 3-5 years, allowing them to quickly build innovative products that challenge traditional vendors. Storage companies not building on Linux, are forced to be operating system companies in addition to storage companies, a distinct disadvantage. Their R&D costs to maintain proprietary storage operating systems reduce overall value, slow innovation and in the end, increase costs. In addition most storage platforms today have high vendor lock-in, offering little choice. You usually can’t choose hardware and therefore you are often paying a premium for standard x86. Disks are a great example, typically they cost twice what you would pay through Amazon.

“Oh but our disks are rigorously tested and have a mean time between failure of xxxxxxx so the cost is justified.”

No it isn’t! We need a new way of thinking, what about if disks failures weren’t a big deal, didn’t cause impact and storage system automatically adjusted?

While these points may be interesting, at the end of the day, cost comes down to one thing. Everyone is measured against Amazon S3. You are either cheaper than S3 (3 cents per GB per Month) or you have some explaining to do. If we consider small or medium environments, it may be doable with hyper-converged storage, but as soon as scale and complexity come into play (multiple use-cases with block, file and object) those costs explode. If we are talking about large environments, forget it. Even if the cost could get down to S3 range, things wouldn’t scale across the many use-cases in a large enterprise.

Scale

Scalability is a very complex problem to solve. Sure everything scales until it doesn’t but jokes aside, we have reached a point where scalability needs to be applied more generally, especially in storage. Storage arrays today are use-case driven. Each customer has many use-cases and the larger the customer the more use-cases. This means many types of dissimilar storage systems. While a single storage system may scale to some degree, many storage systems together don’t. Customers are typically stuck in a 3-year cycle of forklift upgrades, because storage systems don’t truly scale-out they only scale-up.

Agility

The result of many storage appliances and arrays is technology sprawl, which in turn creates technology debt. Storage vendors today don’t even have a vision for data management within their own ecosystem, no less interoperating with other vendors. Everything is completely fragmented to point where a few use-cases equals a new storage system.

Storage systems today require a fairly low entry cost for management overhead, but as the environment grows and scales those costs increase, reducing agility. They don’t stay consistent, how could they when a unified data management strategy is missing.

Storage systems are designed to prevent failure at all costs, they certainly don’t anticipate failure. At scale of course, we have more failures, this in turn correlates to more time spent keeping the lights on. The goal of every organization is to reduce that and maximize time spent on innovation. Automation suffers and as such, it becomes increasingly harder to build blueprints around storage. There is just too much variation.

I could certainly go on and there are other challenges to discuss, but I think you get the picture. We have reached the dead-end of storage.

Why Software-defined?

As mentioned, the main problem I see today is for every use-case, there is a storage array or appliance. The startups are all one-trick ponies solving only a few use cases. Traditional storage vendors on the other hand throw different storage systems at each use-case, cobble them together in some UI and call that a solution. You end up with no real data management strategy and that it is truly what is needed. I laugh when I hear vendors talking about data lakes. If you buy into storage array mindset, you end up at same place, a completely unmanageable environment at scale, where operational costs are not linear but constantly going up. Welcome to most storage environments today.

As complexity increases you reach a logical point where abstractions are needed. Today storage not only needs to provide file, block and object but also needs to interoperate with large ecosystem of vendors, cloud providers and applications. Decoupling the data management software from the hardware is the logical next step. This is the same thing we have already observed with server virtualization and are observing in networking with NFV. The economics of cost and advantages of decoupling hardware and software simply make sense. Organizations have been burned over and over, making technology decisions that later are replaced or reverted because newer better technologies become available in other platforms. Software-defined storage allows an easy introduction of new technologies, without having to purchase new storage system because your old storage was designed before that technology was invented. Finally storage migrations. Aren’t we tired of always migrating data when changing storage systems? A common data management platform using common x86 hardware and Linux could do away with storage migrations forever.

Why Ceph?

Ceph has become the defacto standard for software-defined storage. Currently the storage industry is at beginning of major disruption period where software-defined storage will drive out traditional proprietary storage systems. Ceph is of course opensource, which enables a rich ecosystem of vendors that provide storage systems based on Ceph. The software-defined world is not possible without opensource and doing things the opensource way.

Ceph delivers exactly what is needed to disrupt the storage industry. Ceph provides a unified scale-out storage system based on common x86 hardware, is self healing and not only anticipates failures but expects them. Ceph does away with storage migrations and since hardware is decoupled, gives you choice of when to deploy new hardware technologies.

Since Ceph can be purchased separately from hardware, you have choice, not only whom you buy Ceph from (Red Hat, Suse, Mirantis, Unbuntu, etc) but also, whom you purchase hardware from (HP, DELL, Fujitsu, IBM, etc). In addition you can even buy ceph together with hardware for an integrated appliance (SanDisk, Fujitsu, etc). You have choice and are free from vendor lock-in.

Ceph is extremely cost efficient. Even the most expensive all-flash, integrated solutions are less than S3 (3 cents per GB per Month). If you really want to go cheap, you can purchase off-the-shelf commodity hardware from companies like SuperMicro and still get enterprise Ceph from Red Hat, Suse, Ubuntu, etc while being a lot cheaper than S3.

Ceph scales. One example I will give is Cern 30 PB test. Ceph can be configured to optimize different workloads such as block, file and object. You can create storage pools and decide to co-locate journals, or put journals on SSDs for optimal performance. Ceph allows you to tune your storage to specific use-cases, while maintaining unified approach. In tech-preview is a new feature called bluestore. This allows Ceph to completely bypass file-system layer and store data directly on raw devices. This will greatly increase performance and there is a ton of optimizations planned after that, this is just the beginning!

Ceph enables agility. Providing unified storage system that supports all three storage types: file, block and object. Ceph provides a unified storage management layer for anything you can present as a disk device. Finally Ceph simplifies management, it is the same level of effort to manage a 10 node Ceph cluster or a 100 node Ceph cluster, running costs remain linear.

Below is a diagram showing how Ceph addresses file, block and object storage using a unified architecture built around RADOS.

CEPH1

source: http://docs.ceph.com/docs/hammer/architecture

Summary

In this article we spent time discussing current storage challenges and the value of not just software-defined storage but also Ceph. We can’t keep doing what we have been doing for last 20+ years in storage industry. The economics of scale have brought down barriers and paved the way for a software-defined world. Storage is only the next logical boundry. Ceph being an OpenSource project is already the defacto software-defined standard and is in position to become the key beneficiary as software-defined storage becomes more mainstream. If you are interested in Ceph I will be producing some how-to guides soon, so stay tuned. Please feel free to debate whether you agree or disagree with my views. I welcome all feedback.

(c) 2016 Keith Tenzer

5 thoughts on “Ceph: the future of Storage

  1. Are you sure about this sentence?
    ” Ceph can run on VMs or physical hardware so you can easily bridge private and public clouds. ”
    I know that RedHat does not support Ceph on virtual machines and so on public cluouds. In this scenario only GlusterFS is officially supported, isn’t it?

    Like

  2. Yo keith, what up?! Been a long time, thanks for this article. I like what I see with Ceph, although I believe there are still specific use cases that justify the siloed super-high performance array solutions. But by and large I agree that SDS is the future.

    One thing I’m concerned about with Ceph is the note about this latest release (Jewel) that the OSD backend Bluestore is experimental:
    http://ceph.com/releases/v10-2-0-jewel-released/
    I’d be very hesitant about running this in production when the object store component of Ceph is experimental. That said, looks like it’ll be considered stable in the next release or the one after.

    Like

    • Hey Dave! Great to hear from you after all these years. Yeah I agree there will always be applications that need t1 storage arrays but I think we reached 80% or maybe 50% point where Ceph is good enough.
      As for bluestore it is tech-preview in Jewel same w/CephFS as you mention. Tech-preview means NEVER production, purpose it to release technology before its ready for prod so customers can experiment and we can have quick feedback loop. Bluestore I think will mature quick though, Ceph community gave up on BRTS since it looks like that wont mature anytime soon and are focused on bluestore. I will update my blog to make it clear I meant testing only and tech-preview, thanks for pointing that out.
      Lets meet up around xmas? Ill be in town for sure. Tell everyone I send my best and miss em.

      Keith

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s