https://new.jameshunt.us

The Future of BOSH & Kubernetes

There's been a lot of talk over the past few years about the relevance of Cloud Foundry and BOSH in the face of that titan of container runtimes, Kubernetes. I'd like to throw my predictions out there, and hopefully articulate my vision of whither BOSH, whither Kubernetes.

Some Background & Context

BOSH is, at its core, a VM orchestration engine. It operates on an impressive selection of clouds / IaaSes, including the big ones: AWS, Azure, GCP, vSphere. It creates VMs, provides them with software and configuration, and supervises them. If any VMs crash or go missing, BOSH recreates them.

Cloud Foundry currently sits on top of BOSH (it's packaged exclusively for BOSH, and canonically runs as a bunch of VMs). It provides what we call the cf push story.

It goes like this:

$ cd ~/code/killer-app
$ cf push

Yup. That's it. The power of Cloud Foundry is that deploying applications really can be that simple. Even with persistent data services, custom domains, path-based routing, and raw TCP requirements. cf push.

Under the hood, Cloud Foundry has its own container runtime, called Diego, which supervises and scales application instances (really: containers). If any application instances crash or go missing, Diego recreates them.

Kubernetes is the darling of the container world, and provides a large and flexible toolkit for making containers and OCI-compliant images, like those created by docker build ..., into a usable system, built on pods (really: containers) If any pods crash or go missing, Kubernetes recreates them.

So we've got three supervisors, each tasked with keeping shit running. Two of them leverage containers, and one deals almost exclusively in VMs.

The $64M Question - How do I deploy X?

Do I use BOSH? Do I create a BOSH release for the software and deploy it via my favorite director on my chosen IaaS?

Do I use Cloud Foundry + Diego, and package my software up in a form that can be containerized for me (via a buildpack)? Do I just push a Docker container into CF?

Do I use Kubernetes? Do I package up my software in Docker images, and write a Helm chart to handle the nitty-gritty of rollout?

There doesn't seem to be a lot of consensus on where to go from here. Here's what I've seen:

1. Apps on Cloud Foundry; Services on Kubernetes

Cloud Foundry holds to the 12-factor path, one tenet of which is thou shalt not keep state. After all, you can't scale a node that has to replicate its local state across an ever-growing cluster. In CF land, your persistent data goes in your services.

Under this approach, Kubernetes is where you deploy your services, since Kubernetes can deal with state. Everything is still a container, so operators get to multiplex lots of tenants onto a shared set of machines. The advanced resource- and CPU-share capabilities of Kubernetes help to mitigate noise-neighbor problems (which often drove us to wanting on-demand, dedicated VMs for our services).

There are two downsides to this:

  1. Operators still have to figure out how to stand up (and maintain!) their own Kubernetes clusters.
  2. Cloud Foundry is still using Diego, so you have two distinct runtimes, leading to more waste in unused slack space for future expansion.

As near as I can tell, this is the entire strategy of CFCR, and its commercial big brother, PKS. Use BOSH to deploy a Kubernetes cluster. This gets you all of the value of the BOSH lifecycle - VM replacement, semi-immutable VM deployments, and IaaS-agnosticism.

The main problem I see here is that Kubernetes is still a walled garden, separate from Cloud Foundry's Diego.

3. Replace Diego with Kubernetes

Diego is a container runtime. Kubernetes is a container runtime. As they say in old Westerns, this town ain't big enough for the both of us. As an operator, I'd really rather have one container runtime, that everyone gets to use.

That's what project Eirini from SuSE/IBM is setting out to do. Since runtime / container orchestrators are commodity, why not unhitch Diego from Cloud Foundry's API and allow system designers to swap in Kubernetes? (or Swarm? or Nomad?)

This neatly solves the waste problem of having more than one runtime. Cloud Foundry itself (the routing layer, UAA, database, API, etc.) is all still deployed via BOSH, and you can do what you want with services. I'd wager, however, that more operators and service integrators will move to Kubernetes for the sheer audience size, and abandon BOSH for non-CF things.

4. Treat Kubernetes as an IaaS for BOSH to use

This one is a bit odd, but there is a project out there on GitHub that provides a BOSH Cloud Provider Interface for Kubernetes. BOSH uses the CPI abstraction to deal with an idealized IaaS, and uses CPI plugins to adapt to different real-world clouds. There is an AWS CPI, for example, and one for OpenStack.

The Kubernetes CPI for BOSH translates VM operations into Pod operations.

This would work, and probably work well, were it not for the fact that BOSH is predisposed to thinking in terms of virtual machines. Since VMs take a while to provision, BOSH does in-place upgrades. In a purely container solution, this is undesirable, as it breaks the immutable containers promise.

The Future: Shiny and Bright

I'm an architect by trade, so I live in the future where everything works and nothing is broken. Here's where I see us going:

BOSH Deploys my Kubernetes Clusters

Let's let BOSH do what BOSH does best: provision and maintain virtual machines. It's really good at it, and it has lots of experience. I have other tools at my disposal that build on top of the BOSH paradigm, like Genesis. Those continue to work really well for the Kubernetes clusters themselves.

Kubernetes Deploys ... Everything Else

All the other bits of the infrastructure that I'm used to get deployed on top of Kubernetes.

SHIELD becomes a Helm chart.

My monitoring system is a set of containers, either in situ with the monitored workloads, or off-cluster (in another k8s).

Blacksmith is both deployed on Kubernetes, and also deploys service instances to it.

Concourse no longer has workers, pipeline jobs are one-off tasks inside the k8s cluster.

Cloud Foundry Leverages Kubernetes

I don't deploy any Diego cell VMs via BOSH. I don't deploy Diego cell containers and nest my containerization. Cloud Foundry (via Eirini or something like it) is scheduling application instance containers directly onto the Kubernetes substrate.

Parting Thoughts

BOSH and Kubernetes play important and complementary roles in the modern cloud data center. Let's let BOSH deal with the VMs and let Kubernetes handle the containers.

In the end, I think we'll all be better off.

James (@iamjameshunt) works on the Internet, spends his weekends developing new and interesting bits of software and his nights trying to make sense of research papers.

Currently exploring Kubernetes, as both a floor wax and a dessert topping.