How to scale out any Cloud Foundry service

Introducing…

logo

TL;DR

  1. OMG YOU DIDN’T READ THE POST?!

  2. this post introduces Subway a Cloud Foundry service broker that allows you to horizontally scale out simple service brokers that don’t support horizontal scaling.

A single server service broker can only hold a limited number of service instances. To scale up the number of service instances – you need to scale outwards with more servers to run more service instances.

For example, if you’ve used Docker BOSH release then you’ve hit this problem. Subway is the solution.

It can make it very easy for you to write new Service Brokers without having to implement your own horizontal scaling solution.

UPDATE: We posted a demo animated gif.

Background

Since the dawn of time, Cloud Foundry deployments could install a set of primitive services to suit all developer needs. There is a simple Service Broker API and there was a collection of pre-implemented services which could grow and scale out with demand (cf-contrib-services).

cf-contrib-services did have a lot of features – chief amongst them was that you could run one or more servers to support the actual primitive service. If your Cloud Foundry users wanted a lot of Redis services then you could run a lot of Redis nodes.

Unfortunately, cf-contrib-services was awkwardly architected, difficult to extend to new primitive services, new versions of those services, and with new operations functionality.

In 2013 the Cloud Foundry core services team decided to abandon the project (it has been maintained by Stark & Wayne’s Ruben Koster over the years to keep it alive for those who used it), and they commenced a single service MySQL project. Thanks Ruben!

Fortunately, Pivotal’s Ferran Rodenas wrote a new service broker that could support any primitive service you could code up in a Docker image – cf-containers-broker. Relatively quickly there were 16 different primitive services supported via 16 different Docker images and some example deployment manifests for people using BOSH. To me it was the successor to the cf-contrib-services project.

And it used Docker so extending it with new privitive services was relatively easy (I wrote 4 of the 16 extensions in half a day). The containers were stored directly to persistent disks; and BOSH’s resurrector would recreate the server if anything bad happened to it. Impressive project, thanks Ferdy!

The problem

Unfortunately cf-containers-broker project targets a local Docker daemon. That means, the broker can only support a single node – one server running many Docker containers. You can only scale UP; but not OUT. You can run it on the biggest server and biggest persistent disk you can provision; but no bigger.

Other community services suffer a similar problem.

Pivotal’s Redis service cf-redis-release offers to run multiple Redis servers on a single server – but only on one single server. It does not have the ability to scale out the cluster of servers upon which redis-server processes can be run and data stored.

The solution

The solution we went for was to introduce a proxy service broker – a multiplexer – that runs in-between the Cloud Controller and the single-node backend service brokers.

We call it Subway – an underground tunnel between Cloud Foundry and Service Brokers.

logo

It makes it very easy for anyone to write a simple Service Broker, and to automatically have a scalable deployment solution. The operator can run one or more nodes of your Service Broker, and you didn’t have to implement the scaling functionality or maintain it.

diagram

Deployment

Currently Subway assumes it is brokering to a homogenous set of brokers – each supports the same catalog of services and plans. Though it is not necessary for each backend broker to actually offer each plan with any capacity.

If your backend brokers update their catalog, then Subway will automatically pick up the new catalog when you next cf update-service-broker.

See the README for deployment instructions of Subway as a Cloud Foundry application.

Upgrading to Subway

If you have an existing Docker service broker then it is easy to upgrade it to use Subway, and then commence scaling out your Docker service.

First, deploy Subway to your Cloud Foundry and configure it for the one backend that you have running today.

Second, update the Service Broker record in the Cloud Controller:

cf update-service-broker <current-name> <user> <pass> <subway-url>

The Cloud Controller will stop sending API requests to the backend broker, and start sending them to your new Subway broker. Nothing else changes. The Subway broker offers the same backend service catalog as your broker automatically.

Now you can deploy new backend service brokers – that is, more Docker broker servers.

Finally, add the new backend brokers to your Subway app (via environment variables at the moment, see README) and restart the app. Very simple.

How does it work?

See the README.

How does Subway store state?

Subway is stateless – it does not remember anything.

When it assigns a provisioned service instance to one of the backend brokers, it then forgets about it.

When a binding request arrives for a service instance, it sends off the request to all backend brokers until one of them says "thanks" and returns the binding credentials.

Perhaps this is the "spammer scheduling system".

But Subway is stateless and every sub-system that has no state is a wonderful thing for operations.

Does it work?

I’ve got a simple acceptance test harness running on Stark & Wayne CI that you can review:

example

Spread the word

twitter icon facebook icon linkedin icon