Jul 10, 2016 Speeding up bosh create-env in Production with Proto-BOSH
bosh create-env is AWESOME. It lets you deploy BOSH itself using a BOSH manifest, making it really easy to customize your BOSH deployment as you see fit. It allows you to add a backup agent, some monitoring, and some troubleshooting tools, or even swap out the database with an HA alternative.
However, there is one large downside with
bosh create-env. When you go to perform any updates to it, it has to recompile everything you deploy to it, which in practice leads to 20-30 minutes of downtime on the BOSH director. While your deployed VMs will keep running during this time, no one can operate on them using
bosh. Additionally, if any VMs fail, they won't be resurrected until the
bosh create-env update has completed.
Fortunately, the folks at Pivotal's Ops team offered a suggestion a while back that we really like (Thank you, Pivotal!). I'm not sure if they have a name for it, but we have taken to calling it Proto-BOSH. In essence, you use
bosh create-env to deploy a BOSH director (proto-BOSH) which will be used only for deploying/updating other BOSH directors (regular BOSH). The regular BOSH directors will then deploy your production services, like Cloud Foundry. Now, when you go to upgrade your production BOSH director, it will stay online during the package compilations and will be down only for the duration of the VM rebuild (if necessary) and job/package updates.
To make this easier to get going, we've created the bosh-genesis-kit. It is a Genesis Kit for deploying your BOSH director using templates which can be used for both the Proto-BOSH, as well as the subsequent regular BOSH directors. Instructions on bootstrapping environments with the Proto-BOSH methodology can be found in the bosh-genesis-kit README