Stark & Wayne

Company A - Avoiding Doomsday

To request a downloadable PDF, click here.

Primary Client

A large industrial farm equipement manufacturer.

Background

Company A had three separate Cloud Foundry environments up and running; two production environments, both with assigned data centers that had been running for a year, and a third development environment that was not yet linked to a data center.

The production environments were critically important to the day-to-day operation of the business, as they hosted dealership-related services.

With the loss of a key employee and the institutional knowledge that went with him, the remaining team members at Companuy A weren't actively monitoring certificates or the respective re-issuance dates.

Due to problems during this transition of responsibility, some key re-issuance dates had already passed, leading to the expiration of the BOSH-related certificates for the development environment, thus making it inoperable.

Soon after the development environment went down, the BOSH-related certificates for the first production environment also expired, leading to another outage.

Fortunately, Company A's high-availability design forced all traffic to go through the functioning second production environment, enableing Company A's operations to continue running smoothly.

These outages acted as a clear warning to the Company A team, alerting them to the problem of expiring certificates along with the potentially disastrous consequences of having the second production environment become inoperable.

Because of the sequence of certificate expirations, it also gave the team a three-day warning period before the BOSH certificate for the second production environment would expire.

The clock was ticking and, with the possibility of having day-to-day operations grinding to a halt, the team at Company A called on Stark & Wayne for assistance.

Problem Area(s)

With uptime being critical to operations and a tight deadline to avoid having all three environments become inoperable, Stark & Wayne engineers called on their experience to find a solution to keep the second production environment running smoothly and get the other two envinronments working again.

Having only three days until the third BOSH certificate expired, the initial effort to have certificates re-issued would have taken too long, so another approach was needed if the objective of keeping the environment(s) running was to be achieved.

That's when Stark & Wayne's experience with Cloud Foundry and our collective approach enabled us to find a solution.

Solution

Stark & Wayne engineers quickly gained access to BOSH and identified each certificate that had expired.

With reissuing the certificates ruled out as a viable option, the experts at Stark & Wayne were undeterred and devised another plan for what to do.

Relying on their experience, the Stark & Wayne team decided to extract the certificates from the existing deployments and create renewal requests.

By taking this approach, Stark & Wayne was able to extend the expiration dates on the existing certificates, leveraging BOSH deployment paradigms to update the certificates for each VM through purpose-built software using Company A's last standing production environment.

This secondary approach, unfamiliar to most people with little Cloud Foundry experience, helped Company A avoid what could have been a catastrophic period of downtime, allowing Stark & Wayne to keep the one production environment running and restore the two inoperable environments.

During this engagement, Stark & Wayne leveraged their Collective approach, which advised the onsite team to use the Stark & Wayne Safe certification management tool for the job.

This choice of tool, based on all of Stark & Wayne's considerable experience, was an important factor in having the onsite team manage the process more effectively, allowing them to complete the job of renewing certificates a full day ahead of the project's three-day deadline.

To help prevent future outages, Stark & Wayne also taught Company A's operation team how to use Doomsday, a Stark & Wayne-developed tool used to alert personnel automatically when certificates are nearing their expiration date. This reduces the team's reliance on tracking certificate expiration manually.

Next Steps

Stark & Wayne supplied best practices architecture recommendations to Company A, enabling all the features to Cloud Foundry and helping the client avoid BOSH certificate problems going forward.