Keeping an Eye on Cloud Foundry
This is the third post in the series about keeping an eye on Cloudfoundry. Click here for the previous post.
What have we done?
At Stark & Wayne we help our clients integrate cutting edge technologies in to their stack. Each of us is also tasked with extending these projects for the good of the community. Often we get to do both. Recently the good folks at Swisscom needed to aggregate information about Cloud Foundry in to their Consul Cluster. You can get more information on Consul here.
In this case Consul is being used as part of a larger health monitoring system. So our primary use case is to sync up the BOSH Monitor heartbeats for each deployed component with TTL checks in Consul. Along the way we have also added the ability to forward all events on the BOSH NATS bus to your Consul cluster.
How does it work?
The Consul plugin works by forwarding NATS heartbeat events and alerts to a Consul server or agent. The NATS messages can be forwarded as TTL checks and events. Heartbeat messages will be forwarded as TTL checks, each time a heartbeat occurs it will update the TTL check with it’s status. If Consul does not recieve a success message within the TTL threshold it will put that component in to a failing status.
When a non heartbeat event or alert occurs it can also be forwareded to Consul as an Event. In our case we are using this information for event correlation to provide an appropriate automated response.
How do I use it?
Using a BOSH Monitor plugin is covered in part one of this series. The option to enable this feature is consul_event_forwarder_enabled
.
consul_event_forwarder_enabled: true
consul_event_forwarder:
host: hostname.of.consul
events: true
ttl: 600s
namespace: ns/
heartbeats_as_alerts: true
The options available to the plugin are as follows:
- host: The hostname of your consul cluster
- namespace: A namespace to identify a single Cloud Foundry
- port: Defaults to 8500
- protocal: Defaults to HTTP
- params: Can be used to pass access token “token=MYACCESSTOKEN”
- ttl: TTL Checks will be used if a TTL period is set here. Example “120s”.
- events: If set to true heartbeats will be forwarded as events to consul
- ttl_note: A note that will be passed back to consul with a TTL check
- heartbeats_as _alerts: If set to true all heartbeats will also be forwarded as event
BOSH Talks too much!
It turns out that when we added the events forwarding feature, many of the events were too large for Consul to handle which resulted in an error on the Consul logs. So we had to put the NATS messages on a diet. When heartbeats are sent as alerts the format has been made more concise to come in under the event payload bytesize limits that consul enforces
{
:agent => agent_id,
:name => "job_name / index",
:state => job_state,
:data => {
:cpu => [sys, user, wait]
:dsk => {
:eph => [inode_percent, percent],
:sys =>[inode_percent, percent]
}
:ld => load,
:mem => [kb, percent],
:swp => [kb, percent]
}
}
Whatever! Show me the code already!
This plugin is availabe as of BOSH Release stable-2980 and should be available in Microbosh instances at that version or later.