Skip to main content

This is a new service – your feedback (opens in a new tab) will help us to improve it.

Monitoring-as-code (MaC)

Last updated: 31 May 2023
Relates to (tags): Observability, Monitoring, Alerting, SRE

At the Home Office we follow the GDS Service Manual guidance on how to monitor the status of services and set performance metrics.

Teams should follow SRE best practices to manage the reliability of services, by using service level indicators (SLIs), service level objectives (SLOs) and error budgets.


Solution

The Home Office Monitoring-as-Code (MaC) is a monitoring and alerting framework. It’s a Jsonnet Mixin implementation of Service Level Indicators (SLIs), Service Level Objectives (SLOs) and Error Budgets. It uses Prometheus and Grafana, which are open-source monitoring and alerting systems.

Monitoring-as-code allows platform teams to:

  • create consistent Grafana dashboards and Prometheus rules across the entire service portfolio
  • monitor defined SLOs targets
  • measure service reliability

MaC is an open sourced framework available on Github.

Considerations

If you are only charting CPU and memory usage then you won’t see the full benefits of this framework, it is better suited to teams that have identified SLI(s) and SLO(s) ahead of time. MaC provides some sensible defaults - it is worth looking through the accompanying adopting SRE with Service Level Objectives guidance.