Skip to content
View rsionnach's full-sized avatar

Block or report rsionnach

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
rsionnach/README.md

Hi, I'm Rob πŸ‘‹

Senior SRE β€’ Shift-Left Reliability β€’ Open Source
Creator & Maintainer of NthLayer


NthLayer β€” Shift-left reliability for platform teams.

NthLayer is an open-source Operations-as-Code engine that generates the entire observability and reliability stack from a single YAML file.

Most reliability decisions happen too late β€” after deployment, during incidents, in postmortems. NthLayer moves them earlier:

Problem NthLayer Solution
SLOs set in isolation Validate against dependency chains
Alert when budget exhausted Predict exhaustion with drift detection
Missing metrics found in incidents Enforce before deployment
"Is this ready?" = opinion "Is this ready?" = deterministic CI check
pip install nthlayer
nthlayer check-deploy --service payment-api

β†’ github.com/rsionnach/nthlayer


πŸ’‘ The Thesis

Reliability has a timing problem. We've invested heavily in incident response β€” better alerting, faster recovery, thorough postmortems. But when in a service's lifecycle do we define reliability?

GitHub gave us version control for code. Terraform gave us version control for infrastructure. Security has shift-left. Reliability should too.

I wrote about this: Shift-Left Reliability


πŸ”­ Current Work & Focus

  • Drift detection β€” Predict SLO exhaustion before it happens
  • Dependency intelligence β€” Calculate what SLO targets are actually achievable
  • CI/CD gates β€” Block deploys when error budget is exhausted
  • Metric enforcement β€” Validate OpenTelemetry conventions before production

πŸ“« Connect

Pinned Loading

  1. nthlayer nthlayer Public

    Generate the complete reliability stack from a service spec in 5 minutes. Dashboards, alerts, SLOs, PagerDuty - zero toil.

    Python 14 1