The ownership trio

Knowledge

  • Have a good grasp of the domain knowledge and the use cases that the product supports
  • Know the technology that the product is based upon including the runtime and infrastructure that serves it in order to be able to reason about the system behavior
  • Understanding the big picture architecture and how different pieces of solution interact with each other (this knowledge comes handy when doing a triage or root cause analysis)
  • Be able to interpret the observability data (logs, metrics, traces) and know how to use it to get the pulse of the system as well as incident root cause analysis
  • Understanding the docs (eg. API docs, runbooks, system diagrams, user docs, etc.)
  • Have access to read customer feedback (eg. App Store reviews, UX research results, etc.)
  • Understand the value of experimentation and how to carry experiments to validate hypothesis
  • Be able to analyze the result of A/B testing, user research, etc.

Mandate

  • Having control over defining observability requirements (metrics, logs, traces) as well as access to the data
  • Having control over defining the service level objectives (SLO) and server level agreements (SLA)
  • Having control over evolving the architecture as the requirements grow or change
  • Having control over changing the configs, code and infrastructure as code (IaC) without having to go through other teams
  • Having adequate access to the relevant systems in accordance to principle of least privilege
  • Having control over deciding the tradeoffs based on the impact and consequences (this mandate comes handy in the heat of the battle when dealing with an incident for example)
  • Decide how to best react to customer reviews
  • Has the power to approve/disaprove initiatives including hypothesis and tests (eg. A/B testing)

Responsibility

  • Be responsible for instrumenting the observability tooling and keeping it up to date and functioning
  • Be responsible for on-call duty as well as addressing incidents
  • Be responsible for when the error budget is burned and taking proper actions (eg. blocking deploys till more budget is available)
  • Have access to the right dashboards to be able to diagnose any issues and be able to update them as needed
  • Have access to the runbooks or any automation in place for troubleshooting and be able to update them as needed
  • Be the point of contact for supporting the service
  • Be held accountable for customer reviews
  • Is responsible for reacting to the data that comes out of testing hypothesis (eg. usability tests, user research, A/B testing, etc.)

Broken ownership trio

Only Mandate

Only Knowledge

Only Responsibility

Mandate + Knowledge but no Responsibility

Mandate + Responsibility but no Knowledge

Knowledge + Responsibility but no Mandate

Conclusion

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store