We meet virtually, on the third Tuesday of every month.

9:30am US Pacific / 12:30pm US Eastern [Add to Google Calendar] [Download .ics]

If a meeting is now in session, you can Join with Zoom

About

This group brings together operations and development engineers from different organizations to discuss challenges and solutions around engineering for reliability. Share successes and failures, learn about best practices, and meet others who are also on a journey to implement a sustainable practice of reliability engineering.

Format

  • Monthly sessions via Zoom– all are welcome!
  • We use a Lean Coffee format where discussion topics are generated by the group

Also, the mailing list is available any time for asynchronous discussion.

Topics

This conversation is what you choose it to be! Participants are welcome to to propose any topic, and whatever gets votes will be discussed. Here are some topics from recent discussions:

  • Reliability scanning & recommendation tools
  • Dashboards – useful, useless, somewhere in between?
  • How to transform a traditional operations team into SRE, in a traditional enterprise
  • Alerting philosophy–how many is too many, what medium do you use and why, and how do you scope the audience for different types of alerts?
  • Is platform engineering becoming part of/intersecting with reliability roles/topics?
  • What are the best practices for doing capacity management?
  • How can we properly use postmortem reports and avoiding making them a /dev/null bucket for solved issues?

Summaries of recent discussions

Join in!

To be notified of upcoming events, join “reliability-discuss” on Google Groups.