Date and Time: 16 January 2024 Topics discussed: Oncall Handovers - ref Chad Todd's talk at SREcon23 - https://www.usenix.org/conference/srecon23americas/presentation/todd - handoff meeting --> create tasks to improve system + severity/urgency (eg alert nonactionable) Test to Production Environments. To have reliable code delivery, does anyone leverage a production alpha environment? - Option: Preview mode for customers to opt-in, try new things. - we use our own CIO / IT organization as Client-0, before we open to clients. For some critical services, we ask any/all employees to hammer the system in a defined window Hiring market right now? What AI tools could we use to improve reliability? - idea: start with the data! what do you have already? - Similar Incidents, commands, culprits. Risk Assessments of a given release. - https://www.usenix.org/conference/srecon19emea/presentation/underwood - Anomoly Detection - is this already AI, or simply statistics ? - Story Generation: it might be right? at least a good place to start