Subscribe to Updates
Get the latest tech news and information from AI Ops SRE about all things SRE, AI Ops and Observability.
Browsing: Resources
This code demonstrates the implementation of logging in a Python script for AI operations.
Python can be used to write scripts that collect and aggregate data from various sources, such as log files, metrics, and monitoring tools.
Using a runbook template involves customizing the template to match your organization’s needs, creating a new document, and copying the…
Observability tracing involves instrumenting the code across different services and components of a system to capture and propagate trace data.
Example of Python code using the spaCy library for NLP to analyze incoming support tickets and automatically assign them to the appropriate IT teams based on the content of the ticket.
Google’s SRE books offer practical insights and strategies to enhance professionals’ knowledge, problem-solving abilities, and foster a culture of continuous improvement in system reliability engineering.