Job Description
Observability, SRE, DevOps roles with proven expertise across infrastructure and application-level reliability. Dynatrace, ELK, Splunk, and PagerDuty; SLI/SLO frameworks. Azure Kubernetes Service, Terraform,
What will you do
Design and implement observability-as-code solutions using Terraform to deploy monitoring pipelines, dashboards, and alerting strategies across distributed systems.
Drive observability improvements leveraging industry-leading tools (Dynatrace, ELK, Splunk, PagerDuty) to achieve real-time performance insights and comprehensive system visibility.
Instrument applications for end-to-end observability implementing distributed tracing, metrics collection, and log aggregation across Node.js and .NET microservices and event-driven architectures.
Troubleshoot complex incidents in production environments, diagnosing root causes across multiple service layers, databases, caches, and APIs under load using SLISLO frameworks.
...