Site Reliability Engineer

ManTech · Washington, DC, United States

Location
Washington
Job Type
Full-time
Posted
May 27, 2026

Job Description

MANTECH seeks motivated, career, and customer-oriented **Site Reliability Engineer (SRE)** for a new initiative. This effort supports the rapid design, deployment, operation, and sustainment of enterprise-scale AI, data, and mission platform capabilities across cloud, edge, and classified operational environment

This role supports the operational reliability, scalability, monitoring, and incident response for the enterprise AI systems. You will focus on operational outcomes and optimizing system performance.

**Responsibilities include but are not limited to:**

+ Apply core reliability engineering principles to ensure high availability and stability of production systems.
+ Manage incident response, root cause analysis, and post-mortem processes for the AI platform.
+ Implement and optimize observability operations using OpenTelemetry, Prometheus, Grafana, Loki, or Tempo.
+ Oversee capacity planning, performance optimization, and FinOps practices.

Ready to Apply?

Submit your application for Site Reliability Engineer at ManTech

Apply Now