Site Reliability Engineer

Birmingham, Alabama · Information Technology

Fusion HCR is hiring! Direct Hire –Site Reliability Engineer. This is a hybrid (possible remote) opportunity, working with our client in the Industrial sector.

Site Reliability Engineer
Position Summary
As a Site Reliability Engineer III, you will bridge the gap between software engineering and systems administration to enhance the reliability, scalability, and resilience of our critical cloud platforms. You will focus on driving automation, eliminating manual operational work, and optimizing performance across large-scale, distributed environments. Working closely with development and infrastructure teams, you will balance rapid feature velocity with rock-solid system stability.
Key Responsibilities

Automation & Scalability: Design, build, and maintain automated solutions to improve platform stability, capacity management, and deployment velocity.
Observability & Performance: Leverage monitoring platforms to proactively identify risks, resolve complex performance bottlenecks, and ensure systems meet strict SLOs.
Incident Response & Prevention: Troubleshoot complex infrastructure, application, and network issues; participate in root-cause analysis to minimize downtime and prevent recurrence.
Traffic Management: Detect, investigate, and mitigate malicious or anomalous traffic patterns to secure enterprise applications.
Cloud Architecture: Collaborate with engineering teams on microservices architecture, platform administration, and cloud modernization initiatives.

Required Qualifications

Experience: 5+ years in Site Reliability Engineering (SRE), DevOps, or Systems Engineering supporting enterprise-scale production environments.
Cloud & Containers: Strong, hands-on experience with Google Cloud Platform (GCP) and Kubernetes (containerization, cluster management, elastic scaling).
Infrastructure as Code & CI/CD: Proven experience with Terraform and Azure DevOps or similar CI/CD pipelines.
Observability Stack: Experience with monitoring and diagnostics tools like Dynatrace, Prometheus, and Grafana.
Technical Foundations: Deep knowledge of Linux/Windows administration, networking protocols (HTTP, proxies), Java-based applications, and microservices architecture.

Must be authorized to work in the U.S. without current or future visa sponsorship. Education in Computer Science, IT, or a related field preferred.

Site Reliability Engineer

Share This Job