Join the Peraton Team as a Site Reliability Engineer (SRE3) and Help Secure Mission-Critical Systems!
We are seeking a highly experienced Site Reliability Engineer (SRE) to support large-scale, highly distributed systems in a mission-critical environment.
This role requires a strong blend of software development and system administration expertise, with a focus on designing and implementing sustainable automation solutions that improve reliability, efficiency, and operational consistency.
The ideal candidate will leverage extensive experience managing large systems to develop tools that:
-
Reduce risk to production environments
-
Minimize human error
-
Eliminate labor-intensive and repetitive manual processes
-
Improve adherence to operational procedures
-
Serve as a force multiplier for monitoring and system administration teams
Automation solutions may include configuration management tools (e.g., SALT, Puppet), custom-developed GUIs for shift operations, or fully automated cluster-level solutions. The goal is to deliver sustainable tools that perform at or above the reliability of manual processes.
Key Responsibilities
-
Design and implement automation solutions for large-scale distributed systems
-
Develop software tools to support monitoring and system administration teams
-
Provide technical direction for development, integration, and testing of hardware/software systems
-
Manage and monitor large cloud-based environments
-
Conduct postmortem analysis and support incident management processes
-
Improve operational processes and system health visibility
-
Support distributed, massively parallel data environments
Peraton offers enhanced benefits to employees supporting our critical National Security programs, including:
-
Heavily subsidized medical, dental, and vision coverage for employees and their dependents
-
Eligibility to participate in a competitive bonus plan
-
Generous PTO plan
#MPOJobs #AJCM #PeratonRoyalMove #MPOROYAL
#MDFSP