As a lead site reliability and DevOps engineer on the team, you’ll be responsible for the design, deployment, and maintenance of production-scale systems. You’ll support services before they go live through activities, including system design consulting, developing software platforms and frameworks, capacity planning, and launch reviews. You’ll continuously measure, monitor, and maximize availability, efficiency, and performance of live services, applications, and systems. You’ll lead efforts to incorporate open source tools, automation, and Cloud resources to cut down on tedious, boring tasks and free up the team’s developers to do what they do best – innovate. You’ll implement continuous integration and delivery to limit manual testing and troubleshooting. This is an opportunity to broaden your skill set into areas like site reliability engineering while leading software development efforts that help the organization deliver comprehensive, best value commercial-class products and services. As a technical leader, you’ll identify new opportunities to build software solutions to help your customers meet their toughest challenges. Join our team as we build tools to transform the future for our clients.
Nice If You Have:
- 5+ years of experience in managing infrastructure and its configuration with tools, including Ansible, AWS Cloudformation, or HashiCorp Terraforms
- 3+ years of experience as a system administrator or DevOps engineer in Amazon Web Service (AWS)
- 3+ years of experience with Linux system administration or systems engineering
- 3+ years of experience with scripting in Bash, Python, or Go
- 2+ years of experience with process orchestration using tools, including Gitlab CI or Jenkins AND managing container orchestration platforms, including Kubernetes, OpenShift, or Docker Swarm
- 2+ years of experience building or managing software applications throughout their software development life cycles
- 2+ years of experience with virtualization technology, including VirtualBox and Docker
- 1+ years of experience with declarative system management tools, including Ansible, Salt, or Terrafor
- Ability to obtain a security clearance
- BA or BS degree
- Experience with Linux servers, including installing and maintaining applications, troubleshooting, reviewing logs, and patching
- Experience with developing software application using frameworks, including Spring or Django
- Experience with managing databases or search engines, including Postgres, MySQL, Oracle, Cassandra, or Elasticsearch
- Ability to research, implement, and document complex troubleshooting steps to resolve SRE issues
Applicants selected will be subject to a security investigation and may need to meet eligibility requirements for access to classified information.
We’re an EOE that empowers our people—no matter their race, color, religion, sex, gender identity, sexual orientation, national origin, disability, veteran status, or other protected characteristic—to fearlessly drive change.