DevOps/Site Reliability Engineering FULL TIME (not agency) position
Day-to-day activity requirements:
- Structure and maintain the software configuration management system (Jenkins)
- Automate and maintain the software build process
- Automate software deployment and monitoring (Jenkins, SaltStack, Ansible, CloudWatch, DataDog, Linux/Windows systems)
- Automate software testing at multiple levels (component, configuration item, subsystem, system) and monitor results
- Monitor site stability and performance and troubleshoot site issues
- Scale infrastructure to meet rapidly increasing demand
- Collaborate with developers to bring new features and services into QA, Staging and Production
- Provide support to development teams that use the automated infrastructure
- Develop and improve operational practices and procedures
- At least 3-years demonstrated experience building and operating private and public cloud infrastructure environments (e.g., AWS, VMware)
- Experience in 24x7 production operations, preferably supporting a highly available environment for a SaaS/cloud service provider, and on-prem.
- Release automation (e.g. Jenkins), system administration, system configuration, and system debugging experience.
- Experience using scripting languages (Python, BASH, etc), configuration management tools (Git, Chef, SaltStack, etc) and command execution frameworks.
- Deep understanding of AWS provisioning, monitoring and management
- Experience working with multi-geo teams
- Experience managing Windows and Linux environments
- Good verbal communication
- Knowledge of workflow tools (e.g. Atlassian)
Job Type: Full-time
- AWS: 3 years (Required)
- CI/CD: 3 years (Required)