The Operations Engineer will support the monitoring and infrastructure tools platforms including the hardware/software/network technologies that comprise these systems. Frequently collaborate with internal/vendor/contractor partners to develop and implement detailed design, configuration and engineering strategies/solutions to resolve issues/incidents and meet business needs/requirements while remaining focused on security, up-time and performance. Provide troubleshooting and resolution to routine/semi-complex problems.
Troubleshooting & Incident Management
- Perform moderately difficult and independent assignments in the troubleshooting, problem diagnosis, problem resolution and ongoing production support for one or more technologies within the technical area of expertise.
- Responsible for designing, reviewing, approving and deploying robust, stable and manageable solutions while minimizing hardware/software/network downtime.
- Periodically assist in the procurement, configuration, and integration of new technologies.
Proactive Monitoring & Preventative Maintenance
- Ensure the up time and response time SLAs/OLAs for services are met and or exceeded.
- Pro-actively monitor the stability and performance of various technologies within area of expertise and takes appropriate corrective action prior to an incident or problem occurring.
- Ensure patching and regular maintenance is performed as required.
- Actively collaborate with fellow members of the team and contractors/vendors on bridge calls to prevent or resolve incidents/problems in an expeditious manner.
- Recommend, deploy and document strategies and solutions for software/hardware/network engineering problems/incidents based upon comprehensive and thoughtful analysis of business goals, objectives, requirements and existing technologies.
- Independently identify key issues, patterns and deviations during the analysis.
- Recommend robust solutions utilizing pragmatic judgement, creativity, and in-depth technical knowledge and evaluation that comprehensively meet the needs of the business.
Leadership & Partnerships
- Manage effective relationships and works in partnership with leadership, team members, vendors, and contractors to deliver robust technical solutions ensuring that service level commitments and project time lines are maintained.
Processes, Standards & Best Practices
- Participate and provide input in the continual refinement of processes, policies and best practices to ensure the highest possible performance and availability of technologies.
- Create, maintain and update documentation of detailed design documents, diagrams, engineering specifications, build changes, models, troubleshooting and support guides, systems metrics and Standard Operating Procedures as required to ensure operational excellence.
- Continuously develop specialized knowledge and technical subject matter expertise by remaining apprised of Industry trends, the direction of emerging technologies, and their potential value to the business.
- Add additional job-specific responsibilities needed for this position.
- Bachelor’s degree in Computer Science, Engineering or related field; or equivalent work experience.
- 5-7 years of relevant experience.
- 5-7 years of proven engineering expertise within the subject matter domain.
- Ability to support working outside of normal business hours to provide after hour or "on-call" support when necessary to solve high profile incidents/problems.
- Highly innovative problem solver with strong analytical and customer service abilities required.
- Ability to communicate and articulate technical information across various organizational levels.
- High reasoning aptitude and ability to quickly understand complex operating environments.
- 5-7 years of relevant experience which includes designing and implementing enterprise monitoring solutions.
- Experience working with various operating systems, middleware platforms and databases.
- Exposure to monitoring cloud services (AWS, Azure, etc.)
- Familiarity working with ServiceNow, experience with Evanios event platform a plus.
- In depth knowledge of Dynatrace, SumoLogic, and/or ScienceLogic.
- Strong communications skills both verbal and written.