IT Operations Monitoring Analyst #2874~

Costco Wholesale - Issaquah, WA

Position Summary

This position is responsible for monitoring and assessing potential issues relating to critical IT applications, systems, and devices. The IT Operations Monitoring Analyst will monitor all incoming alerts using various tools and perform actions based on predefined instructions. This position includes performing 1st and 2nd level troubleshooting, manual alert correlation, and resolving issues as required.

The successful candidate must understand how different types of Costco systems integrate together and be able to pinpoint where the actual issue is occurring. The IT Operations Monitoring Analyst must also be able to determine which issues are false positives and document/work with other teams to get these types of alerts suppressed.

  • Pay based on experience.
Job Duties/Essential Functions

Participates in the ongoing process of investigating, troubleshooting, and providing resolution to technical issues during extended shift hours in a 24x7x365 environment.
Quickly develops a comprehensive understanding of the applications and infrastructure within the eCommerce environment and how they impact employees or members.
Stays informed of production changes to determine the impact of alerts, applications or device functionality.
Coordinates across teams, collaborating closely with peers to ensure the appropriate focus and sense of urgency is applied to all issues.
Accurately troubleshoots, reproduces, and documents issues and other pertinent information in Incidents.
Contacts Costco locations in order to troubleshoot power outages and devices or application issues; able to communicate technical jargon to those that are not technical in a way that is understandable.
Using technical writing skills, creates and maintains Knowledge Base articles.
Prepares and distributes turnover documents.
Handles incident queue and performs various tasks. Determines business impact according to ITIL Incident Management guidelines.
Handles ad hoc requests and completes frequent, required training. Adapts to changing procedures as required to support the business.
Regular and reliable workplace attendance at your assigned location.
Ability to operate vehicles equipment or machinery.

Computer, phone, printer, copier, fax

Non-Essential Functions

Assists in other areas of the department as necessary.
Assists in other areas of the company as necessary.
Ability to operate vehicles, equipment or machinery.

Computer, phone, printer, copier, fax

Experience, Skills, Education & Licenses/Certifications

Ability to work a variety of different shifts, including days, nights, weekends, and holidays to support a 24X7X365 environment. Shifts may fluctuate to meet business and staffing needs.
Proficient in troubleshooting and analysis of network, applications, systems, and device issues.
Ability to work independently or collaborate with others in an intense and dynamic work environment.
1+ years’ experience working on an Operations style team and/or Service Desk responsible for troubleshooting networking devices, server issues, and large-scale business critical applications which includes eCommerce and SAP environments.
Has an attention to detail, maintains high quality work while handling a large volume of alerts; is able to multitask in a fast-paced environment.
1+ years’ experience with ITIL processes including: Incident, Problem, Change and Knowledge Management; must have experience creating and handling critical incidents.
Experienced in professional written and oral communications, including technical writing, phone etiquette, and customer service skills.
Process-oriented; understands the organizational benefits of processes and the need for compliance. Suggest or implement improvements to team processes and procedures.
Shows initiative and has a strong desire to share knowledge with others.
Experience mentoring and training coworkers on new or changed procedures.
When working on projects, identifies and tracks project issues and dependencies, ensure follow-through and appropriate actions are taken to complete project on time.

Experience triaging issues with servers, web services, networking devices, SAP, Java-based applications, and large eCommerce platforms.
Remains calm in high stress situations such as handling alert storms and business critical issues.
Basic understanding of monitoring and alerting of: Application health, system availability, latency, performance, and end-to-end monitoring.
Working knowledge of various application, network monitoring, and alerting software.
Degree in Computer Science (or a related technical field) or equivalent relevant work experience.
ITIL Foundations Certification preferred.
Basic knowledge of scripting languages, including JavaScript, Google Apps Scripting, HTML, and Splunk Search Processing Language (SPL).
Successful internal candidates will have spent one year or more on their current team.

Other Conditions

Management will review the Job Analysis for this position prior to a job offer.

Required Documents

Cover letter