This position will lead and manage a diverse team of varying skillsets through a transformation including process and technology. Hands-on technical leadership role that contributes to the success of MISO by managing enterprise monitoring of critical and non-critical system layers through the implementation of various measuring and monitoring systems. Extreme comfort building a Tier IV organization that is undergoing deep transformation. Must possess confidence to challenge status quo and leadership ability to drive and inspire technical and process changes. This position is a strong advocate and catalyst for continual process and technology improvements related to all aspects of monitoring including Network, System, and Application Performance, Capacity, Health, Availability, etc. This position supports the enterprise IT monitoring capabilities by maintaining, managing, and improving various management tools and reports which seek to proactively identify and escalate issues and problems across team boundaries.
Leads ITOC Engineer Team and delivers results
Mentors Direct Reports. Develop and retain great talent.
Provides exceptional leadership to team and organization
Design, re-engineer, implement, manage and develop monitoring tools, such as Solarwinds, Traverse, I3 Precise, TeamQuest, etc. that will be used to support business decisions for monitoring systems heart rate and capability. Solarwinds expertise is given priority.
Transition from existing legacy monitoring (Traverse) to Solarwinds
Setup and utilize Solarwinds NPM, SDM, and Atlas, to discover and monitor a large IT network for potential problems. Problems could include network performance, power, malware intrusion, server faults, bandwidth capacity, storage capacity, server disk utilization, middleware, application performance, as well as memory and processor utilization.
Monitor the performance and capacity of network and computer systems using a variety of tools including Solarwinds, Traverse, Team Quest and other monitoring tools.
Work with the Network/Infrastructure/Monitoring teams to develop and advocate for standard procedures to respond to fault, power, capacity or utilization alerts.
Ensure the monitoring systems operate efficiently and are kept at the most current stable version/release using vendor-supplied updates and patches. Perform research and testing to verify impact of installing all updates. Coordinates vendor support and ensures positive relationships are maintained.
Develop robust reporting performance analysis from various performance reports for internal and external distribution.
Proactively identifies system deficiencies and assists in root cause analysis of system issues to minimize impact and future occurrence. Escalates issues as warranted.
Review performance and capacity data and perform trend analyses to detect present and potential problems.
Assists in the design of establishing standard SLAs and system/application thresholds
Understands systems technical architecture, and able to identify the performance implications for different layers of system based on design discussions or architecture documents.
Perform analysis and maintenance of system data and analysis of opportunities for technical and operational improvements.
Executes initiatives to reduce failures, defects and improving overall performance.
Utilize industry resources to identify new and innovative techniques and best practices.
Serve as champion for new techniques as appropriate.
Contributes to technical presentations to educate teams on how to improve performance and capacity.
Provides capacity performance information to support technology refresh projects.
Ability to make timely recommendations to effectively solve problems, using independent judgment consistent with standards, practices, policies, procedures, regulations, and/or law.
Ability to work in a team/group setting and collaborate by providing transparency in performance results.
BA/BS in Technology field preferred or 7 + years relevant work experience equivalency, required
ITIL Foundations Certification (preferred)
ITIL Operational Support & Analysis (OSA) Intermediate Certification or Service Operation (SO) Intermediate Certification (preferred)
Active CCIE Certfitication
Advanced experience with Solarwinds monitoring software
CCIE and Cisco IOS debugging heavily preferred
Advanced Network Performance Analysis and debugging in complex networks
Expert skill building and developing application performance monitoring
Experience building network performance platforms and capabilities, to include process and procedure development
Expert Network and System documentation and mapping experience, using Visio
Knowledge of network protocols and routing, network, server, and host operating systems
Working knowledge of Networking, Microsoft and Linux Operating Systems, SNMP, WMI, MIBs, and OIDs
Expertise/breadth of knowledge across technologies and various technology layers (application software, SOA, infrastructure, network, etc.) needed to apply full range of Performance Engineering expertise. (application software, infrastructure, network, etc.)
Active Data Guard software
Budgeting and Forecasting
Administration skills on a variety of computer platforms (UNIX, Windows, LINUX)
Must be available for network emergencies or Major Incidents 24x7. Some evening and/or weekend work as necessary based upon workload
Understanding CIP and SSAE 16 Regulatory Compliance Requirements
Appropriate level will be determined based upon experience and knowledge