Site Reliability Engineer

Apply Now
Company: Yelp
Location San Francisco, CA
Date Posted: June 2, 2015
Source: Yelp
Our Site Reliability Engineers are the primary interface between our developers and our production operations. No matter how many times we get searched, scraped, scanned, spammed, pinged, paged or queried, they gotta keep their cool - and keep the site running smoothly.  You'll work in both the dev and systems worlds, instrumenting key parts of core architecture and supporting devs as they try to do the same.  We're looking for a true hacker - you'll work as much in bash as Python, and you'll drop into some C now and then.  You'll implement monitoring and alerting systems to support site stability and performance. You'll proactively scale our infrastructure to meet ever-increasing demand.  You'll make sure that when something goes bump in the night, someone hears it. And you'll play a key role in keeping Yelp fast, available and growing.

Responsibilities

* Work closely with developers in supporting new features and services
* Monitor site stability and performance
* Scale infrastructure to meet demand
* Troubleshoot site issues
* Develop custom tools as necessary
* Document system design and procedures
* Participate in light on-call rotation

Requirements

* Mastery of Linux or Unix
* Command of your favorite modern programming language: Python, Ruby, Java, C++, etc.
* Solid understanding of fundamental technologies like TCP/IP, HTTP,
* Knowledge of best practices related to security, performance, and disaster recovery
* Strong scripting skills in the presence of flying darts
* Experience with web server configuration, monitoring, trending, network design, high availability
* Excellent communication skills
* A sense of humor!

Pluses

* MySQL experience (high availability, scale-out replication)
* Advanced knowledge of network design, management of Cisco network equipment, or BGP
* Experience at a large-scale consumer internet site
* CentOS and Ubuntu distribution familiarity

*LI-KI1

Other jobs you might like

  • Senior Site Reliability Engineer

    NEW

    Walmart

    - Daly City, CA

    Position Summary The @WalmartLabs Platform team is responsible for building & maintaining the next-generation eCommerce platform that powers Walmart Global ...

    2 days ago from Walmart
  • Quality Engineer with Selenium WebDriver

    NEW

    BayOne Solutions

    - San Francisco, CA

    Hi, Hope you are doing well. This is Ram RB from BayOne Solutions. Please find the below detailed job description for position with our direct client for San Francisco, CA . We ...

    11 hours ago from Dice
  • Sr. Site Reliability Engineer

    VEVO - Entertainment and Media Industry

    - San Francisco, CA

    Vevo's infrastructure team is in search of experienced system administrators, network engineers, systems engineers, or any other applicable disciplines, to join our growing ...

    10 days ago from VelvetJobs
  • Site Reliability Engineer

    Airbnb - Entertainment and Media Industry

    - San Francisco, CA

    Why is Site Reliability Engineering important at Airbnb? Site reliability engineers (SREs) are responsible for the overall reliability of Airbnb infrastructure and products. SREs ...

    30+ days ago from VelvetJobs
  • Site Reliability Engineer - Entry Level 2016

    Twitter - Entertainment and Media Industry

    - San Francisco, CA

    About Twitter Twitter’s mission is to give everyone the power to create and share ideas and information instantly, without barriers. We believe the open exchange of ...

    30+ days ago from VelvetJobs
  • Release Engineer

    Docker

    - San Francisco, CA

    As a member of the Release Engineering team you will own the end-to-end release engineering processes for various components across our enterprise products and open ...

    4 days ago from Docker
  • Release Engineer

    Rally Health

    - San Francisco, CA

    Rally Health ( rallyhealth.com ) is a startup that's revolutionizing health and the face of health care. Work for the purpose of improving millions of peoples' wellness and ...

    4 days ago from Rally Health
  • Release Engineer

    AltSchool

    - San Francisco, CA

    AltSchool is reimagining K-8 education from the ground up. Our world-class team of educators, entrepreneurs, and technologists is working together to build a network of ...

    4 days ago from Altschool
  • Release Engineer

    Dropbox

    - San Francisco, CA

    Team Description Our Engineering team is architecting a family of products that handle over a billion files a day. We take on the complexities of technology that affect ...

    4 days ago from Dropbox
  • Release Engineer

    ManpowerGroup

    - San Francisco, CA

    * Use of common configuration and release management processes and tools * Configuration of CM & RM tools based upon defined designs * Perform CM Baseline ...

    5 days ago from Manpowergroup

Show Me More