Senior Site Reliability Engineer

San Francisco Metro Area, CA

Post Date: 07/18/2018 Job ID: JN -072018-3725
Our client, a fast-growing leader in the SaaS space, is in need of a Senior Site Reliability Engineer to oversee, troubleshoot, and support the company's production environment. This is a direct hire, contract to hire or long-term contract role, for the right candidate.

  • Ongoing assessment of the automation tools and recommend process improvements when necessary
  • Monitoring and instrumentation Standard Internet services, such as DNS, HTTP, etc.
  • Lead technical support during an event (outage, etc.)
  • Manage and drive projects around monitoring, automation, and cloud provisioning
  • Effectively communicate issues or problems with other SMEs and Engineers and hand off when necessary
  • Facilitate both technical and non-technical staff during outages
  • Participate in rotating on-call duty to support to maintain the performance of the site and APIs used by third-party services
  • Work with other SREs to solve intricate problems
  • Build relationships with various teams and leaders and act as a leader for other junior team members
  • Design and create tools to manage the site
  • End-to-end tuning, optimizing resource utilization, as load patterns change
  • Scaling requirements and patterns
  • Ensure that backup/restore and disaster recovery capabilities are implemented, tested and maintained

  • 8+ years of industry experience in a similar environment
  • BA./BS degree (required); M.S. degree or equivalent technical training
  • Understanding of shell scripting and high-level programming languages such as Bash, Python, Ruby. Python is the preference.
  • Understanding of REST APIs
  • Working knowledge of load balancing technologies, including L7 routing, DNS, and CDN
  • Experience with networking and TCP/IP
  • Proven ability to manager Server hardware configuration
  • Experience with managing cloud computing patterns
  • Proven success of configuration management using Puppet, Chef, Ansible, etc
  • Experience with AWS services like EC2, ELB, ElastiCache, DynamoDB, SQS, SNS, RDS, S3.
  • Understanding of Container and Container Management technologies, such as Docker and Kubernetes
  • Experience with defining and documenting technical architecture
  • Familiar with ITIL-based incident, problem, and change management

Nice to Have

Experience with complex SaaS or Production, revenue-critical web services type of environments
Experience working with applications build teams who are using Java / J2EE and SQL

About Vivo

Having been in business since 2006, Vivo is a full-service recruiting and consulting company, specializing in mid to senior level technology resources. Our brand promise is simple: we get people. We get that our clients don t want to waste time and that our candidates and employees thrive when given honest feedback and an opportunity to grow.

Whether you re onsite at our Pleasanton headquarters or working for one of Vivo s clients the best brand names out there our promise to you is unwavering: we will treat you like you are our most important employee.

Do you think you get people get what they really need, and get how to deliver? We re not perfect but we re accountable. We re not in 32 countries, but we are in the heart of it all. So, if you are looking for a flexible, fun and high-energy work environment, along with the opportunity to work with some of the world s technology leaders, we can t wait to talk to you.

Vivo We Get People!

Not ready to apply?

Send an email reminder to:

Share This Job:

Related Jobs: