Senior Site Reliability Engineer

We are looking for an experienced Senior Site Reliability Engineer to join our Engineering team. You will work closely with the team and develop software systems and automated solutions for the operational aspects of an organization. You will also be responsible for monitoring computer systems and building alerts for various operational issues that computer systems can experience.

What will you do:

  • Designing and implementing an automation framework for all aspects of the application lifecycle, including build, test, and deployment
  • Performing operations tasks such as monitoring application performance over time or troubleshooting issues with production applications
  • Monitoring network performance to ensure that the platform can handle increased traffic from new applications or services
  • Support and maintain configuration management for various applications and systems.
  • Serve as part of the architecture and development lifecycle implementing systems.
  • Support the recovery and resiliency strategy and architecture for various applications and systems.
  • Proactively support capacity planning and disaster recovery and resiliency aspects.
  • Govern support processes, resiliency and automation principles for the larger organization.
  • Create and develop direction and guidance.
  • Building large scale messaging infrastructure, data replication, auto-scaling and stream processing.

What we are looking for:

  • Proven 4+ years’ work experience as a Site Reliability Engineer or similar role.
  • Bachelor’s degree in System Information or Computer Science or related field and equivalent practical experience is required.
  • Comfortable with large scale production systems and technologies, for example load balancing, monitoring, distributed systems, and configuration management.
  • Strong coding skills in at least one programming language, and a desire to pick up more.
  • Familiarity with and enthusiasm for software engineering best practices such as testing, continuous integration and continuous delivery.
  • A passion for solving problems using open source software.
  • The ability to thrive in a rapidly evolving, globally distributed environment.
  • Strong Security mindset.
  • Detail-oriented with the ability to catch minor errors which can result in major problems.
  • Analytical skills, great interpersonal and communication skills.
  • Growth mindset, challenging status quo to find new solutions and out-of-the-box ideas.