SRE Lead Engineer

Sign up to see company details
  • Permanent
  • £70,000 - £75,000 (GBP)
  • London, England, United Kingdom
    and remote
  • 19/04/2021

Looking for a SRE Lead Engineer to join a hardworking team of Agile SRE and test automation engineers who will be responsible for the E2E infrastructure, support and deployment of code into Lab/Production environments. Highly uptime SLA oriented team and process driven.

Description

Looking for a SRE Lead Engineer to join a hardworking team of Agile SRE and test automation engineers who will be responsible for the E2E infrastructure, support and deployment of code into Lab/Production environments. Highly uptime SLA oriented team and process driven.

SRE team is expanding along with knowledge pool and converging on a single entity to support mobile/broadband for different geographies” - Manager - Software Engineering

What you'll do: -

  • Develop a telco grade PaaS capability for the business.
  • Design, document, and implement a PaaS solution to onboard and integrate vendor provided or requested applications with customers telecommunications infrastructure.
  • Be responsible for the engineering and support of production environments, including automation of patches, upgrades, reliability and performance improvements
  • Develop assurance, monitoring, and management capabilities for PaaS infrastructure using Zabbix, Prometheus, Grafana, and ELK stack.
  • Lead creation of automated reports for various services and PaaS infrastructure.
  • Own the operational playbook for the PaaS infrastructure and the services running within it.
  • Monitor and manage Linux VMs, Containers and applications.
  • On call 1 in 5 weeks - This will move to a 1 in 8 weeks once a full team is in place

 

What you'll bring: -

Essential: -

  • Linux system administration & configuration management, primarily with CentOS and Ubuntu.
  • Experience with automation/orchestration with tools such as Ansible and Terraform.
  • Experience of building and maintaining CI/CD pipelines.
  • Experience working with Git and performing code reviews.

 

Good to have: -

  • Working with Java apps, Rancher, Kubernetes, and Helm.
  • Python
  • Experience deploying and maintaining Hadoop, Airflow, Geode, and related components.
  • Experience building and managing Kafka, Zookeeper, Couchbase, PostgreSQL and Consul clusters.

Skills

DevOps Technical Skills
Containerization
Continuous Integration / Deployment (CI/CD)
IT Infrastructure Expertise
Monitoring Tools & Management
IT Infrastructure Products
Amazon AWS
CentOS
Kubernetes
Linux
Terraform
Ubuntu
Programming Languages & Frameworks
Ansible
Python
Software Development Tools
Bash

Industry Experience

Media & Broadcasting - TV, Music, Movies, Radio, Entertainment