Site Reliability Engineer - INSIDE IR35

Sign up to see company details
  • Contract 180 days
  • £600 - £615 (GBP) / day
  • London, England, United Kingdom
    and remote
  • ASAP

This SRE will help upscale the existing operational team on AWS, Aura and Mapt notification engine platform, two days per week in the London office and part of OnCall.

Description

This SRE will help upscale the existing operational team on AWS, Aura and Mapt notification engine platform, two days per week in the London office and part of OnCall.

What you’ll do:

  • Develop a telco grade PaaS capability.
  • Design, document, and implement a PaaS solution to onboard and integrate vendor provided or requested applications with our telecommunications infrastructure.
  • Take part in an on-call rota to action symptoms before they become outages.
  • As a senior SRE engineer, be responsible for the engineering and support of production environments, including automation of patches, upgrades, reliability and performance improvements
  • Ownership of lab facilities for Dev & Test activities of PaaS
  • Develop assurance, monitoring, and management capabilities for PaaS infrastructure using Zabbix, Prometheus, Grafana, and ELK stack.
  • Act as technical escalation point for colleagues within the team.
  • Act as a day to day technical point of contact for the engineers in other teams.
  • Lead creation of automated reports for various services and PaaS infrastructure.
  • Manage the operational playbook for the PaaS infrastructure and the services running within it.
  • Automate dashboards and reporting for the platform against SLOs, SLAs and KPIs.
  • Support managers with inputs on resourcing as needed.
  • Monitor and manage Linux VMs, Containers and applications.
  • Support and lifecycle management of various applications and services, including patching, upgrades, updates and troubleshooting.
  • Plan and lead proactive disaster recovery testing.
  • Work with suppliers to onboard their VNFs and CNFs

What you’ll bring:

  • Experience working with Public cloud, OpenStack, VM, Linux boxes
  • Strong background automating the configuration and management of large-scale platforms: Linux, Git, any scripting language like Python, Go, Bash etc
  • Experience in database deployment and management (SQL, NoSQL). Eg Couchbase, PostgreSQL
  • Linux system administration & configuration management, primarily with CentOS or Ubuntu.
  • Experience of building and maintaining CI/CD pipelines
  • Experience with automation/orchestration with tools such as Ansible and Terraform.
  • Knowledge of web servers ie nginx or Apache etc

Required behaviours:
>Act as a role model to set acceptable working standards, ethics and practices.
>Mentor and develop colleagues in the team.
>Lead by example for minimising toil and maximising automation.

 

Skills

Business Activities
Escalation Management
Stakeholder Communication
DevOps Technical Skills
Containerization
Continuous Integration / Deployment (CI/CD)
Environments
PaaS
Financial Services Expertise
Monitoring
IT Infrastructure Expertise
Automation
Disaster Recovery
Infrastructure Design
Virtualisation
IT Infrastructure Products
Amazon AWS
CentOS
Couchbase
ELK Stack
Grafana
Linux
Nginx
Openstack
PostgreSQL
Prometheus
Ubuntu
Zabbix
IT Infrastructure Technologies & Protocols
Technical POC
IT Security Expertise
Patch Management
Support
Management Consultancy Skills
Performance Improvement
Programming Languages & Frameworks
Ansible
Go
Python
SQL
Project Management Project Types
IT Upgrades
Software Development Tools
Apache
Bash
Terraform

Industry Experience

Media & Broadcasting company - TV, Music, Movies, Radio, Entertainment
IT company