Site Reliability Engineer

Sign up to see company details
  • Permanent
  • £65,000 - £75,000 (GBP)
  • Fleet, England, United Kingdom
    and remote
  • ASAP

My client are looking for a Site Reliability Engineer to join an existing highly skilled team to deliver in a DevOps environment supporting production applications, back-office services, cloud services, platform improvements, and acting as both advisor and coach to other team members. You will have experience of diagnosing and fault-finding incidents using data insights and liaising closely with the delivery teams reporting progress, gathering data, intelligence and information as requested

Description

My client are looking for a Site Reliability Engineer to join an existing highly skilled team to deliver in a DevOps environment supporting production applications, back-office services, cloud services, platform improvements, and acting as both advisor and coach to other team members. You will have experience of diagnosing and fault-finding incidents using data insights and liaising closely with the delivery teams reporting progress, gathering data, intelligence and information as requested

Key Responsibilities and Accountabilities:

  • Responsible for the performance and reliability of the company’s global online platforms.  Working within the Technology Team, troubleshooting issues with services via proactive/reactive monitoring, alerts and logging, service requests communicated via Jira, email, Sprint meetings and Stand Ups
  • Enhancing existing service’s tech stack/configurations to improve site performance, reduce issues through forensic analysis and be responsible for availability management, latency, efficiency, change management, monitoring, emergency response, and capacity planning.
  • Record data and manage issues with a view to participation in reviews and Blameless Post-Mortems.
  • Explore and deliver on opportunities to implement automation and scripting of services, environments and toolset
  • Liaise closely with the application Developers, Sprint Teams and the Development Managers reporting progress, gathering data, readings and information as requested.
  • Design, implement, calibrate and validate to company procedures and processes alongside routine service, emergency service and product updates as required.
  • Create a bridge between Development and Operations teams by applying an ‘as-a-service’ mindset to system administration, management and build topics. Gain exposure to systems in both staging and production, as well as all technical teams. Take part in work with software development, support, IT operations and on-call duties
  • Be an advocate for change with an innovative and Growth Mindset, be an engaging collaborative member of the Technology Team and actively support your colleagues in Operations and the wider team

 

Essential Skills and Experience: 

AWS
Linux – Debian, CentOS, Alpine and AWS Linux
Terraform
Docker
Git
Jenkins
Nginks
MySQL
ELK/Grafana/Prometheus
Networking 
Security 

Skills

DevOps Technical Skills
Networking
Security
IT Infrastructure Products
AWS Cloudfront
Docker
ELK Stack
Grafana
Linux
MySQL
Prometheus
Software Development Tools
Git
Jenkins
Terraform

Industry Experience

IT