My client are looking for a Site Reliability Engineer to join an existing highly skilled team to deliver in a DevOps environment supporting production applications, back-office services, cloud services, platform improvements, and acting as both advisor and coach to other team members. You will have experience of diagnosing and fault-finding incidents using data insights and liaising closely with the delivery teams reporting progress, gathering data, intelligence and information as requested
Key Responsibilities and Accountabilities:
- Responsible for the performance and reliability of the company’s global online platforms. Working within the Technology Team, troubleshooting issues with services via proactive/reactive monitoring, alerts and logging, service requests communicated via Jira, email, Sprint meetings and Stand Ups
- Enhancing existing service’s tech stack/configurations to improve site performance, reduce issues through forensic analysis and be responsible for availability management, latency, efficiency, change management, monitoring, emergency response, and capacity planning.
- Record data and manage issues with a view to participation in reviews and Blameless Post-Mortems.
- Explore and deliver on opportunities to implement automation and scripting of services, environments and toolset
- Liaise closely with the application Developers, Sprint Teams and the Development Managers reporting progress, gathering data, readings and information as requested.
- Design, implement, calibrate and validate to company procedures and processes alongside routine service, emergency service and product updates as required.
- Create a bridge between Development and Operations teams by applying an ‘as-a-service’ mindset to system administration, management and build topics. Gain exposure to systems in both staging and production, as well as all technical teams. Take part in work with software development, support, IT operations and on-call duties
- Be an advocate for change with an innovative and Growth Mindset, be an engaging collaborative member of the Technology Team and actively support your colleagues in Operations and the wider team
Essential Skills and Experience:
Linux – Debian, CentOS, Alpine and AWS Linux