You are the operational owner for one or more customer-facing systems and coordinate all operational changes.
You are the expert in AWS cloud technologies and consult teams on available technologies, services and best practices.
You design your systems for maximum robustness and maximize the operational performance.
You facilitate a permanent knowledge exchange by sharing your experiences, training others and continuous learning.
Your qualifications
As for Site Reliability Engineer you aim to solve operational problems by software and have experience in the following areas:
We are looking for candidates with 7.5-10 years of work experience experience in operating customer-facing services in the Amazon AWS cloud environment, having an AWS Solutions Architect Professional certification or equivalent knowledge.
A reference project where you have optimized a system for scalability, performance and reliability.
Knowledge in Site Reliability concepts e.g. having an SRE foundation certification.
Azure knowledge is a plus
Software development experience and programming skills in multiple scripting or higher languages (e.g. bash, python).
Building infrastructure out of source code using Terraform.
Work experience in Build & Release Management, Deployment Automation Process, CI/CD pipeline (e.g. Jenkins, Puppet, Ansible)
Software testing and designing test strategies is a plus
You have designed comprehensive monitoring tools (e.g. Icinga, AWS Cloudwatch, etc.), to collect and analyze data for further system optimization.
Experience with distributed systems and complex network architecture – DNS, firewalls, routing, tunnels are known to you.
Experience in agile software environments e.g. Scrum or SAFe
Linux server administration
ITIL foundation certification is a plus
Language: Fluent English a must, other languages a plus.