Key Responsibilities:
· Design, Implement and/or refine Service Management processes. (Monitoring, Incident, Problem, Capacity, Change & Releases and Service Level Management)
· Track system health, performance and reliability via monitoring, observability platforms, implement proactive alerting mechanisms to detect anomalies and respond swiftly to incidents.
· Act as a point of escalation for complex incidents, collaborating with senior engineers and management to ensure effective resolution.
· Establish and enforce change control and release management processes to ensure smooth and controlled deployment of system changes.
· Conduct post-incident analyses to identify root causes and implement actions to prevent recurrence and improve system resilience.
· Perform regular system testing to identify vulnerabilities and validate disaster recovery plans.
· Partner with development teams to improve services through rigorous testing and release procedures.
· Participate in system design consulting, platform management, and capacity planning.
· Integrate reliability practices into CI/CD pipelines to automate testing, quality assurance, and deployment processes.
· Foster a culture of collaboration between development and operations teams, promoting shared ownership and accountability for system reliability.
· Create sustainable systems and services through automation and uplifts.
· Balance feature development speed and reliability with well-defined service-level objectives
· Continuously evaluate and enhance system reliability, scalability and performance. Identify areas for improvement and implement solutions to optimize processes and reduce manual toil.
· Define, track, and monitor SLAs/ SLOs to measure and improve system reliability.
Required skills and qualifications
· Bachelor’s degree (or equivalent) in computer science or related discipline
· Proven Process definition and Implementation experience, leveraging ITIL best practices
· Minimum ITIL V3 Intermediate / Expert certified - Mandatory
· Implementation experience of ITSM / ESM tools (e.g., SNOW, Remedy, JIRA)
Official notificationAny question or remark? just write us a message
If you would like to discuss anything related to payment, account, licensing,
partnerships, or have pre-sales questions, you’re at the right place.