Sr Site Reliability Engineer (2+)
geaerospace | 125 days ago | Bengaluru

Essential Responsibilities:

  • Understand business requirements and collaborate with Product & DevOps teams to implement highly available, scalable, resilient, cost-efficient solutions in Cloud environments.
  • Deploy Observability tools (New Relic, Splunk, ELK, Other open source O11y tools..etc) in our Cloud infrastructure and applications via Terraform and be the SME for these tools.
  • Create and configure alerts, dashboards, reports mapping to the Golden signals – Latency, Errors, Traffic, Saturation.
  • Pioneer the definitions of SLIs, SLOs and Error Budgets for GE Aerospace Digital Workplace’s products and services. And, champion the implementation for large scale adoption.
  • Perform Root Cause Analysis (RCA) for SLO breaches, Alerts and Incidents. Front-end the troubleshooting and debugging sessions.
  • Solve problems relating to critical products, applications, services and create solutions (automations, runbooks..etc.) to prevent problem recurrence.
  • Lead the Incident Management + Postmortem processes and collaborate with the Operations team to develop the templates for comms, runbooks and documents.
  • Consistently share best practices for reliability, resiliency, performance, and improve processes within and across teams.
  • Execute data driven approach to make decisions around capacity needs, Cloud cost optimization and infrastructure stability.
  • Prioritize reducing MTTx (Mean Time to Recover/Resolve/Repair) for Production incidents to provide better user experience.
  • Propose new design and develop solutions to solve complex problems in application resiliency and availability.
  • Be a strong technical mentor for junior team members professionally to help them realize their full potential.

 

Qualifications/ Requirements:

  • Bachelor’s degree from a recognized university or college with a minimum of 4 years of professional experience OR Diploma with a minimum of 5 years of professional experience OR Higher Secondary Certificate with a minimum of 7 years of professional experience
  • A minimum of 2 years of experience in Production Engineering or Site Reliability Engineering roles.
  • A minimum of 2 years of experience in Cloud environments (e.g., AWS, Azure) is required.
  • A minimum of 2 years of experience in DevOps and Infrastructure domain.
Official notification
Contact US

Let's work laptop charging together

Any question or remark? just write us a message

Send a message

If you would like to discuss anything related to payment, account, licensing,
partnerships, or have pre-sales questions, you’re at the right place.