Site Reliability Engineer (7+)
equifax | 150 days ago | Trivandrum

What you’ll do

  • Proven experience as a Site Reliability Engineer or Software Engineer with a focus on operations and automation.

  • Expert-level proficiency in a scripting/programming language (e.g., Python, Go).

  • Demonstrated experience in designing and building automation frameworks for infrastructure and application management.

  • Strong understanding of AI/ML concepts and practical experience applying them to operational data (e.g., anomaly detection, predictive analytics).

  • Deep expertise in observability tools (e.g., Looker, Prometheus, Grafana) and using data to drive decisions.

  • Excellent leadership and communication skills, with the ability to mentor team members and collaborate effectively with other engineering teams.

  • Manage system(s) uptime across cloud-native (AWS, GCP) and hybrid architectures.
    Build infrastructure as code (IAC) patterns that meet security and engineering standards using one or more technologies (Terraform, scripting with cloud CLI, and programming with cloud SDK).

  • Build automated tooling to deploy service requests to push a change into production. Build runbooks that are comprehensive and detailed to manage detect, remediate and restore services.

  • Solve problems and triage complex distributed architecture service maps. On call for high severity application incidents and improving run books to improve MTTR

  • Lead availability blameless postmortem and own the call to action to remediate recurrences.


What experience you need

  • BS degree in Computer Science or related technical field involving coding (e.g., physics or mathematics), or equivalent job experience required

  • 7-10 years of experience in software engineering, systems administration, database administration, and networking.

  • 4+years of experience developing and/or administering software in public cloud

  • Experience in monitoring infrastructure and application uptime and availability to ensure functional and performance objectives.

  • Experience in languages such as Python, Bash, Java, Go JavaScript and/or node.js

  • Demonstrable cross-functional knowledge with systems, storage, networking, security and databases

  • System administration skills, including automation and orchestration of Linux/Windows using Terraform, Chef, Ansible and/or containers (Docker, Kubernetes, etc.)

  • Proficiency with continuous integration and continuous delivery tooling and practices

  • Cloud Certification Strongly Preferred

Official notification

⚡ Hot Jobs Trending Now

SRE
Sr. SRE Engineer
Stripe | Bangalore, India
DEV
Backend Developer
Coinbase | Remote, India
Infra
Cloud Infra Lead
Datadog | Pune, India
ML
MLOps Architect
Anthropic | Hyderabad
Data
Fivetran Data Eng.
Fivetran | Mumbai
SRE
Sr. SRE Engineer
Stripe | Bangalore, India
DEV
Backend Developer
Coinbase | Remote, India
Infra
Cloud Infra Lead
Datadog | Pune, India
ML
MLOps Architect
Anthropic | Hyderabad
Data
Fivetran Data Eng.
Fivetran | Mumbai
SDE
Staff Software Eng.
Airbnb | Gurgaon, India
Prod
Platform Engineer
Databricks | Bangalore
QA
Quality Assurance
GitLab | Remote
Security
Cloud Security
Zscaler | Mumbai
UX
Product Designer
Figma | Pune, India
SDE
Staff Software Eng.
Airbnb | Gurgaon, India
Prod
Platform Engineer
Databricks | Bangalore
QA
Quality Assurance
GitLab | Remote
Security
Cloud Security
Zscaler | Mumbai
UX
Product Designer
Figma | Pune, India
Contact US

Let's work laptop charging together

Any question or remark? just write us a message

Send a message

If you would like to discuss anything related to payment, account, licensing,
partnerships, or have pre-sales questions, you’re at the right place.