Site Reliability Engineer (NM+)
netradyne | 224 days ago | Bangalore

Role and Responsibilities:

  • Participate in an on-call rotation for incident response and implement proactive measures to prevent incidents.
  • Develop monitoring alerts and incident response processes to ensure high availability and reliability.
  • Document actions taken during incidents and create automated solutions to improve incident response.
  • Collaborate with the engineering team as an expert in reliability, performance, and efficiency to support ongoing projects.
  • Consistently deliver high-quality managed services, ensuring optimal uptime and scalability of infrastructure, applications, and cloud services.
  • Automate the detection and resolution of recurring issues to enhance system stability.
  • Build tools and automation frameworks to eliminate repetitive tasks and prevent incident occurrence.
  • Continuously improve engineering, operational processes, and team practices to enhance efficiency and productivity.
  • Demonstrate strong programming skills and a deep understanding of systems to support the reliability and scalability of services.
  • Foster a culture of continuous improvement by promoting process changes and best practices.
  • Engage in continuous learning to expand skills through experimentation or training.

Soft Skills:

  • Ability to work asynchronously and independently.
  • Strong collaboration skills and willingness to work as part of a team.
  • Excellent problem-solving skills with the ability to think clearly under pressure.
  • Strong analytical and management skills.
  • Effective communication and documentation skills.

Qualifications:

  • Bachelor's or Graduate degree in Computer Engineering, Computer Science, Engineering, Information Systems Management, or equivalent experience.
  • Experience with Monitoring/Observability/Log tools such as AWS CloudWatch, Datadog, Prometheus/Grafana, and ELK.
  • Proficiency with Public Cloud platforms, LINUX/UNIX environments, and programming languages such as Java, Python, or Go.
  • Familiarity with Agile methodologies, SaaS environments, RDBMS, NoSQL databases, Cloud Architecture, and Frontend/Backend Systems and tools.
  • Comfortable with scripting and debugging production systems and services.
  • Strong collaboration skills with a mindset for continuous improvement.
  • Expertise in scalability and root cause analysis exercises.
Official notification
Contact US

Let's work laptop charging together

Any question or remark? just write us a message

Send a message

If you would like to discuss anything related to payment, account, licensing,
partnerships, or have pre-sales questions, you’re at the right place.