Senior Site Reliability Engineer (4+)
okta | 171 days ago | Bengaluru

As a Senior Site Reliability Engineer at Spera, you will play a critical role in building, maintaining and scaling our platform. Your expertise and contributions will directly impact the success and effectiveness of our product. Specifically, your responsibilities will include:
 

  • Developing, operating, and maintaining critical infrastructure (EKS, ECS, Airflow, VPCs, Snowflake, MongoDB, etc).
  • Development and full ownership of our IaaS framework.
  • Building, maintaining, and managing Docker images and repositories, including the CI/CD pipeline and deployment processes
  • Integration with 3rd-party tools and other infrastructure in the Okta WIC environment (e.g. observability)
  • Evangelizing security best practices, leading initiatives to strengthen our security posture for critical infrastructure, and managing security & compliance requirements.
  • Developing and maintaining technical documentation, runbooks, and procedures
  • Triaging and troubleshooting complex production issues to ensure reliability and performance
  • Identifying and automating manual processes
  • Promoting and applying best practices for building scalable and reliable services across engineering
  • Supporting a 24x7 online environment, managing production incidents and determining how we can prevent them in the future as part of an on-call rotation
     

What you’ll bring to the role
 

  • 4+ years of experience as a site reliability or platform engineer, preferably in a fast-scaling environment.
  • Proven hands-on experience with Docker and Kubernetes in production
  • Experience with the deployment of production workloads on public cloud infrastructure (AWS and GCP)
  • Strong expertise in configuration management using IaaS tools such as Terraform and Helm
  • Proficiency in ETL processes, showcasing the ability to handle data pipelines efficiently and securely, including experience with orchestration tools like Apache Airflow.
  • Experience in network engineering and security practices in AWS.
  • Experience managing CI/CD infrastructures, with a strong proficiency in platforms like GitHub Actions to streamline deployment pipelines and ensure efficient software delivery.
  • Knowledge of observability tools such as Grafana, Prometheus, and Splunk, as well as their implementation
  • Strong proficiency in Python for backend systems, demonstrating the ability to develop and maintain robust, scalable, and efficient software components essential for the reliability and performance of the infrastructure.
  • Excellent problem-solving skills and a detail-oriented mindset.
  • Strong communication and collaboration abilities to work effectively within a team.
  • Are passionate about encouraging the development of engineering peers and leading by example.
Official notification
Contact US

Let's work laptop charging together

Any question or remark? just write us a message

Send a message

If you would like to discuss anything related to payment, account, licensing,
partnerships, or have pre-sales questions, you’re at the right place.