Site Reliability Engineer (10+)
Gemraj Technologies Ltd | 184 days ago | India

MUST HAVE EXP :

 

  • Candidates should have moved from Devops to ML ops
  • Candidates who are on GEN AI with strong ML ops would also be a fit but must have prior DevOps exp
  • Candidates with ETL data pipelines with ML ops would also fit the role
  • Strong Python knowledge is a must for this role and should be an individual contributor

 

 

About the role:

Turing is looking for people to join us in building ML platforms for our Fortune 500 customers. You will be a key member of the Turing GenAI delivery organization heading a team of other Turing engineers across different skill sets.

Required skills

  • 10+ years of professional experience in building applications using cloud services. Prior experience in building Machine Learning platforms using cloud services.
  • Cloud expertise: Deep knowledge of cloud platforms like AWS, Google Cloud Platform, or Azure, including their machine learning and data services (Azure preferred).
  • DevOps skills: Experience with CI/CD pipelines, infrastructure as code, and containerization technologies like Docker and Kubernetes.
  • Machine learning knowledge: Understanding of ML workflows, model training, and deployment processes.
  • Data engineering: Familiarity with data pipelines, ETL processes, and data storage solutions.
  • Software engineering: Strong programming skills, particularly in languages commonly used in ML like Python.
  • System design: Ability to architect scalable, reliable systems that integrate various services.
  • Automation: Expertise in automating workflows and processes across the ML lifecycle.
  • Security and compliance: Knowledge of best practices for securing ML pipelines and ensuring regulatory compliance.
  • Monitoring and logging: Experience setting up monitoring and logging for ML systems.
  • Collaboration: Ability to work with data scientists, software engineers, and other stakeholders.

Roles & responsibilities

  • Evaluate and select appropriate cloud services for each stage of the ML lifecycle
  • Design and implement the overall architecture of the MLOps platform
  • Set up automated pipelines for data preparation, model training, and deployment
  • Implement version control for code, data, and models
  • Ensure the platform is scalable, secure, and compliant with relevant regulations
  • Provide tools and interfaces for data scientists to easily leverage the platform
  • Continuously optimize the platform for performance and cost-efficiency
  • This role is crucial in bridging the gap between data science and operations, enabling organizations to efficiently develop, deploy, and maintain machine learning models at scale.
Official notification
Contact US

Let's work laptop charging together

Any question or remark? just write us a message

Send a message

If you would like to discuss anything related to payment, account, licensing,
partnerships, or have pre-sales questions, you’re at the right place.