Work you’ll do
- Ensure user visible uptime and quality, providing operational and development expertise in making our systems fail rarely, and are fast to fix when they do fail
- Participate in architecture and design reviews to provide recommended improvements to the development teams to improve the reliability and performance of applications
- Minimize manual involvement by imagining & implementing continuous improvements that create an operating environment, including the development of new tools, dynamically monitoring, alerting, & automated self-healing & recovery
- Identify and/or analyze problems relating to mission critical services and implement automation to prevent problem recurrence; with the goal of automating response to all non-exceptional service conditions.
- Engage in application performance analysis and system tuning, and capacity planning
- Perform root cause analysis to identify & implement continuous improvements
- Capable of presenting analyses and recommendations to leadership or discussing the technical merits of solutions with engineers and architects.
- Participate in Database Architecture review and BCP plan, Data Center Migration architecture and review, Automations using PowerShell, Python and T-SQL
- Responsible for building out and improving the reliability and performance of cloud applications and cloud infrastructure deployed on AWS/GCP/Azure
- Set up and improve monitoring of cloud-native application performance and log monitoring tools (Splunk, Dynatrace etc.)
- Own the day-to-day health, uptime, monitoring, and reliability of services and server infrastructure
-
The team
The Cloud Engineering team will design and implement continuous improvements in the management, design, and functionality of our operational environments to achieve speed and reliability, enabling business agility and user satisfaction. You will be part of our technology organization and have a great opportunity to collaborate with various parts of Deloitte, including our development teams and other stakeholders, to drive reliability upstream in the application lifecycle and across our operational environments.
Qualifications
Required:
- B.E, B.Tech, M.C.A, M.Sc. with 6-10 years of relevant work experience.
- Good working knowledge of Microsoft Windows Server Operating Systems, RedHat Linux and VMware
- Scripting Skill in one or more of PowerShell, Bash/Shell, JavaScript, or Python
- Exposure on Azure SQL Database and Managed Instance, SQL administration and Oracle DB administration
- Able to understand DevOps CI/CD pipeline and troubleshooting DB related issues.
- Advanced administration on Azure, AWS, GCP Cloud administrations knowledge
- Hands on experience in Azure storage, Azure active directory, Azure Service bus. Created and managed Azure AD tenants and configure application integration with Azure AD.
- Infrastructure management tasks, project SDLC setups, encompassing VIP creation, DNS record management, and firewall configurations.
- Familiarity with infrastructure-as-code tools (e.g., Chef, Terraform, CloudFormation) for automated provisioning and management.
- Good understanding of DevOps principles and practices for continuous integration and deployment, SRE principles
- Knowledge of AI/ML technologies
- Design and maintain robust CI/CD pipelines. Experience with orchestration, automation, and configuration management tools like git, and Ansible (or Puppet, Chef, Terraform, Helm or related technology).
- Implement scalable and robust cloud API solutions using PaaS services such as Azure APIM (mandatory), Application Gateway, Web and Function Apps, Service Bus Queues and Topic, Key Vault, Azure Identity Management, EventHub, and others.
Official notification