You’ll make an impact by:
· you will play a crucial role in ensuring the System availability, reliability, scalability, and security of our software applications and infrastructure.
· You will work closely with functional teams to streamline and automate infrastructure operational tasks and processes, applying your expertise in AWS platform along with DevOps standard processes.
· Provide support and incident response as part of an Operations Center in support of a Cloud Operations team.
· Support incident management processes for cloud services.
· Alert configuration and monitoring of Siemens cloud services and data centers.
· Configure & maintain the cloud-based alert systems and document handling procedures.
· Monitor, fix, and maintain system health, using monitoring tools like Prometheus, Grafana, Zabbix or AWS CloudWatch.
· Monitor AWS resources, including EC2 instances, S3 buckets, RDS databases, EKS clusters, Kubernetes pods, and more.
· Guide in managing EKS clusters, including node groups, networking, and security.
· Fix Kubernetes-related issues, such as pod scheduling, networking, and configuration.
· Assist customers with deploying, scaling, and maintaining containerized applications.
Use your skills to move the world forward!
· Design, implement, and maintain CI/CD pipelines using tools such as AWS Code Pipeline, Jenkins, GitLab CI/CD, to automate software builds, testing, and deployments.
· Performing routine maintenance tasks (tuning, backup and recovery, and monitoring)
· Develop and maintain scripts (e.g., Bash, Python, PowerShell, etc.) for automation and routine tasks.
· Implement backup and disaster recovery solutions for critical systems.
· Evaluate active tickets, prioritizing workload, and monitoring queue health.
· Handle complex severity incidents according to Service Level Agreement (SLA).
· Resolve escalated technical application and infrastructure incidents end to end per SLA.
· Identify issues/incidents and contact to the respective partner concern points in a timely manner for issue resolution.
· Leverage internal technical expertise and other internal tools to provide effective solutions to identified issues.
· Ensure services are run effectively and in compliance with pre-defined parameters, ensuring mitigation of risks and impact on business activities.
· Flexible with schedule changes to meet business needs.
Official notificationAny question or remark? just write us a message
If you would like to discuss anything related to payment, account, licensing,
partnerships, or have pre-sales questions, you’re at the right place.