Site Reliability Engineer (SRE) (6+)
Siemens | 99 days ago | Bangalore

You’ll make a difference by:

You’d describe yourself as:

    • Being a SRE L2 Support role focuses on maintaining and improving the reliability, availability, and performance of AWS-based infrastructure and applications. This role involves monitoring, incident management, fixing, and proactive support of cloud services to ensure seamless operations and scalability.
    • Having experience in incident Management and Solving by actively monitor infrastructure and application health in AWS
    • Handling and resolving L2 incidents related to AWS services (EC2, RDS, S3, Lambda, EKS, etc.) with root cause analysis.
    • Providing timely communication to customers during outages and SLA breaches.
    • Setting up and fine-tune AWS monitoring and observability tools to detect issues early.
    • Creating alarms, dashboards, and reports in CloudWatch for compute, storage, and networking services.
    • Using AWS Health Dashboard to proactively identify service disruptions.
    • Managing and analyze logs using tools like AWS CloudWatch Logs, CloudTrail, and third-party solutions (e.g., ELK Stack, Datadog, Splunk).
    • Identifying anomalies and trends to detect and prevent recurring issues.
    • Tackle and resolve issues related to EC2 instances, Autoscaling Groups, and Load Balancers (ELB/ALB/NLB).
    • Supervising server health, resource utilization, and performance bottlenecks.
    • Supporting containerized workloads running on Amazon ECS, EKS,
    • Debugging Kubernetes pods, clusters, and container runtime issues.
    • Resolving issues with Amazon S3, EBS, and EFS, ensuring data integrity and access permissions.
    • Monitoring RDS (PostgreSQL, Aurora) performance, replication, and scaling.
    • Debugging AWS Transit Gateway, VPN, and Direct Connect connectivity problems.
    • Ensuring proper IAM policies and roles for secure access management.
    • Supporting maintenance activities such as patching EC2 instances, upgrading container runtimes, and managing system updates.
    • Participating in the automation of repetitive tasks using scripts.
    • Contributing to incident recovery processes and post-mortems to prevent recurrence.
    • Providing support for failed deployments and ensure quick recovery.
    • Monitoring AWS Backup jobs and ensure regular backups for critical infrastructure.
    • Validating DR (Disaster Recovery) plans and participate in recovery testing exercises.
    • Creating and maintaining operational runbooks, SOPs (Standard Operating Procedures), and knowledge base articles for common AWS issues.
    • Experienced professional with 6 to 9 years of relevant experience in SRE, DevOps, Cloud Infrastructure Support with strong hands-on expertise in AWS services.
    • Having hands-on experience with monitoring tools (e.g., Prometheus, Datadog).
    • Possessing knowledge of Linux/Unix operating systems and basic scripting skills (Python, Gitlab actions.
    • Experiencing working with cloud platforms (AWS, Azure, or GCP).
    • Familiarity with container orchestration (Kubernetes, Docker, Helmcharts) and CI/CD pipelines, ArgoCD for implementing GitOps workflows and automated deployments for containerized applications.
    • Showing experience in Datadog, AWS EC2, Lambda, ECS/EKS, RDS, VPC, Route 53, ELB, S3, EFS, Glacier.
    • Strong analytical skills to resolve production incidents effectively.
    • Basic understanding of networking concepts (DNS, Load Balancers, Firewalls).
    • Good communication and interpersonal skills for incident communication and partner concern.
    • Experience with alerting systems (PagerDuty etc.,) and incident tracking tools (JIRA, ServiceNow).
    • Being proactive problem-solver with a sense of urgency.
    • Strong organizational skills to prioritize tasks efficiently.
    • Ability to work effectively in high-pressure environments.
    • Teammate with the ability to collaborate across teams and shift ownership as required.
Official notification
Contact US

Let's work laptop charging together

Any question or remark? just write us a message

Send a message

If you would like to discuss anything related to payment, account, licensing,
partnerships, or have pre-sales questions, you’re at the right place.