Observability Engineer (5+)

standardchartered | 122 days ago | Bangalore

Job Summary

• As an Observability Engineer with a specialization on the Grafana stack, you will play a critical role in making the internal state of the market's infrastructure and services visible to stakeholders for troubleshooting, performance analysis, capacity planning, and reporting with a focus on telemetry solutions.
• You will develop platforms and tooling to enable developers and operators to efficiently trace performance problems to their source and map their application performance to business objectives using traces.
• You will assist teams in instrumenting their applications and systems to generate and utilize traces.
• You will engineer the standardization and adoption of observability tools for the Infrastructure departments including Platform, Database, Reliability, and Cloud Operations teams, as well as developer teams.
Key Responsibilities
• Design and build an observability infrastructure for all engineering teams to consume.
• Develop and improve instrumentation for monitoring and logging the health and availability of services.
• Design and develop tools for metric collection, analysis, and reporting.
• Educate and lead efforts to improve observability among all engineering teams.
• Work with teams to enable an effective and pleasant on-call experience.
• Identify and collect the appropriate measurements, and synthesize the correct queries, to show intuitive and insightful visualizations which characterize the behavior of complex systems.
• Build a metrics pipeline with end-to-end latency under 5 minutes.
• Integrate logs with time series data for event correlation.
• Help us unlock the power of distributed tracing.
• Proactively monitor systems, networks, and applications to provide input in improving the stability, security, efficiency, and scalability of systems.

Key Responsibilities

Our ideal candidate would have:
• Familiarity with the Grafana tech stack: Loki, Grafana, Tempo, Mimir/Prometheus
• In-depth experience designing at-scale monitoring and logging for corporate infrastructure services.
• 5 years experience working in Monitoring / Observability / SRE / DevOps / Performance Tuning.
• Experience working with cloud infrastructures, particularly Kubernetes and AWS.
• Experience with Git/version control solutions
• Experience with programming languages, primarily Go, Rust, Java, Python
• Experience with CI/CD pipelines like Azure Pipelines, Jenkins
• Expert-level experience in monitoring and logging technologies, both open source

Official notification

Join our Telegram group for daily job update

⚡ Hot Jobs Trending Now

SRE

Sr. SRE Engineer

Stripe | Bangalore, India

DEV

Backend Developer

Coinbase | Remote, India

Infra

Cloud Infra Lead

Datadog | Pune, India

MLOps Architect

Anthropic | Hyderabad

Data

Fivetran Data Eng.

Fivetran | Mumbai

SRE

Sr. SRE Engineer

Stripe | Bangalore, India

DEV

Backend Developer

Coinbase | Remote, India

Infra

Cloud Infra Lead

Datadog | Pune, India

MLOps Architect

Anthropic | Hyderabad

Data

Fivetran Data Eng.

Fivetran | Mumbai

SDE

Staff Software Eng.

Airbnb | Gurgaon, India

Prod

Platform Engineer

Databricks | Bangalore

Quality Assurance

GitLab | Remote

Security

Cloud Security

Zscaler | Mumbai

Product Designer

Figma | Pune, India

SDE