Support reliability operations across platforms.
Monitor and maintain SLOs, SLIs, and error budgets.
Participate in incident response and post-incident reviews; contribute to root cause analysis and remediation.
Develop automation for operational tasks, incident response, and compliance.
Maintain and enhance CI/CD pipelines with integrated testing and deployment automation.
Implement observability dashboards and alerts using Datadog, OpenTelemetry, and BigPanda.
Contribute to infrastructure-as-code using Terraform and GitHub Actions.
Support integration and maintenance of Kong API Gateway and Snowflake data platform.
Service Management & Compliance
Follow ITIL practices for incident, problem, change, and service request management.
Use ServiceNow for ticketing, reporting, and workflow automation.
Ensure runbook accuracy and DR readiness.
Monitor system performance and cost efficiency.
Support compliance and audit readiness activities.
Collaboration & Knowledge Sharing
Work with engineering and product teams to embed reliability into delivery.
Share technical knowledge through documentation and enablement sessions.
Participate in global SRE initiatives and cross-regional collaboration.
Bachelor’s degree in Computer Science, Engineering, or a related technical field or equivalent practical experience.
Any question or remark? just write us a message
If you would like to discuss anything related to payment, account, licensing,
partnerships, or have pre-sales questions, you’re at the right place.