AWS Infrastructure Management:
- Design, implement, and manage complex AWS architectures using a wide range of AWS services.
- Optimize AWS resource utilization and implement cost-saving measures.
- Manage and maintain AWS accounts, IAM policies, and security groups.
Kubernetes Administration:
- Design, deploy, and manage Kubernetes clusters on AWS (EKS).
- Implement and maintain Kubernetes best practices for security, scalability, and performance.
- Troubleshoot complex issues within Kubernetes clusters and applications.
GitOps Implementation:
- Implement and maintain GitOps workflows for infrastructure and application deployments.
- Integrate GitOps practices with CI/CD pipelines and Kubernetes.
- Ensure version control and auditability of infrastructure and application configurations.
AWS Networking and VPN:
- Design and implement complex networking solutions in AWS, including VPCs, subnets, and route tables.
- Set up and manage VPN connections for secure access to AWS resources.
- Implement and maintain AWS Transit Gateway for network connectivity.
Monitoring and Observability:
- Design and implement a comprehensive monitoring stack using tools such as Prometheus, Grafana, and ELK stack.
- Set up alerting and incident response systems.
- Implement logging solutions and log analysis tools.
Infrastructure as Code (IaC):
- Develop and maintain Infrastructure as Code using AWS CloudFormation and Terraform.
- Implement version control for infrastructure code and manage deployment pipelines.
Security and Compliance:
- Implement AWS and Kubernetes security best practices.
- Conduct regular security audits and implement necessary improvements.
- Ensure compliance with industry standards and regulations.
CI/CD Pipeline:
- Design and implement CI/CD pipelines using AWS services and other tools like Jenkins or GitLab CI.
- Integrate CI/CD pipelines with Kubernetes and GitOps workflows.
Performance Optimization:
- Analyze and optimize AWS resource and Kubernetes cluster performance.
- Implement caching strategies and content delivery optimizations.
Debugging and Troubleshooting:
- Apply strong debugging skills to resolve complex issues across the entire stack.
- Perform root cause analysis and implement long-term solutions.
Documentation and Knowledge Sharing:
- Maintain comprehensive documentation for infrastructure, processes, and best practices.
- Mentor junior team members and share knowledge across the organization.
What we’re looking for...
You will need to have:
- Bachelor's degree of four or more years of work experience.
- Experience in DevOps or Site Reliability Engineering roles, with a strong focus on AWS and Kubernetes.
- Proven development background with excellent debugging skills.
- AWS Certified Solutions Architect Professional and AWS Certified DevOps Engineer Professional certifications.
- Knowledge of core AWS services, including EC2, VPC, S3, RDS, DynamoDB, Lambda, EKS, and IAM.
- Experience with Kubernetes administration, including cluster management, security, and troubleshooting.
- Proficiency in GitOps practices and tools (e.g., Flux, ArgoCD, Fleet).
- Extensive experience with AWS networking, including VPC design, VPN setup, and Transit Gateway.
- Strong skills in implementing and managing monitoring solutions (e.g., Prometheus, Grafana, ELK stack).
- Proficiency in Infrastructure as Code tools, particularly Terraform and AWS CloudFormation.
- Strong scripting skills in languages such as Python, Go, or Bash.
- Experience with CI/CD tools and practices, particularly within the AWS ecosystem.
Even better if you have one or more of the following:
- Strong communication skills to collaborate with cross-functional teams and explain technical concepts.
- Experience with Agile methodologies and DevOps culture.
- Knowledge of security best practices in cloud and Kubernetes environments.
- Excellent problem-solving skills and ability to debug complex issues across the entire stack.
Official notification