AI Infrastructure Engineer (8+)
cisco | 144 days ago | Bangalore

As an AI Infrastructure Engineer at Cisco, you will play a pivotal role in shaping the AI systems that enable cutting-edge innovations. Your work will directly impact:

-       The performance and efficiency of AI workloads on the node.

-       The reliability and availability of AI systems for Cisco’s customers.

-       Advancements in AI and machine learning infrastructure, enabling better utilization and improving efficiency for applications across industries.

-       Collaboration across internal teams to bring system level innovation across different cisco products.

 Your contributions will help Cisco maintain its leadership in AI infrastructure development and influence the broader AI and machine learning community.
 

Key Responsibilities

-       Design and develop node-level infrastructure components to support high-performance AI workloads.

-       Benchmark, analyze, and optimize the performance of AI infrastructure, including CUDA kernels and memory management for GPUs.

-       Minimize downtime through seamless config and upgrade architecture for software components. 

-       Manage the installation and deployment of AI infrastructure on Kubernetes clusters, including the use of CRDs and operators.

-       Develop and deploy efficient telemetry collection systems for nodes and hardware components without impacting workload performance.

-       Work with distributed system fundamentals to ensure scalability, resilience, and reliability.

-       Collaborate across teams and time zones to shape the overall direction of AI infrastructure development and achieve shared goals.
 

Minimum Qualifications:

-       Proficiency in programming languages such as C/C++, Golang, Python, or eBPF.

-       Strong understanding of Linux operating systems, including user space and kernel-level components.

-       Experience with Linux user space development, including packaging, logging, telemetry and lifecycle management of processes. 

-       Strong understanding of Kubernetes (K8s) and related technologies, such as custom resource definitions (CRDs).

-       Strong debugging and problem-solving skills for complex system-level issues.

-       Bachelor’s degree+ and relevant 8-12 years of Engineering work experience.
 

Preferred Qualifications:

-       Linux kernel and device driver hands-on expertise is a plus.

-       Experience in GPU programming and optimization, including CUDA, UCX is a plus.

-       Experience with high-speed data transfer technologies such as RDMA.

-       Use of Nvidia GPU operators and Nvidia container toolkit and Nsight, CUPTI.

-       Nvidia MIG and MPS concepts for managing GPU consumption.

Official notification

⚡ Hot Jobs Trending Now

SRE
Sr. SRE Engineer
Stripe | Bangalore, India
DEV
Backend Developer
Coinbase | Remote, India
Infra
Cloud Infra Lead
Datadog | Pune, India
ML
MLOps Architect
Anthropic | Hyderabad
Data
Fivetran Data Eng.
Fivetran | Mumbai
SRE
Sr. SRE Engineer
Stripe | Bangalore, India
DEV
Backend Developer
Coinbase | Remote, India
Infra
Cloud Infra Lead
Datadog | Pune, India
ML
MLOps Architect
Anthropic | Hyderabad
Data
Fivetran Data Eng.
Fivetran | Mumbai
SDE
Staff Software Eng.
Airbnb | Gurgaon, India
Prod
Platform Engineer
Databricks | Bangalore
QA
Quality Assurance
GitLab | Remote
Security
Cloud Security
Zscaler | Mumbai
UX
Product Designer
Figma | Pune, India
SDE
Staff Software Eng.
Airbnb | Gurgaon, India
Prod
Platform Engineer
Databricks | Bangalore
QA
Quality Assurance
GitLab | Remote
Security
Cloud Security
Zscaler | Mumbai
UX
Product Designer
Figma | Pune, India
Contact US

Let's work laptop charging together

Any question or remark? just write us a message

Send a message

If you would like to discuss anything related to payment, account, licensing,
partnerships, or have pre-sales questions, you’re at the right place.