NVIDIA Senior Infrastructure Engineer Salary?

The average base salary for a Senior Infrastructure Engineer is $176,000 - $333,500 / Year at NVIDIA.

Senior Infrastructure Engineer Salary in Santa Clara, CA, USA

The average base salary for a Senior Infrastructure Engineer is $176,000 - $333,500 / Year in Santa Clara, CA, USA

Technical Skills Required to Become a Senior Infrastructure Engineer ?

To work as a Senior Infrastructure Engineer - You must have Degree in a relevant discipline in Degree in Artificial intelligence- AI Degree in Computer Science Degree in Information Architecture Degree in Information Technology

https://www.karkidi.com/upload-nct/company-logo/th1_nvidia_30f02.png

Looking for a Senior Infrastructure Engineer in Santa Clara, CA, USA?

NVIDIA is currently hiring Senior Infrastructure Engineer in Santa Clara, CA, USA and looking for candidates have skills and work experience of 8-10 year.

Does NVIDIA hire Senior Infrastructure Engineer now?

NVIDIA seeks to hire qualified Senior Infrastructure Engineer with at least 8-10 year experience.

Posted on:14 Feb 2024 BACK TO SEARCH

Senior Infrastructure System Software Engineer

NVIDIA

Apply Now

Job Type
Full Time
Experience
8-10 year
Salary
$176,000 - $333,500 / Year
Location

Santa Clara, CA, USA
Job Function

Senior Infrastructure Engineer
Industry
Information Technology
Qualification

Degree in Artificial intelligence- AI
Degree in Computer Science
Degree in Information Architecture
Degree in Information Technology

Key Skills

Aartificial intelligence, Amazon Elastic Compute Cloud-EC2, AWS, C Programming, C++, Design, High performance computing- HPC, Infrastructure as code, Kubernetes-K8s, Optimization, Python Programming

Job Description

We are seeking a Senior Infrastructure System Software Engineer with profound expertise in High-Performance Computing (HPC) and AI workload management, as well as Kubernetes-based infrastructure, to join our Omniverse Infrastructure team. The ideal candidate will have a strong understanding of system software design principles and extensive experience in deploying, managing, optimizing, and scaling sophisticated AI and cloud environments using workload management software systems such as SLURM, Flux, PBS Pro, and Kubernetes. Proficiency in integrating HPC/AI workload managers with Cloud resource provisioning APIs (e.g., AWS EC2 with topology-aware configurations) is desirable.

As a key member of the NVIDIA Omniverse™ Cloud team, you will be tasked with designing and developing advanced system software solutions within large AI clusters to efficiently manage and schedule resources for converged HPC/AI and cloud-native workloads. This role requires close collaboration with multi-functional teams to ensure our infrastructure meets the stringent demands of advanced AI workloads, including extreme scalability, elasticity, multi-tenancy, high availability, and the optimization of large-scale applications and workflows.

What you will be doing:

Architect and implement system software within a converged environment that incorporates both HPC/AI workload managers like SLURM with services running on Kubernetes, enhancing resource/process management, scheduling, and resilience of large AI workloads.

Develop long-running system service solutions to accelerate the training of extensive AI models.

Work closely with Omniverse infrastructure teams and customers to fully understand and meet their compute and storage needs, ensuring seamless integration with AI/HPC workload managers and cloud APIs.

Tackle key system software challenges in compute, networking, and storage to enhance the overall performance, efficiency, and resilience of large language model training and other computational tasks.

What we need to see:

8+ years of experience in system software engineering with a focus on developing and enhancing AI/HPC workload management software such as SLURM and Flux framework.

BS Degree or equivalent experience

Demonstrated ability in developing fault-tolerant, distributed services at scale.

Strong familiarity with HPC workload managers such as SLURM, Flux, PBS Pro, and their integration with Cloud APIs to create large AI/HPC cluster instances within major CSPs, for example, by using AWS EC2 provisioning, reservation and topology-aware configurations APIs.

Proficiency in Python, C/C++, with a solid background in systems programming, including event-based programming, multi-threading, concurrency, and parallelism.

A deep understanding of cloud technologies including Clouds’ managed services, distributed computing systems, and microservices architecture.

An advanced degree in Computer Science or a related field, or equivalent professional experience.

Excellent collaborative skills and the ability to work effectively across multi-functional teams and geographies.

Ways to stand out from the crowd:

In-depth, practical knowledge of significantly modifying HPC workload managers such as SLURM and Flux.

Demonstrated experience in maximizing advanced cloud services for AI optimization that use NVIDIA high-end GPUs.

Experience with low level system provisioning tools such as Forward-thinking Cluster Manager (BCM).

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most hard-working and dedicated people in the world working for us. If you're creative and passionate about developing cloud services we want to hear from you!

The base salary range is 176,000 USD - 333,500 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Company Info.

NVIDIA

NVIDIA’s invention of the GPU sparked the PC gaming market. The company’s pioneering work in accelerated computing—a supercharged form of computing at the intersection of computer graphics, high performance computing and AI—is reshaping trillion-dollar industries, such as transportation, healthcare and manufacturing, and fueling the growth of many others.

Industry

Cloud computing,Video games,Computer software,Semiconductors,Computer hardware,Consumer electronics,Artificial intelligence
No. of Employees

22,473
Location

2701 San Tomas Expressway, Santa Clara, CA 95050, USA
Website

https://www.nvidia.com/
Jobs Posted

Get Similar Jobs In Your Inbox

NVIDIA is currently hiring Senior Infrastructure Engineer Jobs in Santa Clara, CA, USA with average base salary of $176,000 - $333,500 / Year.

Senior Infrastructure System Software Engineer

Job Type

Experience

Salary

Location

Job Function

Industry

Qualification

Key Skills

Job Description

Company Info.

Get Similar Jobs In Your Inbox

NVIDIA is currently hiring Senior Infrastructure Engineer Jobs in Santa Clara, CA, USA with average base salary of $176,000 - $333,500 / Year.

Similar Jobs View More