Senior AI Networking Software Engineer

Hewlett Packard Enterprise
Apply Now

Job Description

High Performance Computing and Artificial Intelligence are two of the fastest growing workloads in the industry today and are pushing the leading edge of networking technology forward at a rapid pace. Come join the Slingshot Fabric team, part of HPE's HPC and AI organization, and make an impact on the high-performance fabric business. We are looking for an experienced Networking Software Engineer to help expand HPE's High Performance Ethernet Fabric product growth through AI/ML use cases, networking, systems, and applications communities. This includes directly working with the open-source communities to optimize and support the latest communication libraries, frameworks, MPI distribution, acceleration middleware, and even applications used in Artificial Intelligence, commercial HPC, and Cloud markets and running on the Slingshot Ethernet fabric. The successful candidate will own the optimization, integration and support of AI infrastructure ecosystem software that will support the organization’s technical, business, and growth goals. 

Responsibilities will include, but are not limited to:

  • Design, implement and maintain system software that enables communication between CPUS, GPUs, and storage in scale out AI and HPC systems.
  • Directly own the partner engagements for the leading communication libraries, middleware and frameworks used in AI development today (NCCL, RCCL, UCX, OneCCL. Pytorch, etc.). 
  • Help identify and drive community engagement and internal test and validation for leading AI infrastructure software. Drive a SW plan and execute it to formalize support for NCCL, RCCL, other comms libraries. Expand support from libraries to other leading infrastructure software components including HPE AI development tools, containers, and new frameworks. 
  • Help enable multiple MPI distributions with Slingshot fabrics.
  • Develop a repeatable process for testing and maintenance of ecosystem software.
  • Understanding the workloads and applications with the goal of improving the SW and infrastructure efficiency.
  • Work with cross-disciplinary teams to understand business requirements and align software direction to meet those needs. 

Qualifications should include: 

  • Ph.D/Bachelor’s/master's degree in computer science, engineering, or related field (or equivalent experience)
  • 8+ years of relevant experience
  • Background in network and communications networking software
  • Background in deep learning and neural networks
  • Experience with software planning, development, and release processes. Ability to participate and own pieces of the product release pipeline up to and including package integration and support.
  • Deep understanding of computer architecture, and familiarity with the fundamentals of GPU architecture. Experience with Nvidia and AMD GPU infrastructure and software stacks.
  • Experience with Pytorch, TensorFlow or other AI frameworks
  • Programming and debug skills in C, C++ and Python
  • Strong collaborative and team skills. A proven ability to operate and influence within a dynamic matrix environment.

Additional Skills:

Artificial Intelligence Technologies, Cross Domain Knowledge, Data Engineering, Data Science, Design Thinking, Development Fundamentals, Full Stack Development, IT Performance, Machine Learning Operations, Scalability Testing, Security-First Mindset

What We Can Offer You:

Health & Wellbeing

We strive to provide our team members and their loved ones with a comprehensive suite of benefits that supports their physical, financial and emotional wellbeing.

Personal & Professional Development

We also invest in your career because the better you are, the better we all are. We have specific programs catered to helping you reach any career goals you have — whether you want to become a knowledge expert in your field or apply your skills to another division.

Diversity, Inclusion & Belonging

We are unconditionally inclusive in the way we work and celebrate individual uniqueness. We know diverse backgrounds are valued and succeed here. We have the flexibility to manage our work and personal needs. We make bold moves, together, and are a force for good.

Let's Stay Connected:

Follow @HPECareers on Instagram to see the latest on people, culture and tech at HPE.

unitedstates

highperformancecompute

Job:

Engineering

Job Level:

Expert

States with Pay Range Requirement

The expected salary/wage range for a U.S.-based hire filling this position is provided below. Actual offer may vary from this range based upon geographic location, work experience, education/training, and/or skill level. If this is a sales role, then the listed salary range reflects combined base salary and target-level sales compensation pay. If this is a non-sales role, then the listed salary range reflects base salary only. Variable incentives may also be offered. Information about employee benefits offered can be found at https://myhperewards.com/main/new-hire-enrollment.html.

Annual Salary: $128,000.00 - $295,000.00

Company Info.

Hewlett Packard Enterprise

Hewlett Packard Enterprise (HPE) is a prominent American multinational IT company headquartered in Spring, Texas. Established on November 1, 2015, in Palo Alto, California, it emerged from the division of the larger Hewlett-Packard corporation. HPE is distinctly business-oriented, specializing in servers, storage, networking, containerization software, and providing consulting and support services.

  • Industry
    Information Technology
  • No. of Employees
    60,400
  • Location
    Spring, TX, USA
  • Website
  • Jobs Posted

Get Similar Jobs In Your Inbox

Hewlett Packard Enterprise is currently hiring Network engineer Jobs in Bloomington, MN, USA with average base salary of $128,000 - $295,000 / Year.

Similar Jobs View More