Machine Learning Stack Integration Engineer

Advanced Micro Devices, Inc.
Apply Now

Job Description

What you do at AMD changes everything

At AMD, we push the boundaries of what is possible. We believe in changing the world for the better by driving innovation in high-performance computing, graphics, and visualization technologies – building blocks for gaming, immersive platforms, and the data center.

Developing great technology takes more than talent: it takes amazing people who understand collaboration, respect, and who will go the “extra mile” to achieve unthinkable results. It takes people who have the passion and desire to disrupt the status quo, push boundaries, deliver innovation, and change the world. If you have this type of passion, we invite you to take a look at the opportunities available to come join our team.

Machine Learning Stack Integration Engineer

THE ROLE:

 You will be working with Solution Validation & Debug within the Machine Learning Software Engineering group. As a team member you will be working closely on the debug and triage of Machine learning and High-Performance Computing related issues and add value to the Solution Validation of ROCm Stack.

THE PERSON:

Ideal candidate will bring in broad experience on dealing with complex software level issues related to Machine Learning and High-Performance Computing.

KEY RESPONSIBILITIES:

  • Debug Machine Learning/ High Performance Computing related issues on Radeon Open Compute Stack (ROCm)
  • Develop test contents for complex Machine learning algorithms on distributed nodes
  • Port High Performance computing application on ROCm
  • Reproduce field defects and develop appropriate tests to prevent future issues.
  • Design, develop and deploy testing tools and automation libraries necessary to perform testing.
  • Lead the adoption of tooling and industry best practices by means of advocacy and outreach to help our development communities level up.
  • Other duties as assigned

PREFERRED EXPERIENCE:

  • Languages: Python, C, C++, Linux Shell scripting.
  • Frameworks/Libraries: TensorFlow, PyTorch, ONNXRT
  • Tools: Prior experience with Linux, Docker, LLVM compilers
  • Desired Skills: Understanding of High-Performance Computing application, Machine learning and GPU Programming, MPI Parallel Programming

ACADEMIC CREDENTIALS:

  • Bachelor's Degree or higher in Computer Science or related quantitative field.

LOCATION:

Markham, Ontario, Canada

Company Info.

Advanced Micro Devices, Inc.

Advanced Micro Devices, Inc. (AMD) is an American multinational semiconductor company based in Santa Clara, California, that develops computer processors and related technologies for business and consumer markets. While it initially manufactured its own processors, the company later outsourced its manufacturing, a practice known as going fabless, after GlobalFoundries was spun off in 2009. AMD's main products include microprocessors, motherboard

  • Industry
    Artificial intelligence,Video games,Semiconductors,Computer hardware
  • No. of Employees
    15,500
  • Location
    Santa Clara, CA, USA
  • Website
  • Jobs Posted

Get Similar Jobs In Your Inbox

Advanced Micro Devices, Inc. is currently hiring Machine Learning Engineer Jobs in Markham, ON, Canada with average base salary of Can$95,000 - Can$170,000 / Year.

Similar Jobs View More