Job Description

You are a talented, hands-on engineer who thrives in a fast-paced environment, is self-directed, a team player, and knows how to get things done. You have a deep knowledge of Python, and a strong understanding of modern deep learning, natural language processing, and the inner workings of the transformer architecture. You can translate high-level goals into concrete research and implementation steps, set an approach, follow through, and present results. When it’s time to explain your ideas, you bring clarity to complex technical issues. You use these skills to create real-world benefits for researchers and other practitioners, and you are excited to help advance our effort to create the best-performing open large language model.

Your Next Challenge:

You will be a part of the core team of research and machine learning engineers working on the infrastructure, architecture, modeling and training of OLMo (Open Language Model). In this role you will be owning the design and implementation of the code that trains the OLMo models. You will be responsible for building scalable machine learning pipelines as we push the boundaries of large language modeling research. You will be collaborating with colleagues inside and outside your own team, but you are responsible for a feature or experiment from start to finish, from conception to implementation.

The essential functions include, but are not limited to the following:

  • Building infrastructure to facilitate the next generation of LLM research
  • Optimizing training and inference for language models
  • Triaging between experiments and executing on the most impactful
  • Supporting and collaborating with an open-source community
  • Bridging the gap between cutting-edge research and a widely adopted product
  • Bringing software engineering best practices to a research environment
  • Releasing your contributions back to the broader community in the form of open source software, model releases, and additions to AI2’s public API and Open Research Corpus

What You’ll Need:

  • Expertise at building ML infrastructure - having 4+ years of industry experiences building infrastructure that handles data preprocessing/transformation and model training, evaluation, and deployment
  • Deep experience in the complete model development cycle, including data set construction, training, tuning, evaluation, performance profiling, and monitoring
  • Knowledge of modern deep learning and natural language processing techniques
  • Strong software engineering skills, particularly around building performant systems and debugging
  • At-home with on-hands programming – must have experience with Python and PyTorch/Jax/Tensorflow. We expect you to be the kind of engineer who can pick up a new programming language, library, or API as needed without it being a big deal.
  • Familiarity working with cloud compute resources (e.g. AWS) and containerization (e.g. Docker)
  • Strong collaboration and communication skills - our environment is small and collaborative, and we'd like you to thrive while working closely with others

Bonus qualifications:

  • Advanced degree in Data Science/CS/EE/Applied Mathematics/Statistics/ML/NLP or related fields and/or relevant and equivalent engineering experience
  • Contributions to open-source ML or research libraries (e.g. spaCy, AllenNLP, transformers)
  • Experience successfully operating models at scale in a production setting
  • Experience in HPC settings

Education:

  • BS or MS in Computer Science, Statistics, Engineering, Applied Mathematics, or a related quantitative field

Physical Demands and Work Environment:

The physical demands described here are representative of those that must be met by a team member to successfully perform the essential functions of this position. Reasonable accommodations may be made to enable individuals with disabilities to perform the functions.

  • Must be able to remain in a stationary position for long periods of time.
  • The ability to communicate information and ideas so others will understand. Must be able to exchange accurate information in these situations.
  • The ability to observe details at close range.
  • Can work under deadlines.

A Little More About AI2:

The Allen Institute for Artificial Intelligence is a non-profit research institute in Seattle founded by Paul Allen. The core mission of AI2 is to contribute to humanity through high-impact research in artificial intelligence.

In addition to AI2’s core mission, we also aim to contribute to humanity through our treatment of each member of the AI2 Team. Some highlights are:

You are a talented, hands-on engineer who thrives in a fast-paced environment, is self-directed, a team player, and knows how to get things done. You have a deep knowledge of Python, and a strong understanding of modern deep learning, natural language processing, and the inner workings of the transformer architecture. You can translate high-level goals into concrete research and implementation steps, set an approach, follow through, and present results. When it’s time to explain your ideas, you bring clarity to complex technical issues. You use these skills to create real-world benefits for researchers and other practitioners, and you are excited to help advance our effort to create the best-performing open large language model.

Your Next Challenge:

You will be a part of the core team of research and machine learning engineers working on the infrastructure, architecture, modeling and training of OLMo (Open Language Model). In this role you will be owning the design and implementation of the code that trains the OLMo models. You will be responsible for building scalable machine learning pipelines as we push the boundaries of large language modeling research. You will be collaborating with colleagues inside and outside your own team, but you are responsible for a feature or experiment from start to finish, from conception to implementation.

The essential functions include, but are not limited to the following:

  • Building infrastructure to facilitate the next generation of LLM research
  • Optimizing training and inference for language models
  • Triaging between experiments and executing on the most impactful
  • Supporting and collaborating with an open-source community
  • Bridging the gap between cutting-edge research and a widely adopted product
  • Bringing software engineering best practices to a research environment
  • Releasing your contributions back to the broader community in the form of open source software, model releases, and additions to AI2’s public API and Open Research Corpus

What You’ll Need:

  • Expertise at building ML infrastructure - having 4+ years of industry experiences building infrastructure that handles data preprocessing/transformation and model training, evaluation, and deployment
  • Deep experience in the complete model development cycle, including data set construction, training, tuning, evaluation, performance profiling, and monitoring
  • Knowledge of modern deep learning and natural language processing techniques
  • Strong software engineering skills, particularly around building performant systems and debugging
  • At-home with on-hands programming – must have experience with Python and PyTorch/Jax/Tensorflow. We expect you to be the kind of engineer who can pick up a new programming language, library, or API as needed without it being a big deal.
  • Familiarity working with cloud compute resources (e.g. AWS) and containerization (e.g. Docker)
  • Strong collaboration and communication skills - our environment is small and collaborative, and we'd like you to thrive while working closely with others

Bonus qualifications:

  • Advanced degree in Data Science/CS/EE/Applied Mathematics/Statistics/ML/NLP or related fields and/or relevant and equivalent engineering experience
  • Contributions to open-source ML or research libraries (e.g. spaCy, AllenNLP, transformers)
  • Experience successfully operating models at scale in a production setting
  • Experience in HPC settings

Education:

  • BS or MS in Computer Science, Statistics, Engineering, Applied Mathematics, or a related quantitative field

Physical Demands and Work Environment:

The physical demands described here are representative of those that must be met by a team member to successfully perform the essential functions of this position. Reasonable accommodations may be made to enable individuals with disabilities to perform the functions.

  • Must be able to remain in a stationary position for long periods of time.
  • The ability to communicate information and ideas so others will understand. Must be able to exchange accurate information in these situations.
  • The ability to observe details at close range.
  • Can work under deadlines.

A Little More About AI2:

The Allen Institute for Artificial Intelligence is a non-profit research institute in Seattle founded by Paul Allen. The core mission of AI2 is to contribute to humanity through high-impact research in artificial intelligence.

In addition to AI2’s core mission, we also aim to contribute to humanity through our treatment of each member of the AI2 Team. Some highlights are:

Company Info.

The Allen Institute for Artificial Intelligence

The Allen Institute for AI (abbreviated AI2) is a research institute founded by late Microsoft co-founder Paul Allen. The institute seeks to achieve scientific breakthroughs by constructing AI systems with reasoning, learning, and reading capabilities. Oren Etzioni was appointed by Paul Allen in September 2013 to direct the research at the institute.

  • Industry
    Information Technology
  • No. of Employees
    200
  • Location
    Seattle, Washington, USA
  • Website
  • Jobs Posted

Get Similar Jobs In Your Inbox

The Allen Institute for Artificial Intelligence is currently hiring Senior Research Engineer Jobs in Seattle, WA, USA with average base salary of $153,040 - $235,680 / Year.

Similar Jobs View More