Intern: RS – Foundation Models for Data Management & Lakehouses

IBM
Apply Now

Job Description

IBM Research Scientists are charting the future of Artificial Intelligence, creating breakthroughs in quantum computing, discovering how blockchain will reshape the enterprise, and much more. Join a team that is dedicated to applying science to some of today’s most complex challenges, whether it’s discovering a new way for doctors to help patients, teaming with environmentalists to clean up our waterways or enabling retailers to personalize customer service.

Your Role and Responsibilities

This is for a 2024 summer internship with the following start dates: May – August or June – September for quarter system schools.

We are broadly interested in making foundation models (FMs) effective for a range of data management tasks, particularly those related to the management of structured data in enterprise data lakes and lakehouses.

Topics of interest include research on effective and efficient tuning techniques, knowledge-driven reasoning, and causality-driven alignment for better control and run-time performance of FMs and their use in enterprise data tasks. Tasks of interest include semantic enrichment of structured data, semantic data management with metadata and knowledge graphs, code generation for data retrieval with transformations, and various data wangling tasks in the end-to-end data lifecycle in data lakes.

Tuning-related research spans full-space and parameter-efficient tuning techniques with supervised as well as reinforcement learning with reward functions that capture end-use performance. Grounding the generation of tuned models in domain-specific vocabulary, efficient techniques for human-in-the-loop adaptation at inference time, and retrieval augmentation techniques for data management tasks will be of interest.

For knowledge-driven reasoning, formulations and benchmarks that treat the database query-answering process as a knowledge-extraction task will be useful for experimenting with reasoning over database tables at different levels of complexity to improve and expand the reasoning skills of FMs.

For causal alignment, we’re interested in formulations that study and show the causal relationships behind the effectiveness of different prompt optimization methods, where a small set of prompt augmentation tokens improves FMs for issues like delusions, alignment, and transfer.

Required Technical and Professional Expertise

  • Applicants should be PhD & MS students pursuing graduate studies in computer science and related fields
  • Having at least one research publication, preferrably at a top conference in AI or data management
  • Familiarity with the basics of data management and data lakes
  • Familiarity and working expertise with large language models (LLMs) or other Foundation Models

Preferred Technical and Professional Expertise

Candidates should have basic knowledge in one or more of the following skills:

  • Familiarity with ontologies, knowledge graphs, and description logic
  • Familiarity with reinforcement learning, causal graphical models, and prompt optimization

Company Info.

IBM

IBM is a leading cloud platform and cognitive solutions company. Restlessly reinventing since 1911, we are the largest technology and consulting employer in the world, with more than 290,000 employees serving clients in 177 countries. IBM Research provides unparalleled insight into business, industry and society by leveraging advanced computing architectures and methodologies to solve some of the world’s most pressing challenges.

  • Industry
    Information Technology,Computer software,Computer hardware
  • No. of Employees
    292,500
  • Location
    New Orchard Road, Armonk, New York, NY 10504, USA
  • Website
  • Jobs Posted

Get Similar Jobs In Your Inbox

IBM is currently hiring Research Scientist Intern Jobs in Yorktown Heights, NY, USA with average base salary of $104,400 - $143,550 / Year.

Similar Jobs View More