Degree in Artificial intelligence- AI
Aartificial intelligence, Computer Vision (CV), Deep Learning, GPUs, Machine learning techniques, Natural Language Processing (NLP), PyTorch, TensorFlow
Your Role and Responsibilities
Student Researcher – in the challenging area of multimodal foundation models, working on tasks in the intersection of vision, audio and language modalities. This position will tackle real-world tasks in rich document understanding and speech understanding and generation. The focus of the work will be on the multidisciplinary, multimodal foundation models including training, adapting, and fine-tuning the models for a wide variety of real-world tasks.
If you’re a student interested in the fields of machine learning, deep learning, and intersection of multiple disciplines of computer vision, speech and audio analysis, and natural language processing, and you’re looking for a place where you will do research with academic and industrial impact, then this position is for you!
Our team develops technologies, models, algorithms, and software that make an impact on IBM products and on the world; we publish papers and issue patents based on the work we do.
Roles and Responsibilities :
The responsibilities involve solving real-world problems using cutting edge deep learning/machine learning methods, with the aim to advance the state of the art in the domain of document understanding, speech analysis and generation.
Document understanding is the ability to read documents, understand their structure and content, extract and act upon it. This is a crucial technology as business documents are key to the day-to-day operation of organizations.
Document understanding remains a research challenge that requires a multi-disciplinary perspective, spanning textual analysis, visual comprehension, layout understanding, knowledge representation, data mining and more.
Speech and Audio technologies provide the ability to understand as well as generate audio and speech. In particular, speech recognition and synthesis are key components of natural spoken interaction, which is key to for customer care by organizations. This also requires a multi-disciplinary perspective, spanning conversational and generative AI and modeling for speech, language, and audio.
The areas we are looking at include also multimodal and foundation models, image and audio understanding, data synthesis, expressive speech synthesis and tokenization.
To achieve these goals, you will collaborate with fellow team members and have access to nearly limitless compute power (GPU). The topics include, novel self-supervised learning techniques, realistic data synthesis, multimodal research, and more.
During your time at IBM you will have the opportunity to publish your work in top AI conferences and development of a prototype demonstrating new AI functionality.
Succeeding in these tasks is expected to make an important impact on the research community in these exciting fields and lead to strong publications in a leading AI venue (e.g. CVPR / ICLR / ICCV / ICASSP / Interspeech).
Location: (both are possible)
Haifa Research Lab (in the Haifa University Campus)
IBM Site in Hashahar Tower , Givataim (Near Tel Aviv Arlozorov train station)
Required Technical and Professional Expertise
IBM is a leading cloud platform and cognitive solutions company. Restlessly reinventing since 1911, we are the largest technology and consulting employer in the world, with more than 290,000 employees serving clients in 177 countries. IBM Research provides unparalleled insight into business, industry and society by leveraging advanced computing architectures and methodologies to solve some of the world’s most pressing challenges.
Research Triangle Park, Durham, NC, USA
2-4 year
Research Triangle Park, Durham, NC, USA
2-4 year
Cambridge, MA, USA
2-4 year
San Jose, CA, USA
2-4 year
Mountain View, CA, USA
2-4 year
Rio de Janeiro, Brazil
0-2 year
Hortolândia, State of São Paulo, Brazil
0-2 year
Rio de Janeiro, Brazil
0-2 year
Hortolândia, State of São Paulo, Brazil
0-2 year
Yorktown Heights, NY, USA
0-2 year
Yorktown Heights, NY, USA
0-2 year
Cambridge, MA, USA
0-2 year
Toronto, ON, Canada
0-2 year
Calgary, AB, Canada
0-2 year
Montreal, QC, Canada
0-2 year
Givatayim, Israel
0-2 year
Cluj-Napoca, Romania
2-4 year
Research Triangle Park, Durham, NC, USA
4-6 year
Givatayim, Israel
0-2 year
Givatayim, Israel
0-2 year
Givatayim, Israel
0-2 year
Givatayim, Israel
0-2 year
São Paulo, State of São Paulo, Brazil
2-4 year
Rio de Janeiro, State of Rio de Janeiro, Brazil
2-4 year
São Paulo, State of São Paulo, Brazil
2-4 year
Rio de Janeiro, State of Rio de Janeiro, Brazil
2-4 year
Rio de Janeiro, State of Rio de Janeiro, Brazil
0-2 year
Frankfurt, Germany
4-6 year
Ehningen, Germany
4-6 year
Austin, TX, USA; Chicago, IL, USA; Dallas, TX, USA; Houston, TX, USA; Los Angeles, CA, USA; New York, NY, USA; Philadelphia, PA, USA; Phoenix, AZ, USA; San Antonio, TX, USA; San Diego, CA, USA; San Jose, CA, USA
2-4 year
Austin, TX, USA; Chicago, IL, USA; Dallas, TX, USA; Houston, TX, USA; Los Angeles, CA, USA; New York, NY, USA; Philadelphia, PA, USA; Phoenix, AZ, USA; San Antonio, TX, USA; San Diego, CA, USA; San Jose, CA, USA
2-4 year
Austin, TX, USA; Chicago, IL, USA; Dallas, TX, USA; Houston, TX, USA; Los Angeles, CA, USA; New York, NY, USA; Philadelphia, PA, USA; Phoenix, AZ, USA; San Antonio, TX, USA; San Diego, CA, USA; San Jose, CA, USA
2-4 year
Chicago, IL, USA; Columbus, OH, USA; Detroit, MI, USA; Indianapolis, IN, USA; Kansas City, MO, USA; Milwaukee, WI, USA; Minneapolis, MN, USA; Omaha, NE, USA
2-4 year