Data Science CV-NLP, Summer Intern

Ancestry
Apply Now

Job Description

Ancestry is looking for an exceptional, passionate, and highly motivated Data Science CV-NLP, Intern to join our Data Science Computer Vision & Natural Language Processing team this summer. The Data Science CV-NLP team develops CV and NLP models to extract and organize text and image information from billions of historical and genealogical records. CV models are combined with NLP models to extract and organize information from data to help customers discover and connect with their family history. As a Data Science intern on the Data Science CV-NLP team, you will build and train models that promote product development, customer success, and content creation across our Family History business. You will also work closely with engineering teams to train, optimize, and deploy models. 

  • Implement state of the art Computer Vision methods in document layout analysis, classification, segmentation, object detection, redaction, etc. across various genealogical and historical collections such as newspapers, city directories, family history books, birth, marriage and death records, etc. 
  • Analyze model performance, refine Labeling Specifications and iterate with Labeling resources to curate and refine training sets improving performance.
  • Collaborate with ML Ops and Data Science Engineers to deploy datasets, truthsets, models, training and inference code to cloud based model registry 
  • Effectively communicate and present deliverables and solutions to teams, stake holders, and executives. 

Who You Are: 

  • Candidate for an advanced degree (MS/PhD) in Computer Science, Statistics, Mathematics, Linguistics, Engineering or data related quantitative field
  • Specialization in natural language processing, computer vision, deep learning, machine learning, or related software development
  • Experience understanding and implementing published models and methods for practical application and real-world problems
  • Strong proficiency in Python and related CV and NLP tools and libraries, and familiarity with deep learning frameworks like Pytorch, TensorFlow, Keras, SciPy stack and Scikit-learn

Nice to Have: 

  • Experience with NLP techniques such as named entity recognition, relationship extraction, document classification, document summarization, topic modeling, machine translation, sentiment analysis, dialogue systems
  • Experience in document image processing i.e., computer vision methods, image classification, object detection, segmentation, layout analysis, redaction, handwriting recognition
  • Familiarity with NLP technologies such as, NLTK, spaCy, pandas, numpy, along with understanding of pre-trained language models and architectures like BERT, GPT, T5, XLNet, PL Marker, TP Linker, OneRel, Huggingface and OpenAI models, etc.

Internship Program Details:

  • Students must be enrolled in an accredited U.S. educational institution with a graduation date after August 2023. 
  • Summer 2023 program dates are May 15 – September 8(Please note that we will have three intern onboarding dates to choose from: May 15th, June 5th and June 20th. Students may offboard every Friday, beginning August 11th. All internships must be wrapped up by September 8th.)
  • FULLY PAIDtemporary housing and travel to and from internship
  • All summer internships will be in Lehi, Utah. You will work a combined hybrid and office-based schedule that allows you to choose which days you come into the office and which days you work from temporary housing/home (Utah students).
  • Interns have the opportunity to network and partner with other interns and industry-leading professionals
  • You will participate in engaging events including executive speaker sessions, professional development, and our annual Intern Days to showcase your project and work. 
  • Full-time schedule (40 hours/week) required; Monday-Friday
  • Company-issued laptop and equipment provided for the duration of the internship program
  • Our interns enjoy mentorship and experience challenging work while receiving a great compensation package, temporary housing, and having a fun captivating experience—we have it all.Oh, and did we mention the possibility of full-time employment once you graduate?

Company Info.

Ancestry

Ancestry.com LLC is an American genealogy company based in Lehi, Utah. The largest for-profit genealogy company in the world, it operates a network of genealogical, historical records, and related genetic genealogy websites.

Get Similar Jobs In Your Inbox

Ancestry is currently hiring Computer Vision Internship Jobs in Lehi, UT, USA with average base salary of $88,330 - $121,330 / Year.

Similar Jobs View More