Staff Data Engineer - Research/Machine Learning

Character.ai
Apply Now

Job Description

About the role

You would be a great fit for this role if you are an experienced engineer who will be instrumental in building the world's best LLMs by collecting and refining the essential training data that powers them. In pursuit of the best language models, your responsibility is twofold:

  • First, identify and collect data at the scale required to feed our largest models. This involves managing a diverse set of sources, including structured and unstructured content from text and multimedia formats. Your engineering expertise is crucial in crafting the infrastructure and tools necessary to efficiently collect and manage petabytes of data.
  • Second, you will experiment with various methods of extracting a balanced and comprehensive training dataset from the raw data. You will leverage your expertise in data to build datasets reflecting a hypothesis, train models, and evaluate experimental results. Through this experimentation, you will create the training datasets for our largest models.

These are critical steps in the construction of AI. With petabytes of data and numerous design decisions, each step requires careful attention. Expertise in AI is not necessary, but enthusiasm for the space and a track record of adapting to new domains is important.

Who we’re looking for

Required Experience:

  • 5+ years of production software engineering experience
  • Experience building large-scale data processing pipelines, with tools like PySpark, Beam, or Flink
  • Familiarity with Machine Learning and NLP and willingness to learn more on the job
  • Track record of adapting to new domains and a desire to use data to improve products

Additional Desired Experience:

  • ML experience as an ML engineer, Data Scientist, or another similar role
  • Experience with cloud platforms like AWS or Azure, or tools such as Kubernetes and Terraform
  • Passionate about Conversational AI or large language models

You will be a good fit if you are proactive and have a “get things done” mindset. Given our current pace of growth and load on our systems, most people have had a significant impact during their first week at the company.

Join our visionary team, on-site 3-5 days a week, at the forefront of next-generation AGI.

About Character.AI

Founded in 2021 by AI pioneers Noam Shazeer and Daniel De Freitas, Character is a leading AI company offering personalized experiences through customizable AI 'Characters.' As one of the most widely used AI platforms worldwide, Character enables users to interact with AI tailored to their unique needs and preferences.

Noam co-invented core LLM tech and was recently honored as one of TIME's 100 Most Influential in AI. Daniel created LaMDA, the breakthrough conversational AI now powering Google's Bard.

In just two years, we achieved unicorn status and were named Google Play's AI App of the Year – a testament to our groundbreaking technology and vision.

Company Info.

Character.ai

Character.ai is a neural language model chatbot service that can generate human-like text responses and participate in contextual conversation. Constructed by previous developers of Google's LaMDA, Noam Shazeer, and Daniel De Freitas, the beta model was made available to use by the public in September 2022.

  • Industry
    Artificial intelligence,Computer software
  • No. of Employees
    100
  • Location
    Menlo Park, CA, USA
  • Website
  • Jobs Posted

Get Similar Jobs In Your Inbox

Character.ai is currently hiring Staff Data Engineer Jobs in New York, NY, USA with average base salary of $150,000 - $350,000 / Year.

Similar Jobs View More