Machine Learning Data Engineer

Synthesia
Apply Now

Job Description

About the role

  • We are looking for an experienced Machine Learning Data Engineer who loves dealing with large quantities of text and audio data. The successful candidate will be proficient in using machine learning techniques to build data processing pipelines, preparing ready-to-train datasets for large models.
  • If you are excited about the intersection of AI, Machine Learning, and Large Data, this role provides a unique opportunity to make a high-impact contribution.
  • Our aim is to make video content creation available for all - not only to studio production!
  • You will be someone who loves to code and build working systems. You are used to working in a fast-paced start-up environment. You will have experience with the software development life cycle, from ideation through implementation, to testing and release.
  • You will join a group of more than 40 Engineers in the R&D department and will have the opportunity to collaborate with multiple research teams across diverse areas, our R&D research is guided by our co-founders - Prof. Lourdes Agapito and Prof. Matthias Niessner.
  • If you know and love Voicebox, Whisper, VALL-E, SPEAR-TTS and more - and you love machine learning and large data, then we would love to talk to you. We will also want to talk to you - if that's what you dream of doing.

What will you be doing?

  • In this position, you'll join the team to help develop our LLM-based TTS system that will provide our customers with voice clones that are indistinguishable from real voices. You will also help us create high quality, production ready code and take ownership of production pipelines. This would include:
  • Designing, developing, and maintaining data processing pipelines, utilising machine learning techniques to handle vast amounts of text and audio data, while ensuring data quality and accessibility.
  • Leveraging your understanding of machine learning algorithms and workflows to prepare data most effectively for usage in large scale models.
  • Use Big Data tools and frameworks to process, analyse, and derive insights from structured and unstructured data.
  • Collaborating with other ML Engineers and Researchers to understand their data requirements and provide them with ready-to-train datasets.
  • Monitoring the performance of data pipeline and machine learning models, troubleshoot data-related issues, and perform root cause analysis to implement strategic solutions.
  • Stay up-to-date with emerging technologies and tools in machine learning and data engineering to continually improve our data infrastructure.
  • Document data pipeline architecture and workflow, present findings to relevant stakeholders, and provide training as needed.

Who are you?

  • You have a background in Computer Science, Engineering, or a related field with 3+ years of experience. Advanced degrees with a focus on Machine Learning are preferred.
  • Proven experience as a Data Engineer, or similar role, with a demonstrated history in designing and building scalable data pipelines using Machine Learning techniques.
  • Familiarity with audio data processing and voice technologies is highly desirable.
  • You have excellent coding skills in Python and you are very passionate about the software development side of things.
  • You have solid proficiency in Unix-like command line operations, including the creation and execution of both quick one-liners and complex bash scripts.
  • You put emphasis on documenting your work in a clear and concise manner.
  • Ability to work effectively in a fast-paced, agile environment.
  • And finally..You have excellent verbal and written communication skills and you are passionate about what you do!

Nice to have…

  • Transformers, Huggingface, Whisper ASR.
  • Multi-threaded Python
  • AWS framework.

The good stuff...

  • You will be compensated well (salary + stock options + bonus)
  • You will work in a hybrid setting with an office in Amsterdam
  • You get 25 days of annual leave + public holidays
  • You will join an established company culture with regular socials and company retreats
  • You get 4 weeks paid sabbatical after 4 years at the company + $10,000!!
  • You can participate in a generous referral scheme
  • You will have huge opportunities for your career growth

Company Info.

Synthesia

Synthesia has been ranked as the top AI video creation platform, utilized by numerous companies to produce videos in 120 different languages. By utilizing Synthesia, businesses can save up to 80% of their time and budget.

  • Industry
    Artificial intelligence,Computer software
  • No. of Employees
    200
  • Location
    London, UK
  • Website
  • Jobs Posted

Get Similar Jobs In Your Inbox

Synthesia is currently hiring Machine Learning Data Engineer Jobs in Amsterdam, Netherlands with average base salary of €73,600 - €129,500 / Year.

Similar Jobs View More