Senior Staff Software Engineer, Model Serving

Cohere
Apply Now

Job Description

Who are we?

We’re a team of engineers, thinkers, and champions whose aim is to give technology language. Every day our team is breaking new ground, as we build transformational AI technology and products for enterprise and developers that wish to harness the power of Large Language Models.

We're driven by ambition, as we firmly believe that our technology has the potential to revolutionise the way industries engage with natural language. Our strong technical foundation speaks for itself, with our team composed of world-class experts who have collectively accumulated hundreds of thousands of citations in academia.

The Cohere team is a collective of college dropouts, PhDs, alumni of big tech and scrappy start-ups, new grads and career pivots, who believe a diverse team is the key to a safer, more responsible technology. At Cohere, work isn’t the opposite of play, as we build the future of language AI with team members on almost every continent in the world, working from high rises, cabins, tour buses, and dog-friendly offices.

There’s no better time to herald the next step with us as we shape the future of Generative AI.

Why this role?

Are you energized by building high-performance, scalable and reliable machine learning systems? Do you want to help define and build the next generation of AI platforms powering advanced NLP applications? We are looking for Senior Staff Software Engineers to join the Model Serving team at Cohere. The team is responsible for developing, deploying, and operating the AI platform delivering Cohere's large language models through easy to use API endpoints. In this role, you will work closely with many teams to deploy optimized NLP models to production in low latency, high throughput, and high availability environments. You will also get the opportunity to interface with customers and create customized deployments to meet their specific needs.

We are looking for candidates with a range of experiences for multiple roles, from senior to staff-level engineers.

Please Note: We have offices in Toronto, Palo Alto, and London but embrace being remote-first! There are no restrictions on where you can be located for this role.

You may be a good fit if you have:

Experience with serving ML models

  • Experience designing, implementing, and maintaining a production service at scale
  • Familiarity with inference characteristics of deep learning models, specifically, Transformer based architectures.
  • Familiarity with computational characteristics of accelerators (GPUs, TPUs, and/or Inferentia), especially how they influence latency and throughput of inference.
  • Strong understanding or working experience with distributed systems
  • Experience in performance benchmarking, profiling, and optimization.
  • Experience with cloud infrastructure (e.g. AWS, GCP)
  • Experience in Golang (or, other languages designed for high-performance scalable servers)

If some of the above doesn’t line up perfectly with your experience, we still encourage you to apply! If you consider yourself a thoughtful worker, a lifelong learner, and a kind and playful team member, Cohere is the place for you.

We value and celebrate diversity and strive to create an inclusive work environment for all. We welcome applicants of all kinds and are committed to providing an equal opportunity process. Cohere provides accessibility accommodations during the recruitment process. Should you require any accommodation, please let us know and we will work with you to meet your needs.

Our Perks:

  • An open and inclusive culture and work environment
  • Work closely with a team on the cutting edge of AI research
  • Free daily lunch
  • Full health and dental benefits, including a separate budget to take care of your mental health
  • 100% Parental Leave top-up for 6 months for employees based in Canada, the US, and the UK
  • Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement
  • Remote-flexible, offices in Toronto, Palo Alto, San-Francisco and London and co-working stipend
  • 6 weeks of vacation

Company Info.

Cohere

Cohere is a Canadian startup that provides natural language processing models that help companies improve human-machine interactions. Cohere was founded in 2019 by Aidan Gomez, Ivan Zhang, and Nick Frosst.

  • Industry
    Computer software,Natural Language Processing
  • No. of Employees
    50
  • Location
    Toronto, ON, Canada
  • Website
  • Jobs Posted

Get Similar Jobs In Your Inbox

Cohere is currently hiring Senior Staff Software Engineer Jobs in San Francisco, CA, USA with average base salary of $126,000 - $246,300 / Year.

Similar Jobs View More