AI Inference Engineer - San Francisco

Perplexity AI
Apply Now

Job Description

We Are Hiring: Machine Learning Engineer – Inference

You will work on large-scale deployment of machine learning models for real-time inference.

Our current stack: Python, C++, TensorRT-LLM, Kubernetes.

Responsibilities

  • Develop APIs for AI inference used by both internal and external customers.
  • Benchmark and address bottlenecks throughout our inference stack.
  • Improve the reliability and observability of our systems and respond to system outages.
  • Explore novel research and implement LLM inference optimizations.

Qualifications

  • Experience with ML systems and deep learning frameworks (e.g., PyTorch, TensorFlow, ONNX).
  • Familiarity with common LLM architectures and inference optimization techniques (e.g., continuous batching, quantization, etc.).
  • Understanding of GPU architectures or experience with GPU kernel programming using CUDA.

Compensation & Benefits

Salary: $190,000 - $250,000 (final offer depends on experience and expertise).

Equity: Equity is included as part of the total compensation package.

Benefits: Comprehensive health, dental, and vision insurance for you and your dependents. Includes a 401(k) plan.

About Perplexity

Since launching the world’s first fully functional conversational answer engine in 2022, Perplexity has experienced tremendous growth. In 2024 alone, our daily queries increased from 2.5 million to around 20 million by December. We also offer Perplexity Enterprise Pro, serving leading companies such as Nvidia, the Cleveland Cavaliers, Bridgewater, and Zoom.

To support our rapid expansion, we’ve raised substantial funding from top-tier investors, including IVP, NEA, Jeff Bezos, NVIDIA, Databricks, Bessemer Venture Partners, Elad Gil, Nat Friedman, Daniel Gross, Naval Ravikant, Tobi Lutke, and other industry leaders. Our team grew nearly 300% in 2024, and we’re just getting started.

Join us as we redefine search and knowledge discovery.

Company Info.

Perplexity AI

Perplexity AI is transforming how people search and interact with the internet. Our mission is to build an intelligent conversational interface powered by large language models, delivering precise and intuitive search results. Join us in shaping the future of AI-driven search technology.

  • Industry
    Artificial intelligence,Computer software
  • No. of Employees
    55
  • Location
    San Francisco, CA, USA
  • Website
  • Jobs Posted

Get Similar Jobs In Your Inbox

Perplexity AI is currently hiring AI Engineer Jobs in San Francisco, CA, USA with average base salary of $190,000 - $250,000 / Year.

Similar Jobs View More