AI Research Scientist – Datadog AI Research (DAIR)

Datadog, Inc.
Apply Now

Job Description

As a research scientist on our team, you will partner with research engineers, working on fundamental research problems and collaborating with Datadog’s Product and Engineering teams to help translate research advances into tangible benefits for our customers.

Building on our proven track record of AI-powered solutions (e.g., Bits AI, Watchdog, and Toto), Datadog AI Research is tackling high-risk, high-reward projects grounded in real-world challenges in cloud observability and security.

We are currently focused on three key research areas:

Observability Foundation Models – Building state-of-the-art models for advanced forecasting, anomaly detection, and multi-modal telemetry analysis (logs, metrics, traces, etc.). These models will also provide the foundation for our agents (described below) to natively analyze telemetry data.

Site Reliability Engineering (SRE) Autonomous Agents – Creating AI agents to automatically detect, diagnose, and resolve incidents in production environments, pushing the boundaries of multi-step planning, reasoning, and domain-specific knowledge.

Production Code Repair Agents – Developing agents and models that leverage code, logs, runtime data, and other signals to identify, fix, and even preempt performance issues and security vulnerabilities in production code.

What You’ll Do:

  • Conduct cutting-edge research in Generative AI and Machine Learning, aiming to build specialized Foundation Models and AI Agents for observability, site reliability engineering, and code repair
  • Leverage large-scale distributed training infrastructure to pre-train and post-train state-of-the-art models on diverse, real-world telemetry data
  • Build simulated environments to facilitate on-policy agentic training and evaluation.
  • Lead and contribute to research publications, present findings at top-tier conferences (e.g., NeurIPS, ICLR, ICML), and help open-source key model artifacts and benchmarks
  • Collaborate with cross-functional teams (e.g., Product, Engineering) to integrate advanced AI capabilities – like multi-modal analysis or automated incident resolution planning – into Datadog’s product ecosystem
  • Stay at the forefront of LLMs, Foundation Models, and Generative AI research and engage with the external research community
  • Foster a culture of scientific rigor, innovation, and practical impact, e.g., by actively participating in reading groups and mentoring interns

Who You Are:

  • You hold a PhD in Computer Science, Machine Learning, or a related field, with deep expertise in areas like generative modeling, AI agents, reinforcement learning, or natural language processing (or have equivalent experience)
  • You possess extensive experience in designing and implementing deep learning models and agents, and have a strong background in distributed training frameworks (e.g., DeepSpeed, Megatron-LM) and ML libraries (PyTorch, TensorFlow)
  • You have a proven track record of conducting impactful research in the field with publications at top-tier venues (e.g., NeurIPS, ICLR, ICML, TMLR)
  • You're familiar with efficient training, post-training, fine-tuning, and inference techniques for large foundation models
  • You excel at explaining complex models and research findings to both technical and non-technical audiences
  • You have strong interest in open-science and open-source contributions, including establishing rigorous benchmarks and sharing research with the community

Bonus Points (any of the following):

  • You have a demonstrated ability to bridge cutting-edge research and real-world product applications, ideally with an emphasis on large foundation models, generative AI agents, or domain-specific LLM deployments.
  • You’re passionate about pushing the boundaries of AI while maintaining a strong focus on customer impact, scalability, and responsible deployment of new technologies
  • You have experience writing production data pipelines and applications
  • You have hands-on experience with GPU programming and optimization, including experience in CUDA

Datadog values people from all walks of life. We understand not everyone will meet all the above qualifications on day one. That's okay. If you’re passionate about AI Research and want to grow your skills, we encourage you to apply.

Benefits and Growth:

  • Competitive global benefits
  • New hire stock equity (RSUs) and employee stock purchase plan (ESPP)
  • Opportunity to collaborate closely with colleagues across the Datadog offices in New York City and Paris
  • Opportunity to attend and present at conferences and meetups
  • Intra-departmental mentor and buddy program for in-house networking
  • An inclusive company culture, ability to join our Community Guilds (Datadog employee resource groups)

Benefits and Growth listed above may vary based on the country of your employment and the nature of your employment with Datadog.

Company Info.

Datadog, Inc.

Datadog is the essential monitoring platform for cloud applications. We bring together data from servers, containers, databases, and third-party services to make your stack entirely observable. These capabilities help DevOps teams avoid downtime, resolve performance issues, and ensure customers are getting the best user experience.

  • Industry
    Information Technology
  • No. of Employees
    3,400
  • Location
    New York, NY, USA
  • Website
  • Jobs Posted

Get Similar Jobs In Your Inbox

Datadog, Inc. is currently hiring AI Research Scientist Jobs in Paris, France with average base salary of €77,600 - €127,500 / Year.

Similar Jobs View More