Valo Health, LLC
As a Staff Data Scientist, Machine Learning, you will be a core member of a team of data scientists and engineers building a powerful computational platform for advancing the research and development of new medicines. As part of the Translational Data Science team, you will help design, develop, and apply machine learning (ML) models, methods, and pipelines for scientific problems involving clinical and biomedical data. Successful candidates will work with a diverse set of data scientists, biological scientists, epidemiologists, and software engineers in ways that cut across traditional industry boundaries.

What You’ll Do...

  • Perform hands-on exploratory and defined analysis and modeling of high-dimensional longitudinal data to generate fit-for-purpose evidence for decision making for multiple projects.
  • Design and implement innovative ML approaches leveraging Valo’s proprietary platform (data assets and data science packages) as well as newly published methods.
  • Contribute to the design and implementation of robust, automated pipelines leveraging deep learning and classical ML on high dimensional electronic health records and omics data to solve scientific problems.
  • Develop well-designed, tested, and documented software packages.
  • Contribute to planning, execution, interpretation, and communication of results.
  • Collaborate with cross-functional teams and key stakeholders to derive user requirements, maintain alignment, and ensure the relevance and impact of models, analyses, and pipelines.
  • Be an active team member in code, design, and analysis review.

What You Bring...

  • Degree in a quantitative field with the following years of post-degree experience or equivalent oBS: 7+; MS: 5+; PhD: 3+
  • Broad experience in ML including random forest, logistic regression, dimensionality reduction, clustering, metrics, model selection, features election, and explainability (3+ years required).
  • Demonstrated experience implementing and applying deep learning approaches such as representation learning, sequence models, transformers, and self-supervised learning on high dimensional or multimodal data (2+ years required).
  • Proficient in Python (3+ years required) and experience with ML, deep learning, and data science packages (e.g., scikit-learn, pytorch, stats models, scipy, MLlib).
  • Experience with collaborative software development using source control). management (e.g., git, unit testing, code review, CI/CD) (2+ years required)
  • Experience with MLops methodology such as workflow orchestration (e.g.,Airflow, Prefect), experiment tracking (e.g., MLflow), containerization (e.g.,Docker), and reproducible research (1+ years required).
  • Experience with ML on electronic health records (1+ years required).
  • Experience with large-scale data analytics engines (e.g., Spark or Dask) and working in cloud environments (e.g., AWS).
  • Experience with statistical methods such as hypothesis testing, longitudinal modeling, and time to event analysis.
  • Strong work ethic with a bias for execution and an ability to manage multiple priorities, ambiguity, and tight timelines.
  • Ability to work effectively in teams or independently.
  • Experience with omics data is a plus.
  • Familiarity with the drug discovery and development process is a plus.


  • New York Base Salary: $192,780 to $203,900
  • San Francisco Base Salary: $200,000 to $227,000

This range represents the low and high end of the anticipated annual base salary range for the New York City based and San Francisco based position. The actual annual base salary will depend on numerous factors such as: experience, knowledge, skills, and if the location of the job changes.

More on Valo

Valo Health, Inc (“Valo”) is a technology company built to transform the drug discovery and development process using human-centric data and artificial intelligence-driven computation. As a digitally native company, Valo aims to fully integrate human-centric data across the entire drug development life cycle into a single unified architecture, thereby accelerating the discovery and development of life-changing drugs while simultaneously reducing costs, time, and failure rates. The company’s Opal Computational Platform™ is an integrated set of capabilities designed to transform data into valuable insights that may accelerate discoveries and enable Valo to advance a robust pipeline of programs across cardiovascular metabolic renal, oncology, and neurodegenerative disease. Founded by Flagship Pioneering and headquartered in Boston, MA, Valo also has offices in Lexington, MA, and New York. To learn more, visit

