Senior Data Scientist - Computational Biology & Machine Learning

Spark Therapeutics, Inc.
Apply Now

Job Description

The Computational Biology and Machine Learning (CBML) group within the Genomics & Data Science Organization at Spark Therapeutics is seeking an engaged and passionate Senior Data Scientist with expertise in AI/ML and proficiency in the analysis of Next-Generation Sequencing data. The candidate will collaborate on projects throughout the Research & Technology Organization that necessitate the application of machine learning expertise, encompassing optimization, anomaly detection, classification, cluster analysis etc.

He/she will be responsible for:

  • Developing and implementing AI/ML models for data analysis, including clustering, classification, and prediction.
  • Analyzing and integrating large-scale genomics and transcriptomics datasets generated from single cells.
  • Working closely with cross-functional teams to design and execute experiments to validate models and algorithms.
  • Engaging in cross-functional discussions, providing conceptual input in experimental and study design and serving as subject matter expert in Computational Biology and ML.
  • Summarizing, visualizing, and presenting analyses and findings to key stakeholders.
  • Supporting evaluation and writing of study reports, scientific presentations, and SOPs.
  • Staying up to date with the latest trends and advancements in genomics, transcriptomics, and machine learning.


Describe the essential daily job functions and include % of time spent on each.

% of Time

Job Function and Description

  • 50% - AI/ML and NGS analysis of scientific data including, interpretation, and communication of results and insights to stakeholders, including scientific and non-scientific audiences
  • 25% - Develop in-house computational tools and machine learning models to support analysis of high-throughput datasets
  • 20% - Generate technical reports, prepare presentation slides, generate novel concepts.
  • 5% - Trainings, lab meetings and administration work


  • Ph.D. in Computer science, Computational Biology, Bioinformatics, or related disciplines preferred with 3-5 years years’ experience.
  • May also have a Bachelors or Master’s degree and significant related experience- preferred minimum of 10 years of experience. Individual experience may vary based on skillset and expertise.
  • Demonstrable track record in the core competency areas: high dimensional data analysis, applying AI/ML techniques to biological datasets, data visualization, etc…
  • Extensive experience in next generation sequencing (NGS) analysis of DNA and RNA-seq (Bulk/single-cell) data using short and long read sequencing technologies.


  • Deep understanding and expertise in the fields of machine learning and data science, as well as a strong foundation in statistical analysis methods and frameworks.
  • Proficiency in programming using one or more common data science languages such as Python, R, SPARQL, SQL.
  • Experience with machine learning libraries such as scikit-learn, TensorFlow, and Keras.
  • Extensive experience using bioinformatics workflow technologies such as WDL, CWL, NextFlow, Docker.
  • Understanding of genomic sequencing technologies, including Illumina and 10x Genomics.
  • Familiarity with bioinformatics tools such as BWA, Samtools, BLAST, Cell Ranger, Seurat, Scanpy, etc…
  • Strong familiarity with core concepts in molecular biology and related lab technologies.
  • Track record of following best practices of coding, version control (Git), code documentation, and reproducible research.
  • Proven ability to work independently & in a collaborative group setting.
  • Self-motivated to learn and develop new methodologies, manage multiple analysis pipelines simultaneously, keep accurate records, follow instructions, and comply with company policies.
  • Excellent communication skills (both oral and written).
  • Experience with AAV vectors and gene therapy is preferred, but not required.
  • Expert ability to critically analyze problems, develop potential solutions, and evaluate impact to Spark.
  • Demonstrated ability to independently provide effective oversight and management of external collaborations and vendors.
  • Experience with cloud computing platforms such as AWS, GCP, or Azure.

Company Info.

Spark Therapeutics, Inc.

Spark Therapeutics, Inc. is a developer of gene therapy treatments, which treat debilitating genetic diseases. It is a subsidiary of Hoffmann-La Roche.

  • Industry
  • No. of Employees
  • Location
    Philadelphia, Pennsylvania, USA
  • Website
  • Jobs Posted

Get Similar Jobs In Your Inbox

Spark Therapeutics, Inc. is currently hiring Machine Learning Scientist, Computational Biology Jobs in Philadelphia, PA, USA with average base salary of $120,000 - $250,000 / Year.

Similar Jobs View More