AWS, Azure, Bioinformatics, Data science techniques, Docker, Git, Google Cloud Platform (GCP), Machine learning techniques, Python Programming, R Programming, Scikit-learn, Spark Core, SQL, TensorFlow
The Computational Biology and Machine Learning (CBML) group within the Genomics & Data Science Organization at Spark Therapeutics is seeking an engaged and passionate Senior Data Scientist with expertise in AI/ML and proficiency in the analysis of Next-Generation Sequencing data. The candidate will collaborate on projects throughout the Research & Technology Organization that necessitate the application of machine learning expertise, encompassing optimization, anomaly detection, classification, cluster analysis etc.
He/she will be responsible for:
- Developing and implementing AI/ML models for data analysis, including clustering, classification, and prediction.
- Analyzing and integrating large-scale genomics and transcriptomics datasets generated from single cells.
- Working closely with cross-functional teams to design and execute experiments to validate models and algorithms.
- Engaging in cross-functional discussions, providing conceptual input in experimental and study design and serving as subject matter expert in Computational Biology and ML.
- Summarizing, visualizing, and presenting analyses and findings to key stakeholders.
- Supporting evaluation and writing of study reports, scientific presentations, and SOPs.
- Staying up to date with the latest trends and advancements in genomics, transcriptomics, and machine learning.
Describe the essential daily job functions and include % of time spent on each.
% of Time
Job Function and Description
- 50% - AI/ML and NGS analysis of scientific data including, interpretation, and communication of results and insights to stakeholders, including scientific and non-scientific audiences
- 25% - Develop in-house computational tools and machine learning models to support analysis of high-throughput datasets
- 20% - Generate technical reports, prepare presentation slides, generate novel concepts.
- 5% - Trainings, lab meetings and administration work
- Ph.D. in Computer science, Computational Biology, Bioinformatics, or related disciplines preferred with 3-5 years years’ experience.
- May also have a Bachelors or Master’s degree and significant related experience- preferred minimum of 10 years of experience. Individual experience may vary based on skillset and expertise.
- Demonstrable track record in the core competency areas: high dimensional data analysis, applying AI/ML techniques to biological datasets, data visualization, etc…
- Extensive experience in next generation sequencing (NGS) analysis of DNA and RNA-seq (Bulk/single-cell) data using short and long read sequencing technologies.
- Deep understanding and expertise in the fields of machine learning and data science, as well as a strong foundation in statistical analysis methods and frameworks.
- Proficiency in programming using one or more common data science languages such as Python, R, SPARQL, SQL.
- Experience with machine learning libraries such as scikit-learn, TensorFlow, and Keras.
- Extensive experience using bioinformatics workflow technologies such as WDL, CWL, NextFlow, Docker.
- Understanding of genomic sequencing technologies, including Illumina and 10x Genomics.
- Familiarity with bioinformatics tools such as BWA, Samtools, BLAST, Cell Ranger, Seurat, Scanpy, etc…
- Strong familiarity with core concepts in molecular biology and related lab technologies.
- Track record of following best practices of coding, version control (Git), code documentation, and reproducible research.
- Proven ability to work independently & in a collaborative group setting.
- Self-motivated to learn and develop new methodologies, manage multiple analysis pipelines simultaneously, keep accurate records, follow instructions, and comply with company policies.
- Excellent communication skills (both oral and written).
- Experience with AAV vectors and gene therapy is preferred, but not required.
- Expert ability to critically analyze problems, develop potential solutions, and evaluate impact to Spark.
- Demonstrated ability to independently provide effective oversight and management of external collaborations and vendors.
- Experience with cloud computing platforms such as AWS, GCP, or Azure.
Spark Therapeutics, Inc.
Spark Therapeutics, Inc. is a developer of gene therapy treatments, which treat debilitating genetic diseases. It is a subsidiary of Hoffmann-La Roche.
Get Similar Jobs In Your Inbox
Spark Therapeutics, Inc. is currently hiring Machine Learning Scientist, Computational Biology Jobs in Philadelphia, PA, USA with average base salary of $120,000 - $250,000 / Year.