Data Scientist Machine Learning and Deep Learning

MD Anderson Cancer Center
Apply Now

Job Description

The primary purpose of this data scientist is to contribute to building the data infrastructure of our flagship platform A3D3a: Adaptive, AI-augmented, Drug Discovery and Development. With expertise in data architecture, the Data Scientist will directly contribute our mission to discover novel therapies for cancer patients. Led by Prof. Bissan Al-Lazikani, Director of Therapeutics Data Science, the intelligent and ever-learning A3D3a platform is part of the new initiative in Therapeutics Data Science and part of our ambitious Institute for Data Science in Oncology at MD Anderson. A3D3a will accelerate the discovery and impact of novel therapies for cancer by enabling novel opportunities for optimized therapies for patients with a focus on rare and hard-to-treat cancers through the development of novel machine learning and AI technologies.

Central to this vision, the Data Scientist will build and maintain data infrastructure to enable the discovery of hidden therapeutic opportunities in integrated patient data and will work closely with data scientists, data engineers, bioinformaticians, and molecular modelers.

Key Functions:

  • Work with lead teammates on establishing architectural plan to encompass local, hybrid, and/or cloud infrastructure
  • Utilize a variety of tools (e.g. Spark, KNIME, Airflow, SQL) to merge and extract data from multiple sources and environments
  • Create data pipelines to validate and enrich data for use in ML models
  • Generate and maintain metadata for all stages of data pipeline
  • Work with a multidisciplinary team and stakeholders to define data requirements
  • Establish and maintain interfaces to the data (APIs)
  • Utilize industry standards for creating, storing, and documenting code
  • Strong Python programming experience is a must and candidates must have demonstrated skills in that area
  • Candidates having experience using Spark (PySpark) will be given preference
  • Solid understanding of CI/CD practices
  • Experience building and querying both relational and graph databases
  • Familiarity with No-SQL
  • Solid knowledge of metadata creation and management
  • Experience with Airflow, Argo or equivalent workflow orchestration is required
  • Must have demonstrated experience working with APIs
  • Good understanding of Container based architectures (e.g. Docker/Kubernetes)
  • Candidate must have demonstrated experience working on data engineering tasks using one of the major cloud vendors. Preference will be given to those with experience with Microsoft Azure
  • Prefer candidates with demonstrated skills in building/deploying ML models

EDUCATION:

  • Required: Bachelor's degree in Biomedical Engineering, Electrical Engineering, Computer Engineering, Physics, Applied Mathematics, Science, Engineering, Computer Science, Statistics, Computational Biology, or related field.
  • Preferred: PhD in Biomedical Engineering, Electrical Engineering, Computer Engineering, Physics, Applied Mathematics, Science, Engineering, Computer Science, Statistics, Computational Biology, or related field.
  • EXPERIENCE:
  • Required: Three years experience in scientific software development/analysis. With Master's degree, one years experience required. With PhD, no experience required.

Additional Information

  • Requisition ID: 161059
  • Employment Status: Full-Time
  • Employee Status: Regular
  • Work Week: Day/Evening
  • Minimum Salary: US Dollar (USD) 84,500
  • Midpoint Salary: US Dollar (USD) 105,500
  • Maximum Salary : US Dollar (USD) 126,500
  • FLSA: exempt and not eligible for overtime pay
  • Fund Type: Soft
  • Work Location: Hybrid Onsite/Remote
  • Pivotal Position: Yes
  • Referral Bonus Available?: Yes
  • Relocation Assistance Available?: Yes
  • Science Jobs: Yes

Company Info.

MD Anderson Cancer Center

The University of Texas MD Anderson Cancer Center is one of the world's most respected centers devoted exclusively to cancer patient care, research, education and prevention. MD Anderson provides cancer care at several convenient locations throughout the Greater Houston Area and collaborates with community hospitals and health systems nationwide through MD Anderson Cancer Network.

Get Similar Jobs In Your Inbox

MD Anderson Cancer Center is currently hiring Data Scientist, Machine Learning Jobs in Houston, TX, USA with average base salary of $84,500 - $126,500 / Year.

Similar Jobs View More