Data Engineer (To Support Cancer Researchers)

Dana–Farber Cancer Institute
Apply Now

Job Description

Do you think artificial intelligence should be applied to solving cancer instead of only advertisement and phone apps? We are looking for a Data Engineer who will prepare the data for the next generation of AI that will run on cancer data.

We are seeking intelligent, hard-working, and dynamic individuals to serve as Data Engineer within the AI Operations and Data Science Services group -- a group serving some of the most prominent research and clinical programs at the Institute, from basic to translational research, to clinical deployment, and operationalization. The group encompasses expertise in AI, data science, machine learning, NLP, computer vision, production deployment, cloud infrastructure, data engineering, project management standards, and data labeling. The group seeks to develop a highly interdisciplinary environment supporting our research, clinical and operational staff advance the overall mission of DFCI which is to provide expert, compassionate care to children and adults with cancer while advancing the understanding, diagnosis, treatment, cure, and prevention of cancer and related diseases.

As we widen our support of several crucial centers and programs at DFCI, we seek an energetic and motivated Data Engineer to help us scale up our data infrastructure to support the research objectives of our Investigators. This will involve being responsible for data management, building data pipelines, contributing to software tools developed internally and used by collaborators, and establishing data engineering and data management best practices. The primary focus will be in our Breast Oncology Division, but the candidate will be expected to contribute as needed to other cancer areas we serve. The successful candidate will have proven experience in working on large complex projects, meeting deadlines, and will have excellent communication skills coupled with being very personable.

Responsibilities

The key responsibilities will be:

  • Responsible for data management and building data pipelines for breast cancer research data
  • Managing one research client that has numerous ongoing projects in AI applied to radiology / molecular / pathology / clinical data / clinical trial data / text data. Data engineering will be the key enabler to the next level of these projects
  • Strong organizational skills with demonstrated capacity to track and manage data flows across projects and computational platforms
  • Meeting and consulting scientists and designing plans and solutions to support their data tooling needs
  • Delivery of results for projects on-time and on-budget
  • Working as part of the broader team to identify long-term solutions that will improve the quality, speed and efficacy of our current projects and programs
  • Evaluating and benchmarking new software libraries
  • Prototype and deploy data engineering pipelines
  • Design and implement data pipelines that focus on data life cycle
  • Excellent communication and effective problem-solving skills, possibly with a track record of serving a variety of diverse customers and projects
  • Ability to quickly learn new software tools and provide feedback/recommendations
  • Ability to work independently, prioritize, and manage people if needed, within an environment with ever changing priorities
  • Demonstrate excellent soft-skills, such as excellent oral and written communication skills

 Requirements:

  • Excellent data engineering and data management skills
  • Required strong proficiency in Python and SQL
  • Preferred a degree in quantitative field such as informatics, computer science, applied mathematics, software engineering, or equivalent experience with evidence of impact in data engineering applied to real life problems (e.g., some quant master courses AND experience in relevant internship)
  • Nice to have experience with research setting ideally within a clinical or basic research environment
  • Familiarity with Jupyter Lab, Linux, and Git
  • Cloud computing experience (e.g., GCP, AWS, Azure)
  • Preferred experience with RedCap API or OMOP Common Data Model

Qualifications:

  • Preferred 1 to 5 years of experience post MS or PhD
  • Experience with multiple large, heterogeneous, and sparse datasets is strongly preferred
  • Preferred prior experience in client management
  • Domain knowledge of oncology and cancer biology would be preferred but it is not required

Company Info.

Dana–Farber Cancer Institute

Dana–Farber Cancer Institute is a comprehensive cancer treatment and research institution in Boston, Massachusetts. Dana–Farber is the founding member of Dana–Farber/Harvard Cancer Center, Harvard's Comprehensive Cancer Center designated by the National Cancer Institute, and one of the 15 clinical affiliates and research institutes of Harvard Medical School.

Get Similar Jobs In Your Inbox

Dana–Farber Cancer Institute is currently hiring Data Engineer Jobs in Brookline, MA, USA with average base salary of $120,000 - $190,000 / Year.

Similar Jobs View More