Lead Data Scientist - Remote (NY, NJ, MA, IL, TX)

McKesson Corporation
Apply Now

Job Description

McKesson requires new employees to be fully vaccinated for COVID-19 as defined by the CDC, subject to applicable, verified accommodation requests.

Join the fight against cancer.

Ontada is a leading oncology real-world data and evidence, clinical education and provider technology business dedicated to transforming the fight against cancer. Part of McKesson Corporation, we support science through our data, technology and channels, which accelerate innovation for life science companies, support the education of community oncology providers and advance patient care. Together with our partners, we improve the lives of cancer patients.

Position Summary

We are looking for an experienced NLP Machine Learning Engineer/Data Scientist with a strong blend of business acumen and technical skills. The NLP Machine Learning Engineer/Data Scientist is involved in the full lifecycle of NLP data solutions, from data engineering, modeling, operations, presentation, maintenance, and benefit tracking. This is the ideal opportunity to become part of an innovative and energetic team that develops analytical tools that influence our products that make a difference in oncology care.

Responsibilities

Business Drivers

  • A strong team player who will collaborate with product management, product owners and engineering departments to understand business needs and devise solutions
  • The role of the NLP Machine Learning Engineer/Data Scientist is a highly collaborative role. This position is expected to work closely with other analysts, data warehousing, and data engineering teams in creating big data applications through the utilization of structured and unstructured data, designing and developing optimal data architecture, and experimenting on new machine learning techniques.
  • The NLP Machine Learning Engineer/Data Scientist also works in collaboration with senior management of Data and Analytics and serves as a reliable advisor in the creation and implementation of useful information for the business

Technical responsibilities

  • Design, test and maintain Natural Language Processing (NLP) applications using the latest in testing methodologies.
  • Develop and lead Optical Character Recognition (OCR) solutions, bringing insight to attached documents contained in the Electronic Medical Record (EMR) System
  • Participate in the full lifecycle of end-to-end NLP and OCR solutions, from planning, designing, technical implementation, deployment, validation, support, and maintenance
  • Lead the development of machine learning NLP models from unstructured healthcare data such as provider notes, EMR attached documentation, etc.
  • Data engineering and data manipulation using Python, PySpark, and Pandas.Make use of state-of-the-art NLP model architectures such as BERT (and derivatives like BioBERT, RoBERTa, etc.), BiLSTM, and XLNet in production pipelines
  • Use of John Snow Labs (JSL) Technology/Pre-Built Models
  • Design, implement, deploy, and maintain deep learning and machine learning models using cloud technologies (e.g., AWS, GCP, Azure. Preferably AWS.)
  • Collaborate with other machine learning engineers/data scientists and provide technical direction
  • Perform code reviews to guarantee high quality products moving to production
  • Designing and developing optimal data architecture for data warehousing of unstructured data and insights
  • Provide mentorship and guidance to junior team members in areas of technical and professional development

Drive innovation

  • Develop creative solutions for diverse problems in Genomics, Medical Imaging, NLP, and AI/Machine learning including information extraction from unstructured data, clinical ontology development, and risk prediction
  • Evaluate new technologies and tools prior to wider business adoption

Knowledge

  • The candidate is tasked with maintaining a deep understanding of the business’s marketplace dynamics. The Machine Learning Engineer/Data Scientist takes initiative and conducts exploratory data analyses and experimental designs, which will help the business to better understand trends and behavior within these markets and settle on the most suitable strategies to drive success and achievements of goals and targets.

This description is general in nature and is not intended to be an exhaustive list of all responsibilities. Other duties may be assigned as needed to meet company goals.

Typical Minimum Requirements

  • Typically requires 10+ years of data science experience
  • At least 4+ years of experience in NLP machine learning (healthcare exp preferred)
  • Must be authorized to work in the US (sponsorship is not available)

Critical Skills & Experience

  • Proficient in big data technologies such as Hadoop, Spark, or Flink
  • Deep knowledge of NLP tools and frameworks such as Spark NLP, spaCy, HuggingFace, Flair, NLTK, etc.
  • Proficient in OCR technologies such as Tesseract
  • Experience using tools in at least one cloud platform such as Amazon Web Services, Microsoft Azure, or Google Cloud Platform
  • Deep understanding of various machine learning/deep learning algorithms, including supervised, unsupervised, and reinforcement learning methods

Specialized Skills & Knowledge

  • Strong analytical skills, business and product instincts
  • Excellent verbal communication, including the ability to articulate complex concepts to both technical and non-technical audiences
  • Researching and converting those results into applications and features that drive business impact
  • Proficient with relational databases like Oracle PL/SQL, PostgreSQL, MySQL, etc.
  • Strong programming abilities in Python, R, Java, or Scala
  • Proficient with Unix systems, command line interfaces, and shell scripting
  • Knowledge of NoSQL databases such as MongoDB a plus, but not required
  • Experience with machine learning and statistical libraries, such as scikit-learn, TensorFlow, PyTorch, NumPy, etc.
  • Experience with BI visualization tools such as Tableau
  • Working knowledge of modern web development tools such as JavaScript and node.js.
  • Passion for teaching and mentoring

Education/Training

  • A degree in a quantitative field such as Statistics, Machine Learning, Mathematics, Computer Science, Economics, Epidemiology or any other related field is required
  • Graduate degree strongly preferred

Working Conditions

  • Remote work location – NY, NJ, MA, IL, TX preferred
  • Occasional travel (up to 10%)

No Agencies Please.

McKesson is an Equal Opportunity/Affirmative Action employer.

All qualified applicants will receive consideration for employment without regard to race, color, religion, creed, sex, sexual orientation, gender identity, national origin, disability, or protected Veteran status.Qualified applicants will not be disqualified from consideration for employment based upon criminal history.

Company Info.

McKesson Corporation

McKesson Corporation is an American company distributing pharmaceuticals and providing health information technology, medical supplies, and care management tools. The company delivers a third of all pharmaceuticals used in North America and employs over 78,000 employees .

  • Industry
    Information Technology
  • No. of Employees
    80,000
  • Location
    Irving, Texas, USA
  • Website
  • Jobs Posted

Get Similar Jobs In Your Inbox

McKesson Corporation is currently hiring Lead Data Scientist Jobs in New York, NY, USA with average base salary of $120,000 - $190,000 / Year.

Similar Jobs View More