Principal Data Engineer, Scientific Digital & Tech

GlaxoSmithKline
Apply Now

Job Description

The Scientific Digital and Tech organization part of R&D Digital & Tech is an integrated family that powers the GSK R&D discovery, manufacture, and supply of new medicines and vaccines to patients. It focuses on optimizing CMC, informing Vaccine and Medicine design, and delivering a competitive, modern laboratory experience.

Scientific Data Engineering is a new organization responsible for enterprise data and architecture, providing thought leadership and architecture services for all aspects of data across the Scientific area. This includes the use of, and augmentation to, the two GSK data platforms on GCP and Azure, design and creation of data pipelines to surface data as a “product” in a data Mesh architecture across two platforms.

Job Purpose:

The Principal Data Engineer contributes to the construction of the CMC data Mesh and data strategy. This role will interact with architects, engineers, data modelers, product owners as well as other team members in Scientific Tech and R&D.

The Principal Data Engineer is a leading technical contributor who can consistently take a poorly defined business or technical problem, work it to a well-defined data problem/specification, and execute it at a high level. They have a strong focus on metrics, both for the impact of their work and for its inner workings/operations.

They are a model for the team on best practices for software development in general (and data engineering in particular), including code quality, documentation, DevOps practices, and testing, and consistently mentor junior members of the team. They ensure the robustness of our services and serve as an escalation point in the operation of existing services, pipelines, and workflows.

The Principal Data Engineer should demonstrate core engineering knowledge/experience of industry technologies, practices, and frameworks such as data Mesh and scaling data platforms, containerization, cloud-based platforms, data analytics, and data streaming. Examples of technologies include Java/C#/Python, Denodo, GIT, Azure DevOps, Data Bricks, Spark, Azure Data Factory, ADLS V2, Kafka, Selenium, JUnit/NUnit, SAFe, Kanban, Docker, Azure Cloud Architecture including networking principles and scaling applications.

Primary responsibilities include the following:

  • Using Azure cloud services and GSK data platform tools to ingest, egress, and transform data from multiple sources.
  • Confidently optimizes the design and execution of complex solutions in data ingestion and data transformation
  • Produces well-engineered software, including appropriate automated test suites, technical documentation, and operational strategy
  • Provides input into the roadmaps of upstream teams (e.g., Data Platforms, DataOps, DevOps) to help improve the overall program of work
  • Ensure consistent application of platform capabilities to ensure quality and consistency concerning logging and lineage
  • Fully versed in coding best practices and ways of working, and participates in code reviews and partnering to improve the team’s standards
  • Adhere to QMS framework, Security & Regulatory Standards, and CI/CD best practices and helps to guide improvements to them that improve ways of working
  • Provide leadership to team members to help others get the job done right
  • Supporting engineering teams in the adoption and creation of data Mesh best practices.
  • Maintains best practices for engineering and architecture on our Confluence site.
  • Pro-actively engages in experimentation and innovation to drive relentless improvement
  • Provides leadership, technical direction, and GSK expertise to architecture and engineering teams composed of GSK FTEs, strategic partners, and software vendors.

Why you?

Basic Qualifications:

We are looking for professionals with these required skills to achieve our goals:

  • BS in Computer Science
  • Experience in all the following:
  • Data Engineering development, architecture design & technology platforms/frameworks
  • Azure Data Analytics services e.g. ADLS, Azure Data Factory, Azure Databricks, Purview, Azure Synapse, etc.
  • Data Platforms and Domain-driven design
  • Agile, DevOps & Automation [of testing, build, deployment, CI/CD, etc.]
  • Data analytics & data quality/integrity
  • Testing strategies & frameworks
  • Python experience required

Preferred Qualifications:

If you have the following characteristics, it would be a plus:

  • MS in Computer Science
  • Experience with various open-source ecosystems including JavaScript, Bigdata, java, scala, python, etc.
  • Experience in agile software development and DevOps, relevant technology platforms [e.g., Kubernetes] and frameworks [e.g. Docker] including cloud technologies & data structures (i.e. information management), data models or relational database design
  • Subject matter expertise in Pharma CMC and scientific domains.
  • Experience in applying data curation, virtualization, workflow, and advanced visualization techniques to enable decision support across multiple products and assets to drive results across R&D business operations.
  • Role requires:
  • Demonstrated skill in delivering high-quality engineered data products
  • Knowledge of industry standards and technology platforms
  • Excellent communication, negotiation, influencing, and stakeholder management skills
  • Customer focus and excellent problem-solving skills
  • Good understanding of various software paradigms: domain-driven, procedural, data-driven, object-oriented, functional
  • Demonstrable knowledge depth in more than one area of software engineering and technology

If you require an accommodation or other assistance to apply for a job at GSK, please contact the GSK Service Centre at 1-877-694-7547 (US Toll Free) or +1 801 567 5155 (outside US)

Company Info.

GlaxoSmithKline

A science-led global healthcare company with a special purpose: to help people do more, feel better, live longer. We have three global businesses that research, develop and manufacture innovative pharmaceutical medicines, vaccines and consumer healthcare products. We aim to bring differentiated, high-quality and needed healthcare products.

  • Industry
    Healthcare
  • No. of Employees
    104,875
  • Location
    Brentford, UK
  • Website
  • Jobs Posted

Get Similar Jobs In Your Inbox

GlaxoSmithKline is currently hiring Principal Data Engineer Jobs in Collegeville, PA, USA with average base salary of $90,000 - $190,000 / Year.

Similar Jobs View More