Big Data / Data Lake Architect

Merck & Co.
Apply Now

Job Description

Our IT team operates as a business partner proposing ideas and innovative solutions that enable new organizational capabilities. We collaborate internationally to deliver the services and solutions that help everyone to be more productive and enable innovation.

The Opportunity:

This is a rare opportunity to join a start-up hub built within a major multinational with the goal to build a world class data platform / data mesh network for the division.

The Team:

The Data Platform team at our Animal Health Intelligence division of our company (MAHI) provides the development, support, and roadmap for the advanced analytics data platform. The team’s area of responsibilities is the capture, ingestion, transformation, storage, architecture, implementation of ML models and end user consumption end points such as self-service and APIs. The platform provides deep analytical insights for reporting and development of products for the market to aid in animal traceability, monitoring, well-being, actions, and other features based on descriptive, predictive, and prescriptive analytics.

About the Role:

Working at the cutting edge of the Animal Health Intelligence division of our company (MAHI), we seek a self-motivated, well communicative, experienced active hands-on Data Architect that will design, establish, review architectures, and design optimized data models with new and existing data sources for ingestion as well as design for both internal an external consumption via self-serve analytics, developed APIs, and consumption layers and monitoring egress. Engineer enterprise-class, large scale deployments, and deliver Cloud-based data solutions to our customers.

You will work closely with engineering, ML Ops, data science, technical and business stakeholders, cloud engineers, data engineers, enterprise architecture, cloud teams, and foundational teams to architect the optimal solutions for the Data Platform Team including cost efficiency. There will be a mixture of various environments, which are both within private and public subnets as well as infrastructures and regions. 

You would bring your energy and skills to:

  • Data Architecture – Drive the architecture for the platform including best models for ingress and egress, ingestion, and microservices based APIs. Identify tradeoffs and the best data store for the job
  • Adaptation – Formulate the architectural strategy for ease of use for data consumers via data catalogs. 
  • Optimizations of Models – Design and suggest the best models to use for both structured and unstructured data in the global data mesh architecture.
  • Communications - Manage timely and succinct communications to leadership and across teams to ensure accountability.
  • Agility - Lead with an agile first mindset while balancing urgency and prioritizations
  • Data Modeling - Create the optimal data models to drive the data product

We love to know more about you because:

  • You are great at data modeling – You can easily take each use case and find the correct model for all sorts of data via batch or streaming capabilities and evaluate various data storage systems and propose plans for optimization, including RDS, Dynamo, S3, Mongo, DynamoDB, and others.
  • You really know your data stores – You know exactly when to make use of various structured, semi-structured, and unstructured types of data stores such as: relational, columnar, time series, in memory, graph, blob, key-value, wide column, document, geospatial, immutable ledger, and text search.
  • You really know your data warehouse schemas – You know exactly when it makes sense to implement star, snowflake, galaxy, and star cluster schemas.
  • You have experience with DataBricks and Lake House – You have experience with lake house architecture and have used DataBricks before with delta.
  • You are super confident in your skills – You have designed and implemented modern data platforms with self-serve federated mesh concepts preferably on Delta
  • You can evangelize – Define and evangelize the adoption of Data Architecture capabilities and build architectural data processes that create repeatable and reusable capabilities to transform how the company collects, curates and consumes data
  • You see the vision – You understand both the big picture and the details.
  • You deal easy with change – You can demonstrate adaptability, resilience, and ability to thrive with changes and in ambiguity
  • You really understand agile – You have a firm understanding and have led true IT agile process
  • You are pragmatic – You know the right balance between progress and perfection. 
  • You find your own work – You possess a strong sense of self-motivation, proactivity, takes initiatives and works independently with minimum direction.

What you need to have:

  • 7+ years combined experience in big data architecture, design, implementation of data driven solutions in on prem and cloud, with a minimum of 4 on the cloud
  • 3+ years hands on experience building for a Data Lake on AWS or preferably DataBricks
  • Mastery of data modeling, various data store options, and warehouse schemas
  • Extensive hands-on experience implementing Lakehouse architecture using Databricks Data Engineering platform, SQL Analytics, Delta Lake, and Unity Catalog
  • Deep understanding of modern data architecture including processes for both batch and streaming (CDC + IoT) data as well as modern cloud-based data technology (e.g. Databricks / Spark, Snowflake)
  • Data Modeling experience and defining conceptual, logical, and physical data models
  • Create great technical documentation and present on demand
  • Deep understanding of structured and unstructured data architectures, data lakes, data warehousing, data orchestration as well as tools for data gathering and processing
  • Experience in implementation and performance tuning MPP databases (Snowflake, Redshift, EMR, Databricks)
  • Strong Knowledge and architecting for Data Governance, privacy, and regulations, GDPR
  • Experience with tools like Airflow and data storage formats such as JSON, Parquet, etc
  • Proficiency with AWS cloud services such as: EC2, EMR, RDS, Lambda, Glue, S3, Redshift
  • Programming Languages: Strong hands-on Python, Shell, Scala, Go, or other language
  • Outstanding verbal and written communication
  • Outstanding presenting skills to both technical and executive audiences whether impromptu on a whiteboard or using presentations and demos.

Additional things that would make you a superstar:

  • 5 years experience using DataBricks
  • Experience reverse engineering Snowflake code to be implemented in Databricks
  • Experience with MDM architecture and implementation
  • Design and implementation of GDPR compliance on DataBricks
  • Experience with Jira and Confluence
  • Experience ingesting from SAP
  • Experience architecting for Data Science and ML Ops use cases
  • In depth understanding of Spark Architecture including Spark Core, Spark SQL, Data Frames, Spark Streaming, RDD caching, Spark MLib.

Education Minimum Requirements:

  • High School Diploma

Our Support Functions deliver services and make recommendations about ways to enhance our workplace and the culture of our organization. Our Support Functions include HR, Finance, Information Technology, Legal, Procurement, Administration, Facilities and Security.

Who we are …

We are known as Merck & Co., Inc., Rahway, New Jersey, USA in the United States and Canada and MSD everywhere else. For more than a century, we have been inventing for life, bringing forward medicines and vaccines for many of the world's most challenging diseases. Today, our company continues to be at the forefront of research to deliver innovative health solutions and advance the prevention and treatment of diseases that threaten people and animals around the world.

What we look for …

Imagine getting up in the morning for a job as important as helping to save and improve lives around the world. Here, you have that opportunity. You can put your empathy, creativity, digital mastery, or scientific genius to work in collaboration with a diverse group of colleagues who pursue and bring hope to countless people who are battling some of the most challenging diseases of our time. Our team is constantly evolving, so if you are among the intellectually curious, join us—and start making your impact today.

NOTICE FOR INTERNAL APPLICANTS

In accordance with Managers' Policy - Job Posting and Employee Placement, all employees subject to this policy are required to have a minimum of twelve (12) months of service in current position prior to applying for open positions.

If you have been offered a separation benefits package, but have not yet reached your separation date and are offered a position within the salary and geographical parameters as set forth in the Summary Plan Description (SPD) of your separation package, then you are no longer eligible for your separation benefits package. To discuss in more detail, please contact your HRBP or Talent Acquisition Advisor.

Expected salary range:

  • $130,960.00 - $206,200.00

Company Info.

Merck & Co.

Merck & Co., Inc. is a multinational pharmaceutical company headquartered in Kenilworth, New Jersey. It is named after the Merck family, which set up Merck Group in Germany in 1668. The company does business as MSD outside the United States and Canada.

  • Industry
    Pharmaceuticals
  • No. of Employees
    66,400
  • Location
    Kenilworth, NJ, USA
  • Website
  • Jobs Posted

Get Similar Jobs In Your Inbox

Merck & Co. is currently hiring Senior Data Lake Architect Jobs in Millsboro, DE, USA with average base salary of $130,960 - $206,200 / Year.

Similar Jobs View More