Senior Specialist, GSF DnA Data Engineer
Merck & Co.
This listing was originally posted on Merck & Co.'s careers page. Formulate is an equal opportunity job aggregator and is not involved in the hiring process. Where salary information is estimated, it is derived from BLS industry benchmarks and may differ from actual compensation.
Upgrade to Pro to access the AI-generated 'Read before applying' briefing and other premium pharma intelligence.
Upgrade to Pro — $25/moJob Description
The Opportunity
Based in Hyderabad, join a global healthcare biopharma company and be part of a 130- year legacy of success backed by ethical integrity, forward momentum, and an inspiring mission to achieve new milestones in global healthcare.
Be part of an organisation driven by digital technology and data-backed approaches that support a diversified portfolio of prescription medicines, vaccines, and animal health products.
Drive innovation and execution excellence. Be a part of a team with passion for using data, analytics, and insights to drive decision-making, and which creates custom software, allowing us to tackle some of the world's greatest health threats.
Our Technology Centers focus on creating a space where teams can come together to deliver business solutions that save and improve lives. An integral part of our company’s IT operating model, Tech Centers are globally distributed locations where each IT division has employees to enable our digital transformation journey and drive business outcomes. These locations, in addition to the other sites, are essential to supporting our business and strategy.
A focused group of leaders in each Tech Center helps to ensure we can manage and improve each location, from investing in growth, success, and well-being of our people, to making sure colleagues from each IT division feel a sense of belonging to managing critical emergencies. And together, we must leverage the strength of our team to collaborate globally to optimize connections and share best practices across the Tech Centers.
Role Overview:
We are looking for a Senior Data Engineer to design and build trusted, scalable, and cost‑efficient data platforms on AWS and Databricks. You will lead hands‑on development of batch and streaming data pipelines (ETL/ELT), implement robust data quality and observability practices, and deliver dimensional models that power analytics and reporting. In this role you will also act as a technical mentorsetting engineering standards, coaching other data engineers through design reviews and pair programming, and partnering with stakeholders to translate business needs into well-governed datasets. We are forward-looking and continue evolving our ecosystem with modern patterns such as lakehouse, data mesh, and data fabric.
What will you do in this role:
Design, develop, and operate end-to-end data pipelines (ETL/ELT) to ingest data from diverse sources into an AWS-based data lakehouse and data warehouse (batch and/or streaming).
Build curated datasets using strong data modeling practicesincluding dimensional modeling (star/snowflake), SCD patterns, and conformed dimensionsto support BI and self-service analytics.
Partner with product managers, analysts, and data scientists to understand requirements, define source-to-target mappings, and deliver datasets that are accurate, discoverable, and reusable.
Define and implement data quality controls (validation rules, reconciliations, anomaly checks), data contracts, and SLAs; partner with governance to maintain catalog, lineage, and business/technical metadata.
Implement orchestration, logging, monitoring, and alerting to ensure reliable operations (data observability, pipeline health, backfills, and incident triage).
Apply engineering best practices: automated unit/integration tests, code reviews, and CI/CD to promote changes safely across environments.
Develop transformations on Databricks using Python/PySpark and Spark SQL; optimize jobs for performance and cost (partitioning, file sizing, caching, tuning).
Write and optimize complex SQL for analysis, transformations, and warehouse/lakehouse consumption patterns.
Build on AWS using services such as S3, IAM, Glue, Lambda, Step Functions, EMR/ECS/Fargate, and CloudWatch; implement secure access patterns and least-privilege principles.
Use Infrastructure as Code (Terraform) to provision and manage cloud resources; promote reusable modules and automated deployments.
Package and run workloads using Docker where appropriate, ensuring repeatable environments for development and deployment.
Use GitHub for version control; follow branching strategies (e.g., trunk-based or GitFlow) and maintain high-quality pull requests.
Process large datasets using PySpark and lakehouse formats (e.g., Delta/Parquet), applying best practices for reliability and scalability.
Create reproducible prototypes and analyses using notebooks when appropriate, then productionize solutions with proper packaging, testing, and documentation.
Work in an Agile environment (Scrum/Kanban), contributing to sprint planning, estimation, demos, and continuous improvement.
Create and maintain technical documentation (data flows, runbooks, data dictionaries) and contribute to team standards and playbooks.
Coach and mentor other data engineers through onboarding, pairing, code/design reviews, and knowledge-sharing sessions; raise the team’s engineering bar.
Provide technical leadership by proposing patterns/standards (lakehouse, warehousing, data quality, CI/CD), influencing the roadmap, and communicating trade-offs clearly to stakeholders.
What you should have (Required):
Primary Skill- AWS, Databricks, Python, PySpark
Secondary Skill- CI/CD, SQL
8+ years of experience in Data Engineering, building production-grade data platforms and pipelines.
Strong AWS experience, including data services (S3, Glue, Lambda, Step Functions, EMR/ECS/Fargate) and operational tooling (CloudWatch).
Proficient in Python and PySpark (Spark SQL), with strong Data engineering fundamentals.
Deep understanding of data warehousing and lakehouse concepts, including dimensional modeling (star/snowflake), SCDs, and performance patterns.
Hands-on with Databricks Lakehouse on AWS (e.g., Delta Lake) and at least one cloud data warehouse (e.g., Redshift or similar).
Experience working in Agile delivery teams, collaborating effectively across roles, and communicating clearly with stakeholders (requirements, trade-offs, and timelines).
Strong focus on data quality and operational excellence: testing, monitoring, alerting, root-cause analysis, and continuous improvement.
Strong SQL skills (complex joins, window functions, query tuning) and ability to translate business logic into reliable transformations.
Proficiency with GitHub, CI/CD pipelines, and DevOps fundamentals; familiarity with Docker and Infrastructure as Code (Terraform).
Demonstrated ability to mentor and coach engineers (guidance, feedback, best practices) while remaining hands-on in delivery.
Bachelor’s degree (or equivalent experience) in Computer Science, Engineering, or a related field.
Nice to have:
AWS certification (Developer, Solutions Architect, Data Analytics) and/or Databricks certification.
Experience with data transformation frameworks (e.g., dbt) and workflow orchestration tools (e.g., Airflow).
Experience building and operating data products (product thinking, domain-aligned datasets, documentation, SLAs, and adoption/usage metrics).
Experience with data governance and access controls tools such as Collibra (catalog/lineage/metadata) and Immuta (policy-based data access) is a plus.
Streaming data experience (e.g., Kafka, Kinesis) and building near real-time pipelines.
Exposure to modern data architecture patterns such as data mesh and data fabric, and experience defining reusable platform capabilities.
Our technology teams operate as business partners, proposing ideas and innovative solutions that enable new organizational capabilities. We collaborate internationally to deliver services and solutions that help everyone be more productive and enable innovation.
Who we are
We are known as Merck & Co., Inc., Rahway, New Jersey, USA in the United States and Canada and MSD everywhere else. For more than a century, we have been bringing forward medicines and vaccines for many of the world's most challenging diseases. Today, our company continues to be at the forefront of research to deliver innovative health solutions and advance the prevention and treatment of diseases that threaten people and animals around the world.
What we look for
Imagine getting up in the morning for a job as important as helping to save and improve lives around the world. Here, you have that opportunity. You can put your empathy, creativity, digital mastery, or scientific genius to work in collaboration with a diverse group of colleagues who pursue and bring hope to countless people who are battling some of the most challenging diseases of our time. Our team is constantly evolving, so if you are among the intellectually curious, join us—and start making your impact today.
#HYDIT2025
Required Skills:
Academic Quality Improvement Program (AQIP), Automated Testing, Branching Strategy, Business Intelligence (BI), Cloud Resource Management, Computer Science, Database Administration, Data Engineering, Data Governance, Data Management, Data Modeling, Data Visualization, Design Applications, GitHub, Information Management, Software Development, Software Development Life Cycle (SDLC), Spark SQL, Sprint Planning, System Designs, Technical Leadership, Technical Writing Documentation, WarehousingPreferred Skills:
Current Employees apply HERE
Current Contingent Workers apply HERE
Search Firm Representatives Please Read Carefully
Merck & Co., Inc., Rahway, NJ, USA, also known as Merck Sharp & Dohme LLC, Rahway, NJ, USA, does not accept unsolicited assistance from search firms for employment opportunities. All CVs / resumes submitted by search firms to any employee at our company without a valid written search agreement in place for this position will be deemed the sole property of our company. No fee will be paid in the event a candidate is hired by our company as a result of an agency referral where no pre-existing agreement is in place. Where agency agreements are in place, introductions are position specific. Please, no phone calls or emails.
Employee Status:
RegularRelocation:
VISA Sponsorship:
Travel Requirements:
Flexible Work Arrangements:
HybridShift:
Valid Driving License:
Hazardous Material(s):
Job Posting End Date:
05/26/2026*A job posting is effective until 11:59:59PM on the day BEFORE the listed job posting end date. Please ensure you apply to a job posting no later than the day BEFORE the job posting end date.
Explore related positions you might be interested in
We'll notify you when matching roles are posted.
Interviewed at Merck & Co.?
Help others prepare — share your experience anonymously.
PHARMACEUTICAL
Upgrade to Pro to access salary benchmarks and market rate data and other premium pharma intelligence.
Upgrade to Pro — $25/moUpgrade to Pro to access AI interview prep brief and other premium pharma intelligence.
Upgrade to Pro — $25/mo