Staff Reliability Engineer
Merck & Co.
This listing was originally posted on Merck & Co.'s careers page. Formulate is an equal opportunity job aggregator and is not involved in the hiring process. Where salary information is estimated, it is derived from BLS industry benchmarks and may differ from actual compensation.
Upgrade to Pro to access the AI-generated 'Read before applying' briefing and other premium pharma intelligence.
Upgrade to Pro — $25/moJob Description
Join our company as we transform and innovate. Our Digital Platforms & Services organization delivers reliable, scalable, and resilient digital solutions that support critical scientific and business outcomes across our global enterprise.
We are seeking a Staff Reliability Engineer with strong technical expertise in Site Reliability Engineering (SRE), Observability, and Resilience. In this role, you will partner with engineering teams to implement and operationalize reliability practices, ensuring systems are designed, built, and operated with reliability in mind.
You will contribute to the adoption of enterprise reliability standards, support the implementation of Service Level Objectives, and help improve system performance and availability. This role is hands-on, execution-focused, and plays a key part in advancing reliability maturity across the organization.
Partner with application and platform teams to embed reliability into system design, development, and operations
Support implementation and operationalization of Service Level Objectives and reliability indicators
Contribute to improving observability coverage across logs, metrics, traces, and events
Apply reliability patterns such as fault isolation, failover, and recovery mechanisms in collaboration with engineering teams
Participate in and support improvements to the incident lifecycle, including detection, response, root cause analysis, and follow-up actions
Assist in identifying reliability risks and performance bottlenecks and contribute to remediation efforts
Support continuous improvement initiatives focused on reducing incident volume and improving system stability
Apply established enterprise standards for observability, resilience engineering, and Service Level Objectives
Support adoption of reliability practices across teams through hands-on guidance and collaboration
Contribute feedback to help evolve reliability frameworks and tooling
Develop and enhance automation for incident response, monitoring, and operational workflows
Leverage existing platforms (e.g., observability tools, incident management systems) to improve efficiency and visibility
Utilize AI-enabled capabilities where appropriate to support diagnostics and operational workflows under defined governance
Work closely with product, platform, and ITSM teams to align on reliability improvements
Participate in cross-team initiatives focused on improving system resilience and operational maturity
Contribute to knowledge sharing within the reliability engineering community
Experience in one or more of the following: system integration, software development, system administration, or operations engineering
Familiarity with software development life cycle (SDLC) and production support models
Understanding of monitoring, observability, and performance optimization concepts
Experience supporting applications in cloud and/or on-premises environments
Working knowledge of CI/CD pipelines and deployment practices
Basic understanding of incident management and root cause analysis processes
Knowledge of system reliability principles, including availability and performance engineering
Strong problem-solving skills with a focus on continuous improvement
Ability to collaborate effectively across engineering and operations teams
Experience with observability platforms and reliability tooling ecosystems
Exposure to Service Level Objectives and reliability metrics frameworks
Familiarity with automation and scripting (e.g., Python, Bash, or similar)
Understanding of resilience patterns and distributed systems concepts
Awareness of AI-assisted operational tools and workflows
Required Skills:
Bash (Programming Language), Data Engineering, Data Visualization, Design Applications, Incident Management, Incident Response, Monitoring Control, Performance Optimizations, Production Support, Python (Programming Language), Reliability Engineering, Software Configurations, Software Development, Software Development Life Cycle (SDLC), Software Integration, Software Lifecycle Management (SLM), Solution Architecture, System Administration, System Designs, System Integration, TestingPreferred Skills:
Current Employees apply HERE
Current Contingent Workers apply HERE
Search Firm Representatives Please Read Carefully
Merck & Co., Inc., Rahway, NJ, USA, also known as Merck Sharp & Dohme LLC, Rahway, NJ, USA, does not accept unsolicited assistance from search firms for employment opportunities. All CVs / resumes submitted by search firms to any employee at our company without a valid written search agreement in place for this position will be deemed the sole property of our company. No fee will be paid in the event a candidate is hired by our company as a result of an agency referral where no pre-existing agreement is in place. Where agency agreements are in place, introductions are position specific. Please, no phone calls or emails.
Employee Status:
RegularRelocation:
VISA Sponsorship:
Travel Requirements:
Flexible Work Arrangements:
HybridShift:
Valid Driving License:
Hazardous Material(s):
Job Posting End Date:
05/30/2026*A job posting is effective until 11:59:59PM on the day BEFORE the listed job posting end date. Please ensure you apply to a job posting no later than the day BEFORE the job posting end date.
Explore related positions you might be interested in
We'll notify you when matching roles are posted.
Interviewed at Merck & Co.?
Help others prepare — share your experience anonymously.
PHARMACEUTICAL
Upgrade to Pro to access salary benchmarks and market rate data and other premium pharma intelligence.
Upgrade to Pro — $25/moUpgrade to Pro to access AI interview prep brief and other premium pharma intelligence.
Upgrade to Pro — $25/mo