Job Description
Our Company is where we transform vision into reality. It's where ideas become technologies, and cutting-edge technologies become solutions for animal care and management.
We support farmers by providing real-time actionable information to help them manage their herds. It provides pet owners with smart devices and data that give them a better understanding of their pets’ activity and health needs, enriching relationships. It helps conservationists safeguard natural environments and wildlife.
Leveraging decades of Technological Research & Development experience across many markets, technologies and species, along with development environments and Quality Assurance procedures, we're always inventing new ways to look after the health and well-being of animals. Our decades of experience keep us ahead of the curve by leveraging advanced Technological Solutions from enhancing the precious bond between people and their pets, to advancing animal healthcare and wildlife preservation.
Job Description:
We are looking for a Site Reliability Engineer (SRE) to lead and establish the SRE domain within the organization.
You will be responsible for ensuring the reliability, availability, and performance of our systems and applications. Collaborate closely with Software Engineers, DevOps teams, Security teams, and Program Managers to build and maintain scalable infrastructure, monitor critical systems, and automate repetitive tasks to improve efficiency and uptime. Your primary goal is to maintain an optimal balance between system stability, feature development, and fast delivery cycles.
Key Responsibilities:
Monitoring: Monitoring of AWS (Azure – advantage) infrastructures using DataDog (or equivalent) using KPIs.
Proven experience with defining efficient alerts, synthetic tests, analyzing logs (error detection), detecting issues using DataDog, managing SLIs and SLOs, leveraging NOC activity, and defining flows.
Architecture Understanding:
Infrastructure: In-depth understanding of designing distributed systems on cloud-based environments and microservices.
Business Logic: Understand complex cloud product architectures, including event-driven architecture, with a focus on how data flows and messages interact between services.
Continuous Improvement & Documentation:
Develop and maintain technical documentation for processes, procedures, and systems; conduct post-incident reviews and implement preventative measures and lead Root Cause Analysis (RCA) and Incident management when issues arise.
Infrastructure & Cloud:
Proven experience with AWS services such as API Gateway, Lambda Functions, SQS, SNS, S3 Bucket, RDS, Redis Cache, Kinesis, Global Accelerator, CloudFront, and Route 53, with an understanding of most common cloud services in production environments and IAC understanding using Terraform.
Automation and CI/CD:
Experience with Azure DevOps, GitHub Actions, Argo, GitOps, Artifact management using Artifactory. Ability to review pipelines and Helm charts or equivalent, understand Automation processes. Familiarity with CrossPlan.
Security (Preferred):
Experience with Web Application Firewalls (WAF) rules review, rate limiting on services and infrastructure based on data analysis and , collaboration with DevSecOps.
Personal Requirements:
Bachelor’s degree in computer science or equivalent proven experience.5+ years in a hands-on DevOps or SRE position.Strong communication skills to align, document, and share knowledge across teams are a must when working with cross-functional teams.Ability to work in high-load and lead sensitive situations and investigations, especially when customer-facing services are impacted.Great motivation for continuous learning and adoption of new technologies and excellent problem-solving skills with a proactive approach.MDAHTL
Current Employees apply HERE
Current Contingent Workers apply HERE
Search Firm Representatives Please Read Carefully
Merck & Co., Inc., Rahway, NJ, USA, also known as Merck Sharp & Dohme LLC, Rahway, NJ, USA, does not accept unsolicited assistance from search firms for employment opportunities. All CVs / resumes submitted by search firms to any employee at our company without a valid written search agreement in place for this position will be deemed the sole property of our company. No fee will be paid in the event a candidate is hired by our company as a result of an agency referral where no pre-existing agreement is in place. Where agency agreements are in place, introductions are position specific. Please, no phone calls or emails.
Employee Status:
RegularRelocation:
VISA Sponsorship:
Travel Requirements:
Flexible Work Arrangements:
HybridShift:
Valid Driving License:
Hazardous Material(s):
Required Skills:
Preferred Skills:
Capacity Management, Change Controls, Configuration Management (CM), Network Design, Release Management, Software Development, Software Development Life Cycle (SDLC), Solution Architecture, System Administration, Systems IntegrationJob Posting End Date:
06/1/2025*A job posting is effective until 11:59:59PM on the day BEFORE the listed job posting end date. Please ensure you apply to a job posting no later than the day BEFORE the job posting end date.
Requisition ID:R317892