Sant Cugat del Vall\u00E8s, Barcelona, Spain
19 days ago
Senior Incident Manager - Site Reliability Engineering(SRE)

Roche fosters diversity, equity and inclusion, representing the communities we serve. When dealing with healthcare on a global scale, diversity is an essential ingredient to success. We believe that inclusion is key to understanding people’s varied healthcare needs. Together, we embrace individuality and share a passion for exceptional care. Join Roche, where every voice matters.

The Position

Are you ready to take on a pivotal role that ensures the stability and reliability of our cutting-edge IT services and systems? As a Site Reliability Engineering (SRE) Incident Manager, you’ll be at the helm of incident management, orchestrating swift and effective responses to ensure our digital infrastructure remains available, performant, and secure.

Your expertise will be critical in maintaining exceptional standards of quality and performance for our customers. By working collaboratively with cross-functional teams, you will drive improvements in monitoring, alerting, and recovery processes. Your analytical skills will help identify patterns from incident data, enabling you to propose and implement effective strategies to prevent future incidents.

This position offers a unique opportunity to make a significant impact on the reliability and stability of our services, contributing to an exceptional experience for our customers and helping to shape the future of our IT infrastructure. If you are passionate about enhancing system resilience and thrive in a dynamic, collaborative environment, we would love to hear from you.

Who We Are 

At Roche, we are passionate about transforming patients’ lives, and we are bold in both decision and action - we believe that good business means a better world. That is why we come to work every single day. We commit ourselves to scientific rigor, unassailable ethics, and access to medical innovations for all. We do this today to build a better tomorrow. 

Roche is strongly committed to a diverse and inclusive workplace. We strive to build teams that represent a range of backgrounds, perspectives, and skills. Embracing diversity enables us to create a great place to work and to innovate for patients.

Roche is building a global Site Reliability Engineering (SRE) team that will support commercial and internal solutions. This team will have the mindset of building and creating engineering solutions to solve a broad spectrum of problems.

Your Core Responsibilities:

Be the Incident Maestro: Lead the lifecycle of incidents from initial detection to successful resolution and post-incident review.

Team Leadership: Coordinate and guide response teams to resolve incidents quickly and efficiently.

Process Implementation: Develop, implement, and refine incident management processes and procedures.

Root Cause Analysis: Conduct regular reviews to identify root causes and drive the implementation of corrective actions.

Flexible Scheduling: Work on-call outside of normal working hours and  weekends as scheduled to ensure continuous support.

Integration Management: Oversee the onboarding of new products into our support ecosystem.

Clear Communication: Provide consistent and clear updates to technical teams, business users, and management.

Collaborative Efforts: Partner with engineering, cybersecurity, DevOps, product managers, test engineers, support teams, and administrators to enhance system reliability.

Continuous Improvement: Identify and implement opportunities to enhance incident management processes and tools.

Product Alignment: Align SRE activities with product release planning to streamline product adoption.

Team Builder: Actively contribute to the growth and development of the SRE team's capabilities, nurturing a stronger, more inclusive, and resilient team.

Who You Are:

Educational Background: Bachelor’s degree in Computer Science, Engineering, or a related field.

Certifications: Relevant certifications such as ITIL are a plus.

Experience: +5 years in IT incident management or related fields.

Incident Management Mastery: Deep understanding of incident prioritization, escalation processes, and service level management (SLA/SLO/SLI).

Troubleshooting: Demonstrates proficient troubleshooting capabilities, especially in cloud and distributed system environments.

Stakeholder Management: Proven ability to engage, influence, and build relationships with stakeholders across various levels.

Technical Knowledge: Familiarity with cloud technologies, particularly AWS, APIs, Microservices, DevOps, cybersecurity, and software release management.

Proactive Approach: Self-motivated to improve system reliability and operational efficiencies.

Technical Tools: Hands-on experience with JIRA and ServiceNow for incident tracking and documentation.

Blameless Postmortems:  Driving the war room in the case of P1 and experience leading or participating in blameless postmortems.

Communication and Teamwork: Excellent communication, teamwork, and documentation skills, with a proactive and self-motivated approach to improving system reliability and operational efficiencies.

Diversity and Inclusion: We value and encourage candidates from diverse backgrounds and experiences, believing that diverse perspectives drive innovation and success.

Why Join Us?

By joining our team, you will be part of a dynamic environment where your contributions will directly impact the resilience and reliability of our services. You will have opportunities for professional growth and the ability to collaborate with industry leaders. Let’s drive the future of IT stability together, ensuring an exceptional experience for our customers.

Ready to make a difference? Apply now to be our next SRE Incident Manager and help us build a more reliable future!


 

Who we are

At Roche, more than 100,000 people across 100 countries are pushing back the frontiers of healthcare. Working together, we’ve become one of the world’s leading research-focused healthcare groups. Our success is built on innovation, curiosity and diversity.

Roche is an Equal Opportunity Employer.

Confirm your E-mail: Send Email