Dublin, Ireland
45 days ago
Staff Engineer, Reliability Insights & Excellence
Who we are

About Stripe

Stripe is a financial infrastructure platform for businesses. Millions of companies - from the world’s largest enterprises to the most ambitious startups - use Stripe to accept payments, grow their revenue, and accelerate new business opportunities. Our mission is to increase the GDP of the internet, and we have a staggering amount of work ahead. That means you have an unprecedented opportunity to put the global economy within everyone's reach while doing the most important work of your career.

About the team

Stripe’s infrastructure team powers businesses all over the world. Our customers trust us with their businesses, and every request that stripe handles is critical. We process billions of dollars every year for millions of users, from the largest enterprises to a startup making their first sale. That is why both world-class reliability and seamless infrastructure scale are considered table stakes to support massive economic transactions for our customers.

We are the team leading Stripe’s reliability and scalability efforts, with focus on delivering world-class availability and certifying Stripe’s systems to handle unprecedented levels of traffic during big events like Black Friday and Cyber Monday as well as our merchants' key events. You can learn more about our contributions on https://stripe.com/newsroom/news/bfcm2023.  We own the core preventative reliability platforms and tools used by infrastructure and product teams across the company to build resiliency in their systems and scale them to handle the projected peak load. 

What you’ll do

We’re looking for an experienced distributed systems engineer with outstanding technical and leadership skills, strong collaboration skills and huge passion for customers to help deliver the  foundation of our reliability infrastructure and work with various teams and across the entire stack to deliver world-class reliability solutions. In this role you’ll not only be in charge of designing, implementing and testing your various infrastructure components, but you’ll play an influential role in enabling engineering teams to make their services more reliable by identifying, creating, and deploying engineering practices, processes, and solutions.

 

You will:

Design, build, and maintain distributed cloud infrastructure and platform service Debug production issues across services and various levels of the stack, work on scaling, automation, reliability and observability of infrastructure services Mentor other engineers in the organization and review code and design documentation Participate in roadmap planning and prioritization

 

Who you are

We're looking for someone who meets the minimum requirements to be considered for the role. If you meet these requirements, you are encouraged to apply. The preferred qualifications are a bonus, not a requirement.

Minimum requirements:

10+ years of engineering experience or equivalent combined work experience reflecting domain expertise  Hands-on experience designing, building and operating large scale distributed systems, identifying shortcomings and optimization opportunities, and making data driven cost performance tradeoffs to influence design decisions Demonstrated experience of leading initiatives spanning multiple teams and leveraging deep domain expertise to influence tech roadmap planning and execution Demonstrated ability to effectively collaborate across multiple teams and stakeholders to drive business outcomes Experience, mentoring, and investing in the development of engineers and peers

 

Preferred Qualifications:

Genuine interest and/or experience in debugging and troubleshooting complex distributed systems problems.  Experience in fault modeling and tolerance, chaos engineering and load testing.   Familiarity with the common patterns and practices for building reliable software. Experience with C, C++, Go, Ruby or/and Java




Confirm your E-mail: Send Email
All Jobs from Stripe