Stripe is a financial infrastructure platform for businesses. Millions of companies—from the world’s largest enterprises to the most ambitious startups—use Stripe to accept payments, grow their revenue, and accelerate new business opportunities. Our mission is to increase the GDP of the internet, and we have a staggering amount of work ahead. That means you have an unprecedented opportunity to put the global economy within everyone’s reach while doing the most important work of your career.
About the teamThe Machine Learning Infrastructure group at Stripe aims to provide state of the art infrastructure and support for building and operationalizing AI/ML models for all business verticals within the company, including but not limited to models that mitigate risks across Stripe’s products and services globally, and models that help our customers to fight fraud by leveraging Stripe’s user facing products like Radar and Identity. ML is a top priority for Stripe in the coming years. With the phenomenal developments happening in the field of AI, we are positioned to accelerate the adoption of AI/ML across all parts of the company by building highly scalable and reliable foundational infrastructure.
What you’ll doYou will work closely with machine learning engineers, data scientists, and platform infrastructure teams to build the powerful, flexible, and user-friendly systems that substantially increase ML-Ops velocity across the company.
Responsibilities Building powerful, flexible, and user-friendly infrastructure that powers all of ML at Stripe. Designing and building fast, reliable services for ML model training and serving, and scaling that infrastructure across multiple regions. Creating services and libraries that enable ML engineers at Stripe to seamlessly transition from experimentation to production across Stripe’s systems. Pairing with product teams and ML engineers to develop easy-to-use infrastructure for production ML models. Who you areWe’re looking for people with a strong background or interest in building successful products or systems; you’re passionate about solving business problems and making impact, you are comfortable in dealing with lots of moving pieces; and you’re comfortable learning new technologies and systems. Many of our engineers work remotely from both the US and Canada, and we’d be happy to talk to you about the possibility of working remotely.
It’s not expected that any single candidate would have expertise across all of these areas. For instance, we have wonderful team members who are really focused on their customers’ needs and building amazing user experiences, but didn’t come in with as much systems knowledge.
Minimum requirements 3-5 years of experience building software applications in large scale distributed systems. A strong sense of curiosity and a desire to both learn and share knowledge with your peers. We like to work in a collaborative environment and hope you do too. A solid engineering background and experience with infrastructure and/or distributed systems. You’ll work mostly in Python and Java but we care more about your general engineering skills than your knowledge of a specific language. Familiarity with the full life cycle of software development, from design and implementation to testing and deployment. Experience with building and maintaining high availability, low latency systems, especially with respect to reliability, testing, and observability. A sense of pragmatism: you know when to aim for the ideal solution and when to adjust course. Preferred qualifications Over 2 years of experience supporting Machine Learning Infrastructure. Experience optimizing the end-to-end performance of distributed systems. Experience training and shipping machine learning models to production to solve critical business problems.