Lead Resiliency Engineer
Pearson
About the Role
We are looking a Resiliency Engineering Lead to enhance system resilience by proactively identifying and addressing potential failures. This role demands expertise in Resiliency Engineering along with performance engineering and proficiency with Chaos Engineering tools like Harness, LitmusChaos, and other native or open-source solutions. You will work with containerized and distributed architectures across AWS, Azure, and GCP, designing and executing chaos experiments, integrating resiliency testing into CI/CD, and collaborating with SRE, DevOps, and Performance Engineering teams. Additionally, you will implement observability solutions, establish best practices, and promote a failure-driven learning culture to ensure high availability, fault tolerance, and self-healing capabilities for critical systems.
Key Responsibilities:
+ Develop strategies for building highly available and fault-tolerant systems by identifying single points of failure and addressing them.
+ Collaborate with cross-functional teams to design and execute experiments that simulate real-world failures. (e.g., Chaos Monkey, Gremlin, Litmus).
+ Utilize SRE principles to enhance system reliability and performance.
+ Work with cloud platforms including AWS, Azure, and GCP to deploy and manage resilient applications.
+ Collaborate with cross-functional teams, including DevOps, SREs, and Development teams, to integrate resiliency best practices into the software development lifecycle.
+ Lead post-mortem analysis of major incidents to identify root causes and create action plans to mitigate future risks.
+ Provide mentorship and technical guidance to engineers in the team.
+ Use New Relic, AWS X-Ray, and logs to track system behavior and find issues early.
+ Builds observability dashboards using LGTM and implements distributed tracing and instrumentation.
+ Contribute actively to CoE, Continues Improvement, Innovations, and Research
+ Optimize Cloud and Large-Scale Systems – Improve efficiency of cloud, microservices, and containerized environments.
+ Proven ability to work in a team and communicate effectively with all stakeholders
Preferred Skills:
+ Proficiency with Chaos Engineering tools such as Chaos Monkey, Gremlin, Litmus, or equivalent.
+ Experience with AWS cloud platforms and technologies.
+ Exposure with Azure & GCP cloud platforms and technologies.
+ Proven experience with containerized and distributed architectures.
+ Experience with JMeter, NewRelic , AWS CloudWatch.
+ Exposure with GitHub, Grafana, etc,.
+ Experience with CI/CD pipelines and Infrastructure as Code (e.g., Terraform, Ansible).
+ Strong scripting and programming skills (e.g., Python, Go, or Java).
+ Excellent problem-solving skills and a proactive approach to identifying and mitigating risks.
+ Strong communication and collaboration skills to work effectively with stakeholders at all levels.
Qualifications:
+ Bachelor’s degree in engineering, or a related field.
+ 10+ years of experience in performance and resiliency engineering
+ Strong experience of performance & resiliency engineering tools and methodologies.
+ Experience with monitoring and tuning of complex systems.
+ Excellent analytical, problem-solving, and communication skills.
+ Ability to work effectively in a fast-paced, collaborative environment.
+ Why Join Us?
+ Work in a cutting-edge technology environment with a talented and passionate team.
+ Opportunities for professional growth and career advancement.
+ Comprehensive benefits package and competitive salary.
+ Be part of a company that values diversity, inclusion, and innovation.
If you are a dedicated performance engineering leader looking for an exciting opportunity to make a significant impact, we would love to hear from you. Apply now to join our team and help us drive performance & reliability excellence!
**Who we are:**
At Pearson, our purpose is simple: to help people realize the life they imagine through learning. We believe that every learning opportunity is a chance for a personal breakthrough. We are the world's lifelong learning company. For us, learning isn't just what we do. It's who we are. To learn more: We are Pearson.
Pearson is an Affirmative Action and Equal Opportunity Employer and a member of E-Verify. We want a team that represents a variety of backgrounds, perspectives and skills. The more inclusive we are, the better our work will be. All employment decisions are based on qualifications, merit and business need. All qualified applicants will receive consideration for employment without regard to race, ethnicity, color, religion, sex, sexual orientation, gender identity, gender expression, age, national origin, protected veteran status, disability status or any other group protected by law. We strive for a workforce that reflects the diversity of our communities.
If you are an individual with a disability and are unable or limited in your ability to use or access our career site as a result of your disability, you may request reasonable accommodations by emailing TalentExperienceGlobalTeam@grp.pearson.com.
Note that the information you provide will stay confidential and will be stored securely. It will not be seen by those involved in making decisions as part of the recruitment process.
**Job:** ENGINEERING
**Organization:** Corporate Strategy & Technology
**Schedule:** FULL\_TIME
**Workplace Type:** Hybrid
**Req ID:** 18742
Confirm your E-mail: Send Email
All Jobs from Pearson