Herndon, VA, US
9 hours ago
Sr Availability TPM, DCE Availability
AWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure. In other words, we’re the people who keep the cloud running. We support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. We work on the most challenging problems, with thousands of variables impacting the supply chain — and we’re looking for talented people who want to help.

You’ll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital roles. You’ll collaborate with people across AWS to help us deliver the highest standards for safety and security while providing seemingly infinite capacity at the lowest possible cost for our customers. And you’ll experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion.

The Data Center Engineering Availability team is looking for an Availability TPM to serve as a technical resource and leader. The builder will be responsible for driving large-scale global projects and initiatives, that directly impact capacity delivery and availability for our customers.
If you like to interact with customers and a diverse array of stakeholders, track and analyze performance data, and develop new processes and methods to drive improvements, we’d like to meet you. Your work will help ensure delivering high availability for AWS customers.

The Firmware and Settings program objective is to establish an AWS-Approved Global Firmware & Settings Repository for critical infrastructure equipment, a companion As-Left repository, a comparator to identify non-compliant infrastructure equipment requiring remediation, and a tool to issue corrective actions.

The program comprises three main areas: data collection, tool development, and process development.

Data collection involves defining “as-expected” firmware and settings for critical infrastructure equipment, and auditing and remotely polling the “as-left” firmware and settings values on equipment.

Tool development involves developing tools and modifying existing tools to satisfy the requirements of the FSM program.

Process development focuses on customer-facing processes, internal mechanisms, and revising existing processes. Interim and permanent processes will define how Commissioning, Data Center Engineering Operations (DCEO), and various Data Center Engineering (DCE) teams access and update firmware and settings data. Internal mechanisms will define how DCSA and DCE monitor, triage, and issue corrective actions for non-compliant equipment. Finally, existing configuration management processes will be revised to ensure changes to approved design configurations are captured in the AWS-Approved Global Firmware & Settings Repository.

Key job responsibilities
Our Availability TPMs are individuals who demonstrate initiative and proactively seek solutions to problems.
• Own and deliver large-scale and complex global engineering and operational programs and initiatives that directly impact capacity delivery and availability for our customers.
• Partnering with and influence the direction of multiple engineering and operations teams within and outside of AWS to deliver complex/cross-functional projects
• Mentor, train, and develop career progression for members of the organization .
• Obsess over team learning and development, both from a technical/functional and soft skills (critical thinking, emotional intelligence, and adaptability) development perspective.
• Develop, improve, and share operational best practices across the region and with peers globally.


About the team
Why AWS
Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses.

Diverse Experiences
AWS values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.

Work/Life Balance
We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve.

Inclusive Team Culture
AWS values curiosity and connection. Our employee-led and company-sponsored affinity groups promote inclusion and empower our people to take pride in what makes us unique. Our inclusion events foster stronger, more collaborative teams. Our continual innovation is fueled by the bold ideas, fresh perspectives, and passionate voices our teams bring to everything we do.

Mentorship and Career Growth
We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
Confirm your E-mail: Send Email