Senior Technical Program Manager, GenAI/ML GPU Orchestration Service, Alloy Greenland
Amazon.com
GenAI is revolutionizing every industry in the world, yet we are still at the very beginning. As the appetite for GenAI continue to grow exponentially, the demand for GPU instances grows exponentially with that resulting in a biggest problem on our generation - how do we get more GPU capacity? In order to solve it as an industry we need to (1) tackle both the production scale of GPUs, and (2) optimize usage of existing scarce GPU resources to do more with less.
The second bucket is where our team comes in the picture. Large cloud providers are spending billions already on GPU resources and resources are distributed in a Silo fashion to teams where they are unable to utilize the resources to its fullest extent, whether due to peak/off-peak seasonality or workloads completely sooner than expected during vacation. As we looked at the data and saw 15-30% idle capacity across GPU allocation, this then presented a huge opportunity for us to tackle.
As part the Alloy Greenland team, we are a new team started beginning of the year operating startup style with true Day1 spirit on a mission to accelerate AI/ML innovations of all teams across Amazon and as an extension with our partnership with AWS SageMaker to the rest of the world.
If you love working backwards from customers, building 0-1, having exposure to senior leadership visibility, and ultimately making a dent in the world excites you, this is the right place for you!
Alloy Greenland team is part of the Alloy organization which is the central efficiency org which drives cost savings for all service teams within Amazon via efficient use of AWS resources as they build and operate their services. This team is special in 3 ways (1) business impact - we have proven records to save cost by hundred million dollars annually. We have earned trust and reputation from service teams, partner teams (business and technical), and senior leadership (2) technical complexity - our system is not a single product but the whole Amazon. We create central efficiency solutions which save costs for thousands of internal services with minimal or zero efforts from their engineers; (3) professional network with high leadership visibility - we work with important stakeholders across the company. You will be able to work with brilliant people while getting exposure to some of senior leadership across organizations.
In this role, you will:
* Work closely with fast paced startup engineering team building 0-1 products and service that delight our customers
* Work closely with AL/ML customers across Amazon, help establish and improve on mechanism and processes that maximize innovation and execution
* Solve performance and efficiency problems that manifest at scale.
* Establish reporting and page-0 metric strategies for VP/SVP level
* Collaborate with service teams to identify inefficiencies, and design and implement mechanisms and processes.
The second bucket is where our team comes in the picture. Large cloud providers are spending billions already on GPU resources and resources are distributed in a Silo fashion to teams where they are unable to utilize the resources to its fullest extent, whether due to peak/off-peak seasonality or workloads completely sooner than expected during vacation. As we looked at the data and saw 15-30% idle capacity across GPU allocation, this then presented a huge opportunity for us to tackle.
As part the Alloy Greenland team, we are a new team started beginning of the year operating startup style with true Day1 spirit on a mission to accelerate AI/ML innovations of all teams across Amazon and as an extension with our partnership with AWS SageMaker to the rest of the world.
If you love working backwards from customers, building 0-1, having exposure to senior leadership visibility, and ultimately making a dent in the world excites you, this is the right place for you!
Alloy Greenland team is part of the Alloy organization which is the central efficiency org which drives cost savings for all service teams within Amazon via efficient use of AWS resources as they build and operate their services. This team is special in 3 ways (1) business impact - we have proven records to save cost by hundred million dollars annually. We have earned trust and reputation from service teams, partner teams (business and technical), and senior leadership (2) technical complexity - our system is not a single product but the whole Amazon. We create central efficiency solutions which save costs for thousands of internal services with minimal or zero efforts from their engineers; (3) professional network with high leadership visibility - we work with important stakeholders across the company. You will be able to work with brilliant people while getting exposure to some of senior leadership across organizations.
In this role, you will:
* Work closely with fast paced startup engineering team building 0-1 products and service that delight our customers
* Work closely with AL/ML customers across Amazon, help establish and improve on mechanism and processes that maximize innovation and execution
* Solve performance and efficiency problems that manifest at scale.
* Establish reporting and page-0 metric strategies for VP/SVP level
* Collaborate with service teams to identify inefficiencies, and design and implement mechanisms and processes.
Confirm your E-mail: Send Email
All Jobs from Amazon.com