Amazon's retail business resolves internal GPU capacity shortage

Amazon's retail business is no longer constrained by a lack of access to GPUs.

As reported by Business Insider (BI), and citing a "trove of Amazon documents" the publication obtained, Amazon was struggling with a lack of compute capacity throughout 2024 despite its cloud computing unit Amazon Web Services (AWS) being a major purchaser of GPUs.

This has since been resolved by the company through its "Project Greenland" initiative.

The exact number of GPUs operated by AWS is unknown. In November 2024, LessWrong estimated that Amazon had between 250,000 and 400,000 "Nvidia H100 equivalent GPUs," expected to increase to 1.3-1.6 million by the end of 2025. LessWrong also estimated Amazon would buy approximately 360,000 Nvidia GB200s in 2025.

Beyond GPUs purchased from the likes of Nvidia and AMD, AWS also develops its own Trainium AI chips. How many Trainium chips the company has is not known, but it announced in December 2024 that it was developing a cluster of Trainium2 UltraServers for Anthropic that would contain "hundreds of thousands of Trainium chips."

Despite its cloud department having such a massive stock of AI compute, the retail division was previously still struggling to access the necessary GPUs.

Insider reports that in early 2024, some employees of Amazon's retail unit went months without securing GPUs, which prevented project launches across the division, including in e-commerce and logistics operations.

As a result, Amazon launched "Project Greenland" to help it better manage and allocate GPU supply in July of 2024.

Project Greenland is described as a "centralized GPU orchestration platform to share GPU capacity across teams and maximize utilization" in one of the documents viewed by BI.

The platform tracks GPU usage per initiative, shares idle servers, and enables the company to "claw back" chips for more urgent projects.

"GPUs are too valuable to be given out on a first-come, first-served basis," one of the Amazon guidelines stated. "Instead, distribution should be determined based on ROI layered with common sense considerations, and provide for the long-term growth of the Company's free cash flow."

Amazon is prioritizing and ranking for GPU allocation on a variety of factors, including completeness of data provided and financial benefit per GPU, says BI. AI projects have to be approved for development and ready to go, and timelines for expected benefits provided.

Should Amazon find that other initiatives have greater value, it will recall GPUs from projects even if they already have approval. As of 2025, all GPU capacity requests from Amazon are mandated to go through Greenland.

The retail unit was short of more than 1,000 P5 instances throughout the second half of 2024. Each P5 instance comprises eight Nvidia H100 GPUs. A document from December suggested that this was expected to improve "slightly" by early this year.

In addition, AWS' Trainium AI chip was hoped to support demand by the end of 2025, but "not sooner."

Despite this, an Amazon spokesperson has now told BI that those estimates are outdated and that there is no longer a GPU shortage.

"Amazon has ample GPU capacity to continue innovating for our retail business and other customers across the company," the spokesperson told BI. "AWS recognized early on that generative AI innovations are fueling rapid adoption of cloud computing services for all our customers, including Amazon, and we quickly evaluated our customers' growing GPU needs and took steps to deliver the capacity they need to drive innovation."

Amazon's retail unit, according to the internal documents, has more than 160 AI-powered initiatives underway. The company estimated in 2024 that AI investments had "indirectly" contributed $2.5 billion in operating projects, and $670 million in variable cost savings.

The company's retail unit expects to spend around $5.7 billion on AWS this year, up from $4.5 billion in 2024.

During Amazon's Q4 2024 earnings call in February 2025, the company's CEO and president, Andy Jassy, noted the company had experienced some constraints on capacity, which "come in the form of chips from our third-party partners, power constraints, and supply chains like motherboards."

He added: “I predict those constraints really start to relax in the second half of 2025. And, as I said, I think we can be growing faster even though we're growing at a pretty good clip today,"

Subscribe to The Cloud & Hybrid Channel for regular news round-ups, market reports, and more.

Create an Account to Subscribe Now

Amazon's retail business resolves internal GPU capacity shortage

More in Cloud & Hyperscale

Structure Research Report 2024 – 2025

Meta targets sites for $200bn AI data center campus - report

Episode Understanding 2025 - How AI is redefining power

More in AI & Analytics

The Power of DCIM to Enable Automation

SoftBank Group and OpenAI team up for enterprise AI offering

Episode The impact of tech refresh on sustainability

Tags

Cooling strategies and infrastructure considerations

Is liquid cooling right for your data center?

AI imperatives: Prioritizing your AI transformation

Rethinking data center service with an AI-driven systemic asset management strategy