TL;DR:
- TSMC predicts a continued shortage of AI chips, impacting Nvidia’s A100 and H100 GPUs, until at least the end of 2024.
- The shortage is not due to chip production but rather a deficiency in advanced packaging capacity.
- CoWoS packaging technology, crucial for high-bandwidth memory (HBM) AI chips, is running at only 80% capacity.
- TSMC anticipates relief within 18 months as it expands its packaging facilities in Taiwan.
- AMD’s MI300-series accelerators are also affected by the packaging shortage.
- Samsung and Intel have alternative packaging solutions but face different market dynamics.
Main AI News:
In a recent development that has sent ripples throughout the tech industry, Taiwan Semiconductor Manufacturing Company (TSMC) has issued a warning regarding the prolonged shortage of AI chips. For those eagerly awaiting the availability of Nvidia’s high-end GPUs, including the much-coveted A100 and H100 models, this news may come as a disappointment. According to TSMC, the situation is unlikely to improve until at least the conclusion of 2024.
The crux of the problem, it appears, is not TSMC’s ability to manufacture chips, but rather a shortage in advanced packaging capacity required to assemble the silicon components. TSMC’s chairman, Mark Liu, emphasized that the bottleneck lies in the production process, specifically in the chip on wafer on substrate (CoWoS) packaging technology. This technology plays a pivotal role in creating some of the most advanced chips available today, particularly those designed for high-bandwidth memory (HBM), an essential feature for AI-intensive workloads.
Liu expressed optimism that this packaging capacity constraint is temporary, foreseeing additional CoWoS capacity becoming available within the next eighteen months. To support this claim, TSMC recently unveiled plans for a significant expansion of its advanced packaging facilities in Taiwan, investing a substantial $3 billion in a new facility located at the Tongluo Science Park in Miaoli County.
Until TSMC can successfully augment its packaging capacity, the scarcity of CoWoS technology affects not only Nvidia but also impacts AMD’s upcoming Instinct MI300-series accelerators, which heavily rely on this packaging method. AMD’s MI300A APU is currently in the sampling phase with customers and is expected to power the El Capitan system at the Lawrence Livermore National Laboratory. Meanwhile, the MI300X GPU is anticipated to reach customers’ hands in the third quarter of this year.
We have reached out to AMD to inquire whether this CoWoS packaging shortage might affect the availability of their chips and will provide updates if we receive a response.
It’s important to note that TSMC’s CoWoS packaging is not the sole technology in play. Samsung, rumored to assist in Nvidia GPU production, offers I-Cube and H-Cube for 2.5D packaging and X-Cube for 3D packaging solutions. Intel, on the other hand, employs its proprietary advanced packaging technologies such as embedded multi-die interconnect bridge (EMIB) for 2.5D packaging and Foveros for vertically stacking chiplets, enabling versatile chip integration even from different fabs or process nodes.
Conclusion:
The persisting AI chip shortage, as indicated by TSMC’s warning extending to late 2024, could exert continued pressure on the market for high-performance GPUs and AI accelerators. The shortage is primarily rooted in packaging limitations, particularly impacting Nvidia and AMD. TSMC’s planned expansion and alternative packaging technologies from competitors may offer some relief, but the market is likely to remain constrained, potentially influencing pricing and product availability. Industry players should closely monitor developments in packaging capacity to strategize effectively in this challenging landscape.