Chip stacking strategy is emerging as China’s innovative response to US semiconductor restrictions, but can this approach truly close the performance gap with Nvidia’s advanced GPUs? As Washington tightens export controls on cutting-edge chipmaking technology, Chinese researchers are proposing a bold workaround: stack older, domestically-producible chips together to match the performance of chips they can no longer access.
The Core Concept: Building upward instead of forward
The chip stacking strategy centres on a deceptively simple premise—if you can’t make more advanced chips, make smarter systems with the chips you can produce. Wei Shaojun, vice-president of the China Semiconductor Industry Association and a professor at Tsinghua University, recently outlined to the South China Morning Post an architecture that combines 14-nanometer logic chips with 18-nanometer DRAM using three-dimensional hybrid bonding.
This matters because US export controls specifically target the production of logic chips at 14nm and below, and DRAM at 18nm and below. Wei’s proposal works precisely at these technological boundaries, using processes that remain accessible to Chinese manufacturers.
The technical approach involves what’s called “software-defined near-memory computing.” Instead of shuffling data back and forth between processors and memory—a major bottleneck in AI workloads—this chip stacking strategy places them in intimate proximity through vertical stacking.
The 3D hybrid bonding technique creates direct copper-to-copper connections at sub-10 micrometre pitches, essentially eliminating the physical distance that slows down conventional chip architectures.
The performance claims and reality check
Wei claims this configuration could rival Nvidia’s 4nm GPUs while significantly reducing costs and power consumption. He’s cited performance figures of 2 TFLOPS per watt and a total of 120 TFLOPS. There’s just one problem: Nvidia’s A100 GPU, which Wei positions as the comparison point, actually delivers up to 312 TFLOPS—more than 2.5 times the claimed performance.
This discrepancy highlights a critical question about the chip stacking strategy’s feasibility. While the architectural innovation is real, the performance gaps remain substantial. Stacking older chips doesn’t magically erase the advantages of advanced process nodes, which deliver superior power efficiency, higher transistor density, and better thermal characteristics.
Why China is betting on this approach
The strategic logic behind the chip stacking strategy extends beyond pure performance metrics. Huawei founder Ren Zhengfei has articulated a philosophy of achieving “state-of-the-art performance by stacking and clustering chips rather than competing node for node.” This represents a fundamental shift in how China approaches the semiconductor challenge.
Consider the alternatives. TSMC and Samsung are pushing toward 3nm and 2nm processes that remain completely out of reach for Chinese manufacturers. Rather than fighting an unwinnable battle for process node leadership, the chip stacking strategy proposes competing on system architecture and software optimisation instead.
There’s also the CUDA problem. Nvidia’s dominance in AI computing rests not just on hardware but on its CUDA software ecosystem. Wei describes this as a “triple dependence” spanning models, architectures, and ecosystems.
Chinese chip designers pursuing traditional GPU architectures would need to either replicate CUDA’s functionality or convince developers to abandon a mature, widely adopted platform. The chip stacking strategy, by proposing an entirely different computing paradigm, offers a path to sidestep this dependency.
The feasibility question
Can the chip stacking strategy actually work? The technical foundations are sound—3D chip stacking is already used in high-bandwidth memory and advanced packaging solutions worldwide. The innovation lies in applying these techniques to create entirely new computing architectures rather than simply improving existing designs.
However, several challenges loom large. First, thermal management becomes exponentially more difficult when stacking multiple active processing dies. The heat generated by 14nm chips is considerably higher than modern 4nm or 5nm processes, and stacking intensifies this problem.
Second, yield rates in 3D stacking are notoriously difficult to optimise—a defect in any layer can compromise the entire stack. Third, the software ecosystem required to efficiently utilise such architectures doesn’t exist yet and would take years to mature.
The most realistic assessment is that the chip stacking strategy represents a valid approach for specific workloads where memory bandwidth matters more than raw computational speed. AI inference tasks, certain data analytics operations, and specialised applications could potentially benefit.
But matching Nvidia’s performance across the full spectrum of AI training and inference tasks remains a distant goal.
What this means for the AI chip wars
The emergence of the chip stacking strategy as a focal point for Chinese semiconductor development signals a strategic pivot. Rather than attempting to replicate Western chip designs with inferior process nodes, China is exploring architectural alternatives that play to available manufacturing strengths.
Whether this chip stacking strategy succeeds in closing the performance gap with Nvidia remains uncertain. What’s clear is that China’s semiconductor industry is adapting to restrictions by pursuing innovation in areas where export controls have less impact—system design, packaging technology, and software-hardware co-optimisation.
For the global AI industry, this means the competitive landscape is becoming more complex. Nvidia’s current dominance faces challenges not just from traditional competitors like AMD and Intel, but from entirely new architectural approaches that may redefine what an “AI chip” looks like.
The chip stacking strategy, whatever its current limitations, represents exactly this kind of architectural disruption—and that makes it worth watching closely.
See also: New Nvidia Blackwell chip for China may outpace H20 model

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events, click here for more information.
AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.


Leave a Reply