NVIDIA Starts Selling the Method of Making Shovels

特邀专栏作者

2026-04-15 12:00

This article is about 3720 words, reading the full article takes about 6 minutes

The person you want to defeat is now renting you all the tools needed to defeat them. Rent is paid annually, and the contract price increases each year.

AI Summary

Expand

Core Viewpoint: NVIDIA is building a closed-loop ecosystem from design to manufacturing by deeply integrating AI into the chip design (EDA) toolchain and bundling it with its own GPU hardware. This strategy consolidates its industry dominance at the foundational tool level, trapping competitors in a paradox where catching up requires dependence on NVIDIA's ecosystem.
Key Elements:
1. Efficiency Revolution: NVIDIA's internal AI tools (e.g., NB-Cell) can reduce the standard cell library migration work, which originally required 8 people over 10 months, to a single GPU running overnight. The results match or surpass human designs in key metrics.
2. Ecosystem Lock-in: Through a $2 billion investment in EDA giant Synopsys and joint development, NVIDIA is embedding its accelerated computing stack into Synopsys's workflow. It is also driving vendors like Cadence to launch EDA platforms "exclusively based on NVIDIA Blackwell," making the fastest tools dependent on NVIDIA hardware.
3. Full-Chain Penetration: NVIDIA is using AI to reshape key links in the chip industry chain, from front-end design (Chip Nemo), mid-end optimization to back-end manufacturing (cuLitho). All ultimately drive demand for its GPU computing power.
4. Domestic Dilemma: Chinese GPU companies are conducting R&D under significant losses and heavily rely on regulated overseas EDA tools (e.g., Synopsys, Cadence) for advanced node design. These EDA providers are accelerating their integration into the NVIDIA ecosystem.
5. Competition Paradox: Competitors (e.g., AMD), if they want to design chips to rival NVIDIA, will be forced to use the fastest EDA tools that run on NVIDIA GPUs, creating the awkward situation of "using the opponent's tools to catch up to the opponent."

Original Author: Ada, TechFlow

San Francisco, San Jose Convention Center, GTC live.

NVIDIA's Chief Scientist Bill Dally sat on stage, facing Google's Jeff Dean. Midway through their conversation, Dally dropped a number: "Previously, porting a standard cell library containing about 2500 to 3000 cells required a team of 8 engineers taking about 10 months."

He paused.

"Now it only takes a single GPU card, running overnight."

There was no gasp from the audience, because those who understood the statement knew what it meant. The work of 8 engineers over 10 months was consumed overnight by a GPU of their own making. Furthermore, Dally added: the results matched or even exceeded human designs in the three key metrics of area, power consumption, and latency.

The next day, news outlets interpreted it as "NVIDIA uses AI to design GPUs."

But the truth of the matter is far more intriguing than any headline.

What is NVIDIA Running Internally?

What NVIDIA runs internally isn't a black box; it's a set of toolchains refined over several years.

NB-Cell is a reinforcement learning-based program dedicated to the most arduous task of standard cell library migration. Prefix RL aims to solve the long-standing research challenge of placement in the lookahead stage of carry-lookahead chains. Dally stated that the layouts generated by this system are "something a human would never think of," improving key metrics by about 20% to 30% compared to human designs.

Then there are the two internal LLMs, Chip Nemo and Bug Nemo. NVIDIA fed these large models with the RTL code, architecture documentation, and design specifications of every GPU in its history. According to Dally's description, this is equivalent to distilling NVIDIA's twenty years of muscle memory from G80 to Blackwell into an internal model, allowing new hires to directly interface with the expertise of a twenty-year veteran engineer.

So, can "AI design a GPU"?

Quite the opposite. Dally's exact words were: "I would love to one day say 'design me a new GPU,' but we're a long way from that."

NVIDIA hasn't used AI to design a GPU. But it's doing something else that will make the entire industry unable to function without it.

$2 Billion to Enter the EDA Heartland

On December 1, 2025, NVIDIA invested $2 billion for a stake in Synopsys, one of the three EDA giants. The two parties signed a joint development agreement to embed NVIDIA's accelerated computing stack into Synopsys's entire EDA workflow, with Blackwell and the next-generation Rubin GPU to be deeply integrated with Synopsys.ai.

Synopsys's position needs some explanation. Nearly every advanced-node chip globally—Apple's M-series, AMD's MI-series, Google's TPU—runs on Synopsys's or Cadence's toolchains during the design phase. These two, along with Siemens EDA, monopolize the foundational tools for chip design. You might not use Qualcomm's chips, you might not use TSMC's production lines, but you can't escape the software from these three companies.

Three months after the Synopsys investment, NVIDIA brought Cadence, Siemens, and Dassault into the fold, announcing that they are all developing AI-driven chip design tools based on NVIDIA GPUs.

The benchmark data NVIDIA released is quite startling: Synopsys PrimeSim is 30x faster on Blackwell, Proteus is 20x faster, Sentaurus achieves a 12x speedup on B200 compared to CPU. MediaTek used H100 to accelerate Cadence Spectre by 6x. Astera Labs used Synopsys + NVIDIA to accelerate chip verification by 3.5x.

One detail is worth highlighting separately: Cadence's Millennium M2000 platform is labeled as "built exclusively for the EDA market, exclusively based on NVIDIA Blackwell."

The word "exclusively" is most telling. This means that EDA tools, which previously ran on CPUs where Intel and AMD could both play, will now require buying NVIDIA's cards if you want to use the fastest EDA.

The True Shape of the Flywheel

Most people understand NVIDIA's flywheel like this: sell GPUs to AI companies, AI companies train large models, large models prove GPUs are irreplaceable, more people buy GPUs.

That flywheel is terrifying enough. But there's another layer beneath it.

NVIDIA uses its own tools to design the next-generation GPU, creating a generational gap in design efficiency, while simultaneously tying the entire industry's EDA toolchain to its own hardware. Competitors want to catch up, but even the tools for catching up have to be rented from NVIDIA's ecosystem.

The anxiety behind AMD's earnings report that caused its stock to plummet is precisely this layer. Even though NVIDIA and Synopsys publicly state that the "investment does not carry any obligation to purchase NVIDIA hardware," the market understands: accelerated EDA features debut first on NVIDIA hardware, leaving AMD and Intel reliant on a "path optimized for their biggest competitor's platform."

Imagine an AMD engineer in the future wanting to design a chip to rival Blackwell. They open Synopsys's tool, which runs fastest on NVIDIA GPUs. They must either endure a design cycle twice as slow, or buy a bunch of NVIDIA cards to design the chip meant to defeat NVIDIA.

The shovels are still for sale. But the sales pitch has changed.

The Real Situation of Domestic GPUs

At this point, some sobering numbers are necessary.

In the same fiscal year 2025 that NVIDIA's net profit surpassed $70 billion, China's domestic GPU "Four Little Dragons"—Moore Threads, MetaX, Biren Technology, and Enflame—were queuing up before the IPO window.

Moore Threads' IPO prospectus shows that from 2022 to 2024, the company accumulated a net loss of 5 billion RMB over three years, with an additional loss of 271 million RMB in the first half of 2025, and accumulated unrecovered losses of 1.478 billion RMB as of June 30. The company's management itself estimates that consolidated profitability might not be achieved until 2027 at the earliest. MetaX is slightly better, with cumulative losses exceeding 3 billion RMB over three years. The worst is Biren Technology, with losses exceeding 6.3 billion RMB over three and a half years, and revenue in the first half of 2025 was only 58.9 million RMB, not even a fraction of Moore Threads' 702 million RMB for the same period.

Look at the intensity of R&D investment. Moore Threads' R&D expenses as a percentage of revenue were 2422.51% in 2022, and still as high as 309.88% in 2024. The money spent on R&D in a year is more than three times its revenue. This isn't business operation; it's life support via IV drip, sustained by continuous capital infusion from the primary market and the recently opened STAR Market window.

The tooling layer presents an even tighter chokehold. Empyrean Technology's 2022 IPO prospectus indicated its tools only partially support 5nm advanced nodes. Primarius Technologies can cover 7nm/5nm/3nm nodes, but only offers point tools, far from a full-flow solution.

Empyrean's founder Liu Weiping was quite candid: "Domestic EDA still has obvious shortcomings in supporting advanced processes, especially current ones like 7nm, 5nm, 3nm. Currently, domestic EDA can achieve 14nm levels. While we have mastered 7nm process technology, the deep integration of 7nm with practical applications requires collaborative effort from the entire industry chain."

In other words, full-flow EDA for advanced nodes is basically unusable domestically. Domestic GPU companies still use Synopsys and Cadence to design chips. In 2025, Trump once announced export controls on all critical software. Although not substantially implemented, EDA tools for advanced nodes below 7nm remain under strict control. When the license gets cut off is in someone else's hands.

The capital market's reaction is surreal enough. On its listing day, MetaX's stock closed at 829.9 RMB, a single-day surge of 692.95%. After listing, Moore Threads' stock price once rose to become the third highest in the A-share market, behind only Kweichow Moutai and Cambricon. Some media calculated its total market cap at the time to be approximately 359.5 billion RMB.

The real business behind these numbers is this: a group of companies still burning cash, still reliant on a restricted foreign toolchain to continue designing chips, are being priced in the secondary market as the successors to the "domestic NVIDIA."

And the very toolset these companies use to design chips is becoming part of NVIDIA's ecosystem. NVIDIA's $2 billion tie-up with Synopsys, and Cadence Millennium M2000's label of "exclusively based on NVIDIA Blackwell," make the act of catching up itself a paradox.

A Complete Chain from Design to Manufacturing

Back to that GTC conversation.

Dally remained humble throughout. "AI is still far from being able to design chips on its own" is a line NVIDIA has been repeating for four or five years. But the phrasing changes each year. Four years ago it was "AI can assist design," three years ago "AI can automate certain stages," this year "overnight completion of work that took 8 people 10 months." Each year pushes a step forward, each year leaves a statement that "the ultimate goal is still far away." Looking back three years later, the previous "still far away" has been achieved, and the new "still far away" is defined at a place all competitors still cannot reach.

What NVIDIA has done in the past twelve months is essentially one thing: apply AI to the most valuable, deepest-moat segments of the chip industry chain, and then sell these tools layer by layer to the entire industry.

The front-end of chip design is being taken over by internal LLMs like Chip Nemo; the mid-design tasks of standard cell library migration and layout optimization are being taken over by NB-Cell and Prefix RL; the entire EDA toolchain is being tied to its own GPUs through the $2 billion Synopsys deal and Cadence's "exclusively based on Blackwell"; the manufacturing-end computational lithography is being taken over by cuLitho, which TSMC is already using.

From design to manufacturing, NVIDIA has re-engineered every segment with AI. Every segment ultimately leads to the same endpoint: if you want the fastest tools, you have to buy NVIDIA's cards.

For all competitors who want to build a chip that can defeat Blackwell, the most awkward situation has already occurred. The EDA tools needed to design this chip have their fastest versions running on NVIDIA GPUs; the computational lithography needed to manufacture this chip uses the fastest algorithm library provided by NVIDIA; the computing power needed to train the design AI still comes from NVIDIA's cards.

The entity you want to defeat is now renting you all the tools needed to defeat it. Rent is paid annually, and the contract price increases every year.

industry

Welcome to Join Odaily Official Community