Can't afford Nvidia's "special supply", Chinese entrepreneurs prefer 4090
2024-07-07
It has been a year and a half since the A100 and H100 were banned from sale, and now the difficulty for NVIDIA to sell a castrated version in China has been increasing again and again.
Recently, analysts from Jeffries have indicated that the United States will conduct an annual review of semiconductor export controls in October, at which time it is "highly likely" that the sale of NVIDIA's H20 to China will be prohibited. There are three possible ways to ban the sale: a specific product ban, reducing the upper limit of computing power, and restricting memory capacity.
NVIDIA plans to launch a new special edition AI chip for China, a rumor that has been circulating non-stop since Huang Renxun officially announced Blackwell in March. The current mainstream news is that NVIDIA plans to release a "castrated version" B20 of the B200.
However, many small and medium-sized entrepreneurs in China understand that, in terms of both price and difficulty of acquisition, NVIDIA's latest high-end AI chips are most likely to meet them "in the cloud" at best.
Advertisement
AI entrepreneur Jason told the Alphabet List (ID: wujicaijing) that his company's previous direction was the AI application layer. In addition to renting the computing power of A100 and H100 through cloud services, they locally deployed 50 NVIDIA V100 chips and NVIDIA 3090 graphics cards.Such choices are made not only because startups pursue cost-effectiveness and cost compression, but also because the business of AI application layer does not actually require extremely high computing power. The V100 is a computing card released by NVIDIA in 2017, at that time the official price of an eight-card V100 server was 1.02 million yuan. However, Jason's 50 V100s were "picked up for 900 yuan each" as second-hand goods, and the 3090 graphics cards were acquired at a price of 5,000 yuan per card.
Starting from the end of 2022, NVIDIA's most powerful chips have been hindered from exporting to China due to the United States' upgrade of semiconductor export controls, whether it is the A100 or the previously most powerful H100, they cannot be officially sold in China. After that, the export controls on high-end chips by the United States became increasingly strict, with NVIDIA launching one "China-specific version" chip after another, from the A800, H800, to the H20, L20, L2.
For most entrepreneurs, the China-specific version not only has far inferior performance compared to the "original version", but the price is also prohibitive. A seller told the Alphabet List that the price of an eight-card H20 server is around 1.3 million yuan. The IT Times has reported that the main demand for H20 still comes from internet giants such as Baidu, Alibaba, Tencent, and ByteDance.
Jason stated that the H20 is mainly used for inference, but "using H20 is not as good as using 4090" because the latter is sufficient, and as a commercial chip, the H20 "has a depreciation rate for commercial use, with a significant discount, and the machine room is replaced every few years, generally not retaining value, for example, the V100 was sold for tens of thousands at the time, now it can only be sold for a few thousand, in just five or six years, if it were not for the AI trend, it would be worth at most 500."
More than one industry insider has expressed to the Alphabet List their preference for using NVIDIA RTX4090, this flagship product launched in October 2022 was originally introduced as a gaming graphics card, but it is also highly favored by the AI industry and has also been affected by U.S. chip export controls.AI supercomputing supplier Zhejiang Huaxiyun Technology Co., Ltd. also stated, "Currently, the best to use is the 4090," but "it also depends on the configuration, networking, and graphics card," and many customers' needs can actually be met by the 4090, which also has a good cost-performance ratio. The staff member also added that the company will be launching 100 units of the 4090 this month.
Regarding the news that NVIDIA will launch a new special product in China, the "castrated version" B20 of the B200, Jason is not excited, simply saying "it depends on the cost-performance ratio," while Huaxiyun has also stated that they have not yet heard any related news in the industry.
Previously, Reuters reported that NVIDIA has cooperated with China's Inspur Information on the B20, but the latter has responded that the news is not true.On March 19th of this year, Huang Renxun took the stage at the SAP Center in San Jose, California, USA, delivering a keynote speech with a grand narrative titled "Witnessing the Moment of AI Transformation."
Huang rarely boasts, but Nvidia officially announced the new Blackwell architecture, along with the launch of the chip B200 and the super chip GB200. The "new nuclear bomb" is how the outside world describes Nvidia's new products, and at that time, Jim Fan, who had just been promoted to the position of Nvidia Research Manager, lamented that Moore's Law could no longer keep up with the company.
However, an experimental exploration from Apple slightly hindered Nvidia's progress. Huang Renxun may have become the person in the world who least expects Apple's smart features to go live.
On July 30th Beijing time, Apple published a technical paper, which included this piece of information: the two AI models that support Apple's intelligence were pre-trained on Google's cloud chips.
Firstly, Google's self-developed chips are TPU tensor processors, which were previously used internally by Google and not sold externally. This time, however, they have managed to snatch a big client. Secondly, Apple was reported by The Wall Street Journal in May to be developing its own chips for data center servers, and now, despite not using its own chips, it still did not choose Nvidia. This is quite an awkward situation for Nvidia.Apple's "cold shoulder" to NVIDIA reflects a microcosm of NVIDIA's current predicament. NVIDIA's stock price soared by 150% in the first half of this year, only to plummet in July, accounting for four of the eight largest drops in market value.
Moreover, NVIDIA has been plagued with recent bad news: there are rumors that the new chip B200 will be delayed by three months or more before delivery; the U.S. Department of Justice has launched two antitrust investigations against it.
In contrast, there are continuous reports of a "China-specific version" – according to multiple foreign media outlets, NVIDIA plans to introduce a crippled version of B200, the B20, to China. Additionally, NVIDIA may sell servers equipped with the latest chips to China, using the servers to compensate for the performance of the specialized chips. If this news is true, it would be the first time NVIDIA has specifically launched a server product for the Chinese market.
These measures also indicate that, despite facing numerous obstacles, NVIDIA has not given up and is making greater efforts for the Chinese market.
The AI wave stirred up by ChatGPT at the end of 2022 has been surging for nearly two years, and NVIDIA has transformed from a chip giant into one of the most valuable companies in the world. Now that the AI race is gradually returning to rationality and competitors are slowly forming a siege, China has become an increasingly indispensable market for NVIDIA.For numerous small and medium-sized startups in China's AI race, the Nvidia China-specific chip is not their main concern. And for the primary buyers of the special edition chips—China's large internet companies—choosing Nvidia is merely the best solution for now.
In October 2023, Nvidia launched the HGX H20, L20 PCIe, and L2 PCIe, tailored for China, among which the H20 is the much-anticipated "general among the dwarfs," the strongest in the castrated versions. However, a month later, with the news of Nvidia's delay and the H20 being postponed to the first quarter of the following year, many enterprises turned to domestic chips.
At that time, China Fund News reported that Baidu had ordered 1,600 Ascend 910B chips from Huawei for 200 servers. Soon after, Zhou Hongyi also stated at the Wuzhen Summit that 360 had purchased more than 1,000 Huawei AI chips, even earlier than Baidu.
By the first half of this year, there were reports of enterprises waiting and watching, with domestic manufacturers like Huawei competing, leading to weak sales of Nvidia H20 in China. In May, news of the H20 price reduction emerged, with reasons pointing to two factors: one is that the H100 is not in short supply and has started to decrease in price, leading the H20 to follow suit, and the other is competition with Huawei's Ascend 910B, which has a more advantageous initial selling price.In the second half of the year, it appears that H20's sales in China are experiencing a turnaround. Recently, the chip industry consulting firm SemiAnalysis predicted that the H20 chip is expected to boost the company's performance in China during the current fiscal year, with the potential to deliver over one million H20 chips this year. Based on the selling price of a single chip ranging from $12,000 to $13,000, H20 alone could contribute over $12 billion in revenue to NVIDIA, a figure that exceeds NVIDIA's overall revenue in China from the previous fiscal year. Reports from the IT Times suggest that the uptick in H20 sales is primarily due to NVIDIA's ecosystem advantages and the supply constraints of Huawei's Ascend 910B.
For NVIDIA, this can only be considered a fleeting joy, as the turnaround gained from ecosystem advantages and competitive supply shortages is not necessarily stable. The next generation of "China-specific" chips may well be on the horizon, but NVIDIA faces numerous challenges.
In addition to the rumored "castrated version" B20 of the B200, according to The Information, NVIDIA also plans to pair a new China-specific chip with servers. This move, which NVIDIA has not previously made, is to maximize the performance of the special chip with servers, compensating for the shortcomings of the "castrated version."
If the news of offering servers as a "package" solution is true, it would be a new attempt by NVIDIA under the constraints of the sales ban.
The bad news is that the market has once again heard rumors of delays in the delivery of NVIDIA's latest chips. According to recent reports from The Information, NVIDIA has informed customers that the B200 will be delayed by three months or more, with mass shipments potentially postponed until the first quarter of next year (the original plan was to start mass production in October of this year).The "customers" mentioned here include no shortage of tech giants. Reports indicate that Meta has placed orders worth at least 10 billion U.S. dollars, and Microsoft has recently increased its order size by 20%, planning to prepare 55,000 to 65,000 GB200 for OpenAI before the first quarter of next year.
There is reason to suspect that the delayed delivery of the B200 will also affect Nvidia's pace in launching a castrated version to the Chinese market. The report cites "design flaws" discovered during production as the reason for the delay.
C
Although Nvidia has not yet confirmed the launch of the B20, hardly anyone would doubt that it will happen.
Compared to 2022 when Nvidia first faced export restrictions on AI chips, the chip giant finds it even more difficult to let go of the Chinese market now.China's vast demand for chips is naturally the primary driving force. In fiscal years 2022 and 2023, mainland China and the Hong Kong region respectively contributed $7.111 billion and $5.785 billion in revenue to NVIDIA, accounting for 31.7% and 25.9% of its total revenue.
However, due to the U.S. chip ban, NVIDIA faces the risk of deceleration in China. In fiscal year 2024, NVIDIA's revenue from the Chinese market, including mainland and Hong Kong and Macao regions, stopped at ten billion dollars, with the proportion slipping to 16.9%.
In May of this year, NVIDIA released its financial report for the first quarter of fiscal year 2025 (ending April 28, 2024). In the data center business, NVIDIA's revenue from Chinese customers has already decreased from 19% in fiscal year 2023 to a mid-single-digit percentage (5%) in fiscal year 2024.
Jensen Huang is also well aware of the competition from local Chinese chip manufacturers: "Our business in China has indeed declined significantly from past levels. Due to technological restrictions, competition in China is now more intense. These are facts." A few days later, Huang mentioned Chinese chip companies again, stating that there are many GPU startups in China and one should not underestimate China's ability to catch up in the chip field.
From a certain perspective, while the U.S. chip embargo has increased the difficulty for Chinese AI companies in this wave, it has also provided space for the development of local Chinese chip manufacturers. From Huang's perspective, this is undoubtedly dangerous; the clock is ticking, and there is not much time left for NVIDIA to break through the "castration" situation.This is not all that makes NVIDIA more reluctant to let go of the Chinese market. Compared to the end of 2022, when ChatGPT ignited a thousand-model war, NVIDIA's sales and stock prices soared, now NVIDIA is facing an increasingly uncertain situation.
Just this year, there have been revelations of OpenAI CEO Sam Altman's ambition for a 7 trillion chip network, Microsoft developing alternatives to NVIDIA's network card ConnectX-7 to enhance the performance of its self-developed chip Maia, and Jonathan Ross, the creator of Google's TPU, founding the technology company Groq, which boasts that its new products can threaten NVIDIA.
Apple's embrace of Google is another step forward in NVIDIA's nightmare: in addition to developing its own chips, a financially strong tech giant joins the AI battle without choosing NVIDIA.
Beyond competition, NVIDIA also faces increasingly intense regulatory pressure. First, in July, the French Competition Authority confirmed that it is investigating NVIDIA for suspected violations of market competition. Then, in August, the US Department of Justice launched two antitrust investigations against NVIDIA.
NVIDIA's stock price rose by 150% in the first half of this year, however, the Federal Reserve's lagging interest rate cuts and Wall Street's increased pressure on tech stocks, under the influence of multiple factors, NVIDIA's stock price "finally" plummeted in July, accounting for four of the eight largest drops in market value.In this scenario, NVIDIA needs to, and must, preserve its influence in the Chinese market. Although the current AI wave is lively, the path ahead is also shrouded in fog. Whether NVIDIA will hit the ceiling, or fall from grace due to the "AI bubble theory" curse, is uncertain.
Four years ago, in August 2020, NVIDIA announced its financial results for the second quarter of fiscal year 2021, where the revenue from data centers surpassed that of gaming for the first time. Now, data centers have replaced gaming as the core business of NVIDIA. However, this step was paved by Jen-Hsun Huang over many years.
Today's NVIDIA also needs to plan for the future, maintaining confidence while also having a "plan B." An interesting piece of data is that, according to NVIDIA's financial report for the first quarter of fiscal year 2025, the automotive business accounted for only 1.2% of total revenue for that quarter, yet it is the only business that achieved sequential growth outside of the data center business. Among the automotive business partners officially announced by NVIDIA, Chinese car manufacturers and intelligent driving solution providers account for over 80%.
From this perspective, the significance of China-specific AI chips may not only lie in AI but also in the fact that NVIDIA needs to maintain a continuous and pivotal influence in China for the future, even if it faces the fate of being castrated time and time again.
Comments