Prediction: DeepSeek Hardware Spending Could Exceed $500 Million

One of the most discussed topics in the tech world this week has been the China-based artificial intelligence company DeepSeek. In a recent article on their latest AI model, DeepSeek claimed the total training cost to be $5.576 million, calculated based on Nvidia’s GPU rental fees. The company issued a warning that this amount only covers the “official training” of the model, excluding costs associated with previous research, experiments, or data-related work. The attention of experts from Wall Street to industry insiders was caught by a single number: $6 million. DeepSeek’s article sparked a debate, revealing that the model’s total training cost reached $5,576 million, estimated through Nvidia’s GPU rental fees. They cautioned that this sum only includes the model’s “official training,” excluding expenses for past research and ablation work related to new architecture, algorithms, or data. Earlier this week, DeepSeek’s “AI Assistant” claimed the title of the most downloaded free app in the U.S. on Apple’s App Store, surpassing OpenAI’s ChatGPT. This led to a sell-off in global tech stocks, resulting in Nvidia and Broadcom collectively losing $800 billion in market value on Monday. According to CNBC, a new report from SemiAnalysis, a firm focused on the semiconductor sector, sheds light on DeepSeek’s expenses. The report indicates that DeepSeek’s hardware spending significantly exceeds $500 million throughout the company’s history. It emphasizes the high costs of research and development (R&D) and the substantial computational power needed even for generating “synthetic data.” The report also highlights that training the Claude 3.5 Sonnet model from Anthropic required “millions of dollars,” with Anthropic receiving billions in investment from Amazon and Google. These findings underscore the substantial resources required for developing AI models and the companies behind them. SemiAnalysis explains the high costs as necessitated by “experimenting with new structures, data collection and cleaning, paying employee salaries, and much more.” DeepSeek does not provide an estimated breakdown of its processing power expenditure in its article, and they are yet to respond to requests for comment. The SemiAnalysis report characterizes DeepSeek’s R1 model, stating that “DeepSeek’s attainment of this level of cost and capability sets it apart.” It praises DeepSeek’s R1 model as “very good,” citing its rapid advancement to a highly impressive level of reasoning skills. The model’s quality received accolades from experts and analysts throughout the week. Despite the U.S.’s restrictions imposed on chip export to China three times in the past three years, DeepSeek’s success has caught attention as it claimed the status of a top competitor in the AI market expected to generate over $1 trillion in revenue. Bernstein analysts noted in a Monday report that some of the “overblown comments” they witnessed over the weekend bordered on the “truly intriguing,” while others were as extreme as “the end of the current AI architecture.” Founded in 2023 by Liang Wenfeng, a co-founder of the AI-based quantitative hedge fund High-Flyer, DeepSeek was initially an extension of the AI research unit of High-Flyer, becoming independent in April 2023 to focus on large language models and artificial general intelligence (AGI). AGI aims to equal or surpass human intelligence in a wide variety of tasks, a goal shared by many companies, including OpenAI. Analysts report that DeepSeek is wholly owned and funded by High-Flyer. DeepSeek garnered attention for unveiling its reasoning model named R1 earlier this month, posing a significant challenge to OpenAI’s “o1” model. R1 is an open-source model, allowing any AI developer to utilize it. Similar to Chinese chatbots, DeepSeek’s chatbot has certain restrictions. For instance, when asked about Chinese leader Xi Jinping’s policies, the chatbot is said to divert such queries. While OpenAI CEO Sam Altman publicly praised DeepSeek’s model, he also indicated concerns over unauthorized usage of OpenAI data by DeepSeek for product development. Altman, speaking on Thursday at an OpenAI event in Washington, D.C., described DeepSeek’s model as “absolutely fantastic.” He underscored the competitive nature of the industry and the importance of “democratic AI” winning, along with highlighting the substantial interest in reasoning and open-source topics.