Summary by Bloomberg AI
- A Chinese AI startup called DeepSeek has released an open-source AI model called R1 that rivals or outperforms leading US developers, sparking concerns about the US lead in AI development.
- The tech industry is trying to understand how DeepSeek achieved this feat, with some suspecting that the company built its chatbot on the back of Western technology, sidestepping enormous development costs.
- The fallout from R1’s launch is spreading, with US tech companies reevaluating their AI investments and strategies, and lawmakers considering how to respond to China’s progress in AI development.
By Shirin Ghaffary and Rachel Metz
01/27/2025
(Bloomberg) — As the heads of the biggest artificial intelligence companies gathered in the Swiss Alps last week, all eyes were looking east. In panel discussions and private conversations on the sidelines of the World Economic Forum in Davos, tech executives stressed the need for the US and its allies to build more data centers and strike the right balance on regulations to stay ahead of China on AI development.
“We’re probably a year plus ahead in models,” Ruth Porat, president and chief investment officer at Alphabet Inc., told Bloomberg News at the event. But, she added, “it isn’t a foregone conclusion” the US holds on to its advantage.
Even that may have been overly optimistic. That same week, a little-known Chinese AI startup called DeepSeek released a new open-source AI model called R1 that can mimic the way humans reason. The company said R1 rivaled or outperformed leading US developers on a range of industry benchmarks, including for mathematical tasks and general knowledge — and was built for a small fraction of the cost. By the weekend, DeepSeek had climbed up the rankings on Chatbot Arena, a closely watched leaderboard for AI systems, and prominent figures in tech like Marc Andreessen were calling the product “AI’s Sputnik moment.”
Now, the fallout from the launch of R1 is quickly spreading through the US as the tech industry tries to understand how DeepSeek pulled off the feat and whether the upstart did so as cheaply as it claims. Already, there are suspicions that the Chinese upstart built its chatbot on the back of western technology, sidestepping the enormous costs of developing large language models.
In San Francisco, AI executives and employees are urgently parsing DeepSeek’s technology. Some of OpenAI’s staff are trying to figure out exactly how DeepSeek was able to release such a model, according to people familiar with the matter who spoke on condition of anonymity to discuss private matters. Another person said there’s a sense at the company that OpenAI needs to take developments from Chinese companies very seriously, as it presents an opportunity to innovate and improve their existing models. OpenAI Chief Executive Officer Sam Altman recently told employees that the release marks a major landscape shift for the startup, one of the people said.
Meta, which also focuses on open-source AI models, has set up an internal team focused on analyzing DeepSeek to better understand how it was built and what it can do, according to people familiar with the matter. The company has put together similar taskforces to assess other major competitors, such as OpenAI’s GPT-4 model and Google’s Gemini, the people said.
Almost overnight, DeepSeek has upended many of the assumptions inside Silicon Valley about the economics of building AI, as well as the best technical methods for developing the technology and the extent of the US lead over competitors in China. For much of the past two-plus years since ChatGPT kicked off the global AI frenzy, the industry has bet that the path to better artificial intelligence depends largely on spending heavily on more advanced chips from companies like Nvidia Corp. and increasingly massive data centers to house them.
The market fallout was staggering. Hype over DeepSeek’s feat drove a nearly $1 trillion rout in US and European technology stocks on Monday as investors questioned the spending plans of some of America’s biggest companies. The share plunge in AI chipmaker Nvidia alone erased roughly $589 billion in market value, the biggest wipeout in US stock-market history.
Meanwhile, in DC, lawmakers are left to figure out the best route to beat back China’s progress on a technology some see as crucial to its military and economy, given the Biden administration’s chip export curbs were not enough. David Sacks, Trump’s crypto and AI czar, said DeepSeek shows the global AI race will be very competitive — while blaming the Biden administration for regulation that “hamstrung” AI development.
Further complicating matters, the renewed uncertainty over large AI investments comes just days after President Donald Trump championed a $100 billion joint venture from OpenAI, SoftBank Group Corp. and Oracle Corp. to boost US competitiveness by investing in data centers and other physical infrastructure. Now, there are new questions about the rationale for stratospheric AI budgets.
“It’s a paradigm shift,” said Ali Ghodsi, CEO of Databricks Inc. “These models that can reason are so much cheaper to produce that you will see it be democratized. You’ll see innovations from unexpected corners of the world.”
The rise of DeepSeek
For Liang Wenfeng, DeepSeek began as a side project. Liang, 40, created DeepSeek in 2023 as an offshoot of the AI division for his hedge fund, Zhejiang High-Flyer Asset Management.
Liang was able to tap into some local talent and, crucially, chips. He had begun stockpiling around 10,000 Nvidia A100 GPUs — an older version of a key technology for training AI systems – before the US imposed export restrictions. And most of his top researchers were fresh graduates from top Chinese universities, he has said, stressing the need for China to develop its own domestic ecosystem.
DeepSeek quickly released a number of open-source AI models, starting with DeepSeek LLM in late 2023. Two more advanced models — V2 and V3 — came out in mid and late 2024, respectively. Yet it was DeepSeek’s R1 model, released in mid-January, that really struck a chord.
Like some of the latest models from OpenAI, Google and Anthropic, R1 is intended to parrot the ways humans sometimes ruminate over problems by spending time computing an answer before responding to user queries. DeepSeek’s version differs, however, in its efficiency. The team behind it came up with some simple but key innovations, such as finding ways to get more use from the computer chips they did have access to. Another breakthrough: leaning heavily on a technique known as reinforcement learning that rewards a system for correct answers and punishes it for those that are incorrect.
DeepSeek’s app proved popular with US users, thanks in part to an affable, somewhat awkward-sounding chatbot that shows in great detail how it plans to respond to a person’s question before diving into the results. The approach includes far more detail than, say, OpenAI’s latest reasoning models. And unlike OpenAI, which charges as much as $200 a month for unlimited access to its most advanced reasoning models, among other features, DeepSeek is currently offering its service for free. But DeepSeek also censors topics that would be sensitive in China. Asking about the Chinese Cultural Revolution, for instance, may provoke the response: “Sorry, that’s beyond my current scope. Let’s talk about something else.”
Within an hour of R1’s release, Ghodsi said he received his first request from a DataBricks customer inquiring about using it. Demand has only intensified since then. In particular, he said, companies want to know how to add the reasoning-like capabilities from DeepSeek on top of Databricks’ existing AI models — something that DeepSeek’s efforts show can be done inexpensively, he said.
“The pace and interest level is unprecedented for us,” Ghodsi said.
DeepSeek’s version differs from its competitors in its efficiency.
Mehdi Osman, CEO of software company OpenReplay, said his company traditionally used services from OpenAI, Anthropic and Mistral, and that DeepSeek’s reasoning skills seemed on par with OpenAI’s. “If OpenAI doesn’t reduce their prices, I think many developers will jump to DeepSeek in the coming months,” Osman said.
OpenAI declined to comment. DeepSeek did not respond to a request for comment.
“It’s sort of come out of left field,” Demis Hassabis, CEO of Google DeepMind, told Bloomberg News last week at Davos. “There’s no doubt it’s an impressive system.” But like others in the industry, Hassabis expressed uncertainty about how DeepSeek’s models work, including to what extent it has relied on other, Western models.
Altman, meanwhile, has told OpenAI employees that his startup is trying to understand if and to what extent DeepSeek’s performance is the result of distilling OpenAI’s models — that is, using the outputs of that company’s AI to train a different model to have similar capabilities — or represents an independent research breakthrough, according to a person familiar with the matter.
“Even if that [distilling an OpenAI model] saved them a little bit of time and a little bit of money — which I’m not saying they did — there’s clearly a lot of genuine technical work here going on in the paper that people can look for themselves and judge,” said Miles Brundage, an independent AI policy researcher who recently left OpenAI.
Some US tech founders and venture capitalists are also skeptical about the real price tag for DeepSeek’s technology. Many, including Brundage, questioned whether DeepSeek’s $5.6 million training estimate included the cost of prior research experiments as well as fixed costs for investments in graphics processing units, such as building data centers.
Liang, for his part, has suggested that costs and fundraising are not his chief concern. Rather, the bottleneck for further advances, Liang said in an interview with Chinese outlet 36kr, is US restrictions on access to the best chips.
“More investment does not necessarily lead to more innovation,” Liang said. “Otherwise, large companies would take over all innovation.”
A new competitive landscape
In the weeks leading up to the DeepSeek frenzy, some of the large companies Liang may have been alluding to flexed their financial muscles even more.
Amazon projected spending about $75 billion in capital expenditures in 2024, and an increased amount this year, mostly on technology infrastructure like the chips and data centers that power artificial intelligence. Meta said it would invest as much as $65 billion on AI-related projects in 2025. And Microsoft said it would spend $80 billion on AI data centers this fiscal year.
Much of the spending by the largest cloud-computing companies is going towards Nvidia graphics processing units. Amazon, Google and Microsoft are also building custom chips designed for AI, work that could be less useful in the long term if developers are able to build and run models on less-specialized hardware, Stefan Slowinski, an analyst with BNP Paribas Exane, wrote in a research note on Monday.
The cloud giants are already grappling with questions from investors about returns from their sizable AI outlays. Microsoft, for one, has struggled to monetize the Copilot chatbots it has been baking into much of its product line. Amazon, meanwhile, has trailed its main rivals in developing its own large language models, even as it infuses chatbots and other AI tools into its retail and cloud-computing businesses.
Amazon engineers have been working on artificial intelligence chips at Annapurna Labs in Austin.
Still, the two companies’ huge investments may pay off down the road. Amazon is betting its status as the largest provider of rented computing power will help it prosper as other outfits train and run AI programs on Amazon Web Services’ servers. Microsoft is more focused on building data centers that run AI models rather than train them, according to Mark Moerdler, an analyst at Bernstein Societe Generale Group, who expects the company’s spending to decelerate as early as next year. “We believe they are building predominantly inferencing capacity and not training,” he said. “If that is correct, I don’t think DeepSeek is a problem for Microsoft.”
The big question is whether big US tech companies will adopt aspects of DeepSeek’s approach. Some AI developers say the Chinese upstart’s success could accelerate a move toward cheaper and more profitable AI — setting in motion a natural progression that has propelled almost every major technological development, from chips to smartphones.
“The future of LLMs belongs to those who focus on more efficient techniques, not more compute,” said Aidan Gomez, CEO of AI startup Cohere. “We’ve believed this for a long time, but it’s finally hitting home across the industry.”