Opinion | China's DeepSeek AI model: New contender in global AI landscape

China

2025.01.29 09:16

As the U.S. "takes a break," the Chinese tech sector moves forward with effort, catching up with lower costs, faster speeds, and stronger capabilities. This sentiment was echoed by Alexander Wang, founder of the American AI company Scale AI, in reference to the groundbreaking domestic AI model, DeepSeek, which has taken the world by storm just before the Lunar New Year.

Recently, DeepSeek, headquartered in Hangzhou, China, unveiled its inference model R1. This model not only approaches the performance of OpenAI's leading model, o1, but does so at a fraction of the cost—approximately one-tenth for inference and only 5% of the training cost compared to GPT-4o. On Jan. 27, DeepSeek topped the Apple App Store's free app download chart in the U.S., surpassing ChatGPT and marking a significant milestone for AI development in China. Despite lacking the powerful AI computational support from NVIDIA's GPUs, China has managed to train an open-source AI model with exceptional reasoning capabilities at remarkably low costs using standard AI accelerators.

Experts suggest that U.S. sanctions have inadvertently spurred innovation, making open source the only viable option. DeepSeek performance has sent shockwaves through Silicon Valley, even being dubbed a mysterious force from the East. The New York Times reported that DeepSeek achieved remarkable performance with an investment of less than US$6 million and only 2,000 chips, challenging the belief that only tech giants can develop cutting-edge AI.

Founded in May 2023, DeepSeek's team, including co-founder Liang Wenfang from Guangdong, has made significant strides in AI technology. The release of the open-source model DeepSeek-V3 on Dec. 27, which abandoned the commonly used "supervised fine-tuning" training paradigm in favor of "reinforcement learning," demonstrates a shift in innovation driven by the U.S. restrictions on chip exports to China.

On Jan. 20, DeepSeek officially released the R1 inference model, which matches the performance of OpenAI's o1 in tasks like mathematics, coding, and natural language reasoning, while supporting free commercial use and modifications. Its pre-training cost was only US$557,600, compared to an estimated US$100 million for GPT-4o, achieved on a cluster of 2,048 lower-spec NVIDIA H800 GPUs over 50 days.

The R1 model stands out for its affordability, with a query cost of just US$0.14 per million tokens, dramatically lower than OpenAI's US$7.50-an astounding 98% reduction. Anjney Midha, a partner at the renowned Silicon Valley investment firm A16z, noted that R1 has quickly become the model of choice among top researchers at prestigious institutions like Stanford and MIT.

The emergence of DeepSeek has raised alarms within companies like Meta, where internal discussions have revealed a state of concern following DeepSeek-V3's impressive benchmark performances that have overshadowed Llama 4, which was trained on an unprecedented scale of 240,000 GPUs. AI policy researcher Ritwik Gupta from UC Berkeley commented that DeepSeek's advancements indicate that "AI capabilities have no moat," emphasizing that China's talent pool in systems engineering is far larger than that of the U.S., adept at utilizing computational resources efficiently.

Recent initiatives by companies like OpenAI and SoftBank to invest US$500 billion over four years in U.S. AI development reflect the urgency felt in the industry. Tsinghua University professor Shen Yang stated that DeepSeek's success represents a significant victory for China in the tech arena, potentially altering the trajectory of global AI technology. He believes that the China-U.S. AI competition has entered a phase of strategic equilibrium, where the future of the AI industry is no longer solely about "computational power," but rather a new contest of "intelligence" and "self-reliance."

DeepSeek's model has demonstrated overwhelming testing performance, with experts confirming its capabilities in critical areas like mathematics, programming, and reasoning, rivaling OpenAI's o1 while dramatically reducing overall API usage costs by about 95%. The hardware requirements for developing DeepSeek were only 1/8 of the leading global companies, needing just 2,048 lower-tier GPUs over a fraction of the time compared to their competitors.

The financial implications of DeepSeek's advancements have sent ripples through the stock market, with predictions suggesting that China's breakthroughs could dismantle America's AI monopoly. NVIDIA's stock suffered a sharp decline, losing over 17.6% of its value, equating to a market cap drop of approximately US$614.3 billion. Stocks of related companies like Meitu, Kingsoft Cloud, and SenseTime surged significantly, reflecting a reassessment of their market value.

Despite U.S. attempts to limit China's AI development through sanctions, experts argue that these measures may have the opposite effect, driving innovation and self-sufficiency in China. The recent developments signal a pivotal moment in the ongoing technological rivalry, with many viewing the restrictions as a catalyst for Chinese advancement rather than a deterrent.

In summary, the rise of DeepSeek is a testament to how resource limitations can spark creativity and innovation, challenging established norms and reshaping the global AI landscape.

(Source: Wen Wei Po)

Don't let AI 'rip off' artists, Beatles musician Paul McCartney warns

Tag:·Alexander Wang· Scale AI· DeepSeek· Apple App Store