Deepline | Rakuten's 'homegrown' AI sparks controversy after community reveals DeepSeek V3 architecture

Deepline

2026.03.18 18:11

"The era has finally arrived where Japan uses Chinese AI to pass it off as domestically produced AI."

The controversy started yesterday (March 17) when Rakuten Group, a Japanese tech company, launched its 7,000-billion-parameter large model, Rakuten AI 3.0—billed as "Japan's largest and most powerful"—with support from the Ministry of Economy, Trade and Industry (METI)'s GENIAC project, a government-backed AI initiative in Japan.

Shortly after the release, however, an open-source community uncovered that the model's underlying architecture actually comes from China's DeepSeek-V3, with Rakuten merely fine-tuning it on Japanese-language data.

On Hugging Face, the well-known AI open-source repository, Rakuten AI 3.0 openly stated in its configuration file that its architecture is derived from DeepSeek V3.

Yet, in the press release announcing the Rakuten AI 3.0 model, there was no mention of DeepSeek. It only vaguely noted that the model "incorporates the best of the open-source community," leading many online to believe it was a purely Japanese-developed model.

What's more intriguing is that Rakuten, in an attempt to cover this up, quietly deleted DeepSeek's MIT open-source license file when releasing the model. Only after the community presented evidence did they sheepishly re-add it under the filename "NOTICE."

Japanese netizens have expressed strong disapproval, saying, "Unacceptable." Receiving subsidies from the Japanese government, yet merely fine-tuning China's DeepSeek, some commented, "Using DeepSeek is one thing, but trying to hide it is really, really pathetic."

Looking solely at Rakuten's press release, the model does represent a significant release in Japan's LLM landscape.

It is a Mixture-of-Experts (MoE) model with approximately 700 billion parameters, confirmed by the open-source community to have the same 671B total parameters with 37B activated as DeepSeek V3. Rakuten's Chief AI Officer, Ting Cai, described it as "a remarkable combination of data, engineering, and innovative architecture at scale."

The name "Ting Cai" clearly doesn't sound like a local Japanese person. A Japanese netizen commented, "Using DeepSeek is bad enough; what's worse is that the big boss leading this model is an immigration hardliner."

Ting Cai previously worked at Google and Apple in the US and spent over 15 years at Microsoft, and did his undergraduate studies in Computer Science at Stony Brook University in the US. He mentioned in an interview that the first time he went abroad at 18 was to Japan.

Regarding the performance of Rakuten AI 3.0, official benchmarks show it achieving exceptionally high scores across various dimensions, including Japanese cultural knowledge, history, graduate-level reasoning, competitive mathematics, and instruction following, seemingly poised to dominate Japan's domestic model scene.

However, the models it was compared against include the now-discontinued GPT-4o, GPT OSS with only 120 billion parameters, and ABEJA QwQ 32b—a model developed by another emerging Japanese AI company, ABEJA, based on Qwen.

A 700-billion parameter model compared to, at most, 120 billion—Rakuten AI 3.0 certainly won convincingly. Furthermore, as a key beneficiary of METI's GENIAC project, Rakuten received substantial support in terms of computational resources.

GENIAC's original purpose was precisely to build Japan's domestic generative AI ecosystem and alleviate anxiety over reliance on overseas technological giants.

Being Japan's largest model, coupled with this "national team" endorsement, instantly cast Rakuten AI 3.0 as the "hope of the nation" upon its debut.

But that halo faded faster than expected.

For starters, the combination of 700 billion parameters and an MoE architecture is extremely distinctive in today's open-source model community. When open-source developers checked the detailed code configuration files on Hugging Face, there it was, directly stating DeepSeek V3.

At its core, this is "Chinese architecture + Japanese fine-tuning." DeepSeek provided the globally validated, highly efficient underlying architecture and reasoning capabilities, while Rakuten leveraged its local advantage to fine-tune it with high-quality Japanese corpora, making it more attuned to Japanese culture.

Objectively speaking, taking an open-source model and performing localized fine-tuning is an extremely common and reasonable practice in the tech world. Nikkei previously reported that six out of the top ten models developed by Japanese companies were secondary developments based on DeepSeek or Qwen.

If Rakuten had also been upfront and acknowledged using DeepSeek as the foundation, it might have been dismissed as an unoriginal "repackaged" release, perhaps even riding on DeepSeek's coattails for some attention.

Instead, they chose to conceal it.

The MIT license adopted by DeepSeek is arguably the "most permissive and lenient" in the open-source world. It allows users to freely use it commercially, modify it, and even close-source it for profit. Its only request is this: retain the original copyright notice and permission statement within the project.

Rakuten, however, not only completely omitted any mention of DeepSeek in its model launch blog post but also outright deleted the license file from the code repository, while loudly announcing it was open-sourcing the model under the Apache 2.0 license. Although Apache 2.0 is also a commercially friendly open-source license, it's more formal and often used by major companies to build their own open-source ecosystems and patent moats.

Rakuten's calculation was clear: erase DeepSeek's name, slap on its own Apache 2.0 license, and then package itself as Japan's AI savior, "generously open-sourcing a 700-billion-parameter large model."

For over a year, we've heard talk of a "European DeepSeek" or an "American DeepSeek," yet none have materialized. Rakuten also aspired to create a "Japanese DeepSeek," but under the pressure of computational costs and training expenses, amidst the breakneck global development of large models, wanting both the ultimate cost-effectiveness of Chinese technology and maintaining the facade of a "homegrown champion" is proving exceedingly difficult.

Maybe we should all just wait for DeepSeek V4.

(Source: APPSO, WeChat Public Platform)

Deepline | When perfection loses its luster: Why we crave 'human touch' in AI world

Tag:·Rakuten AI 3.0· Rakuten Group· DeepSeek V3· AI ecosystem· GENIAC project