Deepseek- V3 and R1 AI (Artificial Intelligence) assistant chatbot model by China: Why is it a breakthrough for all and a setback to the USA? (February 2025):

By Arvind Kumar February 02, 2025

Deepseek — The Chinese AI (Artificial Intelligence) Startup Challenging US Big Tech: DeepSeek’s artificial intelligence assistant made big waves on January 27, 2025, becoming the top-rated app in the Apple Store and sending tech stocks into a downward tumble. DeepSeek, a relatively unknown Chinese AI startup, has sent shockwaves through Silicon Valley with its recent release of cutting-edge AI models. Therefore, we will analyze here all about the DeeSeek AI assistant and its impact: Model (V3 and R1) launched by DeepSeek rivals OpenAI's ChatGPT and Meta's Llama 3.1. DeepSeek is a start-up and private Chinese company founded in Hangzhou, (China) in July 2023 by Liang Wenfeng, a graduate of Zhejiang University, one of China's top universities, who funded the startup via his hedge fund. Liang has about $8 billion in assets. Liang Wenfeng is also the co-founder of the quantitative hedge fund High-Flyer. In China, the start-up is known for grabbing young and talented A.I. researchers from top universities, promising high salaries and an opportunity to work on cutting-edge research projects. DeepSeek's team primarily comprises young, talented graduates from top Chinese universities, fostering a culture of innovation and a deep understanding of the Chinese language and culture. Notably, the company's hiring practices prioritize technical abilities over traditional work experience, resulting in a team of highly skilled individuals with a fresh perspective on AI development. Liang's fund announced in March 2023 on its official WeChat account that it was "starting again", going beyond trading to concentrate resources on creating a "new and independent research group, to explore the essence of AGI" (Artificial General Intelligence). DeepSeek was made later in July 2023. It is unclear how much High-Flyer has invested in DeepSeek. However, High-Flyer has an office in the same building as DeepSeek and owns patents related to chip clusters used to train AI models. This unique funding model has allowed DeepSeek to pursue ambitious AI projects without the pressure of external investors, enabling it to prioritize long-term research and development. Its goal is to build A.I. technologies along the lines of OpenAI’s ChatGPT chatbot or Google’s Gemini. By 2021, DeepSeek had acquired thousands of computer chips from the U.S. chipmaker Nvidia, which are a fundamental part of any effort to create powerful A.I. systems. The fund, by 2022, had amassed a cluster of 10,000 of California-based Nvidia’s high-performance A100 graphics processor chips that are used to build and run AI systems, according to a post on the Chinese social media platform WeChat. U.S. soon after restricted sales of those chips to China. DeepSeek released its first large language model in 2023 and later released several large language models, which are the technology behind chatbots like ChatGPT and Gemini. On January 10, 2025, it released its first free chatbot app, which was based on a new model called DeepSeek-V3. DeepSeek has said its recent models were built with Nvidia’s lower-performing H800 chips, which are not banned in China, sending a message that the fanciest hardware might not be needed for cutting-edge AI research. The company has attracted attention in global AI circles after writing in a paper in December 2024 that the training of DeepSeek-V3 required less than $6 million worth of computing power from Nvidia H800 chips. Deepseek claimed that its new AI model is at par with similar models from US companies such as ChatGPT maker OpenAI, and was more cost-effective in its use of expensive Nvidia chips to train the system on troves of data. But that was not the end of the claim, Deepseek published another research paper on January 20, 2025, on the same day as President Donald Trump’s inauguration, that set in motion the panic that followed. That paper was about another DeepSeek AI model called R1 that showed advanced “reasoning” skills, such as the ability to rethink its approach to a math problem and was significantly cheaper than a similar model sold by OpenAI called o1. The startup says its AI models, DeepSeek-V3 and DeepSeek-R1, are at par or better than the most advanced models in the United States from OpenAI i.e. the company behind ChatGPT, and Facebook parent company Meta at a fraction of cost. Outages hit its website amid a spike in interest. DeepSeek's journey began with the release of DeepSeek Coder in November 2023, an open-source model designed for coding tasks. This was followed by DeepSeek LLM, a 67B parameter model aimed at competing with other large language models. DeepSeek-V2, launched in May 2024, gained significant attention for its strong performance and low cost, triggering a price war in the Chinese AI model market. This disruptive pricing strategy forced other major Chinese tech giants, such as ByteDance, Tencent, Baidu, and Alibaba, to lower their AI model prices to remain competitive. DeepSeek-V2 was succeeded by DeepSeek-Coder-V2, a more advanced model with 236 billion parameters. It is designed for complex coding challenges and features a high context length of up to 128K tokens. This model is available through a cost-effective API (Application Programming Interface), priced at $0.14 per million input tokens and $0.28 per million output tokens. The company's latest models, DeepSeek-V3 and DeepSeek-R1, have further solidified its position as a disruptive force. DeepSeek-V3, a 671B parameter model, boasts impressive performance on various benchmarks while requiring significantly fewer resources than its peers. DeepSeek-R1, released in January 2025, focuses on reasoning tasks and challenges OpenAI's o1 model with its advanced capabilities. DeepSeek also offers a range of distilled models, known as DeepSeek-R1-Distill, which are based on popular open-weight models like Llama and Qwen, fine-tuned on synthetic data generated by R1. These distilled models provide varying levels of performance and efficiency, catering to different computational needs and hardware configurations. DeepSeek's Strategic Partnerships: DeepSeek's success is not solely due to its internal efforts. The company has also forged strategic partnerships to enhance its technological capabilities and market reach. One notable collaboration is with AMD, a leading provider of high-performance computing solutions. DeepSeek leverages AMD Instinct GPUs and ROCM software across key stages of its model development, particularly for DeepSeek-V3. This partnership provides DeepSeek with access to cutting-edge hardware and an open software stack, optimizing performance and scalability. On its Chinese site, DeepSeek blamed "large-scale malicious attacks" on its service, requiring it to temporarily limit new registrations. "Existing users can log in as usual," the company said in the post, which was dated shortly after midnight on Jan. 28 in China's local time. DeepSeek released two models of open-source AI assistants (chatbot) in January 2027, as below:

Model No.	Release date
DeepSeek-V3	January 10, 2025
DeepSeek-R1 (Deepthink)	January 20, 2025
Liang has become the Sam Altman of China — an evangelist for AI technology and investment in new research. We should know that Sam Altman is the CEO of OpenAI. Like other AI startups, including Anthropic and Perplexity, DeepSeek released various competitive AI models over the past year that have captured some industry attention. Its V3 model raised some awareness about the company, although its content restrictions around sensitive topics about the Chinese government and its leadership sparked doubts about its viability as an industry competitor. But R1 launched on January 20, 2025, and gained significant attention when the company revealed to the Journal its shockingly low cost of operation. And it is open-source, which means other companies can test and build upon the model to improve it.

Impact of Deepseek on the Stock Market (January 27, 2025):
U.S. tech stocks tumbled on January 27, 2025, after a small Chinese artificial intelligence startup, DeepSeek, said it could compete with ChatGPT and other U.S.-based AI models at a fraction of the cost. Major fall in the stock market value of important US companies due to the release of the Chinese AI model Deepseek V-3 is listed below:
US Companies	% fall in stock market value
Vistra	28%
GE Vernova	21%
Broadcom Inc	17.4%
Nvidia Wiped out US$600 billion from the company’s market cap.	17% A record single-day loss
ORACLE It has partnered with SoftBank and OpenAI on the Stargate AI project announced by President Donald Trump last week.	14%
ASML	6%
Google parent Alphabet	4.2%
Tech-heavy NASDAQ composite index	3.1%
NASDAQ futures	3%
ChatGPT backer Microsoft	2.1%
S&P 500 index	1.5%
Why is it causing Nvidia and other stocks to slump? A Chinese artificial intelligence company called DeepSeek is grabbing America's attention and sending a shock wave through Wall Street due to its new tech, which some experts say rivals that of OpenAI's ChatGPT. DeepSeek is also catching investors off guard because of the low development costs for its AI app, which Wedbush Securities analyst Dan Ives pegged at only $6 million. By comparison, OpenAI, Google, and other major U.S. companies are on track to invest a total of roughly $1 trillion in AI over the coming years, according to Goldman Sachs. On January 27, 2025, DeepSeek's rollout roiled shares of AI stalwarts such as Nvidia, the high-flying manufacturer of advanced chips engineered for AI development, and Dutch company ASML, another chipmaker. The Chinese company's tech is raising questions about whether demand for Nvidia's chips could take a hit, as well as whether investors are overvaluing tech stocks that have been buoyed by the promise of AI, from Meta to Microsoft, experts said. "DeepSeek has taken the market by storm by doing more with less," said Giuseppe Sette, president of AI market research firm Reflexivity, in an email. "This shows that with AI, the surprises will keep on coming in the next few years." DeepSeek's latest app comes just days after Mr. Trump announced a new $500 billion venture with ChatGPT maker OpenAI, Softbank, and Oracle, dubbed Stargate, which he touted as ensuring "the future of technology" in the U.S. AI-related stocks hit on January 27, 2025 (Monday), with Nvidia shares tumbling 17%, shedding $600 billion in value and marking the single-biggest one-day loss for a company in stock market history, according to CNBC. ASML sank 6%, while Broadcom, another semiconductor stock, also slumped 17%. Some energy-related stocks also plunged on January 27, 2025 (Monday) on investor worries that the new tech could require less energy to run, translating into lower demand from the tech sector. GE Vernova, which makes wind and gas turbines, plunged 21%, while electricity generator Vistra slumped 28%. The tech-heavy Nasdaq index slumped 3%, or 612 points, while the S&P 500 declined 1.5%. Wall Street is trying to assess the long-term impact of a low-cost AI tool from China that rivals ChatGPT and other so-called generative AI apps. It also raises questions about whether Silicon Valley is overspending on tech advancements in the AI sector. The fact that this technology is supposed to take less energy and is more cost-effective than U.S.-based models has U.S. technology investors very concerned.\ In the meantime, major tech companies, including Meta and Microsoft, were slated to report earnings at the same time, where investors will likely hear more from their executives about their AI plans and their thoughts on DeepSeek. Some Wall Street analysts think that selling off stock on January 27, 2025, is an overreaction, noting that the enormous demand for AI will continue lifting key players in the sector. Due to this AI revolution, Wall Street tech company stocks led by AI chip manufacturer Nvidia slumped under massive pressure. It is viewed as a major threat to US tech dominance. The emergence of low-cost and powerful AI assistants (DeepSeek-V3 and R1 model) has raised doubts about the reasoning behind some U.S. tech companies' decision to pledge billions of dollars in AI investment, and shares of several big tech players, including Nvidia, have been hit. What was the reaction of Nvidia to DeepSeek? In a statement, Nvidia offered praise for DeepSeek. “DeepSeek is an excellent AI advancement and a perfect example of test-time scaling," the company said in an email. "DeepSeek's work illustrates how new models can be created using that technique, leveraging widely available models and compute that is fully export-control compliant." But, Nvidia added, AI inference, or using AI models to make decisions or predictions, "requires significant numbers of NVIDIA GPUs and high-performance networking. We now have three scaling laws: pre-training and post-training, which continue, and new test-time scaling."

Why is DeepSeek shaking up the tech world?

There are some solid reasons to shake up the tech world, as listed below:

Tech stocks tumbled. Giant companies like Meta and Nvidia faced a barrage of questions about their future.
Tech executives took to social media to proclaim their fears.
And it was all because of a little-known Chinese artificial intelligence start-up called DeepSeek.
DeepSeek caused waves all over the world on January 27, 2025, as one of its accomplishments — that it had created a very powerful AI model with far less money than many AI experts thought possible- raised a host of questions, including whether U.S. companies were even competitive in AI anymore.
DeepSeek is “AI’s Sputnik moment,” Marc Andreessen, a tech venture capitalist, posted on social media on January 26, 2025.
How could a company that few people had heard of have such an effect?

As claimed by DeepSeek it costs less than $6 million to train its DeepSeek-V3 model. OpenAI, in comparison, spent more than $100 million to train the latest version of ChatGPT, according to Wired.
Analysts say the technology is impressive, especially since DeepSeek says it used less-advanced chips to power its AI models.

It is worth mentioning that former President Joe Biden's administration had limited the export of certain advanced AI chips as per the statement reproduced below:

"In the wrong hands, powerful AI systems have the potential to exacerbate significant national security risks, including by enabling the development of weapons of mass destruction, supporting powerful offensive cyber operations, and aiding human rights abuses, such as mass surveillance”.

However, DeepSeek says the chip restrictions haven’t stopped it from releasing a model that is 20 to 50 times cheaper than the OpenAI o1 model, depending on the task.

It was expected that AI development would grow by leaps and bounds since the public launch of ChatGPT, but the U.S. was surprised when the latest leap came from China.

DeepSeek has rolled out a free assistant that uses lower-cost chips and fewer data.

Worries that the emergence of DeepSeek, a low-cost Chinese artificial intelligence model, would threaten the dominance of AI leaders like Nvidia, led to a rout in tech stocks on Wall Street on January 27, 2025.

Technology shares around the world slid on January 27, 2025, as a surge in popularity of a Chinese discount artificial intelligence model shook investors' faith in the AI sector's voracious demand for high-tech chips.
Startup DeepSeek has rolled out a free assistant that uses lower-cost chips and fewer data, seemingly challenging a widespread bet in financial markets that AI will drive demand along a supply chain from chipmakers to data centers.
DeepSeek's recent models have raised market concerns regarding its potential impact on the competitiveness of US big tech and the broader AI capex momentum.
With the recent initial success of DeepSeek’s large language AI models, investors are grappling with concerns about potential AI price wars, Big 4’s AI capex intensity, and how to navigate investments across various layers, like enabling versus application layers.

The market will worry about demand growth in computing power. “We have been highlighting our concern about AI's ROI, as the massive investment in GPUs (eg, just NVDA's 2024 GPU rev could be> US$200bn) has generated little return.
We have seen model improvement (at a high cost) but no concrete examples of AI monetization that could justify the investments.
DeepSeek could prompt investors to ask hard questions about these computing power investments.
AI players’ management in the US could be under more pressure to justify further raising AI CapEx in 2026.

The release of OpenAI's ChatGPT in late 2022 caused a scramble among Chinese tech firms, who rushed to create their chatbots powered by artificial intelligence.
After the release of the first Chinese ChatGPT equivalent, made by search engine giant Baidu (9888.HK), there was widespread disappointment in China at the gap in AI capabilities between U.S. and Chinese firms.
The quality and cost efficiency of DeepSeek's models have flipped this narrative on its head.
The two models that have been showered with praise by Silicon Valley executives and U.S. tech company engineers alike, DeepSeek-V3 and DeepSeek-R1, are on par with OpenAI and Meta's most advanced models, the Chinese startup has said.
They are also cheaper to use. The DeepSeek-R1, released last week, is 20 to 50 times cheaper to use than the OpenAI o1 model, depending on the task, according to a post on DeepSeek's official WeChat account.

The new AI model was developed by DeepSeek, a startup that was born just a year ago and has somehow managed a breakthrough that famed tech investor Marc Andreessen has called “AI’s Sputnik moment”: R1 can nearly match the capabilities of its far more famous rivals, including OpenAI’s GPT-4, Meta’s Llama and Google’s Gemini — but at a fraction of the cost.
The company said it had spent just $5.6 million powering its base AI model, compared with the hundreds of millions, if not billions of dollars US companies spend on their AI technologies.
That’s even more shocking when considering that the United States has worked for years to restrict the supply of high-power AI chips to China, citing national security concerns.
That means DeepSeek was supposedly able to achieve its low-cost model on relatively underpowered AI chips.

AI is a power-hungry and cost-intensive technology. Therefore, America’s most powerful tech leaders are buying up nuclear power companies to provide the necessary electricity for their AI models.
Meta last week said it would spend upward of $65 billion this year on AI development.
Sam Altman, CEO of OpenAI, last year said the AI industry would need trillions of dollars in investment to support the development of high-in-demand chips needed to power the electricity-hungry data centers that run the sector’s complex models.
So the notion that similar capabilities as America’s most powerful AI models can be achieved for such a small fraction of the cost, and on less capable chips, represents a sea change in the industry’s understanding of how much investment is needed in AI.
The technology has many skeptics and opponents, but its advocates promise a bright future: AI will advance the global economy into a new era, they argue, making work more efficient and opening up new capabilities across multiple industries that will pave the way for new research and developments.
Andreessen, a Trump supporter and co-founder of Silicon Valley venture capital firm Andreessen Horowitz, called DeepSeek “one of the most amazing and impressive breakthroughs I’ve ever seen” in a post on X.
If that potentially world-changing power can be achieved at a significantly reduced cost, it opens up new possibilities and threats to the planet.

Some doubts about the claims of the DeepSeek-V3 model:

Some of the experts do not believe in DeepSeek’s claims to use less-advanced chips, instead, they use advanced Nvidia chips.
Total training costs seem higher than DeepSeek claims (US$6 million).

DeepSeek is the latest app with connections to China to hit the top of the Apple App Store charts.

Some have publicly expressed skepticism about DeepSeek's success story.
It is doubtful without providing evidence, that DeepSeek has 50,000 Nvidia H100 chips, which he claimed would not be disclosed because that would violate Washington's export controls that ban such advanced AI chips from being sold to Chinese companies.
It is noticed that DeepSeek did not immediately respond to a request for comment on the allegation.
DeepSeek's total training costs for its V3 model were unknown, but it should be much higher than the $5.58 million the startup said was used for computing power.
The analysts also said the training costs of the equally acclaimed R1 model were not disclosed.

The industry is taking the company at its word that the cost was so low.
No one is disputing it, but the market freak-out hinges on the truthfulness of a single and relatively unknown company.
The company notably didn’t say how much it cost to train its model, leaving out potentially expensive research and development costs. (Still, it probably didn’t spend billions of dollars.).
It’s also far too early to count out American tech innovation and leadership. One achievement, albeit a gobsmacking one, may not be enough to counter years of progress in American AI leadership. A massive customer shift to a Chinese startup is unlikely.
“The DeepSeek model rollout is leading investors to question the lead that US companies have and how much is being spent and whether that spending will lead to profits (or overspending),” said Keith Lerner, analyst at Truist.
“Ultimately, our view is the required spend for data and such in AI will be significant, and US companies remain leaders.”
Although the cost-saving achievement may be significant, the R1 model is a ChatGPT competitor — a consumer-focused large-language model. It hasn’t yet proven it can handle some of the massively ambitious AI capabilities for industries that, for now, still require tremendous infrastructure investments.
“Thanks to its rich talent and capital base, the US remains the most promising ‘home turf’ from which we expect to see the emergence of the first self-improving AI,” said Giuseppe Sette, president of AI market research firm Reflexivity.
It's one thing to train a [large language] model for less money, but accommodating the huge demand for the consumption of all this AI technology is still going to require massive amounts of infrastructure.

Know about the DeepSeek app:

DeepSeek's app is powered by the DeepSeek-V3 and R3 models.
The startup describes its app as using “state-of-the-art" AI that “leads global standards and matches top-tier international models.”
The app hit the top of Apple’s App Store “top free apps” chart on January 27, 2025.
On January 27, 2025, DeepSeek's AI Assistant overtook rival ChatGPT to become the top-rated free application available on Apple's App Store in the United States.

Around 2 million times, it was downloaded on January 27, 2025, and the DeepSeek app has surged on the app store charts, surpassing ChatGPT.

In January 2025, Chinese startup DeepSeek launched free AI assistant ( model V-3 and R1) apps that use less data at a fraction of the cost of incumbent services.

DeepSeek website hit by cyberattack:

Chinese startup DeepSeek said on January 27, 2025, it will temporarily limit registrations due to a cyberattack on Deepseek’s website after the company's AI assistant amassed sudden popularity.
The startup earlier in the day (January 27, 2025) was also hit by outages on its website after its AI assistant became the top-rated free application available on Apple's App Store in the United States.
The company resolved issues relating to its application programming interface and users' inability to log in to the website, according to its status page.
Registered users can log in normally, according to the company.
The outages on January 27, 2025, were the company's longest in around 90 days and coincided with its skyrocketing popularity.
It suggested that new users wait and try again.

Rising expectations and concerns due to the DeepSeek-V3 AI assistant:

TikTok competitor RedNote shot to the top of the social networking app rankings earlier in January 2025.
DeepSeek, more than TikTok and RedNote, is expected to raise security concerns.
"It seems likely that the AI arms race, as it's already being called, will have geopolitical implications that go beyond mere economic competition, which will in turn impact the future of these transformative technologies.

A frenzy over an artificial intelligence chatbot made by Chinese tech startup DeepSeek was upending stock markets on January 27, 2025, and fueling debates over the economic and geopolitical competition between the U.S. and China in developing AI technology.
DeepSeek’s AI assistant became the number 1 downloaded free app on Apple’s iPhone store on January 27, 2025, propelled by curiosity about the ChatGPT competitor.
Part of what’s worrying some U.S. tech industry observers is the idea that the Chinese startup has caught up with the American companies at the forefront of generative AI at a fraction of the cost.
That, if true, calls into question the huge amounts of money U.S. tech companies say they plan to spend on the data centers and computer chips needed to power further AI advancements.
But hype and misconceptions about DeepSeek’s technological advancements also sowed confusion.
“The models they built are fantastic, but they aren’t miracles either,” said Bernstein analyst Stacy Rasgon, who follows the semiconductor industry and was one of several stock analysts describing Wall Street’s reaction as overblown.
“They’re not using any innovations that are unknown or secret or anything like that,” Rasgon said. “These are things that everybody’s experimenting with.”

Does DeepSeek’s tech mean that China is now ahead of the United States in AI?

The answer is no.
The world has not yet seen OpenAI’s o3 model, and its performance on standard benchmark tests was more impressive than anything else on the market.
However, experts are concerned that China is jumping ahead on open-source AI. Systems.

What are the reactions in the USA to DeepSeek AI assistant models?

The United States thought it could sanction its way to dominance in a key technology it believed would help bolster its national security.
Just a week before leaving office, former President Joe Biden doubled down on export restrictions on AI computer chips to prevent rivals like China from accessing the advanced technology.
But DeepSeek has called into question that notion and threatened the aura of invincibility surrounding America’s technology industry.
America may have bought itself time with restrictions on chip exports, but its AI lead just shrank dramatically despite those actions.
DeepSeek may show that turning off access to a key technology doesn’t necessarily mean the United States will win. That’s an important message to President Donald Trump as he pursues his isolationist “America First” policy.
Wall Street was alarmed by the development.
US stocks were set for a steep selloff Monday (January 27, 2025) morning.
Nvidia (NVDA), the leading supplier of AI chips, whose stock more than doubled in each of the past two years, fell 12% in premarket trading.
Meta (META) and Alphabet (GOOGL), Google’s parent company, were also down sharply, as were Marvell, Broadcom, Palantir, Oracle, and many other tech giants.
David Sacks, a venture capitalist named by Mr. Trump to help oversee AI and cryptocurrency policy, said on social media Monday that DeepSee's app "shows that the AI race will be very competitive."
However, Wedbush Securities analyst Dan Ives said he's skeptical the service will gain ground with major U.S. businesses. "No U.S. Global 2000 is going to use a Chinese startup DeepSeek to launch their AI infrastructure and use cases," Ives wrote.
In the future, there will be only one chip company in the world launching autonomous, robotics, and broader AI use cases, and that is Nvidia.

The statement of US President Donald Trump on DeepSeek:

US President Donald Trump said on January 27, 2025, that Chinese startup DeepSeek's technology should act as a spur for American companies and said it was good that companies in China have come up with a cheaper, faster method of artificial intelligence.
"The release of DeepSeek, AI from a Chinese company should be a wakeup call for our industries that we need to be laser-focused on competing to win," Trump said in Florida.
"I've been reading about China and some of the companies in China, one in particular, coming up with a faster method of AI and a much less expensive method, and that's good because you don't have to spend as much money. I view that as a positive, as an asset," Trump said.

In the USA, the debate on DeepSeek’s technical capabilities is on the rise. Some of the views of the debate are listed below:

“Deepseek R1 is AI’s Sputnik moment,” said venture capitalist Marc Andreessen in a post on social platform X, referencing the 1957 satellite launch that set off a Cold War space exploration race between the Soviet Union and the U.S.

Andreessen, who has advised Trump on tech policy, has warned that overregulation of the AI industry by the U.S. government will hinder American companies and enable China to get ahead.
But the attention on DeepSeek also threatens to undermine a key strategy of U.S. foreign policy in recent years to restrict the sale of American-designed AI semiconductors to China.
Some experts on U.S.-China relations don’t think that is an accident.
“The technology innovation is real, but the timing of the release is political in nature,” said Gregory Allen, director of the Wadhwani AI Center at the Center for Strategic and International Studies. Allen compared DeepSeek’s announcement last week to U.S.-sanctioned Chinese company Huawei’s release of a new phone during diplomatic discussions over Biden administration export controls in 2023.
“Trying to show that the export controls are futile or counterproductive is a really important goal of Chinese foreign policy right now,” Allen said.
On January 27, 2025, Trump said DeepSeek’s breakthrough was “good because you don’t have to spend this much money.”
Speaking on January 27, 2025, to House Republicans in Miami, Trump called the DeepSeek news “positive” if it is accurate because “you won’t be spending as much and you’ll get the same result.” He called the development a “wake-up call for our industries that we need to be laser-focused on competing to win.”
Trump signed an order on his first day in office last week that said his administration would “identify and eliminate loopholes in existing export controls,” signaling that he is likely to continue and harden Biden’s approach.
DeepSeek’s progress on AI without the same amount of spending could undermine the potentially $500 billion AI investment by OpenAI, Oracle, and SoftBank that Trump touted at the White House.
Nvidia’s stock dropped 17% Monday, but the company in a statement commended DeepSeek’s work as “an excellent AI advancement” that leveraged “widely-available models and compute that is fully export control compliant.”

U.S. tech giants are building data centers with specialized A.I. chips. Does this still matter?

Yes, it still matters.
Large numbers of A.I. chips can still help companies in many ways.
With more chips, they can run more experiments as they explore new ways of building AI.
In other words, more chips can still give companies a technical and competitive advantage.
More chips will also be needed to operate the new breed of “reasoning” AI models.
These require more computing power when people and businesses use them.

Hasn’t the United States limited the number of Nvidia chips sold to China?

Yes. To maintain the U.S. lead in the global AI. Race, the Biden administration had put in place rules limiting the number of powerful chips that could be sold to China and other rivals.
However, the impressive performance of the DeepSeek model raised questions about the unintended consequences of the American government’s trade restrictions. The controls have forced researchers in China to get creative with a wide range of tools that are freely available on the internet.
Some experts continue to argue in favor of U.S. trade restrictions, saying that they were only recently put in place and that they will have a greater effect on China’s ability to create AI as the years pass.

What is the outlook of China and its AI market?

DeepSeek's success has already been noticed in China's top political circles.
On January 20, the day DeepSeek-R1 was released to the public, founder Liang attended a closed-door symposium for businessmen and experts hosted by Chinese Premier Li Qiang, according to state news agency Xinhua.
Liang's presence at the gathering is potentially a sign that DeepSeek's success could be important to Beijing's policy goal of overcoming Washington's export controls and achieving self-sufficiency in strategic industries like AI.
A similar symposium last year was attended by Baidu CEO Robin Li.
Assuming DeepSeek’s multi-head latent attention (MHA) and its mixture of experts (MOE) training technique is indeed the way to go for the broader AI industry, investment bank UBS in a report stated that it sees a bright outlook for AI as potentially lower costs would accelerate AI adoption with artificial general intelligence (AGI) coming sooner than expected.
“While that means investors need to tilt their AI portfolios in favor of AI applications (our current allocation is 25-30% in the AI portfolio) and the intelligence layer (15-20%) over the enabling layer (50-60%),” the investment bank said.
DeepSeek’s engineers said they needed only about 2,000 Nvidia chips to train the startup’s AI system.

Will there be any implications for smartphones?

If smaller models work for DeepSeek, it will be potentially positive for smartphones. “We are bearish on AI smartphones as AI has gained no traction with consumers. More hardware upgrades are needed to run bigger models on the phone, which will raise costs,” Jefferies said in a note.

Know the technical difference between Chinese DeepSeek AI assistant models and other USA models:

One thing that distinguishes DeepSeek from competitors such as OpenAI is that its models are “open source” — meaning key components are free for anyone to access and modify, though the company hasn’t disclosed the data it used for training.
But what’s attracted the most admiration about DeepSeek’s R1 model is what Nvidia calls a “perfect example of Test Time Scaling” — or when AI models effectively show their train of thought and then use that for further training without having to feed them new sources of data. It may be called Rethink.
“It’s just thinking out loud, basically,” said Lennart Heim, a researcher at Rand Corp.
OpenAI’s reasoning models, starting with o1, do the same, and other U.S.-based competitors such as Anthropic and Google likely have similar capabilities that haven’t been released, Heim said.
But “it’s the first time that we see a Chinese company being that close within a relatively short time. I think that’s why a lot of people pay attention to it,” Heim said. “I used to believe OpenAI was the leader, the king of the hill, and that nobody could catch up. Turns out this is not completely the case.”

DeepSeek is an open-source large language model that relies on what is known as "inference-time computing," in layman's terms means "they activate only the most relevant portions of their model for each query, and that saves money and computation power."

When DeepSeek introduced its DeepSeek-V3 model, it matched the abilities of the best chatbots from U.S. companies like OpenAI and Google. That alone would have been impressive.
But the team behind the new system also revealed a bigger step forward. In a research paper explaining how it built the technology, DeepSeek said it used only a fraction of the computer chips that leading A.I. companies relied on to train their systems.
The world’s top companies typically train their chatbots with supercomputers that use as many as 16,000 chips or more. DeepSeek’s engineers said they needed only about 2,000 Nvidia chips.

Since late 2022, when OpenAI set off the A.I. boom, the prevailing notion had been that the most powerful A.I. systems could not be built without investing billions of dollars in specialized A.I. chips.
That would mean that only the biggest tech companies — such as Microsoft, Google, and Meta, all of which are based in the United States — could afford to build the leading technologies.
But DeepSeek’s engineers said they needed only about $6 million in raw computing power to train their new system. That was roughly 10 times less than what Meta spent building its latest A.I. technology.

Top A.I. engineers in the United States say that DeepSeek’s research paper laid out clever and impressive ways of building A.I. technology with fewer chips.
In short, the startup’s engineers demonstrated a more efficient way of analyzing data using the chips.
Leading A.I. systems learn their skills by pinpointing patterns in huge amounts of data, including text, images, and sounds.
DeepSeek described a way of spreading this data analysis across several specialized A.I. models — what researchers call a “mixture of experts” method — while minimizing the time lost by moving data from place to place.
Others have used similar methods before, but moving information between the models tended to reduce efficiency. DeepSeek did this in a way that allowed it to use less computing power.
“It has become very clear that other companies, not just someone like OpenAI, can build these kinds of systems,” said Tim Dettmers, a researcher at the Allen Institute for Artificial Intelligence in Seattle and a professor of computer science at Carnegie Mellon University who specializes in building efficient A.I. systems.
DeepSeek-V3 can answer questions, solve logic problems, and write its computer programs as effectively as anything already on the market, according to standard benchmark tests.
Just before DeepSeek released its technology, OpenAI had unveiled a new system, called OpenAI o3, which seemed more powerful than DeepSeek-V3. But OpenAI has not released this system to the wider public.
OpenAI o3 was designed to “reason” through problems involving math, science, and computer programming.
Many experts pointed out that DeepSeek had not built a reasoning model along these lines, which is seen as the future of AI.
Then on Jan. 20, DeepSeek released its reasoning model called DeepSeek R1, and it, too, impressed the experts. That eventually sent U.S. investors and others into a panic and they realized the importance of DeepSeek’s new technology.

Know more about open-source AI?

Like many other companies, DeepSeek has “open-sourced” its latest AI system, which means that it has shared the underlying computer code with other businesses and researchers.
This allows others to build and distribute their products using the same technologies.
This is part of the reason DeepSeek and others in China have been able to build competitive AI systems so quickly and inexpensively.
In the AI world, open-source first gathered steam in 2023 when Meta freely shared an AI system called Llama.
At the time, many assumed that the open-source ecosystem would flourish only if companies like Meta Giant with huge data centers filled with specialized chips continued to open-source their technologies.
But DeepSeek and others have shown that this ecosystem can thrive in ways that extend beyond the American tech giants.
Many experts have argued that big U.S. companies should not open-source their technologies because they could be used to spread disinformation or cause other serious harm.
Some U.S. lawmakers have explored the possibility of preventing or throttling the practice.
However, other experts have argued that if regulators stifle the progress of open-source technology in the United States, China will gain a significant edge.
If the best open-source technologies come from China, these experts argue, U.S. researchers and companies will build their systems atop those technologies.

Know more about DeepSeek’s Innovative Techniques:

DeepSeek’s success can be attributed to several key innovations:

Reinforcement Learning:

Unlike traditional methods that rely heavily on supervised fine-tuning, DeepSeek employs pure reinforcement learning, allowing models to learn through trial and error and self-improve through algorithmic rewards.
This approach has been particularly effective in developing DeepSeek-R1’s reasoning capabilities.
In essence, DeepSeek’s models learn by interacting with their environment and receiving feedback on their actions, similar to how humans learn through experience. This allows them to develop more sophisticated reasoning abilities and adapt to new situations more effectively.

Mixture-of-Experts Architecture:

DeepSeek’s models utilize a Mixture-of-Experts (MoE) architecture, activating only a small fraction of their parameters for any given task.
This selective activation significantly reduces computational costs and enhances efficiency.
Imagine a team of experts, each specializing in a different area. When faced with a task, only the relevant experts are called upon, ensuring efficient use of resources and expertise.
DeepSeek’s MoE (Mixture of Experts) architecture operates similarly, activating only the necessary parameters for each task, leading to significant cost savings and improved performance.

Multi-Head Latent Attention:

DeepSeek-V3 incorporates multi-head latent attention, which improves the model’s ability to process data by identifying nuanced relationships and handling multiple input aspects simultaneously.
Think of it as having multiple “attention heads” that can focus on different parts of the input data, allowing the model to capture a more comprehensive understanding of the information.
This enhanced attention mechanism contributes to DeepSeek-V3’s impressive performance on various benchmarks.

Distillation:

DeepSeek employs distillation techniques to transfer the knowledge and capabilities of larger models into smaller, more efficient ones.
This makes powerful AI accessible to a wider range of users and devices.
It’s like a teacher transferring their knowledge to a student, allowing the student to perform tasks with similar proficiency but with less experience or resources.
DeepSeek’s distillation process enables smaller models to inherit the advanced reasoning and language processing capabilities of their larger counterparts, making them more versatile and accessible.

DeepSeek achieved its efficiency in several ways, as below:

The model has 670 billion parameters, or variables it learns from during training, making it the largest open-source large language model yet.
But the model uses an architecture called “mixture of experts” so that only a relevant fraction of these parameters—tens of billions instead of hundreds of billions—are activated for any given query. This cuts down on computing costs.
The DeepSeek LLM also uses a method called multi-head latent attention to boost the efficiency of its inferences; and instead of predicting an answer word-by-word, it generates multiple words at once.
The model further differs from others like o1 in how it reinforces learning during training. While many LLMs have an external “critic” model that runs alongside them, correcting errors and nudging the LLM toward verified answers, DeepSeek-R1 uses a set of rules internal to the model to teach it which of the possible answers it generates is best. DeepSeek has streamlined that process.
Another important aspect of DeepSeek-R1 is that the company has made the code behind the product open-source, and therefore, the training data remains proprietary. This means that the company’s claims can be checked.

Bottom lines:

The Chinese start-up, DeepSeek, surprised the tech industry with a new model that rivals the abilities of OpenAI’s most recent model—with far less investment and using reduced-capacity chips.
The U.S. has already imposed bans on exports of state-of-the-art computer chips to China and allowed limited sales of chipmaking equipment.
DeepSeek, based in the eastern Chinese city of Hangzhou, reportedly had a stockpile of high-performance Nvidia A100 chips from times before the ban—so its engineers could have used those to develop the model.
But in a key breakthrough, the start-up says it instead used much lower-powered Nvidia H800 chips to train the new model, dubbed DeepSeek-R1.
We’ve seen up to now that the success of large tech companies working in AI was measured in how much money they raised, not necessarily in what the technology was.
DeepSeek makes legitimate breakthroughs as an AI tool, including better learning and more efficient use of memory, although it expressed skepticism about the number of chips used.

DeepSeek (DS) is 100% owned by an AI-driven quant fund in China, High-Flyer, and it is an artificial intelligence lab.
DeepSeek built the model using reduced capability chips from Nvidia. This is impressive and thus has caused major agita for U.S. tech stocks with massive pressure on the Nasdaq composite index on January 27, 2025.

Developed with remarkable efficiency and offered as open-source resources, these models challenge the dominance of established players like OpenAI, Google, and Meta.
DeepSeek's innovative techniques, cost-efficient solutions, and optimization strategies have had an undeniable effect on the AI landscape.

While DeepSeek has achieved remarkable success in a short period, it's important to note that the company is primarily focused on research and has no detailed plans for widespread commercialization shortly.

The company's AI app is available in Apple's App Store, as well as online at its website. The service is free and as of Monday (January 27, 2025), morning was the top download on Apple's store, although some people were having trouble signing up for the app.

DeepSeek’s research paper raised questions about whether big U.S. companies could maintain a significant lead in AI.
Many experts believe that AI technology will become a commodity, with many companies selling the same product.

In the long run, that could put China at the heart of AI research and development, which could further accelerate its effort to build a wide range of AI technologies, including autonomous weapons and other military systems.

On common AI tests in mathematics and coding, DeepSeek-R1 matched the scores of Open AI’s o1 model, according to VentureBeat. But OpenAI CEO Sam Altman told an audience at MIT in 2023 that training ChatGPT-4 cost over $100 million.
DeepSeek-R1 is free for users to download, while the comparable version of ChatGPT costs $200 a month.
DeepSeek’s $6 million number doesn’t necessarily reflect the cost of building an LLM from scratch; that cost may represent a fine-tuning of this latest version.
U.S. companies also don’t disclose the cost of training their Large Language Models (LLMs), the systems that undergird popular chatbots such as ChatGPT.
Nevertheless, the model’s improved energy efficiency would make AI more accessible to more people in more industries.
The increase in efficiency could be good news when it comes to AI’s environmental impact, as the computation cost of generating new data with an LLM is four to five times higher than a typical search engine query.
Because it requires less computational power, the cost of running DeepSeek-R1 is a tenth of the cost of similar competitors, says Hanchang Cao, an incoming assistant professor in Information Systems and Operations Management at Emory University. “For academic researchers or start-ups, this difference in the cost means a lot,” Cao says.
One of the big things has been this divide that has opened up between academia and industry because academia has been unable to work with these really large models or do research in any meaningful way. But something like this, it’s within the reach of academia now, because you have the code.
It's also unclear what type of pushback or reaction could come from the White House, given that Mr. Trump has raised the possibility of placing new tariffs on Chinese imports, although he also gave the Chinese-owned TikTok a reprieve by ordering the Justice Department not to enforce a looming ban.
If the model is as computationally efficient as DeepSeek claims, he says, it will probably open up new avenues for researchers who use AI in their work to do so more quickly and cheaply. It will also enable more research into the inner workings of LLMs themselves.
Innovative techniques, combined with DeepSeek’s focus on efficiency and open-source collaboration, have positioned the company as a disruptive force in the AI landscape.

Disclaimer:

The best efforts are made to provide updated and authentic information through this blog. However, the author does not take any responsibility (legal or otherwise) for its correctness and completeness. The data is collected from various open sources after analyzing the following websites, and the author is not responsible for any differences and/or discrepancies in data. This blog is not AI-generated and is typed manually, therefore, any typographical error is regretted.

https://www.usatoday.com

https://upstox.com/

https://www.reuters.com/

https://edition.cnn.com/

https://apnews.com/

https://www.nytimes.com

https://www.forbes.com

https://www.scientificamerican.com

https://www.cbsnews.com

As per some experts, DeepSeek R1 is one of the most amazing and impressive breakthroughs I've ever seen, and as an open source, it is a profound gift to the world.

***The End***

Arvind Kumar

About author:Born on 05 December 1961.Bachelor's degree (Graduation) in Physics, Chemistry, and Mathematics.Master's degree (Post-Graduation) in Mathematics with Statistics.Attended several courses/trainings at the Graduation and Post-Graduation levels in Electronics and Communication Engineering.Ex. Under Secretary (Government of India).41 years of rich experience in the telecommunication field.Prepared and delivered thousands of lectures to the Master (Post graduation) Level aspirants. Currently, approved by Indian Government body and authorized by HDFCLife insurer to sell and adviser for life insurance policies. Please contact on WhatsApp numbers 98994423601, or 9971797791 or apcsitbranju@gmail.com

Search This Blog

Featured Post

Learn how to manage and customize the best investment strategy for a Rs. 1 crore+ retirement corpus. Distribute the funds into bank deposits, tax-free guaranteed returns with a whole-life pension, and life cover to secure handsome returns for children's education, marriage, or business setups.