However, the true price of developing new models of Deepseek is still unknown, because a number cited in a single study may not grasp the full picture of its cost. . I didn't believe it was $ 6 million, but even $ 60 million, it was a game change. This will put pressure on the profits of companies that focus on AI consumers.
Immediately after DeepSeeK revealed details about their latest model, Ghodsi of Datababks said that the customer began to ask if they could use it or not as well as the basic techniques of Deepseek to cut costs at the cost at the cost at the cost at the cost. their organizations. He added that an approach used by Deepseek engineers, called distillation, is related to the use of outputs from a large language model to train another, relatively cheap model. And simple.
Padval said that the existence of models like DeepSek will eventually benefit companies seeking less to spend for anyone, but he said that many companies can book a reservation on the medium model. Quoc for sensitive tasks. So far, at least one company stands out, confused, publicly announced It is using Deepseek's R1 model, but it says it is being organized completely independent of China.
Amjad Massad, CEO of Replit, a startup that provides AI encryption tools, telling Wired that he thinks that the latest Deepseek's models are impressive. Although he still found Anthropic's SonNet model better in many computer technical tasks, he discovered that R1 was especially good at turning text commands into code that could be executed on the computer. Calculate. We are exploring by using it especially for the agent's reasoning, he added.
The latest two services of Deepseek, Det Deepseek R1 and Deepseek R1-Zero, are likely to have the same type of simulation theory like the most advanced systems from Openai and Google. It all works by dividing problems into components to solve them more effectively, a process that requires a significant amount of additional training to ensure that the reliable one has achieved the answer. Exactly.
ONE paper Posted by DeepSeeK the researchers last week sketched the approach that the company used to create R1 models, which they claimed to be done on some of the breakthrough theoretical models of Openai Called O1. Deepseek tactics are used to include a more automatic method to learn how to solve problems correctly as well as the strategy to transfer skills from larger models to smaller models.
One of the hottest topics of speculation about Deepseek is the hardware that it may have used. The special question is noticeable because the US government has introduced a series Export control And other trade restrictions over the past few years to limit the ability to get and produce advanced Chinese chips necessary to build AI advanced.
In a Research documents Since August 2024, Deepseek pointed out that it has access to a cluster of 10,000 NVIDIA A100 chips, placed under us limit published in October 2022. In a one Separate Since June of that year, Deepseek has stated that a previously created model called DeepSeeK-V2 has been developed by using Nvidia H800 computer chips, a less likely components to be developed by NVIDIA. Develop to comply with US export control.
A source at a company that trains large AI models, the people who request anonymous to protect their professional relationships, estimate that DeepSeeK is capable of using about 50,000 NVIDIA chips to build. my technology.
Nvidia declined to comment directly about the chip that Deepseek could rely on. A Nvidia spokesman said a Nvidia spokesman said in a statement, adding that the startup's theoretical approach requires a significant amount of NVIDIA GPU and high -performance network.
However, Deepseek's models have been built, they seem to show that a less closed approach to developing AI is gaining motivation. In December, Clem Delangue, CEO of Huggingface, a foundation for organizing artificial intelligence models, predict that A Chinese company will lead in AI because of the innovation rate that occurs in open source models, which China has accepted mostly. This went faster than I thought, he said.