Deepseek was shot up by Openai

Just over a week since Deepseek improve the world. The introduction of its open weight model, clearly trained in part of Chip computing That the leaders of the power industry have given shocking waves inside Openai. The staff not only announced that the suggestions that Deepseek had the models of Openai who were distilled improperly to create their own success, but the success of the startup had asked the Wall Street. Do Openai companies are extremely exceeded when calculating or not.

“Deepseek R1 is AI's Sputnik moment, writing Marc Andressen, one of the most influential and provocative inventors of Silicon Valley, on x.

In response, Openai is preparing to launch a new model today, before the schedule was planned. Model, O3-mini, will be released in both API and Chat. The sources say it has an O1 level theory at a speed of 4o. In other words, it is fast, cheap, smart and designed to crush Deepseek.

The moment of galvanized Openai staff. Inside the company, there is a feeling that, especially Deepseek dominates the conversation, Openai must become more effective or at risk of being lagged behind the latest competitor.

Part of the problem stems from the origin of Openai as a non -profit research organization before becoming a profit -seeking power. A performance struggle between research groups and products, the staff declared, led to the crack between the groups working on advanced theory and those who work on chat. .

Some of the Openai parties want the company to build a unified chat product, a model that can indicate whether a question has advanced theoretical requirements. So far, that has not happened. Instead, a drop-down menu in chatgpt will remind users to decide whether they want to use GPT-4O (great for most questions or not) or O1 (using advanced reasoning).

Some employees believe that while conversation brings the revenue sharing of Openai, O1 received more attention and computing resources from the leadership. Leaders are not interested in conversation, a former employee who worked (you guess it). Everyone wants to work on O1 because it is sexy, but the code is not built for testing, so there is no motivation. The former employee asked to be anonymous, quoting an unrelenting agreement.

Openai has spent many years testing for reinforcing learning to tweak the model that has finally become an advanced theoretical system called O1. . They benefit from knowing that learning to consolidate and apply to language models and activities, he said a former Openai researcher was not allowed to talk publicly about the company.

Deepseek did the same as what we did at Openai, he said another former Openai researcher, but they did it with better data and cleaner stacks.

Openai's employees said the study was entered into an O1 which was conducted in a code basis, called the Berry Berry stack, built to get the speed. There is a trade -off of strictness, an old employee with direct knowledge about this situation.

These trade -offs are meaningful to O1, basically a huge test, limitations on the code basis. They do not make much sense for chatting, a product used by millions of users built on another, more reliable stack. When O1 launches and becomes a product, cracks begin to appear in the internal processes of Openai. The staff explained: 'Why do we do this in the test code basis, we should not do this in the main product research basis?' Explanation staff. There is a big response to that.

Source link

Leave a ReplyCancel Reply