OpenAI releases new o1 model technology products

19 9 月, 24

2024-09-192024-09-19bernieBlog

OpenAI recently released a new model technology product o1, including two versions: O1-Preview and O1-Mini. o1-preview has advanced reasoning function, significantly improve the ability in mathematics, programming, science and other issues, and the performance is close to the doctoral level of chemical students; The o1-mini is a smaller model optimized for code generation. This product is the result of the previously rumored advanced inference powerful model “Strawberry” project, and some analysts say that o1 is short for Orion Grand Model.

OpenAI said that o1 represents a new level of AI capabilities, especially breakthroughs in complex reasoning tasks, so the count is reset to 1, giving it a new name, different from the “GPT-4” series. This also heralds a new starting point in the era of AI – the important arrival of general complex reasoning large models.

It should be noted that the current chat experience of o1 is still relatively basic. Unlike its predecessor, the GPT-4o, the o1 is currently unable to browse the web or handle file analysis functions. Although image analysis is available, this feature is temporarily turned off pending further testing. In addition, o1 has a message limit – currently 30 messages per week for O1-Preview and 50 messages per week for O1-Mini.

OpenAI CEO Sam Altman said, “This is our most powerful and consistent model family o1 to date, and our best inference model to date. While the o1 is still flawed and limited, it still feels more impressive to use.” Specifically, OpenAI o1 can solve more difficult problems than the scientific, coding, and mathematical models of the previous GPT era.

Jerry Tworek, head of research at OpenAI, revealed that the training behind the o1 model is fundamentally different from previous products. Whereas previous GPT models were designed to mimic patterns in their training data, o1 was trained to solve problems independently. During reinforcement learning, reward and punishment mechanisms are used to “teach” AI to use “chains of thought” to deal with problems, just as humans have learned to disassemble and analyze problems.

OpenAI found that with more reinforcement learning (calculated while training) and more thinking time (calculated while testing), o1’s performance continued to improve. The limitations of extending this approach are quite different from those of pre-training large models, and OpenAI continues to study them.

In addition to o1-preview, OpenAI has introduced the o1-mini model, which is faster, cheaper and suitable for scenarios that require reasoning but do not require extensive world knowledge. For OpenAI, o1 represents a step toward its broader goal of human-like artificial intelligence. It does a better job of writing code and solving multi-step problems than previous models, but is also more expensive and slower to use than GPT-4o.

OpenAI plans to offer o1-mini access to all free users of ChatGPT, but has not yet set a release date. Developer access to o1 is expensive: in the API, O1-Preview costs $15 per million input tokens and $60 per million output tokens. In comparison, GPT-4o costs $5 per 1 million input tokens and $15 per 1 million output tokens.

Jerry Tworek, OpenAI’s head of research, says the training behind o1 is fundamentally different from its predecessors. o1 has been trained using a completely new optimization algorithm and a new training dataset specifically tailored for it.

Overall, the introduction of o1 is an important milestone in the field of AI, which lays the foundation for the development of general complex reasoning large models, and also brings more possibilities for AI applications in various fields.