Alibaba's New AI Model Outperforms Leading Competitors

Alibaba's New AI Model Outperforms Leading Competitors
Source: Freepik - Frolopiaton Palm

Alibaba has unveiled its latest artificial intelligence model, Qwen 2.5-Max, which the company claims outperforms the current market leaders, including DeepSeek-V3, OpenAI’s GPT-4, and Meta’s Llama-3.

A Mixture-of-Experts (MoE) architektúrára épülő modellt több mint 20 billió tokenen tanították, majd felügyelt finomhangolással (SFT) és emberi visszajelzéseken alapuló megerősítéses tanulással (RLHF) fejlesztették tovább. A benchmarkokon kiemelkedő eredményeket ért el: az Arena-Hard teszten 89,4 pontot szerzett (szemben a DeepSeek-V3 85,5 pontjával), a LiveBench-en 62,2 pontot (DeepSeek-V3: 60,5), míg a LiveCodeBench-en 38,7 pontot (DeepSeek-V3: 37,6).

Forrás: https://qwenlm.github.io/blog/qwen2.5-max/

The Qwen 2.5-Max is now available via the Qwen Chat platform and for developers through Alibaba Cloud Model Studio, which is compatible with the OpenAI API. Alibaba plans further enhancements to the model’s reasoning and cognitive abilities by applying scaled reinforcement learning.

Sources:

Alibaba releases AI model it says surpasses DeepSeek Chinese tech company Alibaba released a new version of its Qwen 2.5 artificial intelligence model that it claimed surpassed the highly-acclaimed DeepSeek-V3.

2.

Qwen 2.5-Max: Features, DeepSeek V3 Comparison & More Discover Alibaba's latest AI model, Qwen2.5-Max, designed to compete with top-tier models like GPT-4o and DeepSeek V3.

3.

Qwen2.5-Max: Exploring the Intelligence of Large-scale MoE Model
QWEN CHAT API DEMO DISCORD It is widely recognized that continuously scaling both data size and model size can lead to significant improvements in model intelligence. However, the research and industry community has limited experience in effectively scaling extremely large models, whether they are dense or Mixture-of-Expert (MoE) models. Many critical details regarding this scaling process were only disclosed with the recent release of DeepSeek V3. Concurrently, we are developing Qwen2.