Meituan Releases 560B Parameter LongCat-Flash Model, Outperforms DeepSeek in Inference Speed

In a significant development in China's AI landscape, Meituan has recently released and open-sourced its first Mixture-of-Experts (MoE) model—LongCat-Flash. This massive 560 billion parameter model demonstrates remarkable speed advantages and represents a major milestone in China's open-source AI capabilities.

Technical Breakthrough

LongCat-Flash represents a significant achievement in large language model development, showcasing several innovative features that set it apart from competitors:

Zero-Computation Expert Mechanism

One of the key innovations in LongCat-Flash is its "zero-computation" expert mechanism. This approach dynamically allocates computational resources, only applying full processing power to complex tokens while conserving computational resources for simpler inputs. This intelligent resource management system significantly improves efficiency and reduces costs.

Unprecedented Inference Speed

The model achieves impressive inference speeds of over 100 tokens per second on H800 hardware for single users. The theoretical output time per token is reduced by nearly 50% compared to DeepSeek-V3. In practical tests, LongCat-Flash delivers responses in under 2 seconds, demonstrating its real-world applicability for time-sensitive applications.

Key Performance Metrics

• 100+ tokens/second inference speed on H800
• Under 2-second response time
• 50% faster per-token output vs DeepSeek-V3
• 0.01-second theoretical per-token output time

Benchmark Performance

LongCat-Flash has demonstrated exceptional performance across multiple benchmarks:

ArenaHard-V2: Scored 86.50 points, ranking second among all evaluated models and surpassing DeepSeek-V3.1
Programming Capability: Achieved 39.51 points in TerminalBench, second only to Claude 4 Sonnet (40.7 points)
Instruction Following: Ranked first with 89.65 points, demonstrating superior ability to understand and execute complex instructions

Technical Infrastructure

According to Meituan's technical report, LongCat-Flash was trained on a massive cluster containing tens of thousands of accelerators. While it's not yet confirmed whether the training was conducted using NVIDIA GPUs, there are indications that domestic chips may have been utilized, highlighting China's progress in AI hardware independence.

ScMoE Architecture

The Shortcut-connected Mixture-of-Experts (ScMoE) architecture enables the model to achieve theoretical per-token output times of just 0.01 seconds, reaching 100 tokens per second. This architectural innovation represents a significant advancement in large-scale model efficiency.

Product Features and Roadmap

Currently, LongCat-Flash has launched with web search generation capabilities, while its "deep thinking" feature is marked as "coming soon" on the interface. Meituan has announced several strategic initiatives:

LongCat Developer Program: Will provide computing power subsidies for high-quality projects
Enterprise API Services: Priced 30% lower than market average to encourage adoption
Open Source Strategy: Making the model accessible to the broader developer community

Meituan's AI Strategy

This release is part of Meituan's broader AI strategy, which CEO Wang兴 has outlined as having three layers: "AI at Work, AI in Products, Building LLM." The Building LLM strategy involves tens of billions of dollars in investment for GPU procurement and proprietary foundation model development.

"AI will disrupt all industries, and our strategy is to take the initiative rather than passively defend." - Wang兴, Meituan CEO

Financial Commitment

In June 2025, Wang Pu, CEO of Meituan's Core Local Commerce division, announced annual AI investments exceeding 10 billion yuan. The Q2 2025 financial report showed R&D investment reaching 6.3 billion yuan, a 17.2% year-over-year increase, primarily focused on AI and unmanned delivery technology.

Market Context

Meituan's entry into the large model space comes during a challenging period for the company. Recent financial reports show operating profit declining to 226 million yuan, a 98% decrease year-over-year. The Core Local Commerce division's operating profit was only 3.7 billion yuan, down 75.6% from the previous year.

The food delivery market is experiencing intense competition, with various platforms using large subsidies to attract users. Following regulatory intervention in mid-July, subsidy strategies have been adjusted. Meituan has identified AI as one of its key breakthrough areas for new business development.

Industry Impact

Throughout this year, Meituan has continuously released vertical applications including NoCode, Kangaroo Consultant, and Meituan Jibai, culminating in the open-source release of its self-developed large model. This demonstrates the company's sustained investment in the AI field.

However, following the open-source model announcement, Meituan's stock price showed no significant fluctuation, indicating that the capital market is taking a wait-and-see approach to the company's large model initiatives.

Future Outlook

From Meituan's current layout, it's clear that the company views large models as a "necessary choice" rather than an optional investment. With substantial financial backing and a clear strategic vision, Meituan is positioning itself as a serious contender in China's AI landscape.

The success of LongCat-Flash could pave the way for more Chinese tech companies to invest in large-scale AI development, potentially reshaping the global AI competitive landscape and reducing dependence on foreign technology.

Looking Ahead

As Meituan continues to develop its AI capabilities, the industry will be watching closely to see how LongCat-Flash performs in real-world applications and whether it can maintain its competitive edge against established players like DeepSeek and international models.