AI
DeepSeek Unveils DSpark Framework Boosting AI Response Speed by 85%
Chinese company DeepSeek introduces DSpark, a new framework that accelerates AI model response times by up to 85% without relying on the latest AI chips.

Chinese firm DeepSeek has revealed a new framework named DSpark, which it claims can enhance the response speed of artificial intelligence models by as much as 85%. This improvement does not require dependence on the latest AI chips, potentially reducing the operational costs of large AI models.
This development emerges amid growing challenges faced by AI companies in securing sufficient computational power to run models, especially under U.S. restrictions limiting Chinese firms' access to advanced chips from manufacturers like Nvidia.
Predictive Decoding Technique
DeepSeek explained that the DSpark framework utilizes a method called predictive decoding. In this approach, a smaller, faster model first proposes a response, which the main model then reviews and verifies in a single batch, rather than generating every segment of the answer from scratch.
This mechanism allows the system to skip multiple steps when predictions are accurate, significantly reducing response times. All processing occurs on the graphics processing unit (GPU), with no tasks offloaded to the central processing unit (CPU).
The system also employs a technique that generates small text segments in batches instead of producing each text unit separately, contributing further to faster responses.
Open Source Collaboration and Testing
DeepSeek has released DSpark research as an open-source project in partnership with Peking University, available on GitHub and Hugging Face platforms. The company emphasized that the technology does not enhance the model's inherent capabilities but improves operational efficiency and reduces the need for additional computational infrastructure investments.
The company reported testing the new framework on several open-source models, including Google's DeepMind's Gemma and Alibaba's Qwen, indicating the potential for broader application of the technology.
Industry Context and Cost Considerations
The announcement coincides with rising global expenditures on AI data centers. Companies such as Uber and Walmart have started imposing restrictions on employee use of AI tools due to the high costs associated with processing unit consumption.
DeepSeek previously launched in April an open-source V4 Preview version designed as a low-cost option for handling contexts up to one million tokens. It also offers a V4-Pro version for high performance and a V4-Flash edition aimed at faster, more cost-effective responses.
DeepSeek is not alone in efforts to accelerate AI model responses. Recently, Xiaomi announced that its MiMo-V2.5-Pro-UltraSpeed model can generate over 1,000 tokens per second, placing it among the fastest in the industry.
Latest news
EconomyEuropean Gas Prices Rise to $503 per 1,000 Cubic Meters
WorldSyria's Intelligence Chief Highlights Expanded Threats Beyond ISIS and Commitment to Counterterrorism
WorldPakistan-Afghanistan Tensions Rise Following Cross-Border Strikes and Accusations
World Cup 2026
