Daily Beirut
Edition·Independent — Beirut, Lebanon

AI

DeepSeek Unveils DSpark Framework Boosting AI Response Speed by 85%

Chinese company DeepSeek introduces DSpark, a new framework that accelerates AI model response times by up to 85% without relying on the latest AI chips.

··2 min read
DeepSeek Unveils DSpark Framework Boosting AI Response Speed by 85%
Share

Chinese firm DeepSeek has revealed a new framework named DSpark, which it claims can enhance the response speed of artificial intelligence models by as much as 85%. This improvement does not require dependence on the latest AI chips, potentially reducing the operational costs of large AI models.

This development emerges amid growing challenges faced by AI companies in securing sufficient computational power to run models, especially under U.S. restrictions limiting Chinese firms' access to advanced chips from manufacturers like Nvidia.

Predictive Decoding Technique

DeepSeek explained that the DSpark framework utilizes a method called predictive decoding. In this approach, a smaller, faster model first proposes a response, which the main model then reviews and verifies in a single batch, rather than generating every segment of the answer from scratch.

This mechanism allows the system to skip multiple steps when predictions are accurate, significantly reducing response times. All processing occurs on the graphics processing unit (GPU), with no tasks offloaded to the central processing unit (CPU).

The system also employs a technique that generates small text segments in batches instead of producing each text unit separately, contributing further to faster responses.

Open Source Collaboration and Testing

DeepSeek has released DSpark research as an open-source project in partnership with Peking University, available on GitHub and Hugging Face platforms. The company emphasized that the technology does not enhance the model's inherent capabilities but improves operational efficiency and reduces the need for additional computational infrastructure investments.

The company reported testing the new framework on several open-source models, including Google's DeepMind's Gemma and Alibaba's Qwen, indicating the potential for broader application of the technology.

Industry Context and Cost Considerations

The announcement coincides with rising global expenditures on AI data centers. Companies such as Uber and Walmart have started imposing restrictions on employee use of AI tools due to the high costs associated with processing unit consumption.

DeepSeek previously launched in April an open-source V4 Preview version designed as a low-cost option for handling contexts up to one million tokens. It also offers a V4-Pro version for high performance and a V4-Flash edition aimed at faster, more cost-effective responses.

DeepSeek is not alone in efforts to accelerate AI model responses. Recently, Xiaomi announced that its MiMo-V2.5-Pro-UltraSpeed model can generate over 1,000 tokens per second, placing it among the fastest in the industry.

Add Daily Beirut to your Google News feed to get the latest first.
Share