OpenAI Claims Its AI Models Were Used to Develop DeepSeek-R1: Report

OpenAI Accuses DeepSeek of Copying Its AI Models

OpenAI has reportedly accused DeepSeek of using its AI models to train their own, specifically the recently released DeepSeek-R1. According to a report by the Financial Times, OpenAI claims it has evidence that some users were leveraging its API to extract outputs, which were then used to train a competing model—suspected to be DeepSeek.

The Chinese AI company, DeepSeek, made headlines last week when it open-sourced its DeepSeek-R1 model on platforms like GitHub and Hugging Face. Interestingly, the reasoning-focused model outperformed OpenAI’s o1 models on several benchmarks, raising suspicions about how it was trained.

OpenAI Investigates, Blocks Accounts

OpenAI and its cloud partner Microsoft conducted an investigation and reportedly identified multiple accounts engaging in what it believes was “distillation”—a process of using AI-generated outputs to train another model. As a result, OpenAI says it has blocked access for these accounts.

In a statement to the Financial Times, OpenAI said, “We know China-based companies — and others — are constantly trying to distill the models of leading US AI companies.” The company emphasized that it is actively working with the US government to safeguard its advanced AI models from competitors and potential threats.

This latest controversy highlights the growing competition in the AI space, as companies race to develop more powerful models while grappling with concerns over intellectual property and fair use.

Notably, AI model distillation is a technique used to transfer knowledge from a large model to a smaller and more efficient model. The goal here is to bring the smaller model on par or ahead of the larger model while reducing computational requirements. Notably, OpenAI’s GPT-4 has roughly 1.8 trillion parameters while DeepSeek-R1 has 1.5 billion parameters, which would fit the description.

The knowledge transfer typically takes place by using the relevant dataset from the larger model to train the smaller model, when a company is creating more efficient versions of its model in-house. For instance, Meta used the Llama 3 AI model to create several coding-focused Llama models.

However, this is not possible when a competitor, which does not have access to the datasets of a proprietary model, wants to distil a model. If OpenAI’s allegations are true, this could have been done by adding prompt injections to its APIs to generate a large number of outputs. This natural language data is then converted to code and fed to a base model.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top