← back

The rl data and fine-tuning market

April 13, 2026

Note: will update more of this note soon and I may also move it to a blog post on my personal website instead since this isn’t exactly a “research” topic

This will be a periodically updated note which contains some thoughts, analysis, and information on the RL market.

While it is fun to think about research ideas in RL, the method itself has had and will continue to have enormous economic impact. As such, in this note, we’ll try to make an understanding of all of this.

At a high level, here is what is happening:

Consequences and implications

This have several consequences and implications for what is happening. I denote some of them here and dedicate a subsection to a few of them.

Proprietary fine-tuned model as the moat

Knowledge workers become AI teachers

Increase in RL environment makers, RL-as-a-service fine-tuning shops, and data marketplaces

Products as a vehicle for data collection

Tech companies turn from product to consulting

The only moats that exist are data and distribution

Inference becomes coupled with fine-tuning

In order to have a continious, evolving loop in the real world, fine-tuners need to run their own inference solutions. For example, if a fine-tuned model is deployed in a real agentic system, customers will use it via inference and give the fine-tuners more data to train on. But in order for this loop to become autonomous, you need to

Business strategies

RL-as-a-Service

This industry deserves its own section because of the enormous impact it can have and its relevance in the modern AI landscape.

Target customers:

There are two ends of the spectrum:

Product based offerings

Examples:

These are traditional tech startup offerings: no forward-deployed/consultancy. Just pure, scale-able product. They make a product and sell it to end customers. Nothing fancy here.

Data marketplaces

These are firms which recruit experts to generate and label data for the labs. They act as a middle man between the expert and the ai research lab. They pay the experts and the ai research lab pays them.

Data brokers

These are firms that simply connect the buyers and the sellers. That is, they connect the startups trying to sell data to the labs, with research members from the lab.

Mapping the landscape

Here are a few players who operate in this space.

RL-as-a-Service

The last two seemingly operate as a product whereas the remaining ones seemingly operate as a consultancy.

Data collection marketplaces

Some questions I have