JPMorgan Taught AI the Language of Markets

Researchers apply the architecture behind ChatGPT to create a model that simulates market behavior.

Mar 31, 2026

∙ Paid

Much of the AI conversation is focused on the latest capabilities of Anthropic’s Claude or ChatGPT, which deserve our attention, but this is a narrow view of the power of the transformer breakthrough.

Manage Email Preferences

If you prefer to receive one weekly email with all AI Street content, turn off Research and Interviews here:

Manage How Often You Receive AI Street

The transformer breakthrough began in text, but researchers are adapting the architecture to other kinds of sequential data. With enough data, transformer models can learn the patterns of “language” in that dataset in ways that traditional models missed.

For example, AlphaFold is a transformer-based system trained on protein data to predict how proteins fold into their 3D shapes from amino acid sequences, which determine how they function. It effectively solved the protein folding problem. (The work later contributed to a Nobel Prize in Chemistry awarded to its creators, including Demis Hassabis, who is not a chemist.)

As I’ve written before, no one knows exactly how these models work. They’re grown rather than built, as the CEO of Anthropic likes to say. We didn’t know how aspirin worked for like 70 years, but we knew it was effective.

This brings us to a new paper from JPMorgan researchers, who trained a transformer model on market data.

The market as a language

Every buy, sell, order submission, or cancellation leaves a trace: what happened, how much size was involved, how far from the market mid-price it was placed, and when it occurred. Multiply that across thousands of stocks and millions of events per day, and you get a massive stream of sequential data.

TradeFM is a 524-million-parameter model trained on 10.7 billion training tokens drawn from more than 9,000 U.S. equities, using data spanning 368 trading days from February 2024 to September 2025.

Instead of predicting the next word in a sentence, their model — called TradeFM — predicts the next event in a sequence: its timing, size, price depth, and direction.

Trading data is messy. Stocks trade at different prices. A $5 move on a $2 stock is massive. A $5 move on one that’s $500 isn’t news.

If you feed those raw numbers into a model, it can’t really compare one stock to another, so it struggles to learn general patterns.

So the researchers adjusted the data before training. They expressed price-related features in relative terms, compressed volumes so large and small trades are easier to compare, and measured time as the gap between events.

That puts different stocks on a common scale, so moves are comparable whether it’s a $2 stock or a $200 stock.

They then discretized each event’s features and combined timing, price depth, volume, side, and action type into a single composite token. The result was a vocabulary of 16,384 trade event tokens.

Related Research

HRT Trains AI Models on Trading Data

Matt Robinson

Jan 15

Read full story

Treating Trading Data As "Language"

Matt Robinson

Jan 2

Read full story

What they found

The researchers tested the model inside a simulated exchange, where it predicts trades in a continuous loop. The resulting data reproduces core patterns seen in real markets, including clustered volatility and large price swings. Across 9 stocks, 3 liquidity tiers, and 9 months of held-out data, it matched those patterns 2 to 3 times more closely than a standard baseline known as a Compound Hawkes process.

What’s most interesting is that TradeFM’s behavior extends beyond the U.S. data it was trained on. JPMorgan tested the model, without any adjustments, on trading data from China and Japan, where market structure differs meaningfully. Japan uses batch auctions at the open. China imposes 10% daily price limits. Spreads in both markets are several times wider than in the U.S. Despite those differences, the model’s performance degraded only moderately. It had never seen these markets, yet it still captured their core dynamics.

The model appears to be learning structure that carries across markets.

Arman Khaledian, PhD, a former quant at Millennium and now CEO of Zanista AI, said: “That’s not a toy result. It means the model is picking up something real about how markets work at a structural level.”

Continue reading this post for free, courtesy of Matt Robinson.

Or purchase a paid subscription.