Treating Trading Data As "Language"
On building frontier models for finance
RESEARCH
Treating Limit Order Books as a “Language”
LLMs are prediction machines. They excel by forecasting the next token, typically, thought of as the next word.
But you can train the same transformer architecture on the weather, genes and financial transactions if you have a large enough dataset.
And there’s growing research that just as transformer-based models excel at language, they can excel in other domains.
You just need a lot of computing power and terabytes of data. And financial markets are brimming with these data streams.
I recently spoke with Juho Kanniainen, a professor at Tampere University’s Data Science Research Centre, on how he and his colleagues trained a transformer-based model on limit order book messages with superior predictive results.
We talked about his new pre-print, “LOBERT: Generative AI Foundation Model for Limit Order Book Messages,” co-authored with Eljas Linna, Kestutis Baltakys, and Alexandros Iosifidis and presented at QuantMinds and the NeurIPS Workshop…



