Wall Street’s Arms Race for GPUs
Introducing AI Street Chat
Hey, it’s Matt. Welcome back to AI Street. This week:
The AI Arms Race on Wall Street
Interview: Robi Krempus on Manulife’s AI Platform Strategy
Research: IBM’s Small Models Pass Audit Check + News
AI Street Subscriber Chat
Since I started AI Street, my goal has been to report how Wall Street is actually deploying AI. Today, I’m adding a more direct way to track and discuss those developments: Subscriber Chat.
This is AI Street's first paid offering. It’s for professionals who want to compare notes on how firms are deploying AI inside research, trading, and operations.
Inside Subscriber Chat, you’ll find:
News: Curated news items that don’t make it into the weekly newsletter
Research: New AI and finance papers I’m tracking
Assignment Desk: Suggest topics, companies, or tools you want covered
Peer Notes: Compare workflows with other subscribers inside financial firms
Sentiment Polls: Quick polls to see where peers stand on new trends
AI Street Chat is available for $12.5/month with an annual plan, or $17 monthly, and also includes the full AI Street archive: 18+ months of interviews, analysis and reporting on AI in Finance.
Only the most recent edition is available without a subscription.
ANALYSIS
Wall Street’s Arms Race for GPUs
Wall Street’s use of AI is splitting in two. One layer focuses on productivity and efficiency. The other is a quiet, capital-intensive effort to build models that learn markets directly.
The first layer is becoming ubiquitous. Hedge funds, banks, and private equity firms are deploying large language models across existing workflows, using them to accelerate research, summarize documents, surface ideas, and automate internal knowledge sharing. This layer is not perfect and is still evolving, but it’s clear that AI-for-productivity is here to stay on Wall Street.
The second use of AI looks very different. A smaller group of firms is training transformer-based models on proprietary financial data, with the goal of modeling markets directly treating trading data as a language. I reported early this month that Hudson River Trading is training large-scale sequence models on decades of global market data, applying techniques similar to those used in frontier language models.
To recap: HRT’s researchers are working with more than one hundred terabytes of market microstructure data, amounting to trillions of tokens, and have found that predictive accuracy keeps improving as the model and dataset scale up. This mirrors the scale of datasets used to train frontier language models like GPT-4. The firm operates its own data center and has purchased enough GPUs to briefly strain supply. HRT generated an estimated $12.3 billion in net trading revenue in 2025, according to Bloomberg—record results for the firm. Of course, HRT hasn’t detailed to what extent it’s relying on transformer-based models.
More reporting and job posts suggest HRT is not alone, though no firm is really going to spell out their plans publicly.
The costs of these two approaches are billions apart. Pre-training large models requires orders of magnitude more computing power than running them once trained.
As a result, access to compute is a real hurdle. Firms with capital to secure GPU clusters can train proprietary models continuously; those without are at a disadvantage. At least one large hedge fund has gone directly to hyperscalers to try to secure any unused GPU capacity, according to people familiar with the discussions.
XTX exemplifies this shift. The market maker is investing more than €1 billion to build a data‑center campus in Finland. The first 22.5‑MW facility is slated to come online in 2026, adding to the firm's existing research cluster that has more than 25,000 GPUs and 650 petabytes of storage. This amount of computing power rivals the dedicated clusters used by big tech to train frontier models.
Its chief technology officer says owning the infrastructure allows the firm to deploy increased computing power on its own terms.
“Our need for compute has outgrown available leasing options. We are building ahead of our needs to establish a backbone for future growth of the business,” said Joshua Leahy, XTX Markets CTO.
In this regime, capital and infrastructure matter as much as data and talent. Hiring the best quants to engineer trading signals is increasingly being displaced by the ability to build and run models at scale.
As the FT's Alphaville noted this week, quant shops and AI labs run almost identical pipelines: data, models, constraints, execution. High-Flyer, the hedge fund behind DeepSeek, just swapped next-price prediction for next-token prediction.
This pattern is already familiar outside finance. Transformer-based models have displaced earlier approaches in areas like weather forecasting and other complex prediction problems, not because they were more elegant, but because end-to-end learning scaled better once sufficient data and compute were available.
These firms won't say whether their transformer models are generating alpha. But you don't spend €1 billion on a data center or buy enough GPUs to strain regional supply for an experiment. These are bets on a specific future where foundation models become core infrastructure for trading, or at minimum, a conviction that compute capacity will remain an edge.
INTERVIEW
Manulife’s Robi Krempus on Adopting AI Early
When generative AI began gaining traction on Wall Street, many firms responded cautiously, often firewalling off the technology from its employees. At Manulife Investment Management, the reaction was different. After years of investing in cloud, data, and machine learning infrastructure, the firm moved early to establish an AI framework across the organization, building governance, risk controls, and a process for prioritizing use cases.
I recently interviewed Robi Krempus, who leads AI for global wealth and asset management, which has $1.3 trillion in assets under management and administration. Earlier in his career, Krempus was a control systems engineer in the energy sector, working on nuclear power thermodynamics and other high-stakes modeling problems. That background now shapes his role overseeing Manulife’s AI platform. Rather than committing to a single vendor, his team has built a model-agnostic framework that allows the firm to move between providers such as OpenAI, Anthropic, and Google as the technology evolves.
In our chat below, Krempus explains why Manulife moved quickly, how his team co-designs tools with portfolio managers, and why the firm shifted from project-based experimentation to a platform strategy. He also discusses how Manulife evaluates large versus small language models, how it manages tech debt as models change, and where AI is already proving useful, particularly in extracting qualitative signals from standard financial disclosures.
This interview has been edited for length and clarity.
Matt: Manulife moved quickly when generative AI first emerged. At the time, many big firms were banning or firewalling it. What drove that decision?
Robi: We truly saw the opportunity. We had already established a strong data science and machine learning community at Manulife. When you build traditional machine learning models, it is often about forecasting or predicting a variable, which still requires a massive infrastructure. When generative AI came into the mix, we quickly understood that this is much broader and will impact everything—decision-making and how you think about intelligence. Organizationally, we saw the opportunity, and with the CTO, CIO, and Chief AI officers, we were certain this technology was not going to go away. Unlike emerging technologies like blockchain that take time to embed, it was quite apparent that this would be transformational.
Matt: How do you architect this technology? How do you organize it to get started?
Robi: In asset management, there were three ingredients where we believed this would really make a difference. One is strong leadership support. We had huge support from Colin Purdie, the Global Chief Investment Officer for Public Markets, and his leadership team. Secondly, we co-designed solutions with the investment professionals. We have CFAs on my team, but we are not managing the money; the investment professionals are. That co-design allowed us to tackle specific pain points together.
Thirdly, our mindset shifted from being project-based to a platform mindset. We wanted to establish a platform so that whenever we have an additional use case, we can give AI to the end user through that platform. We have seen adoption over 70%, and we hold weekly office hours where investment professionals can stay on top of new features.
Matt: Some firms use various models as an engine and build an application layer on top. Can you walk me through your thinking on building those applications?
Robi: From a Manulife perspective, we have a robust model risk management process in place. Before anything goes into production, it is vetted against hallucinations and quality. In working with investment professionals, quality matters a huge deal. If the LLMs do not produce an output that hits the investment context, it will not work. We architected our AI with feedback loops and tested various systems to increase output quality and reduce hallucinations. It is not a straight-through process to a reasoning model; spending time on the AI architecture to increase quality was really impactful.
Matt: Are you agnostic to the model? Can you swap different models in and out of your infrastructure?
Robi: Yes. That goes back to the ten years of investment we put into infrastructure and cloud. What is amazing now is the availability of all these models. Even when OpenAI released 3.5, we had access to it quite fast. The idea was to create a data framework that allowed us to productionalize models in a responsible way. We have a broad lineup available, whether it is OpenAI…
RESEARCH
A problem with using AI in professional work is that you can’t reliably get it to give you the same answer twice.
New research from IBM suggests this is less of an issue with smaller language models. Small here is relative. We are still talking about models with billions of parameters, not trillion-parameter systems.
As previously documented on AI Street, IBM researchers showed that 7–8 billion parameter models can produce identical outputs across repeated runs under controlled settings. New work extends this idea further, asking whether automated decisions can be reproduced months later.
This is the scenario supervisors care about. If a regulator knocks on your door and says: six months ago your system flagged this transaction, rejected this trade, or escalated this client. Can you show me, today, how the system reached that decision?
For an LLM agent, that means more than reproducing the final answer. It means reproducing the sequence of tool calls, the data retrieved, the intermediate calculations, and the evidence cited.
IBM created a framework that checks whether the AI’s output is consistent and whether its decisions are grounded in the underlying data.
The new paper evaluates 74 configurations across 12 models and several agent architectures. The result is consistent with the earlier work. In controlled settings, 7 to 20 billion parameter models reach near 100 percent reproducibility across runs. Frontier models continue to produce inconsistent outputs even under aggressive constraints.
Two points stand out.
First, how you build the agent matters as much as which model you use. Systems with tightly structured outputs are much more likely to behave consistently than general agents, even when they run on the same underlying model.
Second, inconsistent answers become expensive. When a system behaves differently across runs, you have to test it many more times just to satisfy validation and model risk. In the authors’ estimates, the least stable systems need about four times as much testing. For many banks, that is simply too slow and too expensive.
For now, the research suggests that smaller, more stable models paired with narrow agent designs are more likely to meet audit requirements today than their bigger counterparts.
ICYMI
NEWS
FCA To Review AI in Financial Services
The UK’s financial services watchdog has launched a review exploring the implications of advanced AI on consumers, retail financial markets, and regulators.
The Financial Conduct Authority (FCA) is seeking views on how AI could evolve in the future, including the development of more autonomous and agentic systems, and how these redevelopments could impact changes to competition and market structure. Hogan Lovells
Only 12% of U.S. workers say they use AI daily
If you’re reading this newsletter, you’re probably an early AI adopter. You’ve tried Claude, ChatGPT, Gemini, etc. I’m always using them so that’s that makes this AP story helpful. It shows how many people are still not in that category.
This seems too low to me, but maybe I’m too entrenched in the AI world:
Some 12% of employed adults say they use AI daily in their job, according to a Gallup Workforce survey conducted this fall of more than 22,000 U.S. workers.
Sequoia leads Rogo raise at $750M valuation
Rogo, a maker of AI agents for dealmakers and investors, raised $75 million in Series C funding at a $750 million valuation led by Sequoia Capital, according to Axios.
“Everyone at Two Sigma is expected to use LLMs to accelerate their work and consider workflow improvements for their teams.”
Jeff Wecker, CTO, in AI in Investment Management
ROUNDUP
What Else I’m Reading
H2O AM launches Global Macro Fund supported by AI tools Hedgeweek
Gerko’s XTX Notches 50% Jump in European Bilateral Stock Trading BBG
Why AI won’t wipe out white-collar jobs The Economist
Quant funds have had a tough start to the year Business Insider
Quant Jobs vs Research Jobs in an AI lab eFinancialCareers
Two Sigma hires top Goldman Sachs AI technologist eFinancialCareers
What to Expect From AI in 2026 Goldman’s Marco Argenti
GCM Grosvenor Taps Model ML for AI Investing Press Release
BofA struggled to deploy Nvidia AI software, emails show Business Insider
Citadel’s Umesh Subramanian is out as CTO after 7 years Business Insider
SPONSORSHIPS
Reach Wall Street’s AI Decision-Makers
Advertise on AI Street to reach a highly engaged audience of decision-makers at firms including JPMorgan, Citadel, BlackRock, Skadden, McKinsey, and more. Sponsorships are reserved for companies in AI, markets, and finance. Email me (Matt@ai-street.co) for more details.
CALENDAR
Upcoming AI + Finance Conferences
CDAO Financial Services – Feb. 18–19 • NYC
Data strategy and AI implementation in the financial sector.
Future Alpha – Mar. 31–Apr. 1• NYC
Cross-asset investing summit focused on data-driven strategies, systematic investing, and tech stacks.AI in Finance Summit NY – Apr. 15–16 • NYC
The latest developments and applications of AI in the financial industry.
Momentum AI New York – Apr. 27–28 • NYC
Senior-leader forum on AI implementation across financial services, from operating models to governance and execution.AI in Financial Services – May 14, 2026 • Chicago
Practitioner-heavy conference on building, scaling, and governing AI in regulated financial institutions.AI & RegTech for Financial Services & Insurance – May 20–21 • NYC
Covers AI, regulatory technology, and compliance in finance and insurance.









Interesting that the core bet here is treating market data as a 'language' to learn from scratch. What I'm curious about is the trade off: pre-training from zero vs. continued pre-training on an open-source base, then fine-tuning for trading objectives. The compute costs are orders of magnitude apart, but perhaps I'm failing to consider that HFs can go through expensive training runs so...
Great article.
Fascinating dive into the compute arms race. The parallel between quant shops and AI labs is esp apt, both essentially running prediction optimization pipelines at scale. That XTX €1B data center investment signals this isn't experimental anymore, its infrastructural. The point about smaller firms lacking GPU access being at a real disadvanatge makes sense when training becomes teh edge itself.