GPT-5 Underwhelms Wall Street

Hey, it’s Matt. This week on AI Street:

📊 Wall Street Shrugs with GPT-5 Rollout

🏦 LLMs + OCR and the future of data extraction

💵 News Roundup: The AI demand for quants

Forwarded this? Subscribe here. Join readers from McKinsey, JPMorgan, BlackRock & more.

OPENAI

GPT-5 Underwhelms Wall Street

When OpenAI rolled out GPT-5 last week, some expected a repeat of earlier releases that set the bar for the entire market. The reality has been more ‘Meh.’

“This was an incremental model update, vs transformative change,” said Raj Bakhru, CEO of Blueflame AI, recently acquired by Datasite. While GPT-5 improves results for existing users, he said, “it’s far below the original hype around what it would be” and was “not definitively the best model out there” at launch.

What we often see with new foundational model releases is that, while they tout the best benchmark scores and improve meaningfully in a number of areas, they also often regress in a number of areas. We’re identifying those as we speak.

Maithra Raghu, CEO of Samaya AI, which builds customizable AI agents for financial analysis, said there were “no major change in quality, accuracy,” and flagged quirks like the new automatic “thinking mode.”

“We actually find the automatic routing to "thinking mode" difficult to work with, as GPT-5 will unpredictably spend a long time thinking and still produce a poor answer.”

On benchmarking, Raghu warned that public leaderboards “quickly get overfit to, or ‘gamed.’” Samaya AI instead uses their own eval framework to test for real-world tasks.

Some upgrades stood out to Jeremy Leung, a fintech and AI consultant: less hallucination, no need to pick a model, and a more “enterprise-friendly” approach. “It probably agrees less with you, which is great as an investor,” he said.

Every time OpenAI or Anthropic or DeepSeek launches a new model, I reach out to experts to find out if it’s actually better — because it’s almost impossible to tell from public benchmarks. This is not great for broader AI adoption.

Most tests can be gamed or overfit, and the “X% better than before” claims rarely translate cleanly to real workflows.

Wall Street has no AI standards body, like an ISDA for credit default swaps or a FASB for accounting rules.

I think eventually there will be. It’s just early days still. Everyone is still wrapping their arms around the tech.

One group trying to fill that role is FINOS, the Fintech Open Source Foundation. It recently launched an Open Financial LLM Leaderboard to test language models on finance-specific tasks such as information extraction, sentiment analysis, and risk assessment. (GPT5 was not yet ranked as of Aug. 13) The group is now developing a broader “AI Evaluation & Benchmarking Suite” for production-ready generative AI, with a workshop and hands-on TechSprint planned for September in London.

Takeaway: GPT-5 is an incremental improvement, and Wall Street is still far from having a standards-setting body for AI.

DATA

AI Makes Document Data Extraction Cheaper

Sometimes you see “according to public records” in a story, but you rarely see the work it takes to get them. Some records aren’t truly public unless you show up in person. Others hide behind clunky government portals — like trying to pull real estate files in Nevada.

I’m interested in how AI could make this data genuinely accessible: easier to find, work with, and analyze.

AI can take messy, inconsistent data and clean it up. For example, if you have 'J.P. Morgan,' 'JP Morgan,' 'jpmorgan,' and 'J P Morgan' scattered throughout your documents, the AI can recognize these are all the same company and standardize them to one consistent format like 'J.P. Morgan.’

But it can’t quite do this at scale.

Some companies are blending AI with traditional methods to solve this. Start with OCR: optical character recognition. It began in the mid-20th century with rigid template matching. By the 1990s, statistical models made it better — but OCR still struggled with bad scans, odd layouts, and new fonts.

Pairing LLMs with OCR improves data-extraction accuracy and cuts costs.

To understand how this is playing out in practice, I caught up with Alberto Gimeno, CEO of Invofox, which builds AI tools to extract data from dense, document-heavy workflows.

“When we started building in 2021, we had to train everything from scratch,” he told me. “That gave us a moat—until LLMs came along and changed the game completely.”

The shift isn’t just about cheaper models, he said. “The biggest change now is not exactly that what was expensive before is cheaper… but that problems that couldn’t be solved — at least not easily, without tons of super specific R&D — now can be addressed using these new building blocks that LLMs provide.”

What is changing across the board: inference costs. “The cost per token has dramatically decreased,” Gimeno said, pointing to recent research from Epoch and Heunify showing how LLM prices have collapsed.

That drop has made entirely new use cases economical. Invofox now works with fintechs, lenders, and accounting firms to process massive PDFs that were previously too costly to handle. “Loan servicing and origination has become a very relevant use case over the last 18 months,” Gimeno said. “That wasn’t viable just a few years ago—now it is.”

The workflows are complex: hundreds or thousands of pages that need to be split, classified, de-duplicated, parsed, structured, and validated. Before LLMs, firms had to rely on brittle, layout-specific templates that couldn’t handle rare or inconsistent formats. Now, models can manage a wider variety of layouts with better accuracy and lower error rates — even for niche documents in compliance or finance.

“The economics are changing fast,” Gimeno said. “If you’re launching a fintech product or trying to extract data at scale, what wasn’t possible a year ago might already be doable now.”

Takeaway: AI may not just boost productivity. It’s helping capture more data at lower cost — and could one day make public records truly usable.

Related: The article that piqued my interest in this topic: Why Document Processing Costs Are About To Collapse (Nomad Data) Also: Intelligent Document Processing Leaderboard (Nanonets)

NEWS ROUNDUP

Wall Street & AI Startups Fight Over Quants

At a rooftop bar on Manhattan’s Lower East Side, roughly 150 quant researchers met with employees at the artificial-intelligence startup Anthropic who implored them to consider a life away from Wall Street. (Bloomberg)

Trump’s AI Plan Keeps Rules Light

Bloomberg Intelligence on the state of AI regulation

  • Light-touch federal approach: Trump’s AI Action Plan prioritizes advancing AI over heavy regulation, discouraging but not fully preempting state laws.

  • Preemption fight ahead: The main legislative battle through 2026 will be whether Congress blocks state AI laws; a July effort failed but could return.

  • No new federal AI agency: Political support has collapsed for a dedicated AI regulator; lawmakers favor industry-led standards via bodies like NIST.

  • Existing laws will govern: Oversight will come from agencies like the FTC and sector-specific regulations, even if they don’t fully fit AI.

  • Higher legal exposure: AI companies lack Section 230-style immunity, making them more vulnerable to lawsuits over harms. (Bloomberg Intelligence)

AI Is Coming for (Some) Finance Jobs 

Former Millennium PM Omar Sayed says AI tools like Claude, Gemini, and RAG can now handle about 75% of a traditional hedge fund analyst’s work, making him four times more productive at his new fund, Porchester Capital. (Financial Times)

CALENDAR

Upcoming AI + Finance Conferences

I’ve put together a calendar of upcoming AI and finance conferences. Let me know if I’ve missed any and I’ll add them. (just reply to this email.) Thanks!

  • AI in Financial Services (Arena) – Sept 9–10, 2025 • London

    Focused on AI strategy, implementation, and ROI in finance.

  • Cornell Financial Engineering Manhattan 2025 Future of Finance & AI Conference – Sept 19, 2025 • New York

    A one-day forum on AI, quantitative finance, and hedge-fund strategies, attracting leading quants and industry practitioners.

  • Bloomberg-Columbia ML in Finance Conf – Sept 25, 2025 • New York

    Academic–industry event hosted by Columbia University and Bloomberg, focused on ML applications in finance including asset pricing, market forecasting, and LLM risk.

  • GAIIM Conference 2025 – Sept 30, 2025 • New York

    Forum on practical applications of AI in investing, featuring tools for research, valuation, and portfolio workflows.

  • AIFin Workshop at ECAI 2025 – October 26, 2025 • Bologna, Italy

    One-day academic workshop on AI/ML in finance, covering trading, risk, fraud, NLP, and regulation.

  • AI in Finance 2025 – October 27–30, 2025 • Montréal

    Academic event covering ML in empirical asset pricing and risk.

  • ACM ICAIF 2025 – November 15–18, 2025 • Singapore

    Top-tier academic/industry conference on AI in finance and trading.

  • AI for Finance – November 24–26, 2025 • Paris

    Artefact’s AI for Finance summit, focused on generative AI, future of finance, digital sovereignty, and regulation 

  • NeurIPS Workshop: Generative AI in Finance – Dec. 6/7 • San Diego One-day academic workshop at NeurIPS focused on generative AI applications in finance, organized by ML researchers.

WHAT ELSE I’M READING
  • Datasite Pursues $500M Roll-Up Strategy in Private Markets Software (It’s Pronounced Data)

  • Playing by the Rules Costs Wall Street an Extra 51 Million Hours a Year (Bloomberg)

  • Treasuries Go 24-7 as Repo Trade Hits Blockchain on a Saturday (Bloomberg)

  • Banks accelerate AI deployments as agentic tools gain traction (CIO Dive)

  • Austria Credits AI for €354 Million Extra Tax Revenue in 2024 (Bloomberg Law)

  • Systematic Investing to Grow in Private Markets: BlackRock’s Raffaele Savi (Markets Media)

  • Baupost CEO Seth Klarman on how his hedge fund is using AI (Yahoo)

  • Anthropic offers Claude AI to federal agencies for $1 (FedScoop)

How did you like today's newsletter?

Login or Subscribe to participate in polls.

Reply

or to participate.