Spotting Accounting Shenanigans with AI
Hey, it’s Matt. You’re reading AI Stack, an interview series exploring how investors are adopting AI. In this edition:
🕵️♂️ The CEO of Transparently.ai, Hamish Macalister, PhD, on identifying red flags with AI.
INTERVIEW
I spent six years writing about white-collar crime for Bloomberg News. In that time, I learned that accounting fraud cases were among the longest for the SEC to investigate and the hardest to bring.
I was surprised. I naively thought, “Well, if the company is cooking the books, eventually folks will find out, right?” But that’s not always the case. Bad actors can use events outside their control—like COVID—to bury years of weak numbers.
It’s just hard to police accounting statements. And even if you latch on to what you think is a significant issue, sometimes it doesn’t matter because the company is massive. I once wrote about a company that stuffed six months of revenue into a quarter and the market basically shrugged. Granted, this detail does not inspire confidence.
Ambani’s Mobile Startup Packs 6-Month Sales Into a Quarter
A review of Jio’s unaudited results for the last year shows that the wireless venture and its parent relied on a series of accounting decisions that wound up portraying Jio’s financial performance in the best possible light.
To dig deeper into these challenges, I spoke with Hamish Macalister, co-founder and CEO of Transparently.ai, which uses traditional AI and large language models to assess signs of accounting manipulation.
Transparently.ai rates the accounting health of 80,000+ public companies on an A-to-F scale, flagging early signs of manipulation and potential failure. Founded in 2021, the Singapore-based company counts two of the Big Four auditors as clients and money managers overseeing $4 trillion in assets.
Macalister worked as a macro strategist at Citigroup, led quantitative strategy in Asia at Deutsche Bank, and later served as chief data scientist at Firth Investment Management. He also earned a PhD in finance, where his doctoral research on analyst forecasts laid the groundwork for Transparently.ai’s approach.
In this interview, you’ll learn:
Why accounting manipulation is more common than most investors think.
How avoiding high-risk companies based on these scores can generate meaningful alpha.
Why auditors and analysts miss red flags—and how AI can surface them.
This interview has been edited for clarity and length.
How does Transparently.ai help investors evaluate accounting nuances across industries?
This is a perfect problem for machine learning because it’s very complex and multidimensional, but also one for which there’s a great deal of data. That combination makes it well suited to machine learning.
There may be relationships a person would struggle to identify, but a machine can. Another advantage is that the machine isn’t wedded to traditional ways of thinking. For example, it might pick up on signals that an activist short seller would look at, but from a different angle.
One of the red flags might be unusually high margins—possibly a sign the company is faking revenue or hiding costs. That’s a classic example of what an activist short seller might look for.
From a machine learning or AI standpoint, the system might learn something similar: unusually high margins can be a warning sign. Our system does flag that from time to time. But it can also flag unusually low margins if it detects that certain combinations of features—low margins alongside other factors—may indicate a company is doing something unusual.
Machine learning can identify very complicated patterns that may not be intuitively obvious. The one thing I’ll add to that is it cannot just be a black box—unless you’re a quant and all you care about is the black box. In that case, all you want is the numerical output: the risk, the number, the indicator.
But for most of the users we deal with, they want some sort of explanation behind this. So it’s critical to design the system not only to provide an indication that something unusual may be happening in a company’s accounts, but also to explain why and how. It should guide what you need to do next: what questions to ask management, what areas to investigate, and what procedures to implement if you’re an auditor, given the specific features of that company.
What was the a-ha moment for you to start this company?
The a-ha moment came when I was a quant fund manager marketing my fund, talking to private wealth advisors and others. I would casually mention accounting manipulation, since to me it was a very small part of the process.
But the reaction across the table was visceral—people would literally stop me mid-sentence: “Wait, stop there. How do you do that? I didn’t know that was possible.”
I kept hearing it again and again: “I didn’t know it was possible to quantify aspects of accounting manipulation, or to quantify the quality of the accounts.”
I heard it so many times that I started thinking: first, this is amazing, because clearly nobody seems to know about it. And second, while there’s actually quite a significant body of academic research in this space, very few people are aware of it—because very few people read accounting journal articles unless they have serious sleeping problems.
How big is this issue?
It’s a multi-trillion-dollar-a-year problem, and forget about what we say: there’s academic research that shows it. It’s a monster pain point for which, as far as we could tell, nobody else had come up with a solution.
Independent academic research finds that, in the U.S., on average 40%—four zero—of companies manipulate their accounts every year. That’s astonishing. Manipulation can range from something mild and permissible to outright fraud. At the extreme, the same research found that 10% of companies commit securities fraud annually. That’s mind-boggling.
Now, let’s just take that 10%. If you knew in advance which companies were doing this, you wouldn’t touch them with a barge pole. The numbers don’t matter if you can’t trust them. And if you can’t trust the numbers, your investment analysis breaks down.
But what we realized was that there wasn’t much understanding of just how widespread this problem is.
If you’re talking to, for example, the audit assurance team of a Big Four auditor, they know how big a problem it is because they see it firsthand. But for your typical investor or bank asset manager, while they recognize it as a pain point, there isn’t necessarily an appreciation for just how large it really is.
That’s why we started producing research showing, for example, the magnitude of return differentials between high-risk and low-risk companies in our system using true point-in-time data. This isn’t about a backtest.
We generate our risk scores and ratings for companies, then track their performance over the next 1, 3, 6, 9, 12, 24, and 36 months. We looked across all these different periods and compared the performance of high-risk companies versus low-risk companies.
What we found was that the alpha was far larger than we expected. This system keeps surprising us—every time we look at it from a new perspective, the impact is even more dramatic.
Importantly, we didn’t design it to do that. We designed it to identify corporate collapse, or the likelihood of collapse, over a two-to-three-year lead time. But what we discovered is that there’s very significant alpha in our work even for one-month holding periods, which was really surprising.
Instead of comparing the best and worst companies, which mimics a long-short portfolio, the question is: what happens if a typical billion-dollar fund simply avoids the worst companies above a certain risk threshold? What difference does that make to return and risk performance? Once again, the numbers are ludicrous. And because it’s just a matter of not holding something, there’s no issue with trading costs or implementation.
How are the ratings calculated? Are they deterministic, traditional AI with a generative component?
It’s partly deterministic and partly non-deterministic. The predictive AI component definitely has an element of being non-deterministic, though not as much as the generative AI component.
I remember this Journal story. They highlighted academic work showing that 0.4 doesn’t show up in earnings, because if you want it to be 0.5, you round up.
This is a very good example, and it’s well known in the academic literature. If you think about the distribution of earnings surprises, imagine a bell curve. The peak is usually around a small positive surprise. Then you get long tails in both directions. That’s what you’d expect.
But in reality, the distribution looks like that with one exception: you don’t see small negatives. Instead, there’s a dive down and then a jump back up around small negative surprises. That goes to your point: companies know you don’t want to report a small negative surprise. If you’re going to miss, make it a big miss, because either way the market will react badly.
So you get these biases in earnings surprises that create very odd mathematical patterns—yet they make intuitive sense. Companies don’t want to disappoint, and if they have to, they’d rather do it in a big way.



