Insider Quant Trading Is Here... Sorta [Code Included]
The hyper-speed arms race in Finance is a lot less competitive than you think.
By now, we’re all too familiar with the routine:
News Outlet A releases news on Company B at 3:30:00, then by 3:30:05, the stock price has moved significantly based on that news. It’s one of the fiercest and most competitive spaces in Finance.
However, in what may surprise you — the game is a lot less played out than you think, with significant crumbs still up for grabs.
But before we build out a system that tries to join this arms race, we first need to understand how these systems work.
Before News Becomes News
Typically, most investors become aware of meaningful information through an aggregator such as Bloomberg or the Financial Times:
But before these get published through aggregators, they are always first sent to a few main central repositories. A central repository is essentially the first point where important information goes from private to public. There are essentially 3 players in this game: PRNewsWire, BusinessWire, and DowJones NewsWire.
Most relevant stock information comes from these sources, from Index Additions/Deletions to Earnings Summaries (e.g., $4.2 EPS vs. $3 Estimate):
Because these 3 channels are given the releases by the respective company’s own internal staff, they are highly coveted. So coveted, that in 2015, hackers spent years gaining root access to the servers and made $30 million from trading the releases stored:
Unfortunately, this specific area is where the competition is a bit too strong for us. The DowJones newswire specializes in delivering these feeds to algorithmic traders, already formatted, cleaned, and structured. Because of this already existing infrastructure, even if we web scrape sites like PRNewsWire, it’ll be a few seconds/minutes still too slow.
So, we have to get smarter.
Aside: I’ve recently started reading “Fortune’s Formula”, an entertaining quant read about the origins of Information Theory in Gambling & Trading (e.g., Ed Thorp, Claude Shannon). There is an interesting chapter on the history of betting newswires which partially inspired this research. Pirate link
The Information Honeypot
While press releases are convenient and reliable ways of mass-distributing information to investors, they aren’t always used. Companies are eager to disseminate positive news, but when it comes to things like share dilutions or new business risks due to geopolitics, they aren’t as optimistic — nor are they legally bound to.
However, they are legally bound to submit all material information to the SEC. When a company submits a press release, they also submit an SEC filing, but it isn’t always the other way around. The types of changes submitted to the SEC include warning notices by the exchange to delist shares, plans to sell shares (dilute), new lawsuits and their outcomes, layoffs, and more.
The kicker is that while the SEC makes these filings accessible in mere milliseconds, the data is very noisy and unstructured. Because of this, financial media outlets are not as fast to format, summarize and disseminate the information (thousands of filings per day vs. a few major press releases).
This is where we have a shot.
You might immediately think that this source too is priced-in quickly, but let’s go over a real-world example.
On September 21st, 2023, VinFast Auto (VFS) submitted a prospectus to the SEC for a 75,000,000 share offering. When a company files a prospectus to dilute shares, the SEC first has to post a notice of effectiveness before they can actually sell the shares on the open market.
Conventionally, one would assume that volume and the price would immediately adjust to it so that when the notice of effectiveness does post, there won’t be any impact.
Fast forward a few weeks later, and the share sale was approved on October 2nd. Let’s see what happened:
As demonstrated, on the date of the filing, there was not much volume. However, when the SEC formally approved the sale making it official (when the news became “news”), only then did volume appropriately pick up and coverage by news outlets began.
This demonstrates that even in this day and age, the market doesn’t actually immediately price-in all available information. This delay was magnified by investors’ behavioral bias to pay more attention to the “important” names (S&P Components) as opposed to changes in obscure companies. Whenever a stock is strongly positive/negative for a day but you can’t seem to find any news why, it’s probably because there’s a filing you haven’t read.
I found out about this filing almost a full week late, but was still able to generate a tremendous profit of ~3x:
So, with this precedent set, what if we quantitatively systematized the entire pipeline? This would allow us to:
immediately identify any new filings and their respective companies
quantitatively determine the relevancy/meaningfulness of the filing
establish whether or not the opportunity is priced-in or not
So, that’s exactly what we did.
The Quant Edge
Let’s first take a high-level overview of how this system should work: