Speed is relative.
Depending on the context, ‘fast’ can mean different things. Usain Bolt can run 100 meters in 9.58 seconds - high-frequency traders can trade billions in a fraction of a blink.
“You get used to really big numbers at really crazy speeds,” says Dave Lauer, a high-frequency trading veteran. Previously a quantitative research analyst at Citadel Investment Group, and responsible for developing their trading models, Lauer has also worked for Allston Trading, and went on to work at Better Markets researching high-frequency trading and its impact on the market, before going to the IEX stock exchange.
High-frequency trading is a method of algorithmic trading in which large volumes of shares are bought and sold automatically, within a fraction of a second.
“These are speeds that your brain can’t really comprehend," Lauer says. "We had strategies that would be millions of dollars in milliseconds that would go in and out. And sometimes, it could go wrong, and you would have to dissect things to try and figure out, what caused us to lose millions in that moment?”
For many, stock exchanges bring to mind images of cramped rooms full of shouting and hand signals used to conduct trades. But these images are, at this point, archaic.
By the 1980s and 1990s, phone and electronic trading became the norm, and those traders were able to finally put their hands back down. Today, trading floors are somewhat symbolic. Stock markets are instead a network of data centers and databases, and trades are all made via the Internet.
With the phasing out of open outcry trading, and the introduction of new technology - notably when NASDAQ introduced a purely electronic form of trading in 1983 - High Frequency Trading (HFT) gradually developed, and has continued to get quicker.
“HFT isn’t really a trading strategy, it's a technology and an approach,” explains Lauer. “It powers lots of trading strategies.
“HFT is using the fastest computers and networking equipment in a distributed manner to buy and sell anything really fast. It doesn't just have to be equity markets or futures markets like crypto. Generally speaking, it's a way to arbitrage markets,” he says.
Arbitrage is the simultaneous purchase and sale of the same asset in different markets in order to profit from small differences and fluctuations in the asset’s listed price.
An example Lauer offers is slow market arbitrage - which is a form of what he calls ‘structural trading.’
“A lot of exchanges have order types called midpoint peg orders. They fill the midpoint of the National Best Bid and Offer (NBBO). If you, as an HFT firm, know that the NBBO has changed before the exchange does, then you can pick off all the stale midpoint orders, and when the NBBO changes, you make money.”
HFT firms aren’t just trying to be faster than the exchange, however. They are also in constant competition with one another to be first in line. It is this competition that has led to traders around the world constantly innovating their technology, scrounging to save a microsecond or nanosecond, at every level of their networking equipment, computing hardware, and software.
There are stock exchanges all over the world, but much of the trading that occurs is concentrated in what is known as the “New Jersey Triangle,” and Chicago.
“The New Jersey Triangle is referring to an area where several exchanges and their data centers are concentrated. You’ve got NYSE which is located in Mahwah, and then you’ve got the Carteret data center which houses NASDAQ, and then you’ve got Secaucus where all the other exchanges - anything that isn’t NYSE or NASDAQ - are located,” Cliff Maddox, director of business development at FPGA (specialized hardware that can be optimized for HFT) company NovaSparks, tells DCD.
“Something like 60 percent of global trading volume happens in those three data centers (and Chicago), and it's gone up substantially over the last 15 years, from something like 33 percent to 60 percent.”
Those data centers are all colocated - at a premium - and connected to one another, via cables or in other cases, microwaves or laser-based communications.
High frequency trading: A look to the past
Throughout DCD’s investigation into HFT, one thing was consistently mentioned by all interviewees: Flash Boys. Flash Boys, written by Michael Lewis and published in 2014, in many ways exposed HFT and brought it into public consciousness. Its reception was, to say the least, mixed.
The book’s overarching message was that technological changes to trading, along with unethical trading practices, had led to the US stock exchange becoming a rigged market.
Naturally, many financial institutions - and those engaging in high-frequency trading - criticized the book and its claims.
Notably, Manoj Narang, CEO of high frequency trading firm Tradeworx, argued that Lewis' book was more "fiction than fact." Lauer, when discussing the book with DCD, surmises “he got a lot right, and a lot wrong.”
“He made some basic factual errors. But those who are nitpicking what is and isn’t true are missing the point of the book. The book is about conflicts of interest and the fullness of this activity,” says Lauer. “You talk to people in HFT, and they'll tell you that book is pure fiction. But I lived that. The book is not a fiction like that.”
Towards the beginning of Flash Boys, Lewis writes about the indisputably real ‘Spread Networks’ line, a cable that connects the Chicago Mercantile Exchange, and a data center in Carteret, New Jersey, beside the NASDAQ exchange. It has since been expanded to the Equinix NY4 IBX data center in Secaucus, and another in Newark.
The line was created in near total secrecy in 2010 (which has since been fictionalized in the movie The Hummingbird Project) and was developed under the premise that a straight cable would be quicker than the current connections, giving firms who leased bandwidth upon it an inherent advantage. That advantage was in the realm of a single millisecond.
The first cable cost around $300 million (at the time) to develop, and involved carving out a straight line between the two data centers, cutting through mountains and other complicated terrain. When it was developed, the first 200 stock market players who were willing to pay in advance for a five-year contract received a discount: $10.6 million, instead of $20m.
All in all, this was around 10 times the price of standard telecoms routes. But with millions being traded in moments, and the threat that hundreds of traders might now be ahead of you in the line, that expense became worthwhile.
This was, however, 14 years ago, and in an industry where change happens at a much more rapid pace than usual technology upgrades might be elsewhere.
With the entire sector chasing that millisecond or even nanosecond reduction, new solutions are always being sought out.
At some point, it was realized that using fiber optics was not actually “the speed of light” over a straight distance, as the signal would be bouncing around inside the cable itself. This led to the use of microwaves.
Usman Ahmad, chief data scientist and quantitative developer at Acuity Knowledge Partners, a research and business intelligence firm in the financial sectors, tells DCD that microwaves were really “phase two” of data transmission in the HFT industry.
“Microwaves travel through the air, and you can get a straight line directly from tower to tower. So you have a microwave emitter that transmits data, and a detector on another tower high up and it's a nice straight line and can transmit data faster than fiber optics,” explains Ahmad. “But they are unstable by themselves. Adverse weather conditions can affect them, for example.”
Microwave radio services started being offered by McKay Brothers and Tradeworx in 2012.
It is highly impractical for traders to be put out of shop simply because it is raining thus, while the microwave was phased in, fiber optics would still be used for redundancy purposes.
Microwaves were phase 2, but it is when we talk about phase 3 that Ahmad gets truly excited.
“What they are doing now - and I’m honestly really wowed by the technology - is they are using something a bit better and with a shorter frequency, mmWave or millimeter wave.”
MmWave refers to electromagnetic waves that have a wavelength between 10mm and 1mm, a frequency between 30Ghz and 300Ghz, and lie between the super high-frequency band and the far infrared band.
But one of the biggest obstacles with the mmWave method is that, while it has a higher bandwidth, it travels shorter distances.
Joe Hilt, CCO at Anova Financial Networks, a major player in this newer method of transmitting trading data, explains the predicament to DCD: “Microwave technology goes a longer distance but has less bandwidth - say 550 megabits. Then, on the other hand, mmWave can give a customer, say, 1 gigabit of bandwidth, but it needs more ‘hops’ along the network.”
“Anova is a little different in that we do use microwave, we do use millimeter wave, and then we do something called free space optics,” explains Hilt.
Anova began using the free space optics technology “about eight or nine years ago.” The solution has previously been used by the Department of Defense, for example for fighter jets to communicate with one another.
Free space optics - where data is transmitted through ‘free space’ as opposed to via a solid material like optical fiber cables using laser beams - are also being used by Google's parent company Alphabet in its Taara project. Taara is using the technology to bring high-speed Internet access to rural and remote areas including in India, Australia, Kenya, and Fiji, and can transmit information between terminals at speeds of up to 20 gigabits.
“It’s a laser technology, it's not different almost than what you use to turn your TV on and off, except we are pushing it to 10 kilometers,” he says.
Every 10 kilometers, there is a ‘hop,’ or a transmitter and receiver which takes the message and passes it on to the next hop until eventually the data meets its end location.
Anova takes charge of every element of these networks. “We make the radios, we install them and we maintain them. Because of this, we’ve been able to get to a place where our radios can now do 10 gigabits of bandwidth instead of just one. This is major because we are in a market where customers previously would buy 10, 20, or 50 megabits, and they can now buy one gigabit of bandwidth which is almost as if they had built their own network.”
The three technologies are all used, and when it comes down to it, the measure of their advantage lies in how close they get to the data centers. HFTs are trying, according to Hilt, to use as much radio and as little fiber as possible, so those receivers need to be on the roof of a data center, or a pole next to it.
Conceptualizing these transmission methods is relatively simple. Deploying them, is anything but.
Free space optic lines need to be deployed within the line of sight. In other words, while each hop might be 10km apart, there cannot be any obstacle in the way between them.
Before setting up, Anova will look over the route and see what could get in the way, be it trees or a building. “If a building’s in the way, our first point of call would be to see if we can lease space on the roof,” says Hilt. “We’d put the mast right there and take the signal up over the site and down the other side. That’s the fastest way around it.”
As part of its maintenance operation, Anova will “walk” the route regularly and check if there are any new builds coming, anything from a real estate perspective that may mean they need to adjust.
HFT firms will typically lease bandwidth on Anova’s network, though on occasion the company is hired to build a private network for a client. According to Hilt, this only ever occurs on a “metrobasis” - for example connecting NASDAQ to NYSE - not for long-haul routes.
Regardless of whether microwave, mmWave, or free space optic technology is used, once the data arrives at the data center, it travels through fiber optic cabling. At this point, there is an advantage in where the servers doing the trades are actually located in the data center - the shorter the cable, the lower the latency, after all. This led to HFT firms battling over the prime real estate that would bring their servers closer to the matching engine.
The flash crash
Following the 2010 Flash Crash, regulations that limited HFT latency-reducing techniques crept in.
HFT’s role in the Flash Crash is indisputable - though it was not the initial trigger.
May 6, 2010, was a clear and warm day in New York. For those not on the trading floor, it would have seemed like any other Thursday - the weekend was approaching and the weather was optimistic. By 1pm, things had taken a turn for the worse.
The crash has to be remembered in context. May 6 began with some nervousness in the market. Traders would have awoken to news of a troubling political and economic market across the pond in Europe, leading to widespread negative market sentiment.
In addition to this, a mutual fund trader had initiated an unusually large, and flawed, program to sell 75,000 E-Mini S&P 500 futures contracts, valued then at $4.1 billion. The algorithm used to manage the sale had mistakenly only accounted for volume, but not time or price. The sale, as a result, was executed in just twenty minutes, as opposed to the five or so hours that would normally be expected.
HFT algorithms, by their nature focused on being as fast as possible, were naturally the first buyers of the contracts, only to sell them again simultaneously, creating what the Commodity Futures Trading Commission (CFTC) and the Securities and Exchange Commission (SEC) official report called the “hot-potato” effect.
Ultimately, this pushed prices down at a rapid rate, and buyers were then either unable or unwilling to provide buy-side liquidity.
Other market participants began withdrawing from the market and pausing trading. Official market makers used “stub quotes” in which they reduce bids in shares to $0.01, and increase asks to the maximum of $99,999.99, which prevents trades.
Share values became entirely unpredictable, with some running down into practically nothing, and others skyrocketing only to spin back to earth without clear reason.
Stability was achieved by 3pm. Later, after the markets closed, the Financial Industry Regulatory Authority (FINRA) met with the exchanges and they all agreed to cancel all trades that had occurred during said Flash Crash.
While this can be seen as a “no harm, no foul” situation, what it demonstrated was the potential for mass instability in the market directly related to the use of technology for trading.
The process of regulating HFT has been slow, to say the least. There was a call for greater transparency immediately following the 2010 crash, but HFT firms are naturally unwilling to give away what is effectively intellectual property. The details of their strategies are the entire basis of their success and to publicize that is to lose their edge.
In 2012, a new exchange was founded - the Investors Exchange (IEX) - which was designed to directly mitigate the negative effects of high-frequency trading.
The IEX matching engine is located across the Hudson River in Weehawken, New Jersey, while its initial point of presence is in a data center in Secaucus, New Jersey. Notably, the exchange has a 38-mile coil of optical fiber in front of its trading engine, which results in a 350-microsecond delay, designed to negate some of the speed advantages. The NYSE has subsequently adopted a similar “speed bump.”
Speaking on IEX, Lauer explains: “Everyone, regardless of where you're located, gets the same length of a cable, so now everyone's on an even playing field from a latency perspective within the data center.
“The idea with IEX, which is an important one, is that the exchange is always going to be faster than its fastest participants. And that's what really differentiates it from the other exchanges which say that HFT is going to be faster than they are.”
The current market structure is made up of self-regulatory organizations and stock exchanges that are also for-profit and publicly traded. But the way exchanges make money has changed, explains Lauer.
“Exchanges don’t really make much money from trading anymore, most of it comes from data and connectivity, and from selling private feeds to HFT firms. Then there is a massive public subsidy that is paid to exchanges because of the SIP public data feed.”
According to Lauer, a big part of this shift was caused by exchanges paying rebates for passive orders, and charging fees to aggressive orders which resulted in a significant spread compression (when the difference in yield between bonds with the same maturity narrows).
IEX as an exchange takes a different approach, charging much more than other exchanges to trade, but close to nothing for data and connectivity, more or less just what their costs are.
Regardless, with each exchange to an extent creating its own regulations, the real impact on “leveling the playing field” for HFT has been limited.
Lauer says of this: “[The exchanges] are beholden to their biggest customers because they have to make quarterly earnings, and they make choices that are not necessarily in the interests of fair and efficient markets, which is what their self-regulatory responsibility is supposed to be.”
Inside the data center
Regardless of the exchange's stance on latency standardization, there are still gains to be made inside of the data center, both through the compute and the software.
In the cloud computing sector, major hyperscalers such as Amazon Web Services, Google, and Microsoft have increased their server life span to six years in order to reduce operational costs and reach Net Zero goals.
In comparison, Lauer tells DCD: “We were getting new servers every six to 12 months. That was the refresh cycle. Some of those would be repurposed as research clusters, but sometimes it would be like, ‘What are we going to do with all this?’ and we’d sell them. At one point, I had in my house six rack-mounted top-of-the-line servers where I was running a research cluster just for fun.”
The simple reason for these rapid refreshes is that should a piece of equipment come out that can trade faster than the one you are currently using, then it is time for an upgrade.
This constant refresh cycle and high initial capital investment can make it challenging for new entrants to get into HFT.
Christina Qi was an undergraduate student at the Massachusetts Institute of Technology when she began trading out of her dorm room.
“I was trading from my dorm room, doing quantitative trading which was using fundamental analytics to guess whether stocks would go up or down each day,” she says.
“We were using a mathematical basis, looking at the history of different prices or the relationship between different stocks, and whether they would go up or down, and that’s when we realized that the faster you are in that space, the more advantage you would have.”
Qi went on to found Domeyard LP, a high-frequency trading firm that, at its peak, traded billions in volume daily. The firm operated between 2012 and 2022 but, as Qi notes, increasingly found it hard to compete and scale.
“With HFT currently, the majority of that behavior and activity in the markets is from the top maybe 10 to 15 firms in the US that are doing that. They dominate the markets these days; you're not going to see a tiny startup come in. Back then, we were rare in that we ended up trading for many years, but it took us a long time to get started, and it was only after seven or eight years that we started trading billions.”
Eventually, Domeyard began to face constraints. With higher frequency, the capacity the firm could trade was lower. “We ended up with a waitlist of investors that wanted to come in, and we couldn’t take any of them on. It was really frustrating. We reached a point where we would have had to branch out - for example doing longer-term trading, private equities, or real estate. But we had hyperfocused and put all our money into the basket of HFT.”
When the pandemic hit, Domeyard struggled, before eventually winding down its operations in 2022. When asked if the firm ever struggled in the face of regulations, Qi says “not really.”
“There were genuine regulations, but most of them were just reporting requirements, or sometimes it was fundraising requirements, like ‘do we need to have another disclaimer in our pitch deck for example.’ But there was nothing that would shut down high-frequency trading.”
Qi experienced a similar level of hardware refresh cycles and excess that Lauer spoke of, noting that Domeyard would at times have FPGAs lying around the office - she jokingly adds that they would have been easy to steal, and are expensive pieces of hardware.
Servers, FPGAs, and more
The hardware primarily used by HFT firms are known as Field Programmable Gate Arrays (FGPAs). These are integrated circuits that provide customers with the ability to reconfigure the hardware to meet specific use case requirements.
Cliff Maddox, from NovaSparks, a major provider of FPGAs to the HFT sector, tells DCD that the benefit of an FPGA is that it can get “dramatically better latency that you can by writing it in software, and a better power footprint.”
An FPGA differs from a general-purpose chip like a CPU or GPU in that it has one specific purpose. Maddox explains this by comparing it to the Intel 8088 processor from the late 1970s.
“Fundamentally, you could give it [Intel 8088] any instruction you wanted. It might have been slower than those today, but it could do lots of different things. Computers, in general, continue to follow that instruction set - they’ve added new features, the clock speed has gone up massively, and there are more cores. But we are approaching a limit of what's possible with silicon,” says Maddox.
“Moore's Law hasn't changed much, and Moore's Law has kind of died. You're not getting double the power performance ratio every year that you used to, that's gone away. So now the only way to really expand your scale is going to specialized physical architecture.”
The FPGA is a chip that you can flash a design onto, meaning the chip does just the specific thing you want it to. It is optimized at a hardware level.
NovaSparks takes on the responsibility of finessing the FPGA for trading firms: “We sit there and optimize the hell out of it every year. We just keep making it faster, adding little tweaks to whatever it is people want and ultimately try and convince them to not build it themselves but get us to do it for them.”
Maddox similarly notes that there are around 10 firms that have the resources available to throw “everything” at optimizing this (citing Citadel as an example), but it would cost them several times more to do it themselves rather than just contracting NovaSparks to handle this.
The company offers 50 chips with different parameters to them - so HFTs can select those that will match their trading strategies - and do regular tweaks to improve them. Major overhauls are less frequent.
“We did a physical upgrade back in 2019 to a chip that had been around for a year and a half at that point. We look at every chip that comes out, but there hasn't been one that is a significant improvement to make a full upgrade worthwhile,” explains Maddox.
Regardless, through those iterative tweaks and adjustments, NovaSparks has improved latency to that chip by around 25 percent. The power consumption is also drastically reduced, because the FPGA is only ever doing the one thing it is optimized for.
The company is open to new solutions. “We would like to find a new chip. There are a lot of ways they are improving. For example, they’re improving the interfaces, so the 10-gig interface for Ethernet has gotten better. But when you add up all those adjustments and bring it to our scale, we haven't found something that is better enough overall to make that leap yet, but I think it's coming close.”
Maddox half-jokingly notes that NovaSparks isn’t too keen to overshare future plans - referencing the “Osborne effect.”
“The Osborne computer was something from the early 80s. When Osborne put out the computer, everybody thought it was great, and then Osborne started talking about the next computer, so no one bought the current Osborne because they were waiting for the next.”
Beyond the FPGA, there is an even more specialized version known as an ASIC - application-specific integrated circuit.
Maddox explains this as FPGA is an etch-a-sketch chip, but ASIC is where you get it made from scratch. “That could cost something like $30 million to spit out,” says Maddox. While more specified, making an upgrade or adjustment would take months, something that isn’t necessarily in line with the HFT faster-is-better philosophy.
While the FPGA seems to dominate with those companies at the front of the HFT sprint, more general chips still have a role to play.
Blackcore Technologies specializes in servers optimized for HFT - their speed increased by “overclocking.”
“We take the relatively standard hardware, and push it beyond its standard manufacturing specifications,” James Lupton, CTO of Blackcore tells DCD.
“Overclocking is quite a long-standing thing in the IT industry. I think it's human nature to take something and try to push it faster. Ever since we first had computing, there were stories of people taking a 100 megahertz processor and trying to push it to 200 megahertz.”
To cater to HFTs, as well as gamers and other use cases, CPUs are released with overclocking enabled. Those creating the CPUs work in tandem with the motherboard vendors. “It enables you to go in at a low level and adjust certain variables within a motherboard and CPU to increase the clock speed and also other technical things like the cache speed or memory speed.”
Blackcore has learned, over the eight years it has been in the market, how to toe the line in making the server faster while not risking pushing it too far.
“A CPU sold is intended to work a specific specification - the thermal and power and everything is designed around the specification that was shipped up. So when we start to play with them, you're putting more power in than they were maybe necessarily designed for.”
Lupton adds: “They're also going to be generating more heat than they were before. So we have to be very selective with making sure we've got ancillary components with a motherboard that can provide that power to the CPU safely, and that we've got the right cooling systems in place to be able to cool the extra heat that's generated from overclocking.”
As is the standard with high-density racks, this is liquid-cooled. Blackcore offers a complete rackmount server system with liquid cooling already in the unit via a self-contained liquid cooling loop within the server chassis itself, meaning customers don’t need any additional data center infrastructure and can be deployed in a standard air-cooled data center rack.
Blackcore is angling for customers who are relying on a software-based trading algorithm. Lupton acknowledges that FPGAs provide a lower latency solution, but notes the higher barrier to entry.
“It’s very, very fast, but it reduces the complexity you have - you can’t cover every single strategy in that one scenario, so a lot of clients end up needing the fast CPU as well for other strategies or for reprogramming the FPGA in real-time.”
ASICs, notably, cannot be changed once they are manufactured and are more expensive, even if they do offer a lower latency again.
HFT: A valuable contribution to society, or a drain on resources?
The sheer quantity of money and brain power that is put into HFT and shaving off that last nanosecond is monumental, and those who stand to benefit from the technique will maintain that it has a positive impact on the economy and wider society.
They will also fight, tooth and nail, against any move that seems to damage their efforts with latency arbitrage.
In fact, the firms engaging in these practices go so far as to deny their existence. According to Lauer, following the publication of Flash Boys, many firms would argue “latency arbitrage used to happen, but it's not happening anymore,” but when it came to the IEX attempting to bring in a new stock-order type that would mitigate some of those latency based advantages, Citadel actually sued the SEC in 2020.
For context, it is worth noting that Citadel was previously fined $22.6 million for "Misleading Clients About Pricing Trades” in 2017, regarding charges that its business unit handling retail customer orders from other brokerage firms made misleading statements to them about the way it priced trades.
The stock order type was known as the “D-Limit order type.” The D-Limit, in simplistic terms, used the same math that underlies many high-frequency trading models to predict when an impending price change is coming, and when that is triggered, it prevents trades from occurring until the order types have been repriced, thus removing the opportunity for HFTs to take advantage of this.
Should latency arbitrage ‘no longer be used,’ then this would not be a problem. But to HFT firms it was, thus the Citadel lawsuit in which the firm claimed that this would harm retail investors by delaying trades.
It should be added that the SEC and firms that represent the retail investors and their interests were unanimously in favor of this proposal. Citadel lost the suit in 2022.
Citadel also led a suit seeking to have the Consolidated Audit Trail - a database that in May 2024 began collecting almost all US trading data and giving the SEC insight into activity across the market - made illegal.
Citadel, while seemingly a leading force in the fight against regulating these financial markets, is not alone in purporting its belief that these trading methods actually have some sort of benefit.
In an opinion piece, Max Dama, a quantitative researcher at HeadlandsTech, kicks off with: “There is a common misunderstanding, even among practitioners, that low-latency trading is a waste of human talent and resources that could instead go to advancing physics or curing cancer.”
Dama goes on to claim that HFT benefits markets by reducing bid-ask spreads, thus reducing the cost of trading, and by speeding up the handling of supply and demand negotiations to uncover “true” prices.
Dama notes the disillusionment within the industry itself, stating that those who feel that way typically fall into two groups: “People working at smaller, less-profitable firms that worry their lack of traction is due to latency being an “all or nothing game;” and second, people at big firms that feel their micro-optimization work maintains a moat for another year or two but will be replaced shortly and not have any lasting value to humanity.
“They both think, isn’t it wasteful to invest so much in ASICs (he goes on to note that few did opt for the ASIC approach) in a race to the bottom to shave off five nanoseconds to capture a winner-takes-all profit in a zero-sum game? Isn’t it a misallocation of resources for so many companies to build something, for only one to win, and the rest to have squandered resources and time?”
While Dama goes on to dispute these claims, well, aren’t these valid questions?
“It’s a set of really brilliant people that could be doing a lot of very different things in this world,” Lauer says. “It's a damn shame.
“These are geniuses - some of the smartest people in the world. And instead of solving real problems, this is what they're doing.
“They're trying to eke out a couple of nanoseconds so that they can make rich people a little richer. It’s a really poor use of our collective intellectual.”