The latest threat against the safe operation of the internet is not malware, spy agencies or some other form of deliberate, hostile action.
Instead, it’s network operators using software to “optimise” their traffic routing.
This seems reasonable enough, but the effect of the “optimisation” is that delicate balance achieved via mutual routing agreements between networks with the Border Gateway Protocol (BGP) to pass traffic between each other is upset.
At the end of June this year, a short message was posted onto the North American Network Operators’ Group (NANOG) mailing list asking if anyone was seeing issues with Cloudflare.
The issue was confirmed by other operators; it would be a major one that caused constipation for a large chunk of the internet.
A company in Pennsylvania had installed a BGP Optimiser, in order to split traffic within its own network into smaller segments, to steer it in more specific directions.
That’s not necessarily a recipe for disaster but for the Pennsylvania company’s upstream provider, Verizon, not filtering out the fake traffic split announcement and instead passing it on to other transit networks and Cloudflare most certainly was.
Having been told that one part of Verizon and the Pennsylvania company was the best path to Cloudflare, AWS and other popular destinations, other networks dutifully sent traffic to them.
They were not equipped to handle the massive influx of data. Cloudflare said the route leak resulted in 15 percent of its global traffic being lost during the incident.
In essence, the “BGP optimisation” became a massive and costly denial of service attack.
Job Snijders, Internet Architect at NTT, made it clear two years’ ago that using BGP Optimisers that generate fake more-specific announcements is irresponsible and needs to stop.
“I strongly recommend to turn off those BGP optimisers, glue the ports shut, burn the hardware, and salt the grounds on which the BGP optimiser sales people walked,” Snijders wrote on NANOG.
Two years later at the 48th Asia Pacific Network Information Centre conference in Chiang Mai, Snijders is still having to repeat the message.
Why do operators use BGP Optimisers?
Snijders explained that operators are sold BGP optimising software as a very cheap fix for issues like not having full-time network engineers employed, not having a good choice of upstream providers, or having difficult-to-navigate multi-year traffic level commitments on data circuits.
“BGP Optimisers are marketed as a cost-saving measure basically,” Snijders told iTnews.
“If I were a C-level executive, BGP Optimisers would seem to tick a lot of boxes for my business.
“There’s no evidence that BGP Optimisers work that way but they come with a dashboard.”
Compounding the issue is that current traffic tools are lacking in smarts, and BGP Optimiser companies try to fill that gap with products that break the rules.
Snijders noted that BGP Optimiser vendors are not at the Internet Engineering Task Force (IETF) forum that seeks to hammer out open standards for everyone to follow.
The best defence
Unfortunately, BGP Optimisers remain on the internet today, so how can operators defend their networks against them?
At a conference talk, APNIC chief scientist Geoff Huston was pessimistic that the “screaming car wreck that is BGP and its phenomenal insecurity” is fixable.
Nevertheless, Huston and other network techies say there are steps that network operators can take to stop the worst route leak transgressions.
Among these, operators should deploy Resource Public Key Infrastructure-based BGP Origin validation to protect their networks, Snijders advised.
To protect other networks, operators should create Route Origin Attestations (ROAs) of BGP announcements, cryptographically signed and verifiable with RPKI.
ROAs attest that the originating Autonomous System (AS) network domain really is authorised to announce route prefixes.
APNIC made a big push to have operators sign ROAs at the Chiang Mai meeting, and large networks such as AT&T now filter based on RPKI, dropping invalid prefixes.
A recent study pointed to a large increase in the number of AS domains implementing RPKI filtering, up from 50 in 2018 to 616 in September of this year.
A ghost from the past needs also finally be laid to rest to improve internet routing security: the lack of direct peering between networks.
Telcos and providers directly peering, or connecting to each other, at internet exchanges or elsewhere not only adds performance and is more economical, but also shortens the length of the AS path traffic has to traverse.
Not having to route traffic between more networks than necessary, and when possible keeping it local, is a security feature, Snijders’ research shows.
Using BGP might indeed be similar to continuing building houses that burn down, as Huston said, but RPKI and other measures make it a little harder to set them alight.
Juha Saarinen attended APNIC 48 in Chiang Mai as a guest of APNIC.