DDoS understandably makes for a lot of headlines and is the source of genuine concern on the modern Internet. For example, the recent memcached attacks were remarkable for the volume of traffic generated through amplification (1.7Tbps) and are the latest reminder that a well-designed network edge needs a solid DDoS mitigation solution in place.
Beyond the headlines, the data tells us that while DDoS is an increasing problem, the growth trend is in more complex attacks that target applications not infrastructure.
Before getting into the remediation, let's first look at some basic questions.
DDoS Traffic
A Cisco partner, Arbor Networks, publishes an annual report with a lot of great data on security threats. Arbor ATLAS, which monitors over one third of all Internet traffic, shows that the total volume of DDoS attacks has been relatively flat since May 2016, hovering about 0.5 Pbps (Petabytes/second) per month.
The next two charts show that the number of attacks has been (slightly) on the decline in the last year, but the average attack size has increased.
Let's put these impressive numbers into context.
According to Cisco VNI, internet traffic is currently growing at up to 30% CAGR (with many carriers planning for higher growth internally), driven predominantly by video, with up to 80% of the growth coming from content delivery networks and content providers. The VNI report also states that, by 2021, global IP traffic will reach 3.3 ZB per year, or 278 EB per month.
So, while individual DDoS attacks may be growing in size, the Arbor ATLAS and Cisco VNI data tell us that DDoS is growing slower than overall internet traffic. Additionally: the biggest source of traffic is video from content networks, which are not (usually) origins of DDoS attacks.
You can read more about evolving traffic patterns in a technical paper we published earlier this year.
This has important implications when it comes to designing a modern peering architecture because it clearly shows peering hardware should optimize for bandwidth growth, with fast reaction capabilities to throttle attacks. For the foreseeable future, peering routers should be as scalable and cost-effective as possible to satisfy these priorities.
Clearly, attackers are using more advanced DDoS tools and launching harder hitting attacks than before. At a high level, there are two types of attacks that our customers encounter in the wild:
The DDoS attacks of today are very different from the attacks seen a few years ago. As DDoS defenses have become more effective, attackers have turned to using more advanced attacks, often focusing on application vulnerabilities instead of flooding attacks. Volumetric attacks are sometimes used to distract security response teams while more sophisticated attacks, such as application and infrastructure penetrations, are attempted.
For carriers, DDoS solutions have traditionally focused on NetFlow exports collected at peering routers with intelligent analytics systems aggregating and analyzing these flow records (along with other data such as BGP NLRI). As large volumetric attacks occur without warning, it's important to quickly identify attack sources and traffic types in order to understand the nature of the attack and plan a response.
In almost all cases, it is possible to mitigate volumetric attacks at the edges of the network using the five-tuple parameters and packet length of the attack, thereby dropping the offending traffic at the peering router. This avoids having to transport the attack traffic to dedicated cleaning centers. In these scenarios, flow analysis tools can rapidly program the network edge to drop flows using tools such as BGP FlowSpec.
Of course, IOS XR supports both NetFlow and BGP FlowSpec. The application in peering design is documented in more detail here.
With increasing use of FlowSpec, these NetFlow analytics systems can now move past detection of attack patterns and directly signal dynamic ACLs to drop traffic at line-rate in ingress routers. The main advantage is speed and automation. This allows carriers to quickly determine the nature of an attack and trigger mitigation in the infrastructure from a single interface.
The recent memcached attack is a case in point. It was, in fact, a basic volumetric attack and fairly straightforward to detect with NetFlow and block with traditional five-tuple packet filters deployed using BGP FlowSpec. If you want to know more, details of the memcached attack are available here and here, as well as discussed during a recent NANOG talk.
In fact, according to our data, NetFlow-based flow analysis has been successful in rapidly detecting all known volumetric type DDoS attacks. Moreover, according to analysis done by Arbor and Cisco, of the approximately 713TB of analyzed DDoS attacks in June 2018, all the volumetric attacks could have been handled with a closed loop of NetFlow analytics and BGP FlowSpec configuration, using automated and, exceptionally, expert-based analysis.
But, attackers are motivated and are well aware of the techniques used to mitigate their attacks. Because carrier-based DDoS controls have traditionally been manually-driven, attackers know that simply changing the pattern of their attack faster than operators can detect and install new filters is the simplest way of avoiding traditional defenses. In some cases, this is as simple as rotating port numbers or other simple patterns in their attacks.
But attackers are also becoming more sophisticated and are now actively fixing their attack tools to avoid easy-to-spot patterns in the packet headers. Many of the signatures used to detect attacks in the past relied on these errors in manually-crafted packets.
Today, DDoS mitigation requires a multi-layered approach: traffic awareness through flow analytics for controlling volumetric attacks that threaten infrastructures, and more sophisticated application layer protections (using protocol "scrubbers") to address the more complex "low and slow" attacks that now target application endpoints.
Customers sometimes ask us if it's possible to detect and respond to application-level attacks using the multi-terabit class ASIC in our Service Provider router portfolio.
Volumetric attacks are usually easy to detect with sampled or hardware-native flow identification: the really large volumes can't "hide in the noise" the way that the much lower volume application layer attacks can.
Unfortunately, with application level attacks, it's not so easy. Even if there is a recognizable "signature" to look for, detection would mean ASIC-level flow matching of arbitrary patterns found deep in packets. But the nature of very fast silicon means that these on-chip flow tables are limited in size to thousands or tens of thousands of flows. At peering points, the number of flows to look at is many orders of magnitude higher, rendering this sort of on-chip tracking impractical.
Equally important, high speed packet hardware is optimized to work on network packets that have well-defined and relatively compact headers. Matching longer, more complex, or variable length fields (which are extremely common in filters that look for URLs or DNS record strings) requires a completely different set of silicon/NPU choices that mean that the total bandwidth capability of the device is reduced significantly. Not a good trade-off to make given the huge growth in traffic volumes I discussed earlier.
The pragmatic solution to mitigate complex DDoS attacks without sacrificing the bandwidth necessary to keep up with future traffic growth, is to do packet sampling and push the analysis and collection to external systems that offer a breadth of analytics and scale. Then, focus first on eliminating the single slowest part of remediation: manual configuration. Automating mitigations via FlowSpec, driven by intelligent analytics, is today's best practice for coping with large-scale DDoS attacks.
If you want to know more about Cisco's Intelligent Peering solution, read this great white paper or listen to either this podcast or this Cisco Knowledge Webinar.