Research Spotlight: Detecting Algorithmically Generated Domains

Servidores servidores

This post was authored by Mahdi Namazifar and Yuxi Pan

Once a piece of malware has been successfully installed on a vulnerable system one of the first orders of business is for the malware to reach out to the remote command-and-control (C&C) servers in order to receive further instructions, updates and/or to exfiltrate valuable user data. If the rendezvous points with the C&C servers are hardcoded in the malware the communication can be effectively cut off by blacklisting, which limits the malware's further operation and the extent of their damage.

To avoid such static detection mechanisms recent attackers have been taking advantage of various Domain Generation Algorithms (DGA) in choosing and updating the domain names of their C&C servers. DGA embedded in the malware generate a large amount of pseudo-random domain names within a given period, most of which are nonexistent. With the same random seed, e.g. time of the day or most popular tweets of the day, the attackers can generate exactly the same list of domain names remotely, among which they will only register a few. The malware will contact some or all of the domains generated by the DGA, giving its opportunity to be able to connect to the C&C server. The sheer amount of nonexistent domains produced by the DGA on a daily basis presents a great burden for security specialists if blacklisting is still to be pursued.

One of the counter-strategies against malware equipped with DGA is to reverse-engineer the algorithm and use it to automatically generate the target blacklist. Although this method is highly effective if the reverse-engineering is successful, it is a resource- and time-consuming task. New DGA detection methods have been proposed[1][2] which analyze the statistical and linguistic features of the domain names and the related network traffic to determine whether they are generated algorithmically. In this blog post we introduce the first component of our DGA detection system. This component utilizes a novel language-based technique for detecting strings that are generated by chaining random characters, with the assumption that most of the DGA generate domain names which can pass various randomness tests in order to avoid conflicts with existing legitimate domains.

The output of our component is a randomness score on which decision can be made as to whether the input is an algorithmically generated domain name. To calculate such a randomness score we started with building a large set of dictionaries encompassing various languages, e.g. English, French, Chinese, etc. For some languages different versions or dialects were built as well. The set also includes languages that were manually constructed, e.g. Esperanto and Interlingua. In total there are over 60 dictionaries assembled to cover different languages, a list of these dictionaries is shown in Figure 1. In addition English names from US 1990 census data, Scrabble words, Alexa 1000 domain names and texting acronyms have also been added to the set. The goal of constructing these dictionaries is to identify non-random sequence of characters or words in the domain names, which are less likely to appear in a DGA-generated name.

Figure 1. The list of language dictionaries

For each domain name under investigation we then enumerate all of its substrings whose initial or final character aligns with the original string. Dashes and underscores in the domain names help with the tokenization as well. Features involved in computing the randomness score include:

1. the number of dictionary hits of the substrings
2. the length of substrings that appeared in a dictionary
3. the number of different languages needed to cover the substrings

A linear model was built to calculate the randomness score, the weights on the features values as well as thresholds involved in the decision were carefully tuned against legitimate domain names in the Alexa dataset as well as the illegal ones generated by a variety of reverse-engineered DGA. Figure 2 shows the result of the false negative rate on a set of algorithmically generated domains produced by 9 different DGA, together with some of the sample domain names missed by our detection method.

Figure 2. False negative rate of our DGA detection component

Figure 3 shows the result of running our method on Alexa 10000 domains dataset which is believed to consist of only benign domains. Some of these domain names were considered as randomly generated by our detection method therefore constitute the false positive sample.

Figure 3. False positive samples of our DGA detection component

As a real-world example, we looked into the domain names generated by an actual Cryptolocker malware[3]. Cryptolocker encrypts all the files on a victim system using random keys generated by itself. Then it encrypts these keys using RSA with a public key received from the C&C server. Users of the systems are asked to pay an amount of money to the attacker if they want to unlock their files. For this reason Crytolocker is also called a ransomware. The way Cryptolocker connects to its C&C server to receive the public key is through contacting DGA-generated domains. Ref.[3] displays the result of a tcpdump captured on a system right after infection. The second-level domain labels include:

uqxypfdjiwwrdvi
uqxypfdjiwwrdvi
iqnueumtiyugvjt
iqnueumtiyugvjt
vmivkpqyunlqfpl
vmivkpqyunlqfpl
ptiautjthpnxdcw
ptiautjthpnxdcw
qednmophtxheusk
qednmophtxheusk
qpswpewjtgcwdqq
qpswpewjtgcwdqq
rankhydwgovdetm

Our method successfully identified these 13 domain names as DGA-generated domains.

The current DGA detection component has been operating on our private cluster analyzing up to 200,000 newly registered domain names on a daily basis, and among them, 6,000 suspicious DGA domains are detected. Like any machine learning algorithm, the precision and recall of the algorithm is subjected to adjustment through a few parameters, depending on the targeting DGA domains. Fighting cyber-criminal is like a chess match, and DGAs adopted today are becoming more complicated, making room for more sophisticated detection algorithm. Nevertheless, the simplicity of the algorithm and the effectiveness of it on the prevalent DGA domains make it an essential component in our threat detection portfolio.

[1]. Sandeep Yadav, Ashwath Kumar Krishna Reddy, A.L. Narasimha Reddy, and Supranamaya Ranjan. 2010. Detecting algorithmically generated malicious domain names. In Proceedings of the 10th ACM SIGCOMM conference on Internet measurement (IMC '10). ACM, New York, NY, USA, 48-61.
[2]. Leyla Bilge, Sevil Sen, Davide Balzarotti, Engin Kirda, and Christopher Kruegel. 2014. Exposure: A Passive DNS Analysis Service to Detect and Report MaliciousDomains. ACM Trans. Inf. Syst. Secur. 16, 4, Article 14 (April 2014)
[3].Getting prepared for the next Cryptolocker DGA. Frank Denis random thoughts.
https://00f.net/2013/11/05/cryptolocker-dga/

Cisco Price, Dell Price, Huawei Price, ZTE HPE Fortinet Switch Router Server At Low Price

Servidores servidores

Noticias calientes

Huawei S5735-L24T4S-A: High-Performance Stacking Meets Zero-Noise Deployment

S5735-L24P4XE-A-V2: Huawei’s Smart Choice for High-Density Campus Deployments

S5735-L24P4X-A1: Huawei’s High-Performance Access Switch Redefining Campus Networking

Huawei S5735-L24P4S-A1 Review: Reliable Gigabit Access with Enterprise-Grade Features

What Is an Orthogonal Architecture?

Huawei s5735-l24p4s-a-v2 Delivers Scalable, Secure, and Smart PoE Access for Modern IT Infrastructures

Huawei S5735-L48T4XE-A-V2 Switch Delivers Enterprise-Grade Performance in a Compact Design

Huawei S5735-L48P4XE-A-V2 Review: Versatile Campus Switch with iStack and Full L3 Support

Differences Between Huawei CE Series and S Series Switches

Huawei CloudEngine S5735 Switches Set the Benchmark for High-Performance, Energy-Efficient Switching

Huawei CloudEngine S5731‑S48P4X Datasheet

Huawei CloudEngine S5731‑S24P4X Datasheet

Huawei S5731-S Empowers Next-Generation Campus Networks with Advanced Capabilities

Huawei S5731-H24P4XC Switch Review: Power-Packed Performance and Smart PoE

Huawei S5731-H Series Switches Redefine Campus Networking with Intelligent High-Performance Architecture

Top Features of the Huawei S5731-S24T4X: The Ultimate Gigabit Access Switch for Modern Networks

General Power Module Fault Location Procedure (CE8800 & 7800 & 6800 & 5800)

How Do I Split a Stack? How to clear the stacking configuration?

Huawei CloudEngine S5731 Datasheet

Huawei CloudEngine S5731-S24P4X: Powerful Enterprise-Grade Switch Explained

Huawei S5731-S48T4X Review: Powerful Enterprise Switch for High-Speed Networking

Why are network cables limited to 100 meters?

Huawei S5731-S32ST4X: Powerful, Enterprise-Ready Gigabit Switch with Advanced Capabilities

Huawei S5731-H48T4XC Review: High-Performance Switching for Modern IT Infrastructures

Huawei S5731-H48P4XC: Comprehensive Overview

Common display Commands for Huawei Devices

Stacking Card Stacking vs. Service Port Stacking: Application Scenarios for the Two Switch Stacking Methods

Huawei S5731-H24T4XC: High-Performance Intelligent Gigabit Switch

Huawei S5731-S48P4X: High-Performance PoE Switch with Flexible Power and Uplink Options

Huawei S5731 Series: Advanced Networking Solutions for Enterprises

Research Spotlight: Detecting Algorithmically Generated Domains

Etiquetas calientes: Cisco Talos Talos dga

Ordering Guide

Recursos recursos

Sobre nosotros

Cisco Price, Dell Price, Huawei Price, ZTE HPE Fortinet Switch Router Server At Low Price

Servidores servidores

Noticias calientes

Huawei S5735-L24T4S-A: High-Performance Stacking Meets Zero-Noise Deployment

S5735-L24P4XE-A-V2: Huawei’s Smart Choice for High-Density Campus Deployments

S5735-L24P4X-A1: Huawei’s High-Performance Access Switch Redefining Campus Networking

Huawei S5735-L24P4S-A1 Review: Reliable Gigabit Access with Enterprise-Grade Features

What Is an Orthogonal Architecture?

Huawei s5735-l24p4s-a-v2 Delivers Scalable, Secure, and Smart PoE Access for Modern IT Infrastructures

Huawei S5735-L48T4XE-A-V2 Switch Delivers Enterprise-Grade Performance in a Compact Design

Huawei S5735-L48P4XE-A-V2 Review: Versatile Campus Switch with iStack and Full L3 Support

Differences Between Huawei CE Series and S Series Switches

Huawei CloudEngine S5735 Switches Set the Benchmark for High-Performance, Energy-Efficient Switching

Huawei CloudEngine S5731‑S48P4X Datasheet

Huawei CloudEngine S5731‑S24P4X Datasheet

Huawei S5731-S Empowers Next-Generation Campus Networks with Advanced Capabilities

Huawei S5731-H24P4XC Switch Review: Power-Packed Performance and Smart PoE

Huawei S5731-H Series Switches Redefine Campus Networking with Intelligent High-Performance Architecture

Top Features of the Huawei S5731-S24T4X: The Ultimate Gigabit Access Switch for Modern Networks

General Power Module Fault Location Procedure (CE8800 & 7800 & 6800 & 5800)

How Do I Split a Stack? How to clear the stacking configuration?

Huawei CloudEngine S5731 Datasheet

Huawei CloudEngine S5731-S24P4X: Powerful Enterprise-Grade Switch Explained

Huawei S5731-S48T4X Review: Powerful Enterprise Switch for High-Speed Networking

Why are network cables limited to 100 meters?

Huawei S5731-S32ST4X: Powerful, Enterprise-Ready Gigabit Switch with Advanced Capabilities

Huawei S5731-H48T4XC Review: High-Performance Switching for Modern IT Infrastructures

Huawei S5731-H48P4XC: Comprehensive Overview

Common display Commands for Huawei Devices

Stacking Card Stacking vs. Service Port Stacking: Application Scenarios for the Two Switch Stacking Methods

Huawei S5731-H24T4XC: High-Performance Intelligent Gigabit Switch

Huawei S5731-S48P4X: High-Performance PoE Switch with Flexible Power and Uplink Options

Huawei S5731 Series: Advanced Networking Solutions for Enterprises

Research Spotlight: Detecting Algorithmically Generated Domains

Etiquetas calientes: Cisco Talos Talos dga

Ordering Guide

Recursos recursos

Sobre nosotros

Huawei CloudEngine S5731‑S48P4X Datasheet