申请试用
HOT
登录
注册
 
Large-Scale Malicious Domain Detection with Spark AI

Large-Scale Malicious Domain Detection with Spark AI

Spark开源社区
/
发布于
/
8601
人观看
Malicious domains are one of the main resources used to mount attacks over the Internet. It is important to detect such activities by mining the large-scale network traffic data and identifying malicious URLs, domains or IPs. The attackers often take advantages of vulnerabilities in DNS and commit activities such as stealing private information, spamming, phishing, and DDoS attacks, and tend to by-pass botnet detection by generating domain clusters from Domain Generation Algorithms (DGA). We have billions of DNS records per day. Spark AI platform hence serves as an efficient distributed platform for the processing and mining of this huge amount of data. We work on the following two cybersecurity use cases. 1. Detect DGA, Porn, and Gambling domains Each malware-compromised host machine will have a large amount of DNS request in sequential order. The domain names are either generated by DGAs or preserve particular string patterns by design. We use spark to generate DNS request domain sequences and use Word2Vec to estimate the embedding of the domains. We then estimate the similarity and the most similar domains in the embedding space are discovered as the potential malicious domains. 2. Detect cryptocurrency mining pool domains The attackers are interested in accessing computing resources to mine cryptocurrency. The malware infected computers will be directed to attacker-controlled mining pool domains. This type of DNS request does not preserve sequential order and is relatively random. Since each mining pool domain cluster is visited by a wide range of different host machines, we used LSH to evaluate the similarity among sets of hosts. As a result, LSH generates domain-bucket bipartite graph and FastUnfolding algorithm is used to discover the domain clusters. We leveraged spark AI for large scale DNS data analysis and discovered hundreds of thousands of malicious domains each day at high precision.
0点赞
0收藏
1下载
确认
3秒后跳转登录页面
去登陆