This post is based on research conducted in collaboration with Google, to appear in CCS 2012. A pdf is available under my publications. Any views or opinions discussed herein are my own and not those of Google.
Driveby downloads — webpages that attempt to exploit a victim’s browser or plugins (e.g. Flash, Java) — have emerged as one of the dominant vectors for infecting hosts with malware. This revolution in the underground ecosystem has been fueled by the exploit-as-a-service marketplace, where exploit kits such as Blackhole and Incognito provide easily configurable tools that handle all of the “dirty work” of exploiting a victim’s browser in return for a fee. This business model follows in the footsteps of a dramatic evolution in the world of for-profit malware over the last five years, where host compromise is now decoupled from host monetization. Specifically, the means by which a host initially falls under an attacker’s control are now independent of the means by which an(other) attacker abuses the host in order to realize a profit, such as sending spam, information theft, or fake anti-virus.
In the case of exploit kits, attackers can funnel traffic from compromised sites or SEO boosted content to exploit kits, taking control of a victim’s machine without any knowledge of the complexities surrounding browser and plugin vulnerabilities. These hosts can in turn be sold to the pay-per-install marketplace or directly monetized by the attacker. From the perspective of Google Chrome, driveby downloads outstrip social engineering as the most prominent threat, while Microsoft’s latest security intelligence report (SIRv12) highlights the growing threat of driveby downloads, shown below:
In order to understand the impact of the exploit-as-a-service paradigm on the malware ecosystem, we performed a detailed analysis of:
- The prevalence of exploit kits across malicious URLs
- The families of malware installed upon a successful browser exploit, compared to executable found in email spam, software torrents, the pay-per-install market, and live network traffic
- The traffic volume, lifetime, and popularity of malicious websites.
To carry out this study, we analyzed 77,000 malicious URLs provided to us by Google, along with a crowd-sourced feed of blacklisted URLs known to direct to exploit kits. These URLs led to over 10,000 distinct binaries, which we ran in a contained environment (i.e. no side-effects visible to the outside world) to determine the family of malware as well as its monetization approach. We also aggregated and executed over 50,000 distinct binaries pulled from email spam, software and warez torrents, pay-per-install distribution sites, and live network traffic containing malware from corporate settings.
Anatomy of Driveby Download
From the time a victim accesses a malicious website up to the installation of malware on their system, there is a complex chain of events that underpins a successful driveby download. The infection chain for a real driveby that appeared in our study is shown below, where I obfuscate only the compromised website that launched the attack:
In this particular case, victims that visited a compromised website  were funneled through a chain of redirects  before finally being exposed to an exploit kit . Depending on the time the compromised site was visited, either Blackhole or a yet unknown exploit kit would attempt to exploit the victim’s browser. If successful, different malware including SpyEye (information stealer), ZeroAccess (information stealer), and Rena (fake anti-virus) supplied by third-parties  would be installed on the victim’s machine . This chain highlights the multiple actors involved in the exploit-as-a-service market: attackers’s purchasing installs, exploit kit developers, and miscreants compromising websites and redirecting traffic to exploit kits. Depending an attacker’s preference, all three roles can be conducted by a single party or outsourced to the underground marketplace.
Popular Exploit Kits
Of the 77,000 URLs we received from Google’s Safe Browsing list, over 47% of initial domains tied to driveby downloads terminate at an exploit kit. Of the remaining domains, 49% lead directly to executables without a pack, and 4% could not be classified. The table below provides a detailed breakdown of the kits we identified:
Through passive DNS data collected from a number of ISPs (details available in the paper), we are able to determine which families are installed most frequently by driveby domains. This provides a more meaningful ranking than using unique MD5 sums, which only measures polymorphism. We also compare whether any of the families installed by drivebys appear in our other feeds: (D)roppers, (A)ttachments, (L)ive, and (T)orrents.
|Family||Monetization||Fraction of Installs||Other Feeds|
|Windows Custodian||Fake AV||10.3%||–|
|Cluster A||Browser Hijacking||5.1%||–|
|Cluster B||Fake AV||2.2%||–|
|Perfect Keylogger||Information Stealer||1.9%||D;L|
|Votwup||Denial of Service||1.6%||–|
|Fake Rena||Fake AV||1.5%||–|
|Cluster C||Information Stealer||0.7%||–|
Variants including ZeroAccess and Emit rely on multiple infection vectors, while many of the other prominent variants are distributed solely through drivebys. Given that we identify 32 variants from drivebys and 19 from droppers, compared to only 6 from attachments and torrents, it is clear that the exploit-as-a-service and pay-per-install marketplace dominate the underground economy as a source of installs.
Catch Me If You Can
Using passive DNS data, we measure the time that a domain used to host an exploit kit receives traffic. We find malicious domains survive for a median of 2.5 hours before going dark, with 43% of compromised pages that siphon traffic towards exploit kits linking to more than one final domain. As such, attempting to detect sites hosting exploit kits is a losing battle where domain registration far outstrips the pace of detection. Instead, detection should concentrate on identifying compromised sites. Such detection should also occur in-browser in order to circumvent the challenges associated with cloaking or time of crawl vs. time of use variations.