Wednesday, December 10, 2014

Careful When You Fake Your Twitter Compromise: We Can Detect It

As those who've followed our research know well, during the last four years we have worked on systems that can automatically detect malicious activity on online social networks. In particular, last year we presented COMPA, a system that learns the typical behaviour of social network accounts and raises an alert if an accounts posts a message that does not comply with such behaviour. We showed that COMPA is a reliable tool to detect accounts that have been compromised, and we showed how our behavioural modelling could have helped in preventing high-profile social network compromises, such as the ones against the Skype Twitter account and the associated press Twitter account.

Flash forward by a year or so, Honda reported that their Twitter account had been hacked last week. The alleged culprit was the cartoon villain and Robot Chicken celebrity Skeletor. 


Obviously, fictional characters do not hack Twitter accounts, and it was soon clear that this hack had only been simulated for promotional purposes. It is not the first time that such a thing happens, Chipotle did the same a little over a year ago. Apparently Twitter compromises became so mainstream that faking one is an attractive marketing technique. This trick even works, or so it seems. In the day of the simulated compromise, Chipotle collected more than 4,000 followers, an order of magnitude more than what they typically attract.

The clever marketers at these companies did not take into account COMPA though. Our system was able to correctly assess that nothing was anomalous about the malicious tweets sent by Honda, as well as about the ones sent by Chipotle. Basically, our tool is not only useful in detecting messages that are sent by attackers who gained access to social network accounts, but can also detects compromises that are only simulated.

Next time you stage a Twitter compromise, make sure that your messages look anomalous, otherwise we can detect your bluff.

Saturday, May 31, 2014

New Insights into Email Spam Operations

Our group has been studying spamming botnets for a while, and our efforts in developing mitigation techniques and taking down botnets have contributed in decreasing the amount of spam on the Internet. During the last couple of years the spam volumes have significantly dropped, but spam still remains a significant burden to the email infrastructure and to email users. Recently, we have been working on gaining a better understanding of spam operations and of the actors involved in this underground economy. We believe that shedding light on these topics can help researchers develop novel mitigation techniques, and identifying which of the already-existing techniques are particularly effective in crippling spam operations, and should therefore be widely deployed. Our efforts produced two papers.

The first paper, which will be presented at AsiaCCS next week, is a longitudinal study of the spam delivery pipeline. Previous research showed that to set up a spam operation a spammer has to interact with multiple specialized actors. In particular, he has to purchase a list of email addresses to target with his spam emails, and he needs a botnet to send the actual spam. Both services are provided by specialized entities that are active on the underground market, which we call "email harvesters" and "botmasters" respectively. In this paper, we studied the relations between the different actors in the spam ecosystem. We want to understand how widely email lists are sold, and to how many spammers, as well as how many botnets each spammer rents to set up their operations.

To perform our study, we proceeded as follows. First, we disseminated fake email addresses under our control on the web. We consider any access to the web pages where these email addresses are hosted as a possible email harvester, and "fingerprint" it by logging its IP address and user agent. By doing this, every time we receive a spam email destined to a certain address, we can track which email harvester collected that address. Similarly, we can fingerprint the botnet that is sending the spam email by using a technique that we presented at USENIX Security in 2012, called SMTP dialects. In a nutshell, this technique leverages the fact that each implementation of the SMTP protocol used by spambots is different, and that it is possible to assess the family that a bot belongs to just by looking at the sequence of SMTP messages that it exchanges with the email server. Finally, we assume that a single spammer is responsible of each spam campaign, and cluster together similar emails.

After collecting the aforementioned information, we can track a spam operation from its beginning to its end: we know which email list spammers used, as well as which botnet they took advantage of. Our results show that spammers develop some sort of "brand loyalty" both to email harvesters and to botmasters: each spammer that we observed used a single botnet over a period of six months, and kept using the same email list for a long period of time.

The second paper, which was presented at the International Workshop on Cyber Crime earlier this month, studies the elements that a spammer needs to set to make his botnet perform well. We studied the statistics of 24 C&C servers belonging to the Cutwail botnet, looking at which element differentiate successful spammers from failed ones. The first element is the number of bots that the spammer uses. Having too many bots connecting to the C&C server saturates its bandwidth and results in bad performance. Another element is the size of the email list used by spammers. "Good" spammers trim their email list from non-existing email addresses, avoiding their bots to waste time sending emails that will never get delivered. A third element consists in having bots retry to send an email multiple times after receiving a server error: since many bots have poor Internet connections, this helps keeping the fraction of emails successfully sent high. The last, surprising finding is that the physical location of bots seems not to influence the performance of a spam campaign. As a side effect of this, successful spammers typically purchase bots located in developing countries, which are typically cheaper.

The findings from this paper show us which elements spammers tune to make their operation perform well. Fortunately, there are a number of systems that have been proposed by the research community that target exactly these elements. We think that widely deploying these proposed techniques could significantly cripple spam operations, to a point that might make these operations not profitable anymore. An example of these techniques is B@BEL, a system that detects whether an email sender is reputable or not, and provides fake feedback on whether an email address exists or not anytime it detects the sender as a bot. Providing fake feedback would make it impossible for spammers to cleanup their lists from non-existing email addresses, compromising the performance of their operations.

Similarly, Beverly et al. proposed a system that flags senders as bots if network errors are too common. Such system can be used as a direct countermeasure to having spammers instruct their bots to keep trying sending emails after receiving errors. Finally, SNARE is a system that, among other features, looks at the geographical distance between sender and recipients to detect spam. Since spammers purchase their bots in countries that are typically far away from their victims (who are mostly located in western countries), this system could be very effective in fighting spam if widely deployed.

We hope that the insights provided in these two papers will provide researchers with new ideas to develop effective anti-spam techniques.

Tuesday, January 7, 2014

Detecting the Skype Twitter Compromise

On January 1, 2014 Microsoft-owned Skype added itself to the list of high-profile Twitter compromised accounts. The company's Twitter account posted the following tweet:

As it can be seen, the tweet was even "signed" by the #SEA hashtag, which stands for Syrian Electronic Army. SEa is a group of hackers that support the Syrian regime, which has been responsible for previous high-profile Twitter hacks. The tweet states "Don't use Microsoft emails(hotmail,outlook). They are monitoring your accounts and selling the data to the governments. More details soon #SEA". Basically a classical case of political hacktivism.

The tweet looks  anomalous at first sight, not only for the odd content sent out by the account of a Microsoft-owned Twitter account, but also for the hashtag that attributes the tweet to the middle-eastern group. However, Twitter's automated defenses did not block the tweet as anomalous. Even worse, since the compromise happened on a holiday, it took Microsoft hours before the tweet was taken down.

It is to detect and prevent such incidents that we developed COMPA. COMPA learns the typical behavior of a Twitter account, and flags as anomalous any tweet that significantly deviates from the learned behavioral model. My colleague Manuel Egele and I checked the malicious tweet by the Syrian Electronic Army against the behavioral model built for the Skype Twitter account. The result is positive: had it been deployed on Twitter, COMPA would have detected and blocked the tweet, and saved Microsoft some public relations embarrassment. 

More in detail, the Skype twitter account always posts from the Sprinklr Twitter client, while the malicious tweet was sent by the regular Twitter web interface. This fact is already very suspicious by itself. As a second element, the Skype Twitter account never used the #SEA hashtag before. In addition, the malicious tweet did not contain a URL, which is common practice in Skype's tweets. Interestingly, the time of the day at which the tweet was sent matches the typical sending patterns of the Skype Twitter account. However, this was not enough to evade detection by our system.

This result shows that COMPA is effective in detecting and block tweets that are sent by compromised Twitter accounts. We strongly advocate that Twitter and other social networks implement similar techniques to keep their accounts protected, and block malicious tweets before they are posted.