Carnegie Mellon Algorithm Uncovers Online Social Site Fraudsters
        
        
        
			- By Dian Schaffhauser
- 09/14/16
A research team at Carnegie  Mellon University has developed open source software that will allow social  media sites to identify fraudulent accounts, reviews and followers. In an  experiment using Twitter data, Fraudar,  as it's called, successfully detected numerous fake accounts with tweets  showing they used "follower-buying" services. These accounts had gone  undetected by the social networking service for the seven years since the data  was originally collected.
The basic idea of Fraudar is to "see through camouflage"  set up to make fake traffickers look legitimate, according to Christos  Faloutsos, a professor of machine learning and computer science and principal  investigator for the project.
Faloutsos has long been researching how to identify  fraudulent online activity, particularly involving online reviews. "They  influence our decisions over an extremely wide spectrum of daily and  professional activities: e.g., where to eat, where to stay, which products to  purchase, which doctors to see, which books to read, which universities to  attend and so on. However, the credibility and trustworthiness of online  reviews are at stake," explained an abstract submitted  to the National Science Foundation, which has funded the work.
A previous initiative led to development of "NetProbe," a  "fast and scalable system" to perform fraud detection in online auction  websites such as eBay.
As explained in  a recent article on the university website, Fraudar works by identifying a  "bipartite core." A bipartite graph is a way of diagramming paired  sets of data wherein no node from one set is connected to any other node in the  same set; the connections only go from a node in one set to a node in the other  set.
In the case of the fraud detection, each node represents a  user, and the transactions between the users are shown as lines or  "edges." The bipartite core are groups of users who have transactions  with members of a second group but no transactions with each other. The  existence of the core "suggests a group of fraudsters, whose only purpose  is to inflate the reputations of others by following them, by having fake  interactions with them or by posting flattering or unflattering reviews of  products and businesses," as the article noted. They try to look normal by  linking their fraudulent accounts to "popular sites or celebrities,"  or they exploit "legitimate user accounts they have hijacked."
Fraudar cuts away at the camouflage by first identifying and  eliminating the legitimate accounts — those that follow random people or post  only occasional reviews or show other signs of normal activity. What's left  more readily exposes the bipartite cores.
To test the Fraudar algorithm, the research team turned to a  Twitter database extracted in 2009 for research. The technology identified 4,000  accounts that appeared "highly suspicious."
Then the team randomly chose 125 followers and 125 "followees"  from the suspicious group as well as two control groups of 100 users who hadn't  been identified by the algorithm. Each user's account was examined for links  associated with malware or scams or clear "bot-like behavior," such  as replying to large numbers of tweets with identical messages. The researchers  found that 57 percent of the followers and 40 percent of the followees in the  suspicious group were labeled as fraudulent, compared to 12 percent and 25  percent in the control groups.
"We're not identifying anything criminal here, but  these sorts of frauds can undermine people's faith in online reviews and  behaviors," Faloutsos said. He added that social media platforms do their  best to "flush out such fakery." However, the highly scaled approach  offered by Fraudar could be useful, he added, in keeping up with the latest  practices of fraudsters. "We hope that by making this code available as  open source, social media platforms can put it to good use."
The algorithm is available at andrew.cmu.edu. The  paper that describes the work, "Fraudar: Bounding Graph Fraud in the Face  of Camouflage," is available on the  Carnegie Mellon website.
        
        
        
        
        
        
        
        
        
        
        
        
            
        
        
                
                    About the Author
                    
                
                    
                    Dian Schaffhauser is a former senior contributing editor for 1105 Media's education publications THE Journal, Campus Technology and Spaces4Learning.