|   | CMU-ISRI-06-112 Institute for Software Research
 School of Computer Science, Carnegie Mellon University
 
    
     
 CMU-ISRI-06-112
 
Learning to Detect Phishing Emails 
Ian Fette, Norman Sadeh, Anthony Tomasic 
June 2006  
Also appears as Carnegie Mellon Cyber LaboratoryTechnical Report CMU-CyLab-06-112
 
CMU-ISRI-06-112.pdf Keywords: Phishing, email, filtering semantic attacks, learning
 There are an increasing number of emails purporting to be from a trusted 
entity that attempt to deceive users into providing account or 
identity information, commonly known as
"phishing" emails. Traditional spam filters are not adequately detecting
these undesirable emails, and this causes
problems for both consumers and businesses wishing to do business online. 
From a learning perspective,this is a challenging problem. At first 
glance, the problem appears to be a simple text
classification problem, but the classification is confounded by 
the  fact that the class of "phishing"
emails is often designed to look exactly like the class of real emails. 
We propose a new framework
for detecting these malicious emails called PILFER. By 
incorporating features specifically
designed to highlight the deceptive methods used to fool users, 
we are able to accurately classify
over 92% of phishing emails, while maintaining a false positive 
rate on the order of 0.1%.
 
16 pages 
 |