Phishing scams have grown in frequency and developed in sophistication, and in recent years emails have been misused by scammers to frequently launch criminal attacks. By using phishing emails, scammers can make money in a very short time and generally avoid prosecution. Although it is typically easy for them to implement fraudulent plans with little cost, it is normally hard for law enforcement to catch them. On the other hand, victims can often face severe property loss or loss due to identity theft. Research focusing on detecting and preventing phishing attacks has thus become a hot topic in the area of computer and network security and a variety of tools have been developed to address aspects of this problem. However, there is currently not much software that can be used to detect and analyze phishing crimes efficiently. When investigating incidents of phishing and the related problem of identity theft, law enforcement investigators need to spend a lot of time and effort but they often get only few clues or results. We have developed the Undercover Multipurpose Anti-Spoofing Kit (UnMASK) to help solve this problem.
This thesis presents the idea and the design of the deconstruction and analysis of email messages, which is used in UnMASK to help law enforcement in investigating and prosecuting email based crimes. It addresses the following problems: how can we parse a raw email message and find the information for investigation? What kind of information can we gather from the Internet? And which UNIX tools can be used for our investigation? In contrast to other work in this area, this research comprehensively considers exploits in phishing emails and defines a well-provided raw email parser for law enforcement investigations. And we also design and implement a new protocol used in the UNIX tool system. It not only tries to identify suspicious emails, but also emphasizes the gathering of evidence of crime. To the best of our knowledge, UnMASK is the first system that can automatically deconstruct email messages and present related forensic information in a convenient format to law enforcement.
Test results show that the parser and the UNIX tool system of UnMASK are stable and useful. It can correctly extract information that law enforcement officers want to check in raw emails and it also correctly gathers information from the Internet. It generally takes a couple of minutes for our system to complete the report for one raw email message. Compared to the hours investigators spent to do the same work, our system greatly improves their efficiency