Print this page

CS Student Rajan Kharel Wins the First Place in AAS Paper Competition

  • March 25, 2013
Rajan Kharel, who recently joined the Knowledge Discovery and Data Mining Lab as a MS/PHD student in Fall 2012, wins the first lace in the paper competition in the Engineering and Computer Science Section at the 90th annual meeting of the Alabama Academy of Science.
Rajan Kharel, who recently joined the Knowledge Discovery and Data Mining Lab as a MS/PHD student in Fall 2012, wins the first lace in the paper competition in the Engineering and Computer Science Section at the 90th annual meeting of the Alabama Academy of Science.

His paper title:  IMAGE MATCHING FOR BRANDING PHISHING KIT IMAGES

Abstract:
Phishing websites attempt to convince people to deliver their passwords, user IDs and other sensitive information by masquerading legitimate websites such as banks, product vendors, or service providers. It is helpful for taking appropriate action against phishing if the above legitimate websites can be efficiently identified and branded. Using a phishing kit is a preferred way of creating phishing websites as it allows fast deployment of a phishing site. A phish or a kit may contain one or more images that are similar to a targeted brand such as a bank logo or a product trademark. Visual matching is frequently adopted as a reliable way of determining the brand the phish or kit attempts to imitate. However, manual matching of such images does not scale and is error-prone, with a lack of accumulation of knowledge. In this paper, we explore automatic image matching that can determine the association between an image in a phish/kit and the masqueraded brand (e.g. a bank) it is attempting to imitate. In particular, four image matching algorithms are developed and compared, which are 1) Global Color Histograms (GCH), 2) Local Color Histograms (LCH), 3) Local Color Histogram with dimensional scaling and background color removal (LCH+), and 4) LCH+ applied to the foreground object region only (LCH++), to calculate the image similarity in order to group phishing images based on brands. The correct identification rate of brand images and non-brand images is used to evaluate the effectiveness of these methods. Experiments are performed on 10,130 images that are extracted from phish creation kits. The results indicate that all these techniques are reasonably effective at correctly identifying an image's brand, while they are different in effectiveness of correctly identifying an image as a brand or non-brand image.