Center for Business Analytics

Fraud Detection

Members of the Predictive Analytics Lab work on various fraud detection-related projects, including financial fraud detection and online medical fraud. Example published articles include:

MetaFraud: A Meta-Learning Framework for Detecting Financial Fraud

Financial fraud can have serious ramifications for the long-term sustainability of an organization, as well as adverse effects on its employees and investors, and on the economy as a whole. Several of the largest bankruptcies in U.S. history involved firms that engaged in major fraud. Accordingly, there has been considerable emphasis on the development of automated approaches for detecting financial fraud. However, most methods have yielded performance results that are less than ideal. Enhanced financial fraud detection tools represent an important application area for business intelligence technologies. In light of the need for more robust identification methods, we use a design science approach to develop MetaFraud, a novel meta-learning framework for enhanced financial fraud detection. To evaluate the proposed framework, a series of experiments are conducted on a test bed encompassing thousands of legitimate and fraudulent firms. The results reveal that each component of the framework significantly contributes to its overall effectiveness. Additional experiments demonstrate the effectiveness of the meta-learning framework over state-of-the-art financial fraud detection methods. Moreover, the MetaFraud framework generates confidence scores associated with each prediction that can facilitate unprecedented financial fraud detection performance and serve as a useful decision-making aid. The results have important implications for several stakeholder groups, including compliance officers, investors, audit firms, and regulators. PDF 


Detecting Fake Medical Websites using Recursive Trust Labeling

Fake medical websites have become increasingly prevalent. Consequently, much of the health-related information and advice available online is inaccurate and/or misleading. Scores of medical institution websites are for organizations that do not exist and more than 90% of online pharmacy websites are fraudulent. In addition to monetary losses exacted on unsuspecting users, these fake medical websites have severe public safety ramifications. According to a World Health Organization report, approximately half the drugs sold on the Web are counterfeit, resulting in thousands of deaths. In this study, we propose an adaptive learning algorithm called recursive trust labeling (RTL). RTL uses underlying content and graph-based classifiers, coupled with a recursive labeling mechanism, for enhanced detection of fake medical websites. The proposed method was evaluated on a test bed encompassing nearly 100 million links between 930,000 websites, including 1,000 known legitimate and fake medical sites. The experimental results revealed that RTL was able to significantly improve fake medical website detection performance over 19 comparison content and graph-based methods, various meta-learning techniques, and existing adaptive learning approaches, with an overall accuracy of over 94%. Moreover, RTL was able to attain high performance levels even when the training data set comprised of as little as 30 websites. With the increased popularity of eHealth and Health 2.0, the results have important implications for online trust, security, and public safety. PDF​