Data Mining and Fraud Detection

Abstract: In this blog post we will discuss how data mining and machine learning can improve fraud detection in any industry. We also categorize solutions in two main parts which have their own specific patterns for fraud detection.

Fraud detection is a topic applicable to many industries including banking and financial sectors. Fraud attempts have seen a drastic increase in recent years, making fraud detection more important than ever.

Generally, data mining (sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it into useful information. Data mining and statistics help to anticipate and quickly detect fraud and take immediate action to minimize costs. Through the use of sophisticated data mining tools, millions of transactions can be searched to spot patterns and detect fraudulent transactions.

The machine learning and artificial intelligence solutions may be classified into two categories: ‘supervised’ and ‘unsupervised’ learning.

In supervised learning, a random sub-sample of all records is taken and manually classified as either ‘fraudulent’ or ‘non-fraudulent’. Relatively rare events such as fraud may need to be over sampled to get a big enough sample size.These manually classified records are then used to train a supervised machine learning algorithm. After building a model using this training data, the algorithm should be able to classify new records as either fraudulent or non-fraudulent.

The use of unsupervised learning for fraud detection is not explored as in-
tensively as the use of supervised learning. Bolton and Hand are monitoring
behavior over time by means of Peer Group Analysis. Peer Group Analysis
detects individual objects that begin to behave in a way different from ob-
jects to which they had previously been similar. Another tool Bolton and
Hand develop for behavioral fraud detection is Break Point Analysis. Unlike
Peer Group Analysis, Break Point Analysis operates on the account level.
A break point is an observation where anomalous behavior for a particular
account is detected. Both the tools are applied on spending behavior in
credit card accounts.
Conclusion: We can see that organizatios deploy data mining and business intelligence tools to prevent and detect fraud. But simultaneously frauds are becoming more complicated and need more sophisticated solutions. One of the main decision toward a more secure system is empowering our technical infrastructure. In this way we have to develop our system for a bigger Size of the database to gain more accurate pattern of data. And using experts to deploy more complex and greater number of queries.
References:
https://www.researchgate.net/publication/241153108_Data_Mining_for_Fraud_Detection_Toward_an_Improvement_on_Internal_Control_Systems
http://www.statsoft.com/Textbook/Fraud-Detection
http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologies/palace/datamining.htm