00101
FINANCIAL FRAUD DETECTION USING COSINE SIMILARITY

Sunday, February 19, 2017
Exhibit Hall (Hynes Convention Center)
Akshay Gore, California State University Chico, chico, CA
Fraud is a large scale problem which affects the various entities from public sector to private sectors including government, profit and non-profit organizations. It is hard to predict the exact scale of the fraud because most of the time it remains undetected. It is very important to detect financial frauds and save the company’s or the tax payer’s money. In this research project a data mining tool is designed which calculates the similarity between the electronic transactions, using which possibility of the fraud can be predicted. Cosine similarity algorithm and Rapidminer tool is used to develop the fraud prediction tool. Invoice Frauds are not only dependent on a company name or amount, but also depends upon transaction location, payment mode, transaction date, invoice number, transaction category, vendor and digital footprint. It is very important to understand pattern to predict the future frauds. The data mining model developed in this research will help organization to analyze their financial transaction and will blow the early whistle against the fraudsters. This model is taking transaction data as an input. After reading the input, it transforms the data into lower cases. This processed data is then tokenized and inverse frequency of each token is calculated. Similarity count is obtained between transactions calculated by applying the cosine algorithm on this processed data. These results are then compared to the list of fraud transactions that are maintained. This implementation gives 97% accuracy in detecting the fraud. This shows that method used in this project can predict the possibility of fraud accurately in most of the cases. This module is the simple and effective way to avoid such frauds and save those expenditures.