fraud detection là gì

From Wikipedia, the không lấy phí encyclopedia

Fraud represents a significant problem for governments and businesses and specialized analysis techniques for discovering fraud using them are required. Some of these methods include knowledge discovery in databases (KDD), data mining, machine learning and statistics. They offer applicable and successful solutions in different areas of electronic fraud crimes.[1]

Bạn đang xem: fraud detection là gì

In general, the primary reason to tát use data analytics techniques is to tát tackle fraud since many internal control systems have serious weaknesses. For example, the currently prevailing approach employed by many law enforcement agencies to tát detect companies involved in potential cases of fraud consists in receiving circumstantial evidence or complaints from whistleblowers.[2] As a result, a large number of fraud cases remain undetected and unprosecuted. In order to tát effectively test, detect, validate, correct error and monitor control systems against fraudulent activities, businesses entities and organizations rely on specialized data analytics techniques such as data mining, data matching, the sounds like function, regression analysis, clustering analysis, and gap analysis.[3] Techniques used for fraud detection fall into two primary classes: statistical techniques and artificial intelligence.[4]

Statistical techniques[edit]

Examples of statistical data analysis techniques are:

  • Data preprocessing techniques for detection, validation, error correction, and filling up of missing or incorrect data.
  • Calculation of various statistical parameters such as averages, quantiles, performance metrics, probability distributions, and ví on. For example, the averages may include average length of Gọi, average number of calls per month and average delays in bill payment.
  • Models and probability distributions of various business activities either in terms of various parameters or probability distributions.
  • Computing user profiles.
  • Time-series analysis of time-dependent data.[5]
  • Clustering and classification to tát find patterns and associations among groups of data.[5]
  • Data matching Data matching is used to tát compare two sets of collected data. The process can be performed based on algorithms or programmed loops. Trying to tát match sets of data against each other or comparing complex data types. Data matching is used to tát remove duplicate records and identify links between two data sets for marketing, security or other uses.[3]
  • Sounds lượt thích Function is used to tát find values that sound similar. The Phonetic similarity is one way to tát locate possible duplicate values, or inconsistent spelling in manually entered data. The ‘sounds like’ function converts the comparison strings to tát four-character American Soundex codes, which are based on the first letter, and the first three consonants after the first letter, in each string.[3]
  • Regression analysis allows you to tát examine the relationship between two or more variables of interest. Regression analysis estimates relationships between independent variables and a dependent variable. This method can be used to tát help understand and identify relationships among variables and predict actual results.[3]
  • Gap analysis is used to tát determine whether business requirements are being met, if not, what are the steps that should be taken to tát meet successfully.
  • Matching algorithms to tát detect anomalies in the behavior of transactions or users as compared to tát previously known models and profiles. Techniques are also needed to tát eliminate false alarms, estimate risks, and predict future of current transactions or users.

Some forensic accountants specialize in forensic analytics which is the procurement and analysis of electronic data to tát reconstruct, detect, or otherwise tư vấn a claim of financial fraud. The main steps in forensic analytics are data collection, data preparation, data analysis, and reporting. For example, forensic analytics may be used to tát review an employee's purchasing thẻ activity to tát assess whether any of the purchases were diverted or divertible for personal use.

Artificial intelligence[edit]

Fraud detection is a knowledge-intensive activity. The main AI techniques used for fraud detection include:

  • Data mining to tát classify, cluster, and segment the data and automatically find associations and rules in the data that may signify interesting patterns, including those related to tát fraud.
  • Expert systems to tát encode expertise for detecting fraud in the size of rules.
  • Pattern recognition to tát detect approximate classes, clusters, or patterns of suspicious behavior either automatically (unsupervised) or to tát match given inputs.
  • Machine learning techniques to tát automatically identify characteristics of fraud.
  • Neural nets to tát independently generate classification, clustering, generalization, and forecasting that can then be compared against conclusions raised in internal audits or formal financial documents such as 10-Q.[5]

Other techniques such as liên kết analysis, Bayesian networks, decision theory, and sequence matching are also used for fraud detection.[4] A new and novel technique called System properties approach has also been employed where ever rank data is available. [6]

Statistical analysis of research data is the most comprehensive method for determining if data fraud exists. Data fraud as defined by the Office of Research Integrity (ORI) includes fabrication, falsification and plagiarism.

Machine learning and data mining[edit]

Early data analysis techniques were oriented toward extracting quantitative and statistical data characteristics. These techniques facilitate useful data interpretations and can help to tát get better insights into the processes behind the data. Although the traditional data analysis techniques can indirectly lead us to tát knowledge, it is still created by human analysts.[7]

To go beyond, a data analysis system has to tát be equipped with a substantial amount of background knowledge, and be able to tát perform reasoning tasks involving that knowledge and the data provided.[7] In effort to tát meet this goal, researchers have turned to tát ideas from the machine learning field. This is a natural source of ideas, since the machine learning task can be described as turning background knowledge and examples (input) into knowledge (output).

If data mining results in discovering meaningful patterns, data turns into information. Information or patterns that are novel, valid and potentially useful are not merely information, but knowledge. One speaks of discovering knowledge, before hidden in the huge amount of data, but now revealed.

The machine learning and artificial intelligence solutions may be classified into two categories: 'supervised' and 'unsupervised' learning. These methods seek for accounts, customers, suppliers, etc. that behave 'unusually' in order to tát output suspicion scores, rules or visual anomalies, depending on the method.[8]

Xem thêm: Kinh nghiệm mua giày MLB độc đáo và nổi bật nhất

Whether supervised or unsupervised methods are used, note that the output gives us only an indication of fraud likelihood. No stand alone statistical analysis can assure that a particular object is a fraudulent one, but they can identify them with very high degrees of accuracy. As a result, effective collaboration between machine learning model and human analysts is vital to tát the success of fraud detection applications.[9]

Supervised learning[edit]

In supervised learning, a random sub-sample of all records is taken and manually classified as either 'fraudulent' or 'non-fraudulent' (task can be decomposed on more classes to tát meet algorithm requirements). Relatively rare events such as fraud may need to tát be over sampled to tát get a big enough sample size.[10] These manually classified records are then used to tát train a supervised machine learning algorithm. After building a model using this training data, the algorithm should be able to tát classify new records as either fraudulent or non-fraudulent.

Supervised neural networks, fuzzy neural nets, and combinations of neural nets and rules, have been extensively explored and used for detecting fraud in mobile phone networks and financial statement fraud.[11][12]

Bayesian learning neural network is implemented for credit thẻ fraud detection, telecommunications fraud, tự động hóa claim fraud detection, and medical insurance fraud.[13]

Hybrid knowledge/statistical-based systems, where expert knowledge is integrated with statistical power, use a series of data mining techniques for the purpose of detecting cellular clone fraud. Specifically, a rule-learning program to tát uncover indicators of fraudulent behaviour from a large database of customer transactions is implemented.[14]

Cahill et al. (2000) design a fraud signature, based on data of fraudulent calls, to tát detect telecommunications fraud. For scoring a Gọi for fraud its probability under the trương mục signature is compared to tát its probability under a fraud signature. The fraud signature is updated sequentially, enabling event-driven fraud detection.

Link analysis comprehends a different approach. It relates known fraudsters to tát other individuals, using record linkage and social network methods.[15][16]

This type of detection is only able to tát detect frauds similar to tát those which have occurred previously and been classified by a human. To detect a novel type of fraud may require the use of an unsupervised machine learning algorithm.

Unsupervised learning[edit]

In contrast, unsupervised methods don't make use of labelled records.

Xem thêm: dalziel nghĩa là gì

Bolton and Hand use Peer Group Analysis and Break Point Analysis applied on spending behaviour in credit thẻ accounts.[17] Peer Group Analysis detects individual objects that begin to tát behave in a way different from objects to tát which they had previously been similar. Another tool Bolton and Hand develop for behavioural fraud detection is Break Point Analysis.[17] Unlike Peer Group Analysis, Break Point Analysis operates on the trương mục level. A break point is an observation where anomalous behaviour for a particular trương mục is detected. Both the tools are applied on spending behaviour in credit thẻ accounts.

A combination of unsupervised and supervised methods for credit thẻ fraud detection is in Carcillo et al (2019).[18]

Available datasets[edit]

A major limitation for the validation of existing fraud detection methods is the lack of public datasets.[19] One of the few examples is the Credit Card Fraud Detection dataset[20] made available by the ULB Machine Learning Group.[21]

See also[edit]


  1. ^ Chuprina, Roman (13 April 2020). "The In-depth 2020 Guide to tát E-commerce Fraud Detection". Retrieved 2020-05-24.
  2. ^ Velasco, Rafael B.; Carpanese, Igor; Interian, Ruben; Paulo Neto, Octávio C. G.; Ribeiro, Celso C. (2020-05-28). "A decision tư vấn system for fraud detection in public procurement". International Transactions in Operational Research. 28: 27–47. doi:10.1111/itor.12811. ISSN 0969-6016.
  3. ^ a b c d Bolton, R. and Hand, D. (2002). Statistical fraud detection: A review. Statistical Science 17 (3), pp. 235-255
  4. ^ a b G. K. Palshikar, The Hidden Truth – Frauds and Their Control: A Critical Application for Business Intelligence, Intelligent Enterprise, vol. 5, no. 9, 28 May 2002, pp. 46–51.
  5. ^ a b c Al-Khatib, Adnan M. (2012). "Electronic Payment Fraud Detection Techniques". World of Computer Science and Information Technology Journal. 2. S2CID 214778396.
  6. ^ Vani, G. K. (February 2018). "How to tát detect data collection fraud using System properties approach". Multilogic in Science. VII (SPECIAL ISSUE ICAAASTSD-2018). ISSN 2277-7601. Retrieved February 2, 2019.
  7. ^ a b Michalski, R. S., I. Bratko, and M. Kubat (1998). Machine Learning and Data Mining – Methods and Applications. John Wiley & Sons Ltd.
  8. ^ Bolton, R. & Hand, D. (2002). Statistical Fraud Detection: A Review (With Discussion). Statistical Science 17(3): 235–255.
  9. ^ Tax, N. & de Vries, K.J. & de Jong, M. & Dosoula, N. & cầu xin den Akker, B. & Smith, J. & Thuong, O. & Bernardi, L. Machine Learning for Fraud Detection in E-Commerce: A Research Agenda. Proceedings of the KDD International Workshop on Deployable Machine Learning for Security Defense (ML hat). Springer, Cham, 2021.
  10. ^ Dal Pozzolo, A. & Caelen, O. & Le Borgne, Y. & Waterschoot, S. & Bontempi, G. (2014). Learned lessons in credit thẻ fraud detection from a practitioner perspective. Expert systems with applications 41: 10 4915–4928.
  11. ^ Green, B. & Choi, J. (1997). Assessing the Risk of Management Fraud through Neural Network Technology. Auditing 16(1): 14–28.
  12. ^ Estevez, P.., C. Held, and C. Perez (2006). Subscription fraud prevention in telecommunications using fuzzy rules and neural networks. Expert Systems with Applications 31, 337–344.
  13. ^ Bhowmik, Rekha Bhowmik. "35 Data Mining Techniques in Fraud Detection". Journal of Digital Forensics, Security and Law. University of Texas at Dallas.
  14. ^ Fawcett, T. (1997). AI Approaches to tát Fraud Detection and Risk Management: Papers from the 1997 AAAI Workshop. Technical Report WS-97-07. AAAI Press.
  15. ^ Phua, C.; Lee, V.; Smith-Miles, K.; Gayler, R. (2005). "A Comprehensive Survey of Data Mining-based Fraud Detection Research". arXiv:1009.6119. doi:10.1016/j.chb.2012.01.002. S2CID 50458504.
  16. ^ Cortes, C. & Pregibon, D. (2001). Signature-Based Methods for Data Streams. Data Mining and Knowledge Discovery 5: 167–182.
  17. ^ a b Bolton, R. & Hand, D. (2001). Unsupervised Profiling Methods for Fraud Detection. Credit Scoring and Credit Control VII.
  18. ^ Carcillo, Fabrizio; Le Borgne, Yann-Aël; Caelen, Olivier; Kessaci, Yacine; Oblé, Frédéric; Bontempi, Gianluca (16 May 2019). "Combining unsupervised and supervised learning in credit thẻ fraud detection". Information Sciences. 557: 317–331. doi:10.1016/j.ins.2019.05.042. ISSN 0020-0255. S2CID 181839660.
  19. ^ Le Borgne, Yann-Aël; Bontempi, Gianluca (2021). "Machine Learning for Credit Card Fraud Detection - Practical Handbook". Retrieved 26 April 2021.
  20. ^ "Credit Card Fraud Detection".
  21. ^ "ULB Machine Learning Group".