Credit Cards Frauds and Cybersecurity Threats Machine Learning Detection Algorithms As Countermeasures
Credit Cards Frauds and Cybersecurity Threats Machine Learning Detection Algorithms As Countermeasures
Credit Cards Frauds and Cybersecurity Threats Machine Learning Detection Algorithms As Countermeasures
Volume 6 Issue 7, November-December 2022 Available Online: www.ijtsrd.com e-ISSN: 2456 – 6470
1. INTRODUCTION:
Cybersecurity is becoming increasingly significant in Here, a system that can monitor the patterns of all
our daily lives. When addressing digital life security, transactions is required, and if any patterns are
the main issue is identifying anomalous behavior. abnormal, the transaction should be stopped or
When making purchases from online e-commerce terminated.
stores or conducting business online, many people Credit card information should always be kept private
frequently prefer using credit cards as well as debit or confidential. Information about credit card privacy
cards. Occasionally, we can make purchases even should not be compromised. Phishing websites, stolen
when we don't have the cash on hand thanks to credit or lost credit cards, fake credit cards, the theft of card
card credit limitations. information, intercepted cards, etc. are some
On the other hand, scammers and online attackers examples of ways to steal credit card information
abuse these features. To solve this problem, there is (Anderson, 2007). The aforementioned activities
need for a system that can abort the transaction if it should be avoided for security reasons. Online fraud
finds anything fishy or anomalous patterns in the simply requires the card information and takes place
whole financial transaction.
@ IJTSRD | Unique Paper ID – IJTSRD52440 | Volume – 6 | Issue – 7 | November-December 2022 Page 940
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
remotely. At the time of purchase, neither a manual domain registration and hosting, mobile phones, fuel,
signature nor a PIN or card imprint are necessary. The groceries, air ticket booking, books hotels and other
legitimate cardholder is typically unaware that items. When using credit cards for various purchases,
someone else has seen or stolen his or her card they are most valuable because they offer numerous
information. rewards in terms of points. Making payments with
credit cards is equally easy and seamless. But the
The easiest way to spot this kind of fraud is to
examine each card's purchasing habits and look for challenge of using credit cards for online payments is
as a result of activities of hackers and scammers.
any deviations from the "normal" spending habits. The
greatest strategy to lower the number of successful According to ProjectPro (2022), today's credit card
credit card frauds is to detect fraud by examining the fraud falls into a number of different categories:
current cardholder data purchases. Since the outcomes Lost or stolen cards: Online shopping credit
are not made public and the data sets are unavailable. cards are stolen and used fraudulently on the
The logged data and user behavior are two data types owner's behalf. The process of canceling stolen
that can be used to identify fraud incidents. Currently, credit cards and reissuing them is difficult for
a variety of techniques, including data mining, both customers and credit card providers. Several
statistics, and artificial intelligence, are used to detect banking organizations limit the use of credit cards
fraud. until it is certain that the card's legitimate owner
has received it.
Currently, there are many supervised machine
Card Abuse: The customer uses a credit card to
learning approaches based on Artificial Intelligence
make purchases, but he or she has no intention of
(AI) that can classify odd or anomalous transactions.
The only prerequisites are historical data and an returning the money that the bank has charged for
algorithm that can more closely match the data. those purchases. When the due date for payment
approaches, some customers quit returning calls.
In order to identify frauds and potential cyber risks, They even occasionally file for bankruptcy; every
decision tree and random forest machine learning year, this kind of scam causes losses in the
(ML) algorithms were utilized in this paper to identify millions to banks.
unusual or odd and unexpected patterns in credit card Identity Theft: Customers submit false
transactions. The performance of the two ML information while applying for credit cards, and
algorithms is evaluated experimentally using a variety they may even steal the personal information of a
of performance evaluation criteria, including real client to do so. Even card blocking cannot
accuracy, precision, recall and F1 score. prevent the credit card from getting into the
1.1. How Credit Card Fraud works wrong hands in such circumstances.
Credit cards are very important in making purchases Merchant Abuse: Some online merchants
online especially when finances are not readily display fictitious unlawful transactions to
available. Credit cards are one of the most popular facilitate money laundering. Legal information of
financial tools used to make online purchases and legitimate credit card customers is stolen to create
payments for commodities such as electronic gadgets counterfeit cards and be used for these illegal
such TVs, computer hardware and software, website transactions.
@ IJTSRD | Unique Paper ID – IJTSRD52440 | Volume – 6 | Issue – 7 | November-December 2022 Page 941
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
Fig.1 shows a comprehensive list of credit card frauds that can compromise the cybersecurity of credit
cards and financial systems.
@ IJTSRD | Unique Paper ID – IJTSRD52440 | Volume – 6 | Issue – 7 | November-December 2022 Page 942
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
This dataset presents transactions that occurred in two days, where we have 492 frauds out of 284,807
transactions. The dataset was downloaded online at https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud.
Fig. 2 shows the methodology adopted in the experiments.
The system architecture of the proposed machine learning method is shown in Fig.3.
Import the credit card historical dataset
to Python testbed
Fig. 2: Credit card fraud detection methodology using Machine Learning Algorithms
@ IJTSRD | Unique Paper ID – IJTSRD52440 | Volume – 6 | Issue – 7 | November-December 2022 Page 943
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
Algorithm steps for finding the Best algorithm for Credit fraud detection:
Step1: Import the credit card dataset into Pandas data frame
Step2: Convert the data into data frames format suitable for machine learning modeling
Step3: Do random sampling
Step4: Split dataset into training set (70%) and testing set (30%)
Step5: Fit the training dataset to the machine learning algorithms
Step6: Apply the algorithms to the training dataset and create the prediction models
Step7: Make Credit card fraud prediction for test dataset for each algorithm
Step8: Compute the performance metrics for each machine learning algorithm
Random Forest ML algorithm:
Random forest (one of the most popular algorithms) is a supervised machine learning algorithm. It creates a
“forest” out of an ensemble of “decision trees”, which are normally trained using the “bagging” technique. The
bagging method’s basic principle is that combining different learning models improves the prediction outcome.
To get a more precise and reliable forecast, random forest creates several decision trees and merges them. Fig.4
shows the Random Forest algorithm’s anomaly detection technique using majority class rule.
@ IJTSRD | Unique Paper ID – IJTSRD52440 | Volume – 6 | Issue – 7 | November-December 2022 Page 944
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
2.1. Exploratory Data Analysis
Exploratory data analysis (EDA) was carried out in Python 3, Jypyter Notebook and several statistical data
analysis tool. Figs. 6 and 7 show the credit card data analysis carried out in Pandas dataframe.
Fig.8: Data Analysis showing percentage of credit card incidences in the dataset
@ IJTSRD | Unique Paper ID – IJTSRD52440 | Volume – 6 | Issue – 7 | November-December 2022 Page 945
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
Fig.9 shows the printout of the dataset ratio split used for the experiments. The dataset was divided into two sets,
training set was allocated 70% while testing and evaluation set was allocated 30%.
Where TP, TN, FP and FN are referred to as True Positives, True Negatives, False Positives and False Negatives
respectively as used in binary classification machine learning tasks.
Precision:
Precision is the number of classified Positive or fraudulent instances that actually are positive instances.
Precision = ……………………………………………………………… (2)
Recall:
Recall is a metric that measures the proportion of accurate positive predictions among all possible positive
predictions. Recall gives an indicator of missed positive predictions, unlike precision, which only comments on
the accurate positive predictions out of all positive predictions. The number of true positives divided by the sum
of true positives and false negatives is used to determine recall.
Recall = ………………………………………………….………………(3)
F1-Score:
F1 Score is the weighted average of Precision and Recall. Therefore, this score takes both false positives and
false negatives into account.
F1-Score = ………………………………….………………(4)
@ IJTSRD | Unique Paper ID – IJTSRD52440 | Volume – 6 | Issue – 7 | November-December 2022 Page 946
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
Fig. 10: Modeling prediction results for credit card frauds by Decision Tree and Random Forest
Algorithms
From Fig.10, it is clear that Random Forest algorithm outperformed the Decision Tree Algorithm by Accuracy
Score of 99.96% while Decision Tree scored 99.93%.
We also carried out another experiment to comprehensively evaluate the performance of the Decision Tree and
Random Forest algorithms using standard performance evaluation metrics such as Accuracy, precision, Recall
and f1-score. Fig.11 and Table 1 shows the performance evaluation results obtained for Decision Tree algorithm.
Decision Tree performed very well in credit card pattern detection with Accuracy score of 0.99932 or 99.93%
and Precision of 0.76712, Recall=0.82353 and F1-score of 0.79433.
Fig. 11: Modeling prediction performance evaluation for Decision Tree Algorithm
Table 1: Decision Tree credit card fraud detection results
Performance Metric Score
Accuracy 0.99932
Precision 0.76712
Recall 0.82353
F1-score 0.79433
Fig.12 and Table 2 show the performance evaluation results obtained for Random Forest algorithm. Random
Forest performed very well (better than Decision Tree algorithm) in credit card pattern detection with Accuracy
score of 0.99961or 99.96% and Precision of 0.94783, Recall=0.80147 and F1-score of 0.86853.
Fig. 12: Modeling prediction performance evaluation for Random Forest Algorithm
@ IJTSRD | Unique Paper ID – IJTSRD52440 | Volume – 6 | Issue – 7 | November-December 2022 Page 947
International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470
Table 2: Random Forest credit card fraud detection results
Performance Metric Score
Accuracy 0.99961
Precision 0.94783
Recall 0.80147
F1-score 0.86853
Table 3 shows the comparative prediction performances between Decision Tree and Random Forest algorithms.
It is very clear here that Random Forest outperformed Decision Tree algorithm in credit card fraud detection
with higher Accuracy, Precision and F1-score and lower Recall score.
Table 3: Comparison of credit card frauds prediction performances between Decision Tree and
Random Forest algorithms
Performance Metric Score ML Algorithm
0.99932 Decision Tree
Accuracy
0.99961 Random Forest
0.76712 Decision Tree
Precision
0.94783 Random Forest
0.82353 Decision Tree
Recall
0.80147 Random Forest
0.79433 Decision Tree
F1-score
0.86853 Random Forest
@ IJTSRD | Unique Paper ID – IJTSRD52440 | Volume – 6 | Issue – 7 | November-December 2022 Page 948