A Comparative Study of Random Forest and XGBoost for Detecting Credit Card Fraud Transactions using Big Data | Mohamed Riham - CRP Final Presentation.pptx

MohamedRiham4 5 views 21 slides Oct 30, 2025
Slide 1
Slide 1 of 21
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21

About This Presentation

The presentation titled "Mohamed Riham - CRP Final Presentation.pptx" details "A Comparative Study of Random Forest and XGBoost for Detecting Credit Card Fraud Transactions using Big Data" , a project aimed at finding the most efficient detection mechanism for credit card fraud. ...


Slide Content

A Comparative Study of Random Forest and XGBoost for Detecting Credit Card Fraud Transactions using Big Data Learner Name – M.A. Mohamed Riham Student Id - 1028401 HND in Computing 21 Batch Assessor : A.L.F.Sajeetha Supervisor: Mr. Mohamed Nismy

4/21/2025 2 CRP Abstract Credit card fraud is a critical issue requiring efficient detection mechanisms. Compared Random Forest and XGBoost using over 1 million transactions. Applied SMOTE to handle class imbalance. XGBoost outperformed Random Forest in all metrics. Keywords: Credit Card Fraud, Random Forest, XGBoost, SMOTE, Big Data.

Research Objectives 4/21/2025 3 CRP

4/21/2025 4 CRP Research Objectives Evaluate Random Forest and XGBoost for fraud detection. Compare performance: accuracy, precision, recall, F1-score, AUC-ROC. Identify the best-suited algorithm for real-world use.

3. Research Question 4/21/2025 5 CRP

How do Random Forest and XGBoost compare in detecting fraudulent credit card transactions? Dependent variable: Fraud detection accuracy, precision, recall, F1-score, and AUC-ROC Independent variable: Transaction features (amount, location, merchant category) 4/21/2025 6 CRP Research Question

7. Methodology 4/21/2025 7 CRP

Methodology 4/21/2025 8 CRP Models: Random Forest XGBoost Tools: Python, Scikit-learn, TensorFlow, Pandas, NumPy, Matplotlib Evaluation Metrics: Accuracy, Precision, Recall, F1-score, AUC-ROC

5. Model Explanation 4/21/2025 9 CRP

Model Explanation Random Forest Ensemble of decision trees. Reduces overfitting via random sampling. XGBoost Gradient Boosting. Sequentially corrects previous errors. 4/21/2025 13 CRP

6.Data Visualization 4/21/2025 11 CRP

Data Visualization 4/21/2025 12 CRP

Model Performance Metrics 4/21/2025 13 CRP

4/21/2025 CRP 14 Metric Random Forest XGBoost Accuracy 99.63% 99.74% Precision 99.66% 99.71% Recall 99.63% 99.74% F1-Score 99.65% 99.72%

Hypothesis Testing 4/21/2025 15 CRP

Hypothesis Testing 4/21/2025 16 CRP Null Hypothesis (H0) : No performance difference. Alternate Hypotheses (H1) : XGBoost outperforms Random Forest. P – Value: 0.0003 → Reject H0 XGBoost shows significantly better performance

Findings 4/21/2025 17 CRP

Findings 4/21/2025 18 CRP XGBoost consistently outperformed Random Forest. Particularly strong in precision and recall. Ideal for fraud detection in imbalanced datasets.

Limitations & Future work 4/21/2025 19 CRP

Limitation and Future Work 4/21/2025 20 CRP No real-time implementation tested. Future work: Incorporate customer behavior. Test in real-time settings. Explore hybrid/ensemble models.

Thank you 4/21/2025 21 CRP