Credit card fraud detection through machine learning

17,936 views 20 slides Aug 19, 2021
Slide 1
Slide 1 of 20
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20

About This Presentation

credit card fraud detection using machine learning


Slide Content

Credit Card Fraud Detection using Machine
learning
Data Alcott Systems
Ph/ Whatsapp 9600095046
Check demo: www.finalsemprojects.com
Data Alcott Systems Contact 9600095046

Duetoincreaseoffraudwhichresultsinlossofmoneyacrosstheglobe,several
methodologiesandtechniquesdevelopedfordetectingfraudsFrauddetectioninvolves
analysingtheactivitiesofusersinordertounderstandthemaliciousbehaviourofusers.
Maliciousbehaviourisabroadtermincludingdelinquency,fraud,intrusion,andaccount
defaulting.Thispaperpresentsasurveyofcurrenttechniquesusedincreditcardfraud
detectionandevaluatesanewhybridapproachtoidentifyfrauddetection.Intheproposed
work,weanalyzecreditcardfrauddetectionusingmachinelearningalgorithmnamelylogistic
regressionandDecisionTree.Tomakethelearningprocessefficient,weusedPrincipal
componentforfeatureselection.
Abstract
Data Alcott Systems Contact 9600095046

Algorithm Used
Principal component analysis
Feature selection
Decision Tree and Regression
Decision Tree and Regression used for fraud detection
Data Alcott Systems Contact 9600095046

Introduction
Withtheemergingriseoftechnologytoday,thedependencyone-commerceandtheonline
paymentshasgrownexponentially.Asthecreditcardprovidesconveniencetotheusersbutfrauds
causedduetotheseactivitiescausesinconvenience.Thecreditcardinformationisconfidential,
thebankandtheotherfinancialenterprisesdoesn'twanttodisclosetheinformationabouttheir
customers.Riskmanagementiscriticalforfinancialenterprisestosurviveinsuchcompeting
industry.Theprovisionallossarisesduetothe“bad”accountsbanklendsthemoneytocustomers
whoeventuallydonothavecapabilitytopayback.Intheriskmanagement,thechancesoffalse
negative(false“good”accounts)couldstillbehigh.However,byleveragingtheirperformancesuch
ascreditcardutilization,paymentinformation,riskscanfurtherbemanagedtocontrol
provisionalloss.Inthispaper,afocusonriskmanagementaswellasfrauddetectionisdepicted.
Data Alcott Systems Contact 9600095046

Figure: Overview of Machine learning
Data Alcott Systems Contact 9600095046

Existing System
AdvancedoversamplingmethodslikeSMOTEgeneratesynthetictraininginstancesfrom
theminorityclassbyinterpolation,insteadofsamplereplication.
TheweightingstrategyofAdaBoostisequivalenttoresamplingthedataspace[6],which
areapplicabletomostclassificationsystemswithoutchangingtheirlearningmethods.
Bahnesonet.alinexpandedthetransactionaggregationstrategy,andproposedtocreate
anewsetoffeaturesbasedonanalyzingtheperiodicbehaviorofthetimeofa
transactionusingthevonMisesdistribution.Then,usingarealcreditcardfrauddataset
providedbyalargeEuropeancardprocessingcompany.
Halvaieeet.alinaddressedcreditcardfrauddetectionusingArtificialImmuneSystems
(AIS),andintroducedanewmodelcalledAIS-basedFraudDetectionModel(AFDM).
Theauthorusedanimmunesysteminspiredalgorithm(AIRS)andimproveditforfraud
detection.
Data Alcott Systems Contact 9600095046

Disadvantages
AISoptimizedthedataandlearningratebuttheaccuracyarrivedisless.
Differentsetsoffeatureshadanimpactontheresults.
SOMTEdrawbacksisthatundersamplingmaylosesomepotential
information,andoversamplingmayleadtheoverfitting.
AdaBoostismostlyusedclassificationwhereasextralearningcostisaburden.
Data Alcott Systems Contact 9600095046

Proposed System
Creditcardlogsminingisoneofthemostsignificantfieldsinthe
areaofdatamining.Therehavebeenalargenumberofdatamining
algorithmsrootedinthesefieldstoperformdifferentdataanalysis
tasks.
ApplyPrincipalcomponentanalysisalgorithmtofindbestfeatures,
thenapplydecisiontreeclassifierforcreditcardfraudprediction.
Data Alcott Systems Contact 9600095046

Advantages
Bestaccuracyforthestudy
PCAgetsbestfeatures
Data Alcott Systems Contact 9600095046

SYSTEM REQUIREMENTS
Software Requirements
1.Windows Xp, Windows 7(ultimate, enterprise)
2.Python 3.6 and related libraries
Hardware Components
1.Processor –i3
2.Hard Disk –5 GB
3.Memory –1GB RAM
Data Alcott Systems Contact 9600095046

SYSTEM ARCHITECTURE
Data Alcott Systems Contact 9600095046

IMPLEMENTATION
TheproposedworkisimplementedinPython3.6.4withlibrariesscikit-
learn,pandas,matplotlibandothermandatorylibraries.Wedownloaded
datasetfromkaggle.com.Thedatadownloadedcontainstrainsetandtest
setseparatelywithtwodifferentclasses0and1.Thetraindataset
consideredastrainsetandtestdatasetconsideredastestset.Machine
learningalgorithmisappliedsuchaslogisticregressionandDTused.
Data Alcott Systems Contact 9600095046

DATA PRE-PROCESSING
Wehavetakenmultipleattributeinourcasestudy,dataset16features/
attributesaretakenfrstudy.Pre-processingofdatasetisdonefor
convertingthestringattributestonumeralsandmissingdatarecordsare
dropped.Thepre-processeddataisstoredin“dataset.csv”file,whichis
givenasinputformachinelearningmodels.
Data Alcott Systems Contact 9600095046

Principal component analysis feature selection
PrincipalComponentAnalysis,orPCA,isadimensionality-reduction
methodthatisoftenusedtoreducethedimensionalityoflargedatasets,
bytransformingalargesetofvariablesintoasmalleronethatstill
containsmostoftheinformationinthelargeset.
Reducingthenumberofvariablesofadatasetnaturallycomesatthe
expenseofaccuracy,butthetrickindimensionalityreductionistotradea
littleaccuracyforsimplicity.Becausesmallerdatasetsareeasiertoexplore
andvisualizeandmakeanalyzingdatamucheasierandfasterformachine
learningalgorithmswithoutextraneousvariablestoprocess.
Data Alcott Systems Contact 9600095046

EXPERIMENTAL RESULTS AND EVALUATIONS
Implementedtwomachinelearningalgorithmonthegivendatasetfor
creditcardfrauddetectionshowsthatLogisticsregressionmodel
outperformsothermodels.TheaccuracyofRegressionishighcompared
todecisiontreeclassificationmachinelearningalgorithms.
Algorithm Accuracy
DecisionTree 71.41
LogisticRegression79.91
Data Alcott Systems Contact 9600095046

Metrics
Data Alcott Systems Contact 9600095046

CONCLUSION
Inthisstudy,anewmethodfordatagenerationofimbalanceddataset's
minorityclasswasproposedtoenhancefrauddetectionincreditcardby
machinelearningandPCAalgorithmasanoversamplingstrategy.
AlthoughPCAalgorithmshavebeenappliedinmanyareas,our
applicationdomainaimstohandleimbalanceddatasetissueby
generatingnewminorityclassinstancestogainnewtrainingsets.
Applyingthisalgorithmintobankcreditcardfrauddetectionsystemaims
toreducefraudulenttransactionanddecreasethenumberoffalsealert.
Data Alcott Systems Contact 9600095046

FUTURE ENHANCEMENTS
Afurtherworkistoimplementthisapproachusingpythonprogramming
language,thiswillallowustovalidateourworkandproducepertinent
experimentalresults.
Data Alcott Systems Contact 9600095046

REFERENCES
[1] H. Lei et al., ``A deeply supervised residual network for HEp-2 cell classification via cross-modal transfer learning,'' Pattern Recognit.,
vol. 79, pp. 290 302, Jul. 2018.
[2] P. Wang, L. Li, Y. Jin, and G. Wang, ``Detection of unwanted traffic congestion based on existing surveillance system using in freeway
via a CNN-architecture trafficNet,'' in Proc. 13th IEEE Conf. Ind. Electron. Appl., May/Jun. 2018, pp. 1134 1139.
[3] X. Zhu, Y.Wang, J. Dai, L. Yuan, and Y.Wei, ``Flow-guided feature aggregation for video object detection,'' in Proc. ICCV, Mar. 2017, pp.
408 417.
[4] Z. Zhao,W. Chen, X.Wu, P. C. Chen, and J. Liu, ``LSTM network: A deep learning approach for short-term traffic forecast,'' IET Intell.
Transp. Syst., vol. 11, no. 2, pp. 68 75, Mar. 2017.
[5] P. Li, D. Wang, L. Wang, and H. Lu, ``Deep visual tracking: Review and experimental comparison,'' Pattern Recognit., vol. 76, pp.
323 338, Apr. 2018.
[6] P. Wang and J. Di, ``Deep learning-based object classification through multimode fiber via a CNN-architecture SpeckleNet,'' Appl.
Opt., vol. 57, no. 28, pp. 8258 8263, 2018.
[7] J. Zhao, Z. Zhang, W. Yu, and T.-K. Truong, ``A cascade coupled convolutionalneural network guided visual attention method for ship
detection from SAR images,'' IEEE Access, vol. 6, pp. 50693-50708, 2018.
[8] T. Pamula, ``Road traffic conditions classification based on multilevel filtering of image content using convolutionalneural
networks,'' IEEE Intell. Transp. Syst. Mag., vol. 10, no. 3, pp. 11 21, Jun. 2018.
[9] M. Barth and K. Boriboonsomsin, ``Environmentally beneficial intelligent transportation systems,'' IFAC Proc. Volumes, vol. 42, no.
15, pp. 342 345, 2009.
[10] A. Krizhevsky, I. Sutskever, and G. E. Hinton, ``Imagenetclassification with deep convolutionalneural networks,'' in Proc. Adv.
Neural Inf. Pro-cess. Syst., 2012, pp. 1097 1105.
Data Alcott Systems Contact 9600095046

Thank You
Data Alcott Systems Contact 9600095046