software Engineering ppt containing the research work relevant in the field of Software Engineering

NishaRaheja3 10 views 41 slides Aug 05, 2024
Slide 1
Slide 1 of 41
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41

About This Presentation

Software Engineering


Slide Content

Ph.D. Comprehensive Seminar on Prediction of Faulty Classes in Object Oriented System based on Software Metrics and Features Presented By Manpreet Singh (Registration No: 2K19/NIT/PHD/61900026) Department of Computer Engineering National Institute of Technology, Kurukshetra Under Supervision of Prof. Jitender Kumar Chhabra

Table of Contents Introduction to Software Fault Prediction Structure of Fault Prediction Model What is Faulty Class Prediction Applications Fault Prediction Taxonomy Machine Learning Based FCP techniques Search Based FCP techniques Some other techniques Research gaps and Objectives Work Plan References

Software Fault Prediction Software Fault Prediction aims to identify fault-prone software modules(functions/classes/files). SFP is used before execution of actual testing software. Importance of SFP is to reduce testing time, testers effort, improve efficiency and cost cutting by reducing man force. Properties like features and code metrics of the software project are used for SFP. The basic building block of SFP models are divided into three classes. Fault Prediction Techniques Software metrics. Performance evaluation metrics.

SFP model building block

Faulty Class Prediction For Object Oriented Applications, classes are the basic blocks of software. SFP techniques can be used to identify faulty classes in early stages based on software design metrics. Models are trained based on some initial versions of Object Oriented applications. After training models they can be used for subsequent releases of same application. Cross project FCD can be used based on training of models and nature of applications.

Applications Finding of complex classes with high risk of bugs in early phases makes developers’ task easy. Identification of faulty classes makes testers task easy, testers will be focused on classes with high probability of faults. It reduces the effort and improve testing time. It reduce man force of testing team. So, improve cost efficiency. automatic testing always more accurate than manual process.

Taxonomy of FCP Algorithm

Development of SFP techniques In starting buggy modules were identified based on statistical methods and structural & schematic based properties. Some statistical methods are ANOVA, Naïve Bayesian, Variance Analysis etc. These methods were not very accurate. Trend shifts toward Machine Learning and Soft Computing in recent years. These methods are more efficient. In next few slides we will discuss about previous research on these advanced techniques.

Machine Learning Based Algorithm Machine Learning is as application of artificial intelligence. It provides systems the ability to automatically learn and improve from experience. Machine Learning focuses on the development of computer programs that can access data and use it to learn for themselves. Machine Learning algorithms are broadly divided into two categories:[1] Supervised Machine Learning algorithms. Unsupervised Machine Learning algorithms. Some useful algorithms of Machine Learning for Faulty Class Prediction are discussed in next few slides. [1] A. Shanthini and R.M. Chandrasekaran , "Analyzing the effect of bagged ensemble approach for software fault prediction in class level and package level metrics", International Conference on Information Communication and Embedded Systems(ICICES), pp. 1-5, 2014.

Decision Tree Based Classifier Decision Tree is supervised Machine Learning Algorithm developed by Quinlan, which recursively divides the input dataset into smaller groups until group is small enough to be labelled by some label [2]. Bennin et al. (2017) developed novel oversampling technique called Mahakil and combined it with Decision Tree algorithm to get better results.[3]. Kumar et al. (2017) used many Decision Tree variations like C4.5, Tree Disc Algorithm and Sprint Sliq Algorithms for software fault prediction in his research. [4]. Choudhary et al. (2018) proposed new change metrics and applied Decision Tree (C4.5) algorithm for his comparative study. [5] Bhandari et al. (2018) used Decision Tree (C4.5) algorithm with combination of feature selection techniques Principle Component Analysis (PCA) and Correlation based Feature Selection (CFS). [6]. Singh et al. (2017) used both CART and C4.5 for his experimentation and results shows that C4.5 gives best performance for SFP within project.[7] [7] Singh, Pradeep . "Comprehensive model for software fault prediction." In 2017 International Conference on Inventive Computing and Informatics (ICICI), pp. 1103-1108. IEEE, 2017.

Random Forest Based Classifier Random Forest is an ensemble learning method developed by  Breiman  in 2001. [8] Catal et al. (2009) applied Random Forest and compared with other Machine Learning models, in case of AUC and accuracy performance metrics it outperforms many Machine Learning models like Naïve Bayes , Decision Tree. [9]. Abaei et al. (2014) explained Random Forest and its applications in software fault prediction in his survey. [10] Choudhary et al. (2018) applied Random Forest on proposed change metrics and results are compared with Decision Tree (C4.5) and KNN. [5]. Kaya et al. (2019) have used Random Forest classifier and many other machine learning algorithms for Software vulnerability prediction to reduce security risks.[11]. Many other researchers like kaur el at. (2008), Kulkarni el at. (2012), Malhotra et al. (2015) explained and used Random Forest Classifier in their research. [12][13][14]. [10] Abaei , Golnoush , and Ali Selamat . "A survey on software fault detection based on different prediction approaches."  Vietnam Journal of Computer Science  1, no. 2 (2014): 79-95.

Support Vector Machine Based Classifier Support vector machine provides a classification learning model and rather than a regression model [15]. Support Vector Machine can be divided into two parts [16]: Linear Non-Linear Yogesh Singh et al. (2009) built SVM model and tested on object oriented metrics proposed by Chidamber and Kemerer in [17]. Di Martino et al. (2011) configured SVM using GA for SFP and results shows improvement in Recall performance metric without effecting other performance metrics. [18] Chen et al. (2016) have used one class SVM for software fault prediction [19]. Mohapatra et al. (2018) developed a hybrid algorithm using kernel based SVM and GSO-GA and results shows improvement in performance. [20] Many other researchers used SVM for comparison purpose in their work [1] [3] [6]. [ 16] T. Hastie, R. Tibshirani , and J. Friedman. The Elements of Statistical Learning. New York: Springer, 2009. [20] Mohapatra , Yogomaya , and Mitrabinda Ray. "Software fault prediction based on GSO-GA optimization with kernel based SVM classification." International Journal of Intelligent Engineering & Systems 11, no. 5 (2018): 152.

Artificial Neural Network Based Classifier Artificial neuron take inputs process them and give relevant output in similar manner as biological neuron. Tong- Seng Quah et al. (2003) trained neural network based on object oriented code metrics. They have used Ward neural network (WNN). [21] Kanmani et al. (2007) trained BPN using Object Oriented code metrics in their research and results are compared with statistical methods. [22] Hamdi et al. (2011) trained PNN and SVM for SFP based on NASA datasets downloaded from PROMISE Repository. [23] Hui Xiao et al. (2020) explained the working of FANN, SRNN, TCNN and applied SFP and SFC in iterative manner. [24] Many other researchers like Boucher et al. (2017), Bhandari et al. (2018), Bennin et al. (2018) have used different variations of neural networks in their research for comparison purpose. [3] [6] [25] [22] Kanmani , S., V. Rhymend Uthariaraj , V. Sankaranarayanan , and P. Thambidurai . "Object-oriented software fault prediction using neural networks." Information and software technology 49, no. 5 (2007): 483-492.

Metaheuristic Algorithm Evolutionary-based or swarm intelligence algorithms are referred to as being ‘‘ metaheuristic ’’. A metaheuristic is a higher-level procedure or heuristic designed to find or select a heuristic that may provide a sufficiently good solution to an optimization problem.[26] These techniques are used where incomplete or imperfect information is provided or limited computation resources.[26] Metaheuristics samples a subset of solutions which is otherwise too large to be completely enumerated. Metaheuristics do not guarantee that a globally optimal solution can be found on some class of problems.[26] [26] Bianchi, Leonora; Marco Dorigo ; Luca Maria Gambardella; Walter J. Gutjahr (2009).”A survey on metaheuristic for stochastic combinatorial optimization ”. Natural Computing .  8  (2): 239–287

Genetic Algorithm Based Classifier The genetic algorithm (GA) was developed by John Holland and published in 1975 [27]. Sarro et al. (2012) have used Genetic algorithm to configure SVM for Faulty class detection. Genetic algorithm in first phase is used for feature selection and SVM is used for classification purpose of faulty classes. [29]. Erturk et al. (2016) have applied genetic algorithm and results are compared with other soft computing methods.[30] Rathore et al. (2017) applied genetic algorithm for rule base generation for Faulty Class Prediction and results are compared with 6 other models. [31]. Alsghaier et al. (2019) combined GA & SVM for faulty class prediction.GA in first phase is used for feature selection and SVM is used for classification purpose. [32] [27] Whitley, Darrell. "A genetic algorithm tutorial." Statistics and computing 4, no. 2 (1994): 65-85. [30] Erturk , Ezgi , and Ebru Akcapinar Sezer . "A comparison of some soft computing methods for software fault prediction."  Expert systems with applications  42, no. 4 (2015): 1872-1879.

PSO Based Classifier Marini in [33] explained the Basic PSO and its variations in detail. In PSO each candidate solution is called particle in D-Dimensional space where is number of parameters [33]. Jin et al. (2010) applied adaptive dynamic and median PSO algorithms and results shows that adaptive dynamic PSO outperforms other statistical methods.[34] Kayarvizhy et al. (2013) compined PSO with ANN to improve performance of fault prediction in Object oriented system. [35] Jin et al. (2015) combined Quantum PSO with ANN to develop efficient faulty class prediction model for object oriented system.[36]. Turabieh et al. (2019) applied Binary PSO for feature selection to develop iterative faulty class detection model and results are compared with Decision tree, Random forest and naïve bayesian classifier [37]. Alsghaier et al. (2020) combined PSO& SVM for faulty class prediction. PSO in first phase is used for feature selection and SVM is used for classification purpose. [32] [33] Marini, Federico, and Beata Walczak . "Particle swarm optimization (PSO). A tutorial."  Chemometrics and Intelligent Laboratory Systems 149 (2015): 153-165 [36] Jin, Cong, and Shu -Wei Jin. "Prediction approach of software fault-proneness based on hybrid artificial neural network and quantum particle swarm optimization."  Applied Soft Computing  35 (2015): 717-725

ACO Based Classifier ACO was developed by Dorigo   in 1990 inspired by the behaviour of ants. Ants find shortest path in search of food with the help of Pheromones they deposit. [38] Vandecruys et al. (2008) used a variant of ACO called ant miner algorithm for mining software repository to develop software fault prediction models [39]. Singh et al. (2015) in his survey explained ACO and other search based techniques to improve Software Fault Prediction quality to reduce testing efforts and time [40]. Sharma et al. (2018) in his research combined ACO with various macine learning models like ANN, SVM etc. [41] Turabieh et al. (2019) applied Binary PSO for feature selection to develop iterative faulty class detection model and results are compared with Decision tree, Random forest and naïve bayesian classifier [37] Singh et al. (2020) developed a comprehensive model based on ACO and results are compared with J48, Random forest and SVM. [42] [ 38] Dorigo , Marco, and Christian Blum. "Ant colony optimization theory: A survey." Theoretical computer science 344, no. 2-3 (2005): 243-278. [41] Sharma, Deepak, and Pravin Chandra. "Software fault prediction using machine-learning techniques." In  Smart Computing and Informatics , pp. 541-549. Springer, Singapore, 2018.

Other Algorithm Except Machine Learning algorithms and Search Based algorithms many other algorithms exist which can be used for Faulty Class Identification. These algorithms are listed below: Fuzzy Rule Based Case Based Call Graph Based Hybrid Algorithms (ACO,PSO,GA) + ANN (ACO,PSO,GA) + SVM Next few slides will give the overview of papers which have used these algorithms based strategies.

Naïve Bayesian Classifier Naïve Bayesian classification discovered by Reverend  Bayes . This is the simplest method used for software fault prediction based on Bayes ’ theorem [43]. Turhan et al. (2009) has shown detailed steps for software fault prediction based on Naïve Bayesian classifier. He applied this method on many NASA software’s datasets and compared performance with Linear Discriminant (LD) and Quadratic Discriminant (QD). [44] Catal et al. (2011) has implemented Naïve Bayesian classifier to develop eclipse based software fault prediction tool. [45] Wang (2015) used Naïve Bayesian Classifier for non-homogeneous Poisson process (NHPP) based fault detection and correction. [46] Spiral life cycle model-based Bayesian classification technique is developed by Dhanajayan et al. (2017) and compared with other sophisticated approaches like HSOM, ANN and Random Forest. [47] Many other researchers used this technique for comparison purpose of newly developed models in their research [6] [25] [37] [43] Rish , Irina. "An empirical study of the naive Bayes classifier." In IJCAI 2001 workshop on empirical methods in artificial intelligence, vol. 3, no. 22, pp. 41-46. 2001.

Fuzzy Rule Based Classifier Fuzzy logic is an extension of Boolean logic system [48]. A fuzzy set is a set without crisp, clear boundaries and admitting the possibility of partial membership.[48] Yuan et al. (2000) applied fuzzy clustering approach to differentiate faulty and healthy classes. [49]. Singh et al. Have developed Fuzzy Inftence System for Faulty Class Prediction based on K-mean clustering algorithm. K-mean clustering algorithm is used for initial Rule Base generation. [50] Chatterjee et al. (2019) developed a fuzzy rule-based generation algorithm in interval type-2 fuzzy logic system for fault prediction in the early phase of software development. [51] [48] L. A. Zadeh . Knowledge representation in fuzzy logic. IEEE Transactions on Know ledge and Data Engineering, 1(1):89{100, Mar. 1989} [51] Chatterjee , Subhashis , Bappa Maji , and Hoang Pham. "A fuzzy rule-based generation algorithm in interval type-2 fuzzy logic system for fault prediction in the early phase of software development." Journal of Experimental & Theoretical Artificial Intelligence 31, no. 3 (2019): 369-391.

Artificial Immune System Based Classifier This algorithm used Resource limited approach [52] and clone selection principles [53]. Watkins et al. (2001) showed the application of ARIS for classification purpose as supervised machine learning algorithm. [54] Catel et al. (2007) explained the basic steps of AIRS and applied for Software Fault Prediction on Object oriented metrics[55]. AIRS algorithm inspired by immunological metaphor can be used for both classification and regression problem [56]. Haouari et al. (2020) applied AIRS for inter release software fault prediction and showed that it outperforms many machine learning methods [57]. [57] Haouari , Ahmed Taha , Labiba Souici-Meslati , Fadila Atil , and Djamel Meslati . "Empirical comparison and evaluation of Artificial Immune Systems in inter-release software fault prediction." Applied Soft Computing 96 (2020): 106686.

Hybrid Algorithms Erturk et al. (2016) applied ANFIS, which is combination of Artificial Neural Network and Fuzzy Inference System, and compared with some soft computing methods. [30]. Alsghaier et al. (2019) combined PSO & SVM and GA & SVM for faulty class prediction. PSO and GA in first phase is used for feature selection and SVM is used for classification purpose. [32] Rhmann et al. (2019) applied GGA-FS, GFS- Adaboost -c, GFS- LoogitBoost -c and results are compared with Random Forest, J48 and Naïve Bayesian classifier and results shows that GFS-AB and GFS-LB outperforms all other algorithms. [58] Banga et al. (2020) applied particle swarm optimization and modified genetic algorithm for Faulty Class Prediction and model compared with statistical and Machine Learning methods.(PSO-MGA) [59] [30] Erturk , Ezgi , and Ebru Akcapinar Sezer . "A comparison of some soft computing methods for software fault prediction."  Expert systems with applications  42, no. 4 (2015): 1872-1879. [59] Banga , Manu, and Abhay Bansal . "Proposed software faults detection using hybrid approach." Security and Privacy: e103.

Graph based and Case based techniques Emam et al. (2001) developed a model based on case-based reasoning classifiers for predicting high risk software components and compared with other Machine Learning algorithms. [60]. Shafi et al. (2008) explained Conjunctive Rule Learner ( ConjunctiveRule ) algorithm to predict nominal class labels based on rules constructed earlier by the learner. [61]. Turhan et al. (2008) have developed Call Graph Based Ranking metrics and combined with other static code metrics to develop software fault prediction model. [62] Bhattacharya et al. (2012) have used Graph based technique to predict fault severity and testing effort of classes in object oriented system. [63] Abandah et al. (2013) have used call Graph based metrics to evaluate software classes design quality.[64]. Jian et al. (2017) generated Abstract Syntax Tree (AST) to extract code metrics and applied CNN for classification purpose of faulty and healthy classes.[65] [62] Turhan , Burak , Gozde Kocak , and Ayse Bener . "Software defect prediction using call graph based ranking (CGBR) framework." In 2008 34th Euromicro Conference Software Engineering and Advanced Applications, pp. 191-198. IEEE, 2008. [64] Abandah , Hesham , and Izzat M. Alsmadi . "Call graph based metrics to evaluate software design quality." (2013).

Software Metrics and Features Many types of metrics are used as input to predict faulty classes. Famous metrics are categorized as under: [66] Object Oriented Code metrics Procedural/static code metrics Aspect oriented code metrics Feature oriented code metrics Change Metrics [66] Nuñez -Varela, Alberto S., Héctor G. Pérez -Gonzalez, Francisco E. Martínez -Perez, and Carlos Soubervielle-Montalvo . "Source code metrics: A systematic mapping study."  Journal of Systems and Software  128 (2017): 164-197.

List of Primary Studies Author Journal Name(year) Overview Strength Weakness Vandecruys et al. The Journal of Systems and Software(2008) ACO based technique is developed for mining repository to extract rules and metrics. A new approach to extract rules from software repository. Only metrics extraction model, not used for training to predict software faults. Catal et al. Information Sciences(2009) Comparison of naïve Bayesian, J48, CLONALG, AIRS and random forest is done on different static and object oriented code metrics. Very useful to find best combination of different code metrics to improve accuracy of existing algorithms No new technique of code metrics developed only different combinations of existing metrics are used for experimentation. Kayarvizhy et al. International Journal of Computer Applications (2013) In this paper integrated approach of PSO and ANN is developed, PSO is used for updating weights in iterative manner. Perform better than back propagation in case of accuracy and converge faster than gradient and Hill Climbing. Only comparison with back propagation ANN and PSO can be used to decide hidden layer neurons also. Abandah et al. International Journal of Software Engineering and Its Applications(2013) Two new call graph based metrics are proposed and combined with other cohesion and coupling metrics. SVM, J48 and LMT are trained and compared. Two new call graph based metrics based on page ranking on interned are designed. Only basic tree models are compared and no object oriented code metric is used. Erturk et al. Expert Systems with Applications(2015) ANFIS (FIS+ANN) is compared some other soft computing methods like GA, PSO etc. A very good soft computing method ANFIS is explained clearly, which can be implemented easily with the help of this paper. No new technique developed and only static code metrics are used in this paper.

List of Primary Studies Author Journal Name(year) Overview Strength Weakness Biray et al. International Conference on Software Testing, Verification and Validation Workshops (2015) Code metrics are extracted from closed data of IT company based on different released and new labelling method for defective/healthy is developed and C4.5 algorithms is trained. New labelling method is proposed if unlabeled dataset exist. Only decision tree model is trained, no sophisticated approach is used for SFP Jin et al. Applied Soft Computing(2015) Integrated approach based on QPSO and ANN is developed and AUC-ROC comparison is done with other models. A new algorithm QPSO is integrated with ANN in this paper and AUC-ROC curve is explained clearly. QPSO is used for dimensionality reduction Comparison is done with only 2 models ANN and PSO+ANN, further only 3 datasets are used and datasets contains only static code metrics. Singh et al. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS(2016) A new approach based on AND and OR gates is developed and used to generate fuzzy rule based with combination of K-mean clustering algorithm. A novel approach based on digital gates is developed. Only 5 datasets are chosen for implementation and testing and all are based on static code metrics. Phan et al. International Conference on Tools with Artificial Intelligence (2017) A control flow graph is generated from assembly code and CNN is applied for SFP. New concept based flow chart of assembly code. Inbuilt tools are used no new technique is developed Li et al. IEEE International Conference on Software Quality, Reliability and Security(2017) Abstract Syntax Tree is generated from software code and CNN is trained for SFP. New concept based on AST based metrics. Inbuilt tools are used for metrics extraction and no new technique is developed.

List of Primary Studies Author Journal Name(year) Overview Strength Weakness Choudhary et al. Computers and Electrical Engineering(2018) J48, KNN and Random Forest algorithms are applied on newly designed and existing change metrics. Cross-release experimentation done in this paper and 25 new change metrics proposed. Only three algorithms are compared for results need to apply more sophisticated approaches to achieve better accuracy. Arora et al. Int. J. Intelligent Engineering Informatics(2018) In this paper ANN is trained based on ACO, PSO, GA and Fire Fly weight updating algorithms and proved Fire Fly outperforms other three approaches Clearly explained how to use optimization algorithms for ANN weight updating in iterative manner. Iterative algorithms are faster than back propagation algorithms. Accuracies are not greater than 0.9 for any case and only 6 data sets are used and all datasets are based on static code metrics. Mohapatra et al. International Journal of Intelligent Engineering and Systems(2018) GA is used for feature selection and kernel based SVM is designed and trained for classification A new kernel based SVM is developed for classification purpose Only static code metrics are extracted based on flow chart, no object oriented metrics is used. Turabieh et al. Expert Systems With Applications(2019) BPSO, BACO and BGA are integrated with LRNN and AUC-ROC curve comparison is done. Method applied on static code metrics based datasets and object oriented based datasets. Only AUC-ROC comparison is done, need to compare more performance metrics like Recall, Error, Precision and F1-Score. Rhmann et al. Journal of King Saud University – Computer and Information Sciences(2019) Two Hybrid algorithms Adaboost and logisticboost are applied on change metrics and experimentation is done on cross release within project. Cross release performance testing based on change metrics. Gives knowledge about change metrics which can be helpful for further research. Only recall and precision performance metrics are compared and no new technique is developed only existing approaches are compared.

List of Primary Studies Author Journal Name(year) Overview Strength Weakness Banga et al. Security and Privacy (2019) In this paper new integrated approach based on PSO and GA is developed and accuracy is compared with clustering and statistical methods. New fitness function based on chromosome accuracy is designed instead of using classification algorithm like SVM. No comparison with Machine learning algorithms like decision tree, naïve Bayesian and random forest etc. accuracy is between 0.8 to 0.9 in each case. Alsghaier et al. Software: Practice and Experience (2020) In this paper two integrated approaches are developed GA+SVM and PSO+SVM, SVM works as fitness function of GA and PSO. New integrated approach is developed, based on this more integrated approaches can be developed Only comparison of GA_SVM and PSO+SVM, need to compare with other machine learning models. Xiao et al. Applied Soft Computing Journal(2020) In this paper new metrics based on testing effort are developed and CNN, RNN, ANN algorithms are applied with dataset normalization using Min-Max, Log, Sigmoid and Z-Score It not only classify defective/ healthy classes but also predicts number of faults in each class Log and Sigmoid normalized data gives best performance but no explanation why only these gives best solution and overall accuracy is not very good. Haouari et al. Applied Soft Computing Journal(2020) AIRS algorithm is trained and compared with Naïve Bayesian, J48, SVM and Random Forest for cross project validation Cross project validation and testing is done in this paper. No new technique is developed onlt existing approaches AIRS and CLONALG are trained. Singh et al. International Journal of Knowledge-based and Intelligent Engineering Systems (2020) Rule based is generated based on ACO for software fault prediction and compared with J48, Random Forest and Naïve Bayesian Rule base generation technique with the help of search based algorithm is explained. Tested only for static code metrics based data sets and compared with only basic models.

Research Gaps Most of the researchers have used PROMISE repository datasets for their research, need to create new datasets by extracting code metrics from open source software codes based on new ideas. Many approaches have been developed to deal with Faulty Class Prediction but no approach provide good results for all familiar metrics like AUC-ROC, F1-Score, Accuracy, Precision, Recall. If one metric result improved then other metrics lags in results. In most of the cases results are verified based on 10-fold validation or dividing data into 70-30% ratio and finding average of 10 runs, very less researchers have done work based on inter release verification. Instead of intra release or inter release testing one more research area is cross project verification of SFP models which is very less explored.

Research Objectives To develop new algorithm or combine already existing algorithms to create hybrid approach which can give better results. To fine new source code metric (object oriented) to improve faulty class prediction process. Combine newly created metrics with already existing source code metrics to get best results. Apply newly designed FCP technique on metric suit and get better results as compared to already existing approaches.

Research Work Plan

References [1]  A. Shanthini and R.M. Chandrasekaran , "Analyzing the effect of bagged ensemble approach for software fault prediction in class level and package level metrics", International Conference on Information Communication and Embedded Systems(ICICES), pp. 1-5, 2014. [2] Quinlan J. R. (1986). Induction of decision trees. Machine Learning, Vol.1-1, pp. 81-106. [3] Bennin , Kwabena Ebo , Jacky Keung, Passakorn Phannachitta , Akito Monden , and Solomon Mensah . " Mahakil : Diversity based oversampling approach to alleviate the class imbalance issue in software defect prediction." IEEE Transactions on Software Engineering 44, no. 6 (2017): 534-550. [4] Kumar, T. Ravi, and T. Srinivasa Rao . "The Enriched Object Oriented Software processes for Software Fault Prediction." (2017). [5] Choudhary , Garvit Rajesh, Sandeep Kumar, Kuldeep Kumar, Alok Mishra , and Cagatay Catal . "Empirical analysis of change metrics for software fault prediction." Computers & Electrical Engineering 67 (2018): 15-24. [6] Bhandari , Guru Prasad, and Ratneshwer Gupta. "Machine learning based software fault prediction utilizing source code metrics." In 2018 IEEE 3rd International Conference on Computing, Communication and Security (ICCCS), pp. 40-45. IEEE, 2018. [7] Singh, Pradeep . "Comprehensive model for software fault prediction." In 2017 International Conference on Inventive Computing and Informatics (ICICI), pp. 1103-1108. IEEE, 2017.

References [8] Breiman , Leo. "Random forests." Machine learning 45, no. 1 (2001): 5-32. [9] Catal , Cagatay , and Banu Diri . "Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem." Information Sciences 179, no. 8 (2009): 1040-1058. [10] Abaei , Golnoush , and Ali Selamat . "A survey on software fault detection based on different prediction approaches."  Vietnam Journal of Computer Science  1, no. 2 (2014): 79-95. [11] Kaya , Aydin , Ali Seydi Keceli , Cagatay Catal , and Bedir Tekinerdogan . "The impact of feature types, classifiers, and data balancing techniques on software vulnerability prediction models."  Journal of Software: Evolution and Process  31, no. 9 (2019): e2164. [12] Kaur , Arvinder , and Ruchika Malhotra . "Application of random forest in predicting fault-prone classes." In 2008 International Conference on Advanced Computer Theory and Engineering, pp. 37-43. IEEE, 2008. [13] Kulkarni , Vrushali Y., and Pradeep K. Sinha . "Pruning of random forest classifiers: A survey and future directions." In 2012 International Conference on Data Science & Engineering (ICDSE), pp. 64-68. IEEE, 2012. [14] Malhotra , Ruchika . "A systematic review of machine learning techniques for software fault prediction." Applied Soft Computing 27 (2015): 504-518. [15] M. A. Hearst, S. T. Dumais , E. Osman , J. Platt, and B. Scholkopf . “Support vector machines.” Intelligent Systems and their Applications, IEEE, 13(4), pp. 18–28, 1998.

References [16] T. Hastie, R. Tibshirani , and J. Friedman. The Elements of Statistical Learning. New York: Springer, 2009. [17] Singh, Yogesh , Arvinder Kaur , and Ruchika Malhotra . "Software fault proneness prediction using support vector machines." In Proceedings of the world congress on engineering, vol. 1, pp. 1-3. 2009. [18] Di Martino, Sergio, Filomena Ferrucci , Carmine Gravino , and Federica Sarro . "A genetic algorithm to configure support vector machines for predicting fault-prone components." In International conference on product focused software process improvement, pp. 247-261. Springer, Berlin, Heidelberg, 2011. [19] Chen, L. I. N., B. I. N. Fang, and Zhaowei Shang. "Software fault prediction based on one-class SVM." In  2016 International Conference on Machine Learning and Cybernetics (ICMLC) , vol. 2, pp. 1003-1008. IEEE, 2016. [20] Mohapatra , Yogomaya , and Mitrabinda Ray. "Software fault prediction based on GSO-GA optimization with kernel based SVM classification." International Journal of Intelligent Engineering & Systems 11, no. 5 (2018): 152. [21] Quah , Tong- Seng , and Mie Mie Thet Thwin . "Application of neural networks for software quality prediction using object-oriented metrics." In International Conference on Software Maintenance, 2003. ICSM 2003. Proceedings., pp. 116-125. IEEE, 2003. [22] Kanmani , S., V. Rhymend Uthariaraj , V. Sankaranarayanan , and P. Thambidurai . "Object-oriented software fault prediction using neural networks." Information and software technology 49, no. 5 (2007): 483-492.

References [23] Al- Jamimi , Hamdi A., and Lahouari Ghouti . "Efficient prediction of software fault proneness modules using support vector machines and probabilistic neural networks." In 2011 Malaysian Conference in Software Engineering, pp. 251-256. IEEE, 2011. [24] Xiao, Hui , Minhao Cao, and Rui Peng . "Artificial neural network based software fault detection and correction prediction models considering testing effort." Applied Soft Computing 94 (2020): 106491. [25] Boucher, Alexandre , and Mourad Badri . "Predicting fault-prone classes in object-oriented software: an adaptation of an unsupervised hybrid SOM algorithm." In 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS), pp. 306-317. IEEE, 2017. [26] Bianchi, Leonora; Marco Dorigo ; Luca Maria Gambardella; Walter J. Gutjahr (2009).”A survey on metaheuristic for stochastic combinatorial optimization”. Natural Computing .  8  (2): 239–287 [27] Whitley, Darrell. "A genetic algorithm tutorial." Statistics and computing 4, no. 2 (1994): 65-85. [28] Bianchi, Leonora; Marco Dorigo ; Luca Maria Gambardella; Walter J. Gutjahr (2009).”A survey on metaheuristic for stochastic combinatorial optimization”. Natural Computing .  8  (2): 239–287. [29] Sarro , Federica, Sergio Di Martino, Filomena Ferrucci , and Carmine Gravino . "A further analysis on the use of genetic algorithm to configure support vector machines for inter-release fault prediction." In  Proceedings of the 27th annual ACM symposium on applied computing , pp. 1215-1220. 2012.

References [30] Erturk , Ezgi , and Ebru Akcapinar Sezer . "A comparison of some soft computing methods for software fault prediction."  Expert systems with applications  42, no. 4 (2015): 1872-1879. [31] Rathore , Santosh S., and Sandeep Kumar. "An empirical study of some software fault prediction techniques for the number of faults prediction."  Soft Computing  21, no. 24 (2017): 7417-7434. [32] Alsghaier , Hiba , and Mohammed Akour . "Software fault prediction using particle swarm algorithm with genetic algorithm and support vector machine classifier." Software: Practice and Experience 50, no. 4 (2020): 407-427. [33] Marini, Federico, and Beata Walczak . "Particle swarm optimization (PSO). A tutorial."  Chemometrics and Intelligent Laboratory Systems 149 (2015): 153-165. [34] Jin, Cong, En-Mei Dong, and Li-Na Qin. "Software fault prediction model based on adaptive dynamical and median particle swarm optimization." In  2010 Second International Conference on Multimedia and Information Technology , vol. 1, pp. 44-47. IEEE, 2010. [35] Kayarvizhy , N., S. Kanmani , and Rhymend Uthariaraj . "Improving Fault prediction using ANN-PSO in object oriented systems."  International Journal of Computer Applications  73, no. 3 (2013): 0975-8887. [36] Jin, Cong, and Shu -Wei Jin. "Prediction approach of software fault-proneness based on hybrid artificial neural network and quantum particle swarm optimization."  Applied Soft Computing  35 (2015): 717-725.

References [37] Turabieh , Hamza , Majdi Mafarja , and Xiaodong Li. "Iterated feature selection algorithms with layered recurrent neural network for software fault prediction."  Expert systems with applications  122 (2019): 27-42. [38] Dorigo , Marco, and Christian Blum. "Ant colony optimization theory: A survey." Theoretical computer science 344, no. 2-3 (2005): 243-278. [39] Vandecruys , Olivier, David Martens, Bart Baesens , Christophe Mues , Manu De Backer, and Raf Haesen . "Mining software repositories for comprehensible software fault prediction models."  Journal of Systems and software  81, no. 5 (2008): 823-839. [40] Singh, Pradeep Kumar, R. Panda, and Om Prakash Sangwan . "A critical analysis on software fault prediction techniques."  World applied sciences journal  33, no. 3 (2015): 371-379. [41] Sharma, Deepak, and Pravin Chandra. "Software fault prediction using machine-learning techniques." In  Smart Computing and Informatics , pp. 541-549. Springer, Singapore, 2018. [42] Singh, Pradeep , and Shrish Verma . "ACO based comprehensive model for software fault prediction."  International Journal of Knowledge-based and Intelligent Engineering Systems  24, no. 1 (2020): 63-71. [43] Rish , Irina. "An empirical study of the naive Bayes classifier." In IJCAI 2001 workshop on empirical methods in artificial intelligence, vol. 3, no. 22, pp. 41-46. 2001.

References [44] Turhan , Burak , and Ayse Bener . "Analysis of Naive Bayes ’ assumptions on software fault data: An empirical study." Data & Knowledge Engineering 68, no. 2 (2009): 278-290. [45] Catal , Cagatay , Ugur Sevim , and Banu Diri . "Practical development of an Eclipse-based software fault prediction tool using Naive Bayes algorithm." Expert Systems with Applications 38, no. 3 (2011): 2347-2353. [46] Wang, L. J., Q. P. Hu , and Min Xie . "Bayesian analysis for NHPP-based software fault detection and correction processes." In 2015 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM), pp. 1046-1050. IEEE, 2015. [47] Dhanajayan , Rajaganapathy Chinna Gounder , and Subramani Appavu Pillai . "SLMBC: spiral life cycle model-based Bayesian classification technique for efficient software fault prediction and classification." Soft Computing 21, no. 2 (2017): 403-415. [48] L. A. Zadeh . Knowledge representation in fuzzy logic. IEEE Transactions on Know ledge and Data Engineering, 1(1):89{100, Mar. 1989}. [49] Yuan, Xiaohong , Taghi M. Khoshgoftaar , Edward B. Allen, and K. Ganesan . "An application of fuzzy clustering to software quality prediction." In Proceedings 3rd IEEE Symposium on Application-Specific Systems and Software Engineering Technology, pp. 85-90. IEEE, 2000. [50] Singh, Pradeep , Nikhil R. Pal, Shrish Verma , and Om Prakash Vyas . "Fuzzy rule-based approach for software fault prediction." IEEE Transactions on Systems, Man, and Cybernetics: Systems 47, no. 5 (2016): 826-837.

References [51] Chatterjee , Subhashis , Bappa Maji , and Hoang Pham. "A fuzzy rule-based generation algorithm in interval type-2 fuzzy logic system for fault prediction in the early phase of software development." Journal of Experimental & Theoretical Artificial Intelligence 31, no. 3 (2019): 369-391. [52] J. Timmis , and M. Neal, “Investigating the Evolution and Stability of a Resource Limited Artificial Immune Systems”, Genetic and Evolutionary Computation Conference, Workshop Program, Nevada, USA, 2000, pp. 40- 41. [53] L. N. De Castro, and F. J. Von Zubben , “The Clonal Selection Algorithm with Engineering Applications”, Proceedings of Artificial Immune System Workshop, Genetic and Evolutionary Computation Conference, Workshop Program, Nevada, USA, 2000, pp. 36-37. [54] Watkins, A.: AIRS: A Resource Limited Artificial Immune Classifier, Master Thesis, Mississippi State University (2001). [55] Catal , Cagatay , and Banu Diri . "Software fault prediction with object-oriented metrics based artificial immune recognition system." In International Conference on Product Focused Software Process Improvement, pp. 300-314. Springer, Berlin, Heidelberg, 2007. [56] Catal , Cagatay , Banu Diri , and Bulent Ozumut . "An artificial immune system approach for fault prediction in object-oriented software." In 2nd International Conference on Dependability of Computer Systems (DepCoS-RELCOMEX'07), pp. 238-245. IEEE, 2007. [57] Haouari , Ahmed Taha , Labiba Souici-Meslati , Fadila Atil , and Djamel Meslati . "Empirical comparison and evaluation of Artificial Immune Systems in inter-release software fault prediction." Applied Soft Computing 96 (2020): 106686. [58] Rhmann , Wasiur , Babita Pandey , Gufran Ansari , and D. K. Pandey . "Software fault prediction based on change metrics using hybrid algorithms: An empirical study." Journal of King Saud University-Computer and Information Sciences 32, no. 4 (2020): 419-424.

References [59] Banga , Manu, and Abhay Bansal . "Proposed software faults detection using hybrid approach." Security and Privacy: e103. [60] El Emam , Khaled , Saida Benlarbi , Nishith Goel , and Shesh N. Rai . "Comparing case-based reasoning classifiers for predicting high risk software components." Journal of Systems and Software 55, no. 3 (2001): 301-320. [61] Shafi , Sana, Syed Muhammad Hassan, Afsah Arshaq , Malik Jahan Khan, and Shafay Shamail . "Software quality prediction techniques: A comparative analysis." In 2008 4th International Conference on Emerging Technologies, pp. 242-246. IEEE, 2008. [62] Turhan , Burak , Gozde Kocak , and Ayse Bener . "Software defect prediction using call graph based ranking (CGBR) framework." In 2008 34th Euromicro Conference Software Engineering and Advanced Applications, pp. 191-198. IEEE, 2008. [63] Bhattacharya, Pamela, Marios Iliofotou , Iulian Neamtiu , and Michalis Faloutsos . "Graph-based analysis and prediction for software evolution." In 2012 34th International Conference on Software Engineering (ICSE), pp. 419-429. IEEE, 2012. [64] Abandah , Hesham , and Izzat M. Alsmadi . "Call graph based metrics to evaluate software design quality." (2013). [65] Li, Jian , Pinjia He, Jieming Zhu, and Michael R. Lyu . "Software defect prediction via convolutional neural network." In 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS), pp. 318-328. IEEE, 2017. [66] Nuñez -Varela, Alberto S., Héctor G. Pérez -Gonzalez, Francisco E. Martínez -Perez, and Carlos Soubervielle-Montalvo . "Source code metrics: A systematic mapping study." Journal of Systems and Software 128 (2017): 164-197.

Thanks
Tags