Prediction Of Diabetes Using Machine Learning.pptx
BinodepDas
30 views
19 slides
Aug 27, 2025
Slide 1 of 19
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
About This Presentation
Machine learning
Size: 2.75 MB
Language: en
Added: Aug 27, 2025
Slides: 19 pages
Slide Content
Diabetes Prediction using Machine Learning Techniques Guided by – Mr. Prasanta Baruah Department Of Computer Science and Engineering Central Institute Of Technology, Kokrajhar Presented by - Ganggadhar Chandra Ray(201902021029) Smitesh Dutta(201902021030) Satish Padi()
Contents Abstract Introduction Objectives Methods and Techniques Description of the data set Collected Images Experiment and results System Architecture Model Evaluation Conclusion References
Abstract Diabetes mellitus is the most common disease worldwide and keeps increasing everyday due to changing lifestyles, unhealthy food habits and over weight problems. There were studies handled in predicting diabetes mellitus through physical and chemical tests, which are available for diagnosing diabetes. Data science methods have the potential to benefit other scientific fields by shedding new light on common questions. In the proposed system, an efficient way of detecting diabetes is proposed through machine learning. Under machine learning, we used the different classification algorithms.
Objective The objective of the study is to classify the dataset and prediction of diabetes. This is proposed to achieve through machine learning classification algorithm. Our objective is to design an interactive platform in which user can input their defined information get the result of their diabetic condition.
DATASET INFORMATION We have collected relevant data accordingly which are required to analyze diabetes. A part of collected data-set is shown in the figure. Here number of inputs in the columns are 768 and number of attributes in the rows are 9. Fig: Data-set
Data pre-processing We have also replaced the zero values present in some columns of attributes with their mean values. We converted the output into binary form and assigned 1 for diabetic and 0 for non diabetic. Figure: Data Pre-processing
SYSTEM ARCHITECTURE Figure: System Architecture
OVERVIEW OF THE DATASET Figure: Dataset Overview
ANALIZATION OF DATASET In the dataset , column output is depends upon the independent attributes. We separate the dependent and independent variable by storing outcome into y variable and attributes into x variable. We split the dataset for training and testing purpose into X and y train and test splits.
MODEL EVALUATION We have created a model using Logistic Regression, K-nearest neighbor, Support Vector Machine, Decision Tree, Random Forest and Gradient Boosting algorithms. The accuracy of Random Forest resulted most accurate output among all the algorithms implemented for our dataset. So, we chose the Random Forest algorithm to create our model. Figure: Model Evaluation
Visual Comparison Figure: Graphical Comparison
Training the model with Random Forest A s we have seen that Random Forest is the best-suited algorithm for our data set. So, we have again trained our model using the Random Forest Classifier with the entire data set and tested it with new data. We have also saved the model using Joblib library. The following results have been obtained:
save We saved our model using Joblib library for future reference. We can call our model and use any time in future and every time training will not be required. Figure: Joblib save
CREATING AN USER INTERFACE We have created a Graphical User Interface(GUI) for the users to easily access our model. In the GUI we asked the users to input their data according to our attributes which we have used in the model. After that the model will show the users their result according to their input data.
PREDICTIVE MODEL DEMONSTRATION
CONCLUSION The use of ‘Random Forest Algorithm’ in the field of ‘Diabetes prediction using machine learning’ is quite common. They can also work strongly and efficiently in situations where education data are small. The ability to produce and generalize better classifiers is also high. The working conditions of the algorithm can be examined by performing tests on different data sets. In this way, studies can be expanded and clearer results can be obtained.