project-ppt-on-breast-cancer-prediction-using-ml

1kn20cs050 194 views 13 slides May 25, 2024
Slide 1
Slide 1 of 13
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13

About This Presentation

a final year project on ml which uses various algorithms in the field of computer science and engineering


Slide Content

Prediction for breast cancer using Natural Language Processing Algorithms Project Batch Details: Batch Information: LUCKY SHETTY [1KN20CS015] PROJECT GUIDE : NAVEENA C K [1KN20CS026] Prof. Kusum Rajput Dept. of CSE VISHNU BABU B [1KN20CS050]

Abstract: Breast cancer has replaced lung cancer as the number one cancer among women worldwide. The combined sampling method SMOTE-ENN is used to solve the problem of sample imbalance, and the data are standardized to make the data have better separability . The final results of each model are derived using a 10-fold cross-validation method.

Introduction: Breast cancer, as one of the common malignant tumors in women, has become a focus of public health attention around the world. Machine learning, as an important artificial intelligence technology, has the ability to extract features, discover patterns and build predictive models from a large amount of medical data. For breast cancer diagnosis, the application of machine learning has revolutionized the field and achieved remarkable results.

Literature survey: Logistic regression: Linear regression model used for binary classification. Suitable for predicting breast cancer risk based on multiple features. Decision Trees: Non-linear model that uses a tree-like structure for classification. Can handle both categorical and continuous features.

Random Forests: Ensemble learning method that combines multiple decision trees. Reduces overfitting and improves accuracy. Support Vector Machines: Uses hyperplanes to separate data into different classes. Effective for high-dimensional feature spaces.

Existing system: Hybrid strategy SMOTE-ENN XGBoost algorithm RANDOM FOREST SUPPORT VECTOR MACHINE K-NEAREST NEIGHBOR (KNN) LOGISTIC REGRESSION (LR)

Drawbacks: Limited Generalizability : A high accuracy rate on a specific training dataset does not guarantee similar performance on different datasets or in diverse clinical settings. Lack of Contextual Understanding: NLP algorithms might struggle with understanding the contextual nuances of medical reports, including sarcasm, idiomatic expressions, or ambiguous language. Inadequate Handling of Medical Jargon: Medical reports often contain complex terminology and abbreviations.

Limited Adaptability to Varied Data Sources: Healthcare data comes in diverse formats, including text, images, and numerical data. Sensitivity to Preprocessing Techniques: The accuracy of NLP algorithms can heavily depend on the preprocessing techniques applied to the text data.

Proposed system:

Hardware and software requirements: Hardware requirements: Requires a multi-core CPU, ideally with 16GB RAM. Utilizes a dedicated GPU, preferably NVIDIA GeForce RTX series or higher, for efficient deep learning model training. Benefits from high-speed storage, such as SSDs, for quick data retrieval and model loading.

Software Requirements: Utilizes Python for coding and algorithm implementation. Employs TensorFlow, PyTorch , NLTK, and spaCy for advanced natural language processing. Utilizes Matplotlib, Seaborn for data visualization, and Git/GitHub for version control and collaboration.

Conclusion : The breast cancer prediction model demonstrates promising results in accurately predicting breast cancer. Future Work: Further improve the model's performance by fine-tuning the parameters and optimizing the feature selection process.

Thank you
Tags