Covid-19 Detection using Deep Learning Abhishek Raj Vikas Jat Abhay tiwari Raj kumar
Abstract Millions of people have been impacted by the COVID-19 pandemic, which has resulted in severe morbidity and mortality. Chest radiography is one of the quick and accessible diagnostic techniques for detecting COVID-19. The project aims to develop a reliable method for detecting COVID-19 in patients by analyzing chest X-rays. Three Deep Learning systems based on LeNet, ResNet, and VGG-19 models are being developed for detecting COVID-19 in patients on a Chest Radiography dataset and GRADCAM will be applied to the best model to detect COVID-19-affected areas in the lungs. This project aims to develop a quick and accurate diagnostic method using deep learning techniques to detect COVID-19 and other respiratory illnesses. We developed a deep-learning model that can classify chest X-ray images into four groups: COVID-19, viral pneumonia, non-COVID bacterial infection, and normal health by using three very popular deep-learning techniques to train the models. These models can be utilized in the detection of respiratory illnesses, especially COVID-19. The project also aims to create a visualization technique that helps identify the areas affected by the virus in the lungs of a patient and helps give proper treatment.
Introduction Problem Statement: The COVID-19 pandemic has led to significant morbidity and mortality worldwide, necessitating swift and accurate diagnostic methods to identify affected individuals. While chest radiography offers a quick and accessible means of detection, manual interpretation is subjective and time-consuming, leading to delays in diagnosis and treatment. As such, there is a pressing need for automated and reliable techniques to detect COVID-19 from chest X-rays, thereby improving diagnostic efficiency and facilitating timely intervention. Solution to the Problem: To address the challenges associated with COVID-19 detection from chest X-rays, this project proposes the development of three deep learning systems based on established architectures such as LeNet, ResNet, and VGG-19. These models will be trained on a diverse dataset comprising images from COVID-19 cases, viral pneumonia, non-COVID bacterial infections, and normal health conditions. By harnessing the capabilities of deep learning, these models aim to accurately classify chest X-ray images into distinct categories, enabling rapid and precise identification of COVID-19 and other respiratory illnesses. Additionally, a visualization technique, such as GRADCAM, will be applied to the best-performing model to identify areas affected by the virus in the lungs, aiding in treatment planning and monitoring.
Contd. Research Gap: Despite the growing interest in applying deep learning to medical imaging for disease diagnosis, there remains a research gap in the development of robust and interpretable models specifically tailored for COVID-19 detection from chest X-rays. Existing approaches often lack scalability, generalization, and interpretability, hindering their adoption in clinical settings. Moreover, the scarcity of publicly available and annotated datasets poses a significant challenge for training and validating deep learning models for COVID-19 detection. By addressing these gaps, this project aims to contribute novel insights and methodologies to the field of medical diagnostics, advancing our understanding of automated COVID-19 detection and paving the way for more effective diagnostic tools. Motivation: The motivation behind this project stems from the urgent need to enhance diagnostic capabilities in the face of the COVID-19 pandemic. The ability to rapidly and accurately identify individuals infected with the virus is crucial for implementing timely interventions, containing transmission, and mitigating the impact on public health. By leveraging deep learning techniques to develop automated diagnostic methods, this project seeks to empower healthcare professionals with tools that can facilitate early detection, improve patient outcomes, and alleviate the strain on healthcare systems. The potential societal impact of this research is immense, as it has the potential to save lives, reduce healthcare costs, and contribute to global efforts to combat the pandemic. Contributions: Through the development of deep learning-based models for COVID-19 detection from chest X-rays and the implementation of a visualization technique for identifying affected areas in the lungs, this project aims to make several contributions to the field of medical diagnostics. These contributions include the development of novel methodologies for automated disease detection, the creation of publicly available datasets for benchmarking and validation purposes, and the advancement of our understanding of COVID-19 pathology through interpretable model visualization. Additionally, by providing healthcare professionals with reliable and efficient diagnostic tools, this project seeks to improve patient care, enhance public health outcomes, and contribute to the global fight against COVID-19.
Literature review Article : COVID-19 Detection Using Deep Learning Algorithm on Chest X-ray Images The advancement of artificial intelligence, particularly deep learning, has shown promising results in detecting COVID-19 from chest X-ray images, a quick and accessible diagnostic method. Several studies have utilized deep learning techniques, including convolutional neural networks (CNNs), to develop models for COVID-19 detection. These studies have explored various CNN architectures, transfer learning methods, and dataset sizes to achieve accurate diagnosis. However, despite significant progress, challenges remain, particularly regarding the credibility of outcomes due to limited dataset sizes and varying model performances. While some studies have achieved high accuracy rates using small datasets, others with larger datasets have not consistently produced accuracies in the high nineties. Additionally, the compilation time of implemented models has not been thoroughly evaluated. Addressing these issues, this study proposes a modified MobileNetV2 model for COVID-19 detection, aiming for higher accuracy rates in shorter times. To ensure the credibility of the model, a large dataset of 52,000 augmented chest X-ray images is utilized for training and testing. Furthermore, a Wilcoxon signed-rank test is conducted to calculate p-values, providing statistical validation of the model's performance. By addressing dataset limitations and focusing on model efficiency, this study seeks to provide a reliable and efficient method for COVID-19 diagnosis using chest X-ray images.
CONTD. Article : Detection and analysis of COVID-19 in medical images using deep learning techniques The COVID-19 pandemic has spurred the urgent need for efficient diagnostic tools, especially in identifying respiratory symptoms quickly. Medical imaging, particularly X-rays and CT scans, has emerged as a frontline approach in diagnosing COVID-19, given its simplicity and speed. However, the influx of patients has overwhelmed healthcare systems, necessitating faster and more accurate diagnostic methods. Deep learning, with its ability to learn intricate features from vast datasets, presents a promising solution to this challenge. While previous studies have shown promising results in pneumonia detection using deep learning, the co-occurrence of various respiratory illnesses, including COVID-19, poses a new challenge. Recent research has focused on leveraging deep learning techniques to detect COVID-19 pneumonia from X-ray and CT scan images, achieving high accuracy rates. This article aims to conduct a comprehensive comparative analysis of state-of-the-art methods for classifying COVID-19 X-ray and CT scan images, evaluating the effectiveness of various deep neural network structures. The study proposes a novel framework for evaluating algorithms using three publicly available datasets and applies transfer learning to address limited training data issues. With an extensive evaluation, the proposed framework demonstrates significant improvements in accuracy, f1-score, precision, and recall metrics compared to previous approaches. Notably, the VGG16 deep transfer learning model emerges as the most effective, achieving exceptional accuracy rates of up to 99% in binary and three-class classification tasks.
Data description The models were developed using the COVID-19 Radiography dataset sourced from Kaggle, curated by researchers from Qatar University, Doha, Qatar, and the University of Dhaka, Bangladesh, in collaboration with partners from Pakistan and Malaysia. This dataset comprises chest X-ray images representing three pneumonia-related chest conditions alongside normal health cases. With over 42,000 images, the dataset is organized into two folders: one containing chest X-ray images and the other containing corresponding masks. Specifically for this project, only the Chest X-ray images were utilized, encompassing 3,616 COVID-19-positive cases, 10,192 Normal cases, 6,012 Lung Opacity (Non-COVID lung infection) cases, and 1,345 Viral Pneumonia cases.
Model Implementation Data Preprocessing The dataset provided in Kaggle contains both Chest X-ray images and also their corresponding masks. Our implementation does not use the mask images. Hence, these images are ignored. The images in the dataset are randomly divided into three different subsets – Training, Testing, and Validation sets with a ratio of 80-10-10 to help in training the model, evaluating the model, and finetuning the parameters during the training respectively. The images in the training and validation sets are subjected to Horizontal flip to accommodate the Data Augmentation. This helped in increasing the training parameters and helped train the model to detect minor variations in data. Then all the images are resized to the same shape of 224x224 size and the color of the images is fixed to a range of ‘RGB’. This helped the models get trained more easily and prevent overfitting of the model. These preprocessed images are used for training the models. Developing Deep Learning Models We have trained three independent models – LeNet-5, ResNet-101, and VGG-16 using Keras -TensorFlow. These models are chosen for their well-known success with Image classification problems. One additional model is also developed to check the efficiency of DenseNet architecture on the problem.
LeNet-5 LeNet is a convolutional neural network (CNN) proposed by Yann LeCun in 1998 and was one of the first successful applications of CNNs. LeNet-5 is a simple architecture with seven layers. The first two are convolutional layers, followed by two subsampling layers, and then two fully connected layers concluding with one output layer. During the training process, the weights of the model are adjusted to minimize the difference between the predicted class and the actual class of the chest X-ray image and are updated using the stochastic gradient descent (SGD) optimization algorithm.
Mathematical formulation Convolutional Layer : 𝐶𝑖= ReLU (𝑊𝑖∗𝐶𝑖−1+𝑏𝑖) Ci = ReLU ( Wi ∗ Ci −1 + bi ) Max-Pooling Layer : 𝑃𝑖= MaxPool(𝐶𝑖) Pi = MaxPool ( Ci ) Fully Connected Layer (FC) : 𝐹𝑖= ReLU (𝑊𝐹𝐶𝑖⋅𝐹𝑖−1+𝑏𝐹𝐶𝑖) Fi = ReLU ( WFCi ⋅ Fi −1 + bFCi ) Output Layer : Output= Softmax (𝑊𝑜𝑢𝑡⋅𝐹last+𝑏out)Output= Softmax ( Wout ⋅ F last + b out )
The model is initialized to train for a maximum of 20 epochs with categorical cross entropy loss function and an early stopping call back is applied to prevent overfitting of the model. The model stopped training at the 8th epoch as shown in the image below and the best model is saved.
ResNet-101 The ResNet model proposed in 2015 by Kaiming He et al.is a deep and complex CNN architecture with residual connections that allows for the training of very deep networks. ResNet is made up of several residual blocks, each with two or three convolutional layers and a shortcut connection that bypasses them. The shortcut connection allows the input to be added to the residual block's output, thereby avoiding the vanishing gradient problem, and allowing the model to learn identity mappings.
Contd. We have used a variation of ResNet that is 101 layers deep including convolutional layers, batch normalization layers, and shortcut connections. The model is based on the Residual Network (ResNet) architecture introduced in 2015. The Model we used is pre-trained on the ImageNet dataset consisting of one million images of 1000 categories which means it already knows how to detect important features in an image. We have added two new fully connected layers to the model using the Keras functional API. The first layer consists of 256 neurons with a ReLU activation function, followed by a final output layer with a softmax activation function that outputs probabilities for each of the categories. After adding the new layers, the pre-trained layers of the ResNet101 model are frozen, meaning that their weights will not be updated during training to prevent overfitting and to preserve the features that the pre-trained model has learned. The model is then compiled and trained with early stopping call back on validation loss, categorical cross-entropy loss function, and Adam as the optimizer. The model is defined to be trained for a maximum of 20 epochs and the model stopped training at the 12th epoch as shown in Fig.3 and the best model is saved for further usage.
Mathematical formulation Basic Building Block (Residual Block) : 𝐹(𝑋)= ReLU (𝑊2∗ReLU(𝑊1∗𝑋+𝑏1)+𝑏2)+𝑋 F ( X )= ReLU ( W 2 ∗ ReLU ( W 1 ∗ X + b 1 )+ b 2 )+ X 𝑋 X is the input to the residual block. 𝑊1 W 1 and 𝑊2 W 2 are the weights of the convolutional layers. 𝑏1 b 1 and 𝑏2 b 2 are the corresponding bias terms. Fully Connected Layers (FC) : The final fully connected layers are typically applied for classification.
VGG-19 VGG-19 is a convolutional neural network (CNN) architecture that was introduced in 2014 by researchers at the Visual Geometry Group (VGG) at the University of Oxford. The VGG19 model is a deep learning model that consists of 19 layers and is based on the VGG architecture which includes 16 convolutional layers and 3 fully connected layers. Similar to ResNet, VGG-19 is also trained on pre-trained weights of a model trained on the ImageNet dataset. The VGG architecture is based on a simple principle of stacking multiple convolutional layers with small filters, which allows for the creation of deep neural networks that are more efficient and require fewer parameters.
Contd. We have used the transfer learning approach which uses weights from the model trained on the ImageNet dataset as the base model for training the model. This pre-trained model is used without its top layer which is replaced with a custom layer consisting of a flattened layer, two fully connected layers with 1024 and 4 neurons with a softmax activation function. The resulting model as shown in the picture above is created by specifying these input and output layers and compiled with categorical cross-entropy loss and Adam optimizer. The layers of the pre-trained VGG19 model are then frozen to prevent retraining and overfitting on the new dataset. The model is trained on the new dataset for 20 epochs as shown in the picture below, with early stopping to prevent overfitting which stopped the training at the 6th epoch by saving the best model.
Mathematical formulation Convolutional Layer : 𝐶𝑖= ReLU (𝑊𝑖∗𝐶𝑖−1+𝑏𝑖) Ci = ReLU ( Wi ∗ Ci −1 + bi ) Max-Pooling Layer : 𝑃𝑖= MaxPool (𝐶𝑖) Pi = MaxPool ( Ci ) Fully Connected Layer (FC) : 𝐹𝑖= ReLU (𝑊𝐹𝐶𝑖⋅𝐹𝑖−1+𝑏𝐹𝐶𝑖) Fi = ReLU ( WFCi ⋅ Fi −1 + bFCi ) Output Layer : Output= Softmax (𝑊𝑜𝑢𝑡⋅𝐹last+𝑏out)Output= Softmax ( Wout ⋅ F last + b out )
GRAD_CAM Visualisation GradCam serves as a technique crucial for visualizing the significant areas within an image that contribute to classification. These highlighted regions offer valuable insights into the features utilized by the model during classification, aiding in the interpretation of the model's decisions. In the context of detecting COVID-19 from chest X-rays, GradCam generates heatmaps highlighting regions crucial for classification. These heatmaps, overlaid onto the original X-ray images, elucidate the areas of the lungs impacted by the virus, thereby aiding in diagnosis. By implementing GradCam with the best-performing model, we can effectively visualize the areas pivotal for classification. This heatmap overlay on the chest X-ray image illustrates the affected regions within the lungs, facilitating a clearer understanding of the model's decision-making process.
Results For evaluating the performance of the model, each model is tested on the same test subset and the same system. We have used metrics like Classification Reports and Confusion Matrices to visualize the class-wise performance of the models. The Confusion matrices highlight and show the number of true positives of each class which is a good criterion to check how well the models performed. Accuracy Scores of each model on the test set are as follows - Accuracy of LeNet against the Test Set: 78.37% Accuracy of VGG against the Test Set: 91.4% Accuracy of ResNet against the Test Set: 92.58% The above picture shows the Confusion Matrices of four models developed. Classification Reports in the picture below show a summary of metrics like accuracy, precision, recall, and F1-Score of models in each class.
Contd.
Conclusion The findings indicate that ResNet-101 stands out as the most effective model for classifying chest X-rays and detecting COVID-19 in patients' lungs. Notably, F1-Scores for Normal images and Viral Pneumonia are slightly higher compared to the other classes, likely due to class imbalance. Despite attempts to address this imbalance through techniques like oversampling, undersampling , and class weighting, these approaches resulted in model overfitting. Initially, during the model building process following the standard procedure outlined in the paper, ResNet showed signs of overfitting. As a result, adjustments to hyperparameters and activation functions within custom layers were necessary to develop a robust model. Amidst concerns over ResNet's performance, exploration of an alternative approach led to the training of a DenseNet model, which also exhibited optimal results with an 86% accuracy rate. It's worth noting that the best models were saved after a relatively low number of epochs, likely influenced by the deep architectures employed. Additionally, the relatively short training times (approximately 18 minutes for LeNet, 54 minutes for ResNet, and 31 minutes for VGG) were facilitated by the computing capabilities of the training machine; however, longer training times were observed on alternate systems.
Thank you :) In the guidance of :- Jitendra tembhurne