Crop recommendation system powerpoint.pptx

163 views 19 slides Nov 11, 2024
Slide 1
Slide 1 of 19
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19

About This Presentation

crop recomendation


Slide Content

Crop Recommendation System Using Neural Networks with Soil and Weather Data for Optimized Agricultural Decision-Making IN13/00079/21 YUSUF KANTAI IN13/00017/21 REBECCA WAMBUA IN13/00040/21 TERRY LECHUTA  

INTRODUCTION Agriculture plays a crucial role in ensuring food security and supporting the global economy. However, farmers often face challenges in deciding which crops to grow, especially with varying soil conditions, weather patterns, and climate change impacts. (Stekhoven & Buhlmann, 2012). To address this challenge, we developed a Crop Recommendation System using Neural Networks leveraging soil and rainfall data for more effective decision-making. Integrating artificial intelligence into agriculture promotes sustainable farming by improving productivity, minimizing resource wastage, and boosting overall crop yield (Stekhoven & Buhlmann, 2018). This aligns with recent studies emphasizing the significance of data-driven models in solving agricultural challenges (Monteiro Thomas, 2022; Morales & Villalobos, 2023).

This system aims to assist farmers in selecting the most suitable crops for their specific conditions by analyzing data such as soil type, pH level, moisture, and local weather patterns like temperature and rainfall. Choosing the appropriate crop in relation to location-specific soil factors and climatic conditions is also vital for enhancing production (Wang, Shi, & Wen, 2023). Therefore, farmers must be equipped with instruments that allow them to choose the best crop for the region’s unique meteorological and soil conditions (Uddin, Matin, & Meyer, 2019). In developing nations, using machine learning for agricultural planning objectives has resulted in the development of applications such as crop recommendation, crop disease diagnosis, fertilizer management, and so on (Gow, J. (2019).

The farmers would profit from the development of the crop recommendation system considering location-specific factors. The research described in this article tries to create a recommendation algorithm that offers the greatest produce based on terrain and climate factors unique to a particular region (Sharma et al., 2020). In this paper, a Random Forest model was utilized for the recommendation on crop systems depending on terrain and environmental factors (Shehadeh et al., 2021). Tarek Z et al (2023) Soil erosion status prediction using a novel random forest model optimized by random search method. This approach improves the accuracy of the model by fine-tuning its parameters, making it more effective at analyzing complex factors contributing to soil erosion. The optimized model provides a reliable tool for assessing soil erosion risk, which is crucial for sustainable land management and conservation efforts.The work by Bhadouria R, et al. (2019) examines the impact of climate change on agricultural ecosystems, focusing on the challenges and consequences that arise in this new era. It discusses how changes in temperature, precipitation patterns, and extreme weather events affect crop yields, soil health, and water availability, posing significant risks to agricultural productivity. Tailor Brown. (2019) developed an Integrated Climatic Assessment Indicator (ICAI) specifically for assessing wheat production. The ICAI combines various climate-related factors, such as temperature, rainfall, and humidity, into a single comprehensive metric to evaluate their impact on wheat growth and yields.Paudel Jones. (2019) utilized machine learning alongside agronomic principles from traditional crop modeling to create a reliable baseline for predicting crop yields on a large scale. By integrating data-driven machine learning methods with established agricultural knowledge, they improved the accuracy of yield predictions, considering factors like soil properties, weather conditions, and crop management practices.

CONTRIBUTION This paper makes several contributions to the field of agricultural technology by building on prior research in crop prediction, machine learning, and sustainable agriculture. Specifically, it advances the work of Wang et al. (2023) by addressing location-specific crop recommendation systems, focusing on optimizing agricultural yields based on environmental factors such as soil properties, moisture, and weather patterns. The study also complements the research by Uddin, Matin, and Meyer (2019) by emphasizing the development of climate-based decision-making tools for regions facing changing weather dynamics. Additionally, this paper aligns with the findings of Zhang et al. (2018) and Monteiro et al. (2022), who demonstrated the potential of AI and machine learning for crop forecasting and agricultural management. By integrating graph convolutional neural networks (GCNNs), the proposed system offers a novel approach to modeling spatial and environmental relationships, enhancing the predictive accuracy for crop selection. This extends the work of Sharma et al. (2020) by addressing regional variations in soil and climate more effectively. Furthermore, the system builds on ensemble learning methodologies, as highlighted by Sagi and Rokach (2018), while also refining terrain-based crop recommendations, following Shehadeh et al. (2021). By considering region-specific agricultural constraints and potentials, this research offers practical tools for sustainable farming practices, advancing the discussion initiated by Letey (2017) on the interplay between soil properties and crop productivity.

RELATED WORK Various applications of ML models in agriculture have been, such as crop yield prediction, weather forecasting, smart irrigation system, crop disease prediction, and deciding minimum support price (Young, L. J., 2016; Nandy and Singh, 2020; Sharma et al., 2020; Cravero and Sepulveda, 2021). Moreover, in order to achieve accurate predictions, researchers used the supervised ML algorithms for crop production prediction in (Kaur, 2016; Shehadeh et al., 2021). In addition, many researchers proposed a methodology that uses Average Pearson Correlation (APC) and Coefficient of Variance (CV) to determine indications that reveal crop price fluctuation (Pereira et al., 2021). All these methods require the dataset to be extremely clearly described, which is difficult to generate in the context of Bangladesh.   Van et al. (2020) conducted a comprehensive review, highlighting that soil composition, temperature, and rainfall are key features often used, with artificial neural networks (ANNs) being a popular algorithm in such models. Rashid et al. (2021) explored multiple machine learning (ML) algorithms, with a focus on predicting agricultural yields, particularly for palm oil. Kalimuthu et al. (2020) utilized the Naive Bayes algorithm, while Sharma et al. (2021) provided an extensive review of ML applications in agriculture, particularly in areas like livestock productivity through machine learning and computer vision for behavioral predictions. Cunha et al. (2018) developed a pre-season forecast model for soybean and maize, excluding NDVI data, integrating soil parameters from satellite data, climate forecasts, and rainfall information. Pande et al. (2021) built a practical ML-based system for crop yield prediction and fertilizer recommendations to boost yields. Reddy and Kumar (2021) proposed an ML-based approach to identify profitable crops and forecast yields using algorithms like SVM, ANN, RF, multivariate regression, and k-NN. Tahaseen and Moparthi (2021) demonstrated how various ML techniques can predict crop yields based on factors such as weather and temperature, with dataset availability influencing feature selection.

Sharma et al. (2021) examined methods for weed and pest detection, crop prediction, and leaf disease diagnosis, discussing the state of global agricultural yield forecasting. Ray et al. (2022) used distribution and correlation analysis to propose a model for 22 crop types, achieving an accuracy of 99.54%. Vashisht et al. (2022) applied extreme learning machines to predict rice yield based on geographical and seasonal factors. Gupta et al. (2022) emphasized the potential of ML to segment large datasets for yield prediction. Seireg et al. (2022) utilized cascading and stacking regression to predict blueberry yield with high accuracy, while Rasheed et al. (2021) tested a decision-making tool on historical agricultural data in Pakistan to predict net profits. Pant et al. (2021) defined the use of ML techniques to identify trends in data for crop prediction. Chandraprabha et al. (2021) utilized predictive analytics for soil nutrient forecasting, while Raja et al. (2022) demonstrated that ensemble techniques can enhance yield predictions over traditional classification methods. Cedric et al. (2022) presented a decision tree and k-NN-based ML model for forecasting crop yields in West Africa. Ali et al. (2022) employed remote sensing and statistical models to evaluate crop production. Pantazi et al. (2016) predicted wheat yield using unsupervised learning with satellite and soil data. Aghighi et al. (2018) predicted maize yield using time-series imagery from Landsat 8, introducing a modified feature selection method that outperformed others with 95% accuracy (Mariammal et al., 2021). Kumar et al. (2021) incorporated pre-processing, exploratory data analysis (EDA), and detection modules for plant disease prediction, achieving over 98% accuracy. Ziliani et al. (2022) combined the APSIM crop model with CubeSat images to produce high-resolution yield maps. Vlachopoulos et al. (2022) determined that random forests were the best for green area index (GAI) prediction with an RMSE of 10.86%. Goel and Mishra (2022) achieved 95.64% accuracy using deep learning for phenological data, while Elavarasan and Vincent (2020) found that Q-learning networks offered superior yield predictions. Haque et al. (2020) applied the ANN method to examine the impact of different factors on crop yield, using error rates to evaluate performance. Cunha and Silva (2020) developed a model that used weather forecasts and crop calendars to predict yields. Bose et al. (2016) employed spiking neural networks to analyze remote sensing data for crop yield prediction, achieving an accuracy of 95.64%.

Saeed and Lizhi (2019) developed a deep neural network (DNN) approach to enhance prediction accuracy, while Sun et al. (2020) integrated RNN and CNN for extracting spatial and temporal features from time-series data. Qiao et al. (2021) introduced a deep learning architecture combining RNNs and 3D CNNs for crop yield forecasting from multispectral images. Kalaiarasi and Anbarasi (2022) introduced a multiple kernel DNN to enhance learning capacity for medium-scale agricultural datasets. Abbaszadeh et al. (2022) combined deep learning networks like 3DCNN and ConvLSTM to predict soybean yield, with probabilistic outputs. Pang et al. (2020) used CNNs and hyperspectral imaging to model spectral data, comparing PCA and multidimensional scattering correction. Alebele et al. (2021) applied Gaussian kernel regression to rice yield prediction, outperforming other Bayesian methods. Martinez et al. (2021) utilized Gaussian processes to identify climate extremes affecting crop productivity, while Qiao et al. (2021) developed a 3D convolutional neural multi-kernel network to capture hierarchical features for yield prediction. Sivanantham et al. (2022) improved accuracy by using orthogonal basis functions and quantile regression. Li et al. (2022) focused on combining solar-induced fluorescence (SIF), satellite, and environmental data for crop yield prediction. Gupta et al. (2021) applied MapReduce architecture and K-means clustering for crop prediction based on soil and weather data. Liu et al. (2022) used MLR to predict plant diseases with 91% accuracy, while Udutalapally et al. (2021) trained CNNs to achieve 99.24% accuracy in disease prediction. Makkithaya and G. (2022) used deep residual networks for soybean prediction, while Mehta et al. (2021) compared CNN and LSTM models for crop yield forecasting. Mopidevi et al. (2022) employed deep learning to predict Ficus stem growth, while Swarnakantha et al. (2022) evaluated comparative studies on crop development. Bhansali et al. (2022) built a recommendation model using N-P-K and rainfall data to diagnose diseases and provide treatment suggestions, while Nancy et al. (2022) developed an image-based plant disease detection system using machine learning and deep learning.

METHODOLOGY Flow of the Proposed System

Dataset The dataset consists of parameters like Nitrogen(N), Phosphorous(P), Potassium(K), PH value of soil, Humidity, Temperature and Rainfall. The datasets have been obtained from the Kaggle website. Features Description N Nitrogen content in the soil (kg/ha) P Phosphorus content in the soil (kg/ha) K Potassium content in the soil (kg/ha) Temperature Average temperature (°C) Humidity Average humidity (%) pH pH level of the soil Rainfall Average rainfall (mm) Label Categorical variable indicating the recommended crop

Data preprocessing Before building the model, the following preprocessing steps were applied to the dataset: Handling Missing Data The dataset was examined for any missing values using the crop.isnull().sum() function. This check revealed that there were no missing data points, ensuring the dataset was complete. The data types of each feature were inspected using crop.info() to confirm that the numerical and categorical data were appropriately categorized, preventing any issues during model training.

Handling of datasets

Normalization and Standardization The dataset underwent normalization and standardization. First, the features were scaled using MinMaxScaler(), which transformed all feature values into a range between 0 and 1. This step is important because it ensures that features with larger ranges do not dominate the training process. After that, StandardScaler() was applied to further standardize the data. This transformation shifts the data so that it has a mean of 0 and a standard deviation of 1. This standardization helps to ensure uniformity across the dataset, which is particularly beneficial when using algorithms that assume data is normally distributed or when features have varying units or scales. This preprocessing ultimately improves the model's performance and convergence speed.

Feature Correlation A correlation matrix was generated to understand the relationships between features. The matrix showed that nitrogen (N) and phosphorus (P) had a weak negative correlation (-0.23), while phosphorus (P) and potassium (K) exhibited a strong positive correlation (0.74), indicating that these two elements often vary together in the dataset.

Feature Selection This step is focused on identifying and using the most relevant attribute from the dataset. Through this process irrelevant and redundant information is removed for the application of classifiers.In this proposed system applied different Machine Learning algorithms like Decision Tree, Naïve Bayse (NB), Support Vector Machine (SVM), Logistic Regression (LR), Random Forest (RF) and XGBoost.

Random Forest Random forest first builds new datasets from the original data. Then, the model randomly selects rows from the original data to build new datasets. The decision tree is trained on each of the bootstrapped data sets independently. The model randomly selects a subset of features for each tree and uses only them for training. Since this is a classification problem, the prediction is made by taking the majority voting of all the decision trees. This classification can also be expressed mathematically as shown: (x)-Y)   Where: H(x) is the final predicted class for input x. h(x) is the prediction of i th decision tree. Y represents the possible classes. || is an indicator function that equals 1 if the condition is true and 0 otherwise. N is the total number of decision trees in the forest.  

Evaluation metrics Coefficients of determination (R2) are used as evaluation metrics for measuring the accuracy of all the models. Adjusted R2 is a statistical measure that examines how changes in one variable can be explained by a change in a second variable while predicting the outcome of an event. The formula for R 2 can be expressed as:   Where: R 2 is the coefficient of determination. RSS is the Residual Sum of Squares which represents the total squared difference between the actual and predicted values. TSS is the Total Sum of Squares which is the total squared difference between the actual values and their mean.  

Conclusion Using datasets, machine learning models can reasonably accurately predict whether a crop will be profitable or not. This study used four different machine learning algorithms to recommend crops according to the weather conditions and soil nutrients. Random forest outperformed rest of the algorithms in this study with a testing accuracy R2 of about 99%. Through this work, farmers will increase the productivity of their agriculture and prevent soil degradation on cultivated land. They will also reduce the use of chemicals in crop production and make better use of water resources. Further research can be conducted by considering more varieties of crops in future. The current research focuses on twenty-two crops due to the limited availability of data. In future studies, soil fertility data could be assessed by considering more granular geographical conditions, based on micro nutrients data like sulfur, zinc, iron, manganese, etc. Also, a machine learning framework can be built which could recommend optimum amounts of pesticides and fertilizers to be used for a particular crop. By doing so, the production of quality crops and the profits of farmers can be increased.

References Stekhoven, D. J., and Buhlmann, P. (2012). MissForest—nonparametric missing value imputation for mixed-type data. Sujjaviriyasup, T., and Pitiruek, K. (2013). Agricultural product forecasting using a machine learning approach. Tavares, O. C. H., Santos, L. A., Filho, D. F., Ferreira, L. M., Garcia, A. C., Castro, T. A. V. T., et al. (2021). Response surface modeling of humic acid stimulation of the rice(Oryza sativa L.) root system. Arch. Agron. Uddin, K., Matin, M. A., and Meyer, F. J. (2019). Operational flood mapping using multitemporal Sentinel-1 SAR images: A case study from Bangladesh. Remote Sensing. Van Ittersum, M. K., Cassman, K. G., Grassini, P., Wolf, J., Tittonell, P., and Hochman, Z. (2013). Yield gap analysis with local to global relevance—a review. Van Klompenburg, T., Kassahun, A., and Catal, C. (2020). Crop yield prediction using machine learning. Sharma, R., Kamble, S. S., Gunasekaran, A., Kumar, V., and Kumar, A. (2020). A systematic literature review on machine learning applications for sustainable agriculture supply chain performance. Shehadeh, A., Alshboul, O., Al Mamlook, R. E., and Hamedat, O. (2021). Machine learning models for predicting the residual value of heavy construction equipment: An evaluation of modified decision tree, LightGBM, and XGBoost regression. Automation Construction. Siddique, M. N. E. A., de Bruyn, L. A. L., Osanai, Y., and Guppy, C. N. (2022). Typology of rice-based cropping systems for improved soil carbon management. P., Altman, D. G., and Sauerbrei, W. (2016). Dichotomizing continuous predictors in multiple regression: a bad idea. Sagi, O., and Rokach, L. (2018). Ensemble learning: A survey. Wiley Interdiscip. Reviews: Data Min. Knowledge Discovery. Sarker, M. A. R., Alam, K., and Gow, J. (2019). Performance of rain-fed Aman rice yield in Bangladesh in the presence of climate change. Renewable Agric. Food systems.
Tags