review main GURU SAI5446531251616502351645

gurusai9963 9 views 21 slides Jun 12, 2024
Slide 1
Slide 1 of 21
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21

About This Presentation

TECHNOLOGY


Slide Content

PREDICTING URBAN WATER QUALITY WITH UBIQUITOUS DATA - A DATA-DRIVEN APPROACH Presented By P.GURU SAI (MCA II Year) Reg. No: 22091F0014  Under the esteemed guidance of MR.V.RAJA SEKHAR MCA, M. TECH Assistant Professor, Dept. of CSE DEPARTMENT OF MASTER OF COMPUTER APPLICATIONS RAJEEV GANDHI MEMORIAL COLLEGE OF ENGINEERING & TECHNOLOGY (AUTONOMOUS) NANDYAL-518501, (Estd-1995)

INDEX Abstract Introduction Existing System Proposed System Software Design Modules Proposed Algorithms SDLC UML Diagrams Software and Hardware Requirements

ABSTRACT Urban water quality is of great importance to our daily lives. Prediction of urban water quality help control water pollution and protect human health. Predicting the urban water quality is a challenging task since the water quality varies in urban spaces non-linearly and depends on multiple factors, such as meteorology, water usage patterns, and land uses. In this work, we forecast the water quality of a station over the next few hours from a data-driven perspective, using the water quality data and water hydraulic data reported by existing monitor stations and a variety of data sources we observed in the city, such as meteorology, pipe networks, structure of road networks, and point of interests (POIs). First, we identify the influential factors that affect the urban water quality via extensive experiments. Second, we present a multi-task multi-view learning method to fuse those multiple datasets from different domains into an unified learning model. We evaluate our method with real-world datasets, and the extensive experiments verify the advantages of our method over other baselines and demonstrate the effectiveness of our approach

INTRODUCTION Urban water is a vital resource that affects various aspects of human, health and urban lives. People living in major cities are increasingly concerned about the urban water quality, calling for technology that can monitor and predict the water quality in real time throughout the city. Urban water quality, which serves as “a powerful environmental determinant” and “a foundation for the prevention and control of waterborne diseases” , refers to the physical, chemical and biological characteristics of a water body. And several chemical indexes (such as residual chlorine, turbidity and pH) can be used as effective measurements for the water quality in current urban water distribution systems.

EXISTING SYSTEM Several studies in the environmental science have been tried to analyze the water quality problems. via data-driven based approaches, and those studies covers a range of topics, from the physical process analysis in the river basin, to the analysis of concurrent input and output time series. The approaches adopted in these studies include instance-based learning models (e.g., KNN) as well as neural network models (e.g., ANN). In general, those data-driven approaches in the environmental science can fall into the following three major categories: Instance-based Learning models (IBL), Artificial Neural Network models (ANN) and Support Vector Machine models (SVM).

DISADVANTAGES The system is implemented only Multi-task Multi-view Learning Approaches. Instance-based learning models (IBL) is a family of learning algorithms that model a decision problem with instances or examples of training data that are deemed important to the model.

PROPOSED SYSTEM Data-driven Perspective : We present a novel data-driven approach to co-predict the future water quality among different stations with data from multiple domains. Additionally, the approach is not restricted to urban water quality prediction, but also can be applied to other multi-locations based copredication problem in many other urban applications. Influential Factor Identification : We identify spatially-related (such as POIs, pipe networks, and road networks) and temporally-related features (e.g., time of day, meteorology and water hydraulics), contributing to not only our application but also the general problem of water quality prediction. Unified Learning Model : We present a novel spatio -temporal multi-view multi-task learning framework ( stMTMV ) to integrate multiple sources of spatio -temporal urban data, which provides a general framework of combining heterogeneous spatio -temporal properties for prediction, and can also be applied to other spatio -temporal based applications. Real evaluation : We evaluate our method by extensive experiments that use real-world datasets in Shenzhen, China. The results demonstrate the advantages of our method beyond other baselines, such as ARMA, Kalman filter, and ANN, and reveal interesting discoveries that can bring social good to urban life.

ADVANTAGES Improves the accuracy. Water quality data: We collect water quality data every five minutes from 15 water quality monitoring stations in Shenzhen City. It comprises residual chlorine (RC), turbidity (TU) and pH. In this paper, we only use RC as the indexfor water quality, since RC is the most important and effective measurement for water quality in current urban water distribution system. Hydraulic data: Hydraulic data consists of flow and pressure, which are collected every five minutes from 13 flow sites and 14 pressure sites, respectively.

System Design Actual Problem Normally, a water quality dataset is found to be incomplete and noisy, as a result, reading data from dataset linkage traditionally fails within the discipline of software engineering. Firstly, the issue of data quality and standardization looms large. We cannot predict the accuracy with noisy or incomplete data. The main problem is that it is not using the perfect framework or software engineering approach to predict the accuracy of the disease.

Solution The DATA-DRIVEN APPROACH uses five Algorithms which are : Naïve Bayes , K-Nearest Neighbours(KNN), Random Forest, Logistic Regression Classifiers, Linear SVM . In which we use mainly three algorithms .Which are given below : Naïve Bayes, SVM and Logistic Regression Classifiers.

Modules Service Provider In this module, the Service Provider has to login by using valid user name and password. After login successful he can do some operations such as Login, Train and Test Data Sets, View Trained and Tested Accuracy in Bar Chart, View Trained and Tested Accuracy Results, View Predicted Water Quality Type, Find Water Quality Prediction Ratio, Download Trained Data Sets, View Water Quality Prediction Ratio Results, View All Remote Users. View and Authorize Users In this module, the admin can view the list of users who all registered. In this, the admin can view the user’s details such as, user name, email, address and admin authorize the users.  

In this module, there are n numbers of users are present. User should register before doing any operations. Once user registers, their details will be stored to the database. After registration successful, he has to login by using authorized user name and password. Once Login is successful user will do some operations like REGISTER AND LOGIN, PREDICT WATER QUALITY TYPE, VIEW YOUR PROFILE. R emote User

Proposed Algorithms Naive Bayes Naive Bayes is a simple yet powerful algorithm used in machine learning for classification tasks. At its core, Naive Bayes relies on Bayes' theorem, which calculates the probability of a hypothesis given the evidence. Understanding Bayes' Theorem: Bayes' theorem is a fundamental concept in probability theory that calculates the probability of a hypothesis (H) given some evidence (E). It is represented as: P(H|E) = (P(E|H) * P(H)) / P(E), where: P(H|E) is the probability of the hypothesis given the evidence. P(E|H) is the probability of the evidence given the hypothesis. P(H) is the prior probability of the hypothesis. P(E) is the prior probability of the evidence. Applying Naive Bayes for Classification: In the context of classification tasks, Naive Bayes makes the "naive" assumption that the features (or variables) are independent of each other given the class label. Despite this simplification, Naive Bayes often performs well in practice and is computationally efficient. Making Predictions: Once the model is trained, you can use it to make predictions on new, unseen data. Naive Bayes calculates the probability of each class given the features of the input data using Bayes' theorem. The class with the highest probability is then assigned as the predicted class label for the input data.

SVM In classification tasks a discriminant machine learning technique aims at finding, based on an independent and identically distributed (id) training dataset, a discriminant function that can correctly predict labels for newly acquired instances. Unlike generative machine learning approaches, which require computations of conditional probability distributions, a discriminant classification function takes a data point x and assigns it to one of the different classes that are a part of the classification task. Less powerful than generative approaches, which are mostly used when prediction involves outlier detection, discriminant approaches require fewer computational resources and less training data, especially for a multidimensional feature space and when only posterior probabilities are needed. From a geometric perspective, learning a classifier is equivalent to finding the equation for a multidimensional surface that best separates the different classes in the feature space.  SVM is a discriminant technique, and, because it solves the convex optimization problem analytically, it always returns the same optimal hyperplane parameter—in contrast to genetic algorithms (GAs) or perceptrons , both of which are widely used for classification in machine learning. For perceptrons , solutions are highly dependent on the initialization and termination criteria. For a specific kernel that transforms the data from the input space to the feature space, training returns uniquely defined SVM model parameters for a given training set, whereas the perceptron and GA classifier models are different each time training is initialized. The aim of GAs and perceptron's is only to minimize error during training, which will translate into several hyperplanes’ meeting this requirement.

Logistic Regressions classifiers Logistic Regression is a popular machine learning algorithm used primarily for binary classification tasks, where the goal is to predict the probability of an input belonging to one of two classes. Despite its name, logistic regression is used for classification rather than regression tasks. Understanding Binary Classification: In binary classification, there are two possible outcomes or classes: positive (1) and negative (0). Logistic Regression predicts the probability that an input belongs to the positive class. Logistic Regression is a simple yet effective algorithm for binary classification tasks. It models the relationship between input features and the probability of belonging to a particular class using the sigmoid function, and it learns the optimal model parameters through training to make predictions on new data.

Software Development Life Cycle

UML Diagrams Use Case Diagram

Class Diagram   CLASS DIAGRAM In software engineering, a class diagram in the Unified Modeling Language (UML) . It is a type of static structure diagram that describes the structure of a system by showing the system's classes, their attributes, operations (or methods), and the relationships among the classes. It explains which class contains information.

Activity Diagram

SOFTWARE REQUIREMENTS Operating System : Windows 10/11 Programming : Python Front End : Python HARDWARE REQUIREMENTS Processor : Intel core i5 RAM : 4 GB Hard disk : 512 GB

Thank you
Tags