Predicting Power Consumption for a Greener Tomorrow: Machine Learning Project Presentation
jadavvineet73
326 views
18 slides
May 18, 2024
Slide 1 of 18
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
About This Presentation
Explore how our student team leveraged data science to forecast power consumption, empowering smarter energy management and sustainability initiatives. visit for more: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Size: 3.71 MB
Language: en
Added: May 18, 2024
Slides: 18 pages
Slide Content
Power Consumption Prediction Develop a robust tool for optimizing energy usage and efficient production of power Presented by : Narra Yugendra
Introduction The Power sector is most important, a necessary sector and is very well influenced by technological advancements, changing consumer preferences, and a competitive market. Power Consumption, which is the phenomenon of users consuming the provided electricity, poses unique challenges and opportunities. When the power consumption is altered, it can seriously affect on production and transmission of electricity. Machine learning, with its predictive capabilities, offers a transformative approach to understanding and mitigating the challenges posed by changes in power consumption.
Power Domain And Importance The Power Sector is a rapidly growing sector and it's not shy to present its own set of distinct challenges and opportunities. I chose Power Domain for my Capstone Project because : Efficiency of Supply: The efficient supply of power to consumers create a stable system of operations which enables system to produce to power based on the demand of the consumer. Costs matters : The cost of supply ,transmission and distribution of power to consumers should be optimized so the providers don't experience losses . Chain of operation : Different production units hydro, thermal, wind are used. So as per the demand these units should be made operational. Tech is always changing: New tech stuff is always popping up, especially in Power industry. Figuring out how to use these tech innovations to help consumers is part of the adventure.
Problem Statement & Explanation Electricity Production from Various Units Sub Station Zone_1 Zone_2 Zone_3 Weather conditions, and potential emission diffusion metrics : Temperature Humidity Wind Speed Diffuse flows General Diffuse flows
Dataset Information Here are the key details about the dataset used in this project: Number of records: Our dataset comprises a robust collection of data, consisting of 52,416 records. Each record represents a unique entry, contributing to the richness and depth of our analysis. Features/Columns: The dataset is characterized by a diverse set of features, each providing valuable insights into climatic conditions, flows of water, and power consumptions in various zones. In total, there are 9 features/columns that form the basis of our predictive modeling. Source of the Data: We have partnered with a leading Moroccan renewable energy company committed to providing efficient and sustainable energy solutions. They want to develop a robust tool for optimizing energy usage in Agadir, a critical region for their operations. Columns/Features Datetime Temperature Humidity Wind Speed General Diffuse Flows Diffuse Flows PowerConsumption_Zone1 PowerConsumption_Zone2 PowerConsumption_Zone3
Exploratory Data Analysis (EDA) Exploring the data allowed us to gain a comprehensive overview of the data's structure. It uncovered potential patterns, helped us identify key trends and get essential insights from the dataset. Throughout the EDA process, we analyzed the distribution of individual features, investigated correlations, and explored any inherent relationships between variables. Visualizations also played a crucial role in providing a clear representation of the data, offering insights into customer behavior and identifying the factors that may contribute to customer churn.
Exploratory Data Analysis (EDA) First, we made sure there were no Null values and Duplicates in the dataset. And luckily, there weren't any. Our dataset was clean to begin with. There are some columns that don't provide any useful information and hence they won't contribute much to the predictions. Therefore, we will drop the following columns during Preprocessing : Datetime, PowerConsumption_Zone1, PowerConsumption_Zone2. The target variable, PowerConsumption_Zone3 exhibits Continuousness. The independent variables Temperature, Humidity, Windspeed, General Diffuse Flows, Diffuse Flows also exhibits Continuousness.
Visualizations
Visualizations These above Visualizations of various weather conditions, and potential emission diffusion metrics Vs. Power Consumption in Zone 3 observes that these are Independent variables with respect to the power consumption.
Visualizations Upon inspecting the heatmap, we can see that there is strong positive correlation observed among the columns PowerConsumption_Zone1, PowerConsumption_Zone2, PowerConsumption_Zone3 . As a result, PowerConsumption_Zone1, PowerConsumption_Zone2 will be dropped.
Preprocessing First, “Datetime” , “PowerConsumption_Zone1” and “ PowerConsumption_Zone2 ” columns were dropped as they didn’t provide any useful information for our predictions we made sure there were no Null values and Duplicates in the dataset. And luckily, there weren't any. Our dataset was clean to begin with. Splitting the data into X and y Now, we partition the dataset into two components: X and y. The variable X encompasses all independent variables, representing the features that contribute to our predictions. On the other hand, y encapsulates the dependent variable or target variable, serving as the outcome we aim to predict.
Train-Test Split We then split the dataset into training data and testing data. We'll now split the dataset into training and testing data. We will do an 80:20 split, so our test size will be set to 0.2. We will take Random State as 42. This will guarantee the reproducibility of our results across different runs. Minmax Scaler We used Minmax Scaler to normalize the features of the dataset. This ensured that the consistency between the features of the dataset was maintained. MinMax Scaler scales the data so that it is in the range of [0, 1] .
Applying Machine Learning Algorithms This Power Consumption prediction problem we have here is a Continuous Regression problem. Models used: Linear Regression : Linear regression is a quiet and the simplest statistical regression method used for predictive analysis in machine learning . Linear regression shows the linear relationship between the independent(predictor) variable I, and the dependent(output) variable Decision Tree Regression : Decision tree regression observes features of an object and trains a model in the structure of a tree to predict data in the future to produce meaningful continuous output . In the context of Power Consumption prediction, it observes the features of independent variables and trains the model. Random Forest Regression : Random forest regression is a supervised learning algorithm and bagging technique that uses an ensemble learning method for regression in machine learning . The trees in random forests run in parallel, meaning there is no interaction between these trees while building the trees. Gradient Boost Regression : Gradient boosting regression trees are based on the idea of an ensemble method derived from a decision tree . The decision tree uses a tree structure. Starting from tree root, branching according to the conditions and heading toward the leaves, the goal leaf is the prediction result.
Model Selection and Considerations Random Forest Regression outperforms Linear Regression, Decision tree Regression and Gradient Boost Regression in all metrics, demonstrating higher r2 Score, Lower mse and rmse error. It seems to be a promising model for our task. Based on the provided metrics, Random Forest Regression stands out as the best-performing model overall . Hence, we will go with Random Forest Regression as our final model as it is quite evident that it predicts best for our Power Consumption prediction model.
Conclusion With the help of several insights, patterns and trends in our data, we’ve used Machine Learning to predict the power consumption in zone3 . This project offers significant benefits to electricity providers : By predicting power consumption, Electricity providers can adopt proactive measures to produce power at the required rate. This involves proper electricity production , less transmission losses and proper supply to consumers. By focusing efforts on consumption at a high rate, Electricity providers can streamline operations, reduce production costs, and improve overall efficiency. Understanding the factors influencing power consumption enables providers to efficiently supply the power to meet individual needs. This level of personalization fosters stronger consumer relationships, increases efficiency in supply of electricity without losses.