AI and data engineering are poised to revolutionize how we address global challenges. This talk explores the synergistic relationship between these disciplines in optimizing energy usage and driving sustainability. By leveraging advanced data engineering techniques, we can create AI models capable o...
AI and data engineering are poised to revolutionize how we address global challenges. This talk explores the synergistic relationship between these disciplines in optimizing energy usage and driving sustainability. By leveraging advanced data engineering techniques, we can create AI models capable of predicting energy consumption patterns, identifying inefficiencies, and recommending actionable solutions. From optimizing building energy management systems to improving renewable energy integration, we will showcase real-world applications where AI-driven data engineering is making a significant impact. This presentation will delve into the technical aspects of data acquisition, preprocessing, feature engineering, and model development while emphasizing the broader societal benefits of sustainable AI.
Size: 2.14 MB
Language: en
Added: Sep 16, 2024
Slides: 23 pages
Slide Content
Data Engineering for
Sustainable AI
Optimizing Energy Usage and Real-World Impact
How can we make
AI more efficient
and sustainable?
We’re going to focus on data engineering
There are different ways
●Optimizing training
●Optimizing scoring
●Improving architectures
I’ve dealt with or
already have data
engineers and it
didn’t…
Data Engineer: A
software engineer
who has specialized in
creating data systems
How good is a data engineer at
a data science task?
How good is a data scientist at
a data engineering task?
A data engineer is the not the
data scientist with the best
programming skill or the data
warehouse engineer who
learned a little programming
Engineers and
scientists think
about problems
differently
Examples:
●Efficient data retrieval (databases)
●Efficient data movement (pub/subs)
●Right tool for the job
●Understanding nuances and tradeoffs
Optimizing Systems
Engineers excel at optimizing and architecting systems that are more efficient
Examples:
●Are the right systems being used?
●Is the code efficient?
●Is more time being spent in an unexpected
code path?
Is training using the right systems
Optimizing Training
Examples:
●Is the scoring system optimized for the use
case?
●Is external data accessed or cached efficiently?
●Can the process be sped up with concurrency?
Optimizing Scoring
Is the scoring system optimized?
A lot of a little or a little of
a lot. A 0.1% improvement
on a trillion or 2 hours of
an eight hour job.
Let’s make a concrete example
Energy usage will be optimized:
●By the car driving less
●By using less power at the data
center for training and scoring
Optimizing Car
Directions Using AI
1.Identify the inefficient parts of the system
2.Estimate the difficulty and efficiency gains of
optimizing the parts
3.Gradually improve the specific parts of the system
4.Repeat
Three Pronged Approach
Looking for efficiency through the eyes of a data engineer
Improving how we acquire data
Data Acquisition
●Sustainability systems often have
real-time ingestion requirements
●Real-time systems have different levels
of engineering rigor
●Dissemination of data is difficult
Getting the data ready
Preprocessing
●Use the right distributed system
for compute and storage
●Use binary formats
Improving data preparation
Feature engineering
●Make sure queries and data preparation
is efficient
●Ensure queries and data paths are
understood
●Use best practices for running queries
Distribute and optimize
Model Development
●Ensure you know about the latest
technologies in distributed systems
●Check for performance and memory
loading issues
●A little of a lot or a lot of a little
The Right People
at the Right Ratios
We need data scientists, data
engineers, and operations engineers.
Each person is important with the
right ratio.
Who
Data Teams
All Three Teams Are Required For Success