Autonomous Systems for Optimization and Control

ivoandreev 198 views 43 slides Sep 14, 2024
Slide 1
Slide 1 of 43
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43

About This Presentation

Autonomous systems have the ability to operate and make decisions driven by goals without human interaction and have a myriad of applications in the industrial landscape. Autonomous systems are a highly comprehensive mix of sensor technologies, data analytics, simulations, digital twins and predicti...


Slide Content

14 | SEP | 2024
Platinum Sponsor Platinum Sponsor
Silver Sponsor Partner

14 | SEP | 2024
NEXT EVENTS
Data Saturday Sofia 2024
05 | October | 2024
Free Ticket
Labs building, Sofia Tech Park

14 | SEP | 2024
•Solution Architect @
•Microsoft AI & IoT MVP
•External Expert Eurostars-Eureka, Horizon Europe
•External Expert InnoFund Denmark, RIF Cyprus
•Business Interests
oWeb Development, SOA, Integration
oIoT, Machine Learning
oSecurity & Performance Optimization
•Contact
[email protected]
www.linkedin.com/in/ivelin
www.slideshare.net/ivoandreev
SPEAKER BIO

14 | SEP | 2024
AUTONOMOUS SYSTEMS
for Optimization and Control

14 | SEP | 2024
Takeaways
•Azure AI Assistant API–Sample(Multimodal Multi-Agent Framework)
ohttps://github.com/Azure-Samples/azureai-samples/tree/main/scenarios/Assistants/multi-agent
•Azure Assistants - Tutorial
oAPI: https://learn.microsoft.com/azure/ai-services/openai/how-to/assistant
oPlayground:https://learn.microsoft.com/azure/ai-services/openai/assistants-quickstart
•PID Control - A Brief Introduction
ohttps://www.youtube.com/watch?v=UR0hOmjaHp0
•Gymnasium Reinforcment Learning Project
ohttps://gymnasium.farama.org/content/basic_usage/
•Open AI Gym
ohttps://github.com/openai/gym
•Project Bonsai - Sample Playbook (discontinued, but a good read)
ohttps://microsoft.github.io/bonsai-sample-playbook/
You MUST read that one
and watch that one

Introduction to Autonomous Control

•Autonomy is moving ML from bits to atoms
•Levels of Autonomy
L0 Humans
In complete control
L1 Assistance with
subtasks
L2 Occasional,
human specifies intent
L3 Limited,
human fallback
L4 Full control,
human supervises
L5 Full autonomy,
human is absent
Automation vs Autonomy
•Automated Systems
oExecute a (complex) “script”
oApply long term process changes (i.e. react to wear)
oApply short-term process changes (clean, diagnose)
oNo handling of uncoded decisions
•Autonomous Systems
oAim at objectives w/o human intervention
oMonitor and sense the environment
oSuits dynamic processes and changing conditions
•Disadvantages
oHuman trust and explainability (Consumer acceptance)
oWhat if the computer breaks (Technology and
infrastructure)
oSecurity and Responsibility (Policy and legislation)

Use Case Control a Robot from A to B
Lvl1: Naïve Open Loop
•Constant speed(x) for (t)ime starting at A
Lvl2: Feedback Control
•Trajectory changes or moves faster
•Error - deviation from desired position
•Controller: Convert error to command
•Objective: Minimize error
Benefits
•Adaptive, easy to understand, buildand test

Proportional-Integral-Derivative (PID) Control
Def: Control algorithm that regulates a process by adjusting the control variables
based on error feedback.
•Proportional P = ??????
?????? × e(t)
oImmediate response, proportional gain Kp to the error e(t) at time t
•Integral I = ??????
?????? x ׬
�
??????
�(??????)�??????
oIntegral gain Ki accumulates past errors
•Derivative D = ??????
�×
�
�??????
�??????
oPredicts future error, Kd reduces overcorrection, aids stability
•PID Control Loop (x1000Hz) u(t) (P, I, D)
oKp, Ki, Kd gains directly influence the control system (heat, throttle)
•Quadcopter: Pitch, Roll, Yaw, Altitude has 4 separate PID controls

Autonomous System
Autonomous Systems
Intelligent systems that operate in highly dynamic PHYSICAL environments by sense, plan and
act based on changing environment variables
Application Requirements
○Millions of simulations
○Assessment in real world required
○Simulation needs to be close to physical worldReal
World
Plan
Act
Sense
Once the action plan is in
progress, evaluate proximity of
the objective.
Reacting starts from gathering
information about the
environment.
Interpreted sensor information
and translated to actionable
information based on reward.

Reinforcement Learning (RL)
Def. AI that helps intelligent agents to make choices and achieve a long-term
objective as a sequence of decisions
•Strategies (exploration-exploitation trade-off)
oGreedy – always choose the best reward
oBayesian – probabilistic function estimates reward;
oIntrinsic Reward – stimulates experimentation and novelty.
•Strengths
oAct in a dynamic environments and unforeseen situations
oLearns and adapts by trial/error
•Weaknesses
oEfficiency – large number of interactions required to learn
oReward Design – experts design rewards to stimulate agent properly
oEthics – reward shall not stimulate unethical actions

Key Concepts in RL
Agent Decision-making entity that learns from interactions with the
environment (or simulator)
EnvironmentThe external context of agents. Provides feedback with
rewards or penalties
State Describes the current environment configuration or situation
Action Set of choices available to the agent, which it selects based on
the state
Reward A numerical score given by the environment after an action,
guiding the agent to maximize cumulative rewards.
Policy The strategy mapping states to actions, determining the
agent's behavior.
Value
Function
Estimates the expected cumulative reward from a state

Autonomous Control Frameworks

Project Malmo (2016-2020)
•Research project by Microsoft to bridge simulation and real world
•AI experimentation platform built on top of Minecraft
•Available via an open-source license
ohttps://github.com/microsoft/malmo
•Components
oPlatform - APIs for interaction with AI agents in Minecraft
oMission – set of tasks for AI agents to complete
oInteraction – feedback, observations, state, rewards
•Use Cases
oNavigation – find route in complex environments
oResource Management – learn to plan ahead and resource allocation
oCooperative AI and multi-agent scenarios
•Discontinued (Lack of critical mass, high resource utilization)

Project Bonsai (2018-2023)
•Platform to build autonomous industrial control systems
•Industrial Metaverse (Oct 2022 - Feb 2023)
oDevelop interfaces to control powerplants, transportation networks and robots
•Components
oMachine Teaching Interface
oSimulation APIs – integrate various simulation engines
oTraining Engine – reinforcement learning algorithms
oTools - management, monitoring, deployment
•Use Cases
oIndustrial automation, Energy management, Supply chain optimization
•Challenges
oModel Complexity – accurate models requires deep understanding of the systems
oData Availability – effective RL requires accurate simulations and reliable data

Can’t we Use Something Now?
• Gym (2016 - 2021)
o Open-source Python library for testing reinforcement learning (RL) algorithms
o Standard APIs for environment-algorithm communication
o De Facto standard: 43M installations, 54’800 GitHub projects
•Gymnasium (2021 – Now)
oMaintained by Farama Foundation (Non-Profit)
•Key Components
oEnvironment – description of the problem RL is trying to solve
•Types: Classic control; Box 2D Physics; MuJoCo (multi-joint); Atari video games
oSpaces
•Observation – current state information (position, velocity, sensor readings)
•Action – all possible actions an agent could perform (Discrete, MultiDiscrete, Box)
oReward – returned by the environment after an action
Now you
know why ☺

Gymnasium Installation (Windows)
•Install Python (supported v.3.8 - 3.11)
oAdd Python to PATH; Windows works but not officially supported
•Install Gymnasium
•Install MuJoCo (Multi-Joint dynamics with Contact)
oDeveloped by Emo Todorov (University of Washington)
oAcquired by Google DeepMind; Feely available from 2021
•Dependencies for environment families
•Test Gymnasium
oCreate environment instance
pip install gymnasium
pip install “gymnasium[mujoco]”
pip install mujoco

Simplest Environment Code
1.make – create environment instance by ID
2.render_mode – how to render environment
1. human – realtime visualization
2. rgb_array – 2D RGB image in the form of an array
3. depth_array – 3D RGB image + depth from camera
3.reset – initializes each training episode
1. observation: starting state of the environment after resetting.
2. info (optional): additional information about the environment's state
4.range – number of steps
5.Episode ends when
1. When environment is invalid/truncated
2. When objective is done
3. When number of steps used

Reinforcement Learning with Gymnasium
DEMO

The Training Environment of the Autonomous
Control “Brain”
Simulation is the Key

Why Simulation?
•From Bits not Atoms
oValid digital representation of a system
oDynamic environment for analysis of computer models
oSimulators can scale easily for ML training
•Key Benefits
o Safe, risk-free environment for what-if analysis
o Conduct experiments that would be impractical in real life (cost, time)
o Insights in hidden relations
o Visualization to build trust and help understanding
•Simulation Model vs Math Model
o Mathematical/Static = set of equations to represent system
o Dynamic = set of algorithms to mimic behaviour over time
•Easier said than done ☺

Professional Simulation Tools
•MATLAB+Simulink Add-on - €2200 € + €3200 usr/year
oModeling and simulation of mechanical systems (multi-body dynamics, robotics, and mechatronics).
oPID controllers, state-space controllers – to achieve stability
oSteep learning curve
•Anylogic - €8900 (Professional), €2250 / year (Cloud)
oFree for educational purposes models
oTypical scenarios: transportation, logistic, manufacturing
oEasier to start with
•Gazebo Sim + ROS (Free Source)
oGazebo precise physics, sensors and rendering models. Tutorials & Video here
•Custom (Python + Gym)
oSupports discrete events, system dynamics, real-world systems Sample code:
https://github.com/microsoft/microsoft-bonsai-api/tree/main/Python/samples/gym-highway

Let’s Start Simple: Simulate Linear
Concept: Predictive simulator using regression model, trained on telemetry
•1 Dependent Variable
•N Independent Variables
Application: Predict how changes in input variables will affect environment
•Type: linear, polynomial, decision tree
•Excel: Solver
•Python:

Simulate Non-Linear Relations
Support Vector Machine (SVM)
•Supervised ML algorithm for classification
•Precondition
Number of samples (points) > Number of features
•Support Vectors – points in N-dimensional space
•Kernel Trick – determine point relations in higher space
oi.e. Polynomial Kernel 2D [??????
�, ??????
�] -> 3D [??????
�,??????
�,??????
�
�
+??????
�
�
], 5D [??????
�
�
, ??????
�
�
,??????
�× ??????
� ,??????
�,??????
�]
Support Vector Regression (SVR)
• Identify regression hyperplane in a higher dimension
• Hyperplane in ϵ margin where most points fit
• Python:

Simulate Multiple Outcomes
Example: Predictors [diet, exercise, medication] -> Outcomes [blood pressure, cholesterol, BMI]
Option 1: Multiple SVR Models
•SVR model for each outcome
•Precondition: outcomes are independent
Option 2: Multivariative Multiple Regression
•Predictors (Xi): multiple independent input variables
•Outcomes (Yi): multiple dependent variables to be predicted
•Coefficients (β), Error (ϵi)
•Precondition: linear relations
•Python:

Steps to Implement a Dynamic Simulator
•Data Collection
oIdentify Variables in the environment that influence outcomes
oCollect Telemetry and store data from sensors (Gateway, IoTHub, TimeSeries DB)
•Data Preprocessing
oClean Data (missing values, outliers, noise)
oSelect Features to pick the most relevant features (statistical tests, domain knowledge)
oNormalize Data to fit math model (consistent data ranges)
•Build
oTrain regression model, find the best fit hyperplane that describes input-output relations
oEvaluate model using test data (i.e. Mean Squared Error, R2)
•Integrate, Validate, Deploy
oFramework to use the model to simulate scenarios
oUX Design to allow end-user to configure and visualize environment

Networks of Specialized Agents can Solve
Complex task in Dynamic Environments
GPT is only the Beginning

Agentic Design Patterns
•!!!Large Language Models (LLMs) are under-used!!!
oZero-shot mode – generate completion from prompt
oIterative improvement
•Concept: Collaboration between autonomous LLM-based agents is noteworthy
•Workflow:
1.Agent receives an objective
2.Agent breaks down objective into tasks and creates a prompt for each task
3.Feed prompts iteratively to LLM (parallel or serial)
4.When tasks are completed, agent creates prompts by incorporating the results
5.Agent actively priorities
6.Execution continues until goal is met or infeasible

Azure OpenAI Assistant API
•Objectives
oCreate multi-agent systems
oCommunication between agents
oNo limit on context windows
oPersistence enabled
•Strategies
oReflection
oTool Use
oMulti-agency
oPlanning

Simulator: Sample Implementation
DEMO

•What are the independent variables that describe state and control the environment?
Environment State & Control
Same characteristics
Different characteristics
(no 2-way communication)
1.State variables
2.Control variables
1
2

Control Effect Baseline
•What would be the effect of applying control for a single episode step? (i.e. cost, energy, pressure)
1.Observation period
2.Step interval
3.Controls
4.Source data
5.Transformations
6.Weights
7.Effect preview
1 2
3 4 5 6 7

Simulation Target
•What dependent variable do we model with the simulator?

•How to learn the relations between dependent and independent variables?
Simulator Model Training Settings
1
1.Training period
2.Step interval
3.Data preparation
4.Type of model
5.Target transform
6.Control transform
7.State transform
2
3
4 5
6
7

Preview and Train
•Preview training data and train model
1.Data sample
statistics
2.Command data
preview
3.State data preview
4.Learnt regression
coefficients
2 3
4
1

Evaluate Performance
•How does the model explain the data?
1
3
2 1.Coefficient of
determination
2.Number of
iterations
3.Feature
importance

Autonomous Control: Sample Implementation
DEMO

Autonomous Control Settings
•How will autonomous control run?
oWhat environment simulator will it use?
oHow often the agent will act on the environment?
oDo we allow tolerance – when the target cannot be achieved within a single step?
1
3
2
1.Environment
simulator
2.Step interval
3.Tolerance

Select Control and State Characteristics
•Which of the simulator characteristics will be state and which controllable?
1.Control
characteristics
2.Possible control
values
3.State
characteristics
1
3
2

Process Constraints
•What is the range of valid values for environment state?
oConstraints on environment simulator target
oConstraints on aggregation of state characteristics (i.e. humidity < threshold)
1.Add constraint
2.Type of constraint
(target/state)
3.Condition for
constraint
4.Accept tollerance
1
3
2
4

Control Objectives
•What objective to achieve in the environment within the constraints?
oImplicit conditions – there are always hidden objectives to aim at moving towards the constraints.
•Objective: Minimize energy cost
•Example constraint: Temperature < 25 + 1.5 (tolerance)
•Implicit condition: Temperature < 25 + 1.5 (tolerance)
1.Add objective
2.Optimization func.
3.Objective target
(simulator/sum)
4.Characteristics
5.Use simulator
output baseline
6.Explicit conditions
1
2
6
3
4 5