IEEE CIS Webinar Sustainable futures.pdf

1
Sustainable Futures in AI
Exploring Efficiency and Emerging Paradigms
Claudio Gallicchio
University of Pisa, Italy
This Webinar is provided to you by
IEEE Computational Intelligence Society
https://cis.ieee.org

2
The current path of deep
learning research
is largely unsustainable.

3
Doubling time
DL training
≈ 3 months
Moore’s law
≈ 2 years
Sevilla, Jaime, et al. "Compute Trends Across Three Eras of Machine Learning."arXiv e-prints(2022): arXiv-2202.

4
https://www.technologyreview.com/2019/06/06/239031/train
ing-a-single-ai-model-can-emit-as-much-carbon-as-five-
cars-in-their-lifetimes/
Strubell, Emma, Ananya Ganesh, and Andrew McCallum.
"Energy and policy considerations for deep learning in
NLP." arXiv preprint arXiv:1906.02243 (2019).
Energy: ≈ 54 US households in 1 year
CO!: ≈125 round-trip flights on Boeing 737

5
GPT-4 has 1.7 trillion of trainable parameters.
Training costs
▶ 25k Nvidia A100 GPUs for 90+ days
$100+ millions.
▶ 50k-60k MWh
▪ more than 5 years of 1,000 average US
households.
Inference costs
▶ as much electricity per month as 26k US
households.
It continues to get worse

6
AI risks to sensibly accelerate
the climate crisis.

7
Paris Agreement
Hold “the increase in the
global average temperature
to well below 2°C above pre-
industrial levels” and pursue
efforts “to limit the
temperature increase to 1.5°C
above pre-industrial levels.”
https://unfccc.int/process-and-meetings/the-
paris-agreement
European Green Deal
Make Europeclimate neutral by 2050. To make this objective legally binding, the Commission proposed theEuropean Climate Law, which also sets a new, more ambitious net greenhouse gas emissions reduction target of at least-55% by 2030, compared to 1990 levels.
https://commission.europa.eu/strategy-and-
policy/priorities-2019-2024/european-green-
deal/climate-action-and-green-deal_en

8
Compute
up to ≈10!" FLOPS
Energy
Range from MW to GW
Compute
≈10## FLOPS
Energy
≈20 W

9
Deep Learning models achieved a
tremendous success over the years.
This comes at a very high cost.
Do we really need this all the time?

10
Example: embedded applications
Source: https://bitalino.com/en/freestyle-kit-bt
Source: https://www.eenewsembedded.com/news/
raspberry-pi-3-now-compute-module-format

11
Deep neural networks: powerful representations by
applying multiple non-linear levels of transformation.
Deep Learning =
Architectural biases + Learning algorithms

12
Desirable AI systems’ properties
Fast Learning: quickly adapts to new data
and tasks with minimal retraining.

13
Desirable AI systems’ properties
Fast Learning: quickly adapts to new data and tasks with minimal retraining.
Low Training Cost: economical in terms of
computational resources and energy
consumption.

14
Desirable AI systems’ properties
Fast Learning: quickly adapts to new data and tasks with minimal retraining.
Low Training Cost: economical in terms of computational resources and energy
consumption.
Efficient in Modeling Dynamic
Information: handle time-series data and
dynamic environments effectively.

15
Desirable AI systems’ properties
Fast Learning: quickly adapts to new data and tasks with minimal retraining.
Low Training Cost: economical in terms of computational resources and energy
consumption.
Efficient in Modeling Dynamic Information: handle time-series data and
dynamic environments effectively.
Ease of Hardware Implementation: suitable
for deployment on a wide range of hardware
platforms, including low-power devices.

16
Desirable AI systems’ properties
Fast Learning: quickly adapts to new data and tasks with minimal retraining.
Low Training Cost: economical in terms of computational resources and energy
consumption.
Efficient in Modeling Dynamic Information: handle time-series data and
dynamic environments effectively.
Ease of Hardware Implementation: suitable for deployment on a wide
range of hardware platforms, including low-power devices.

17
Nakajima, Kohei, and Ingo Fischer.Reservoir computing.
Springer Singapore, 2021.
“Reservoir Computing
seems simple but is difficult,
feels new but is old, opens
horizons, and is brutally
limiting”
- H. Jaeger

18
RNN architecture
recurrent
layer
%(')
)(')
*(')
output layer
(readout)%&=()$*&+)%%&−1+-%
.&=)&%&+-&
trainable parameters
state transition function
+"
+#
+$

19
Propagation Issues: fading / exploding memory
"(4) "(3) "(2) "(1) "(5)
*
!
*
"
+(1)
,(1)
*
!
*
"
+(2)
,(2)
*
!
*
"
+(3)
,(3)
*
!
*
"
+(4)
,(4)
*
!
*
"
+(5)
,(5)
*
# *# *
# *
#
Instabilities in the
forward propagation of
the input.

20
Propagation Issues: fading / exploding gradients
"(4) "(3) "(2) "(1) "(5)
*
!
*
"
+(1)
,(1)
*
!
*
"
+(2)
,(2)
*
!
*
"
+(3)
,(3)
*
!
*
"
+(4)
,(4)
*
!
*
"
+(5)
,(5)
*
# *# *
# *
#
Instabilities in the
backward propagation of the gradients.

21
The Philosophy
“Randomization is computationally
cheaper than optimization”
Rahimi, A. and Recht, B., 2008. Weighted sums of random kitchen sinks: Replacing minimization with randomization in learning.
Advances in neural information processing systems, 21, pp.1313-1320.
Rahimi, A. and Recht, B., 2007. Random features for large-scale kernel machines. Advances in neural information processing systems,
20, pp. 1177-1184.

22
Basic Idea
Use as much as possible the intrinsic computational
capabilities of RNNs.
Even prior to (or in absence of) learning of the
recurrent connections.

23
Echo State Network
recurrent
layer
%(')
)(')
*(')
output layer
(readout)%&=()$*&+)%%&−1+-%
.&=)&%&+-&
trainable parameters
state transition function
+"
+#
+$
randomly initialized
and left untrained
“reservoir”
layer
Jaeger, Herbert, and Harald Haas. Science 304.5667 (2004): 78-80.
the representation layer is fixed,
only the output layer is trained

24
Echo State Network
recurrent
layer
%(')
)(')
*(')
output layer
(readout)!"=$%!&"+%"!"−1+*"
state transition function
+"
+#
+$
“reservoir”
layer
Jaeger, Herbert, and Harald Haas. Science 304.5667 (2004): 78-80.
Random, ok… but stable.
Control the asymptotic stability by constraining
the eigenvalues of )% (echo state property)
Yildiz, Izzet B., Herbert Jaeger, and Stefan J. Kiebel. "Re-visiting the
echo state property."Neural networks35 (2012): 1-9.
the representation layer is fixed,
only the output layer is trained

25
Echo State Property (ESP)
Gallicchio, Claudio. "Euler state networks: nondissipative reservoir computing"Neurocomputing 2024.

26
Advantages
Clean mathematical Analysis
Efficiency
Hardware implementation

27
AdvantagesClean mathematical Analysis
Efficiency
Hardware implementationMotionSenseHAR
0.930.972000x faster1700x more efficient
Gallicchio, Claudio. "Euler state networks: nondissipative reservoir computing"Neurocomputing 2024.

28
AdvantagesClean mathematical Analysis
Efficiency
Hardware implementation
Dragone, Mauro, et al. "A cognitive robotic ecology
approach to self-configuring and evolving AAL
systems."Engineering Applications of Artificial
Intelligence45 (2015): 269-280.
Neural learning in 8Kb of
memory + deployment
over-the-air

29
AdvantagesClean mathematical Analysis
EfficiencyHardware implementation
the reservoir can be implemented by any (controllable) physical substrate
●dynamics
●non-linearity
Yan, M., et al.Emerging opportunities and challenges for the
future of reservoir computing.Nat Commun15, 2056 (2024).
Nakajima, Kohei, et al. "Information processing via physical soft
body."Scientific reports5.1 (2015): 10487.

30
AdvantagesClean mathematical Analysis
EfficiencyHardware implementation
the reservoir can be implemented by any (controllable) physical substrate
●dynamics
●non-linearity
Yan, M., et al.Emerging opportunities and challenges for the
future of reservoir computing.Nat Commun15, 2056 (2024).
Nakajima, Kohei, et al. "Information processing via physical soft
body."Scientific reports5.1 (2015): 10487.

31
AdvantagesClean mathematical Analysis
EfficiencyHardware implementation
hardware–software co-design of
graph echo state nets
random resistive memory
arrays: low-cost, nanoscale and
stackable resistors for efficient
in-memory computing
state of the art performance +
up to 40x improvements in
energy efficiency
Wang, Shaocong, et al. "Echo state graph neural networks with
analogue random resistive memory arrays."Nature Machine
Intelligence5.2 (2023): 104-113.

32
Deep reservoirs
Reservoir = set of nested non-linear dynamical systems
#!$=tanh(+"
!#!$−1++#!#!$%$+/!)
…
#%$=tanh(+"
%#%$−1++#%0$+/%)
Gallicchio, Claudio, Alessio Micheli, and Luca
Pedrelli. "Deep reservoir computing: A critical
experimental analysis." Neurocomputing 268
(2017): 87-99
driving input

33
• Multiple time-scales
• Multiple frequencies
• Develop richer dynamics even without training of the recurrent connections
Gallicchio, Claudio, Alessio Micheli, and
Luca Pedrelli. "Deep reservoir computing: A
critical experimental analysis."
Neurocomputing 268 (2017): 87-99
Gallicchio, C., Micheli, A. and Pedrelli, L.,
2018. Design of deep echo state networks.
Neural Networks, 108, pp.33-47.
Deep reservoirs

34
Euler State Networks (EuSN)
Non-dissipative reservoir computing
1.impose antisymmetric recurrent weight
matrix to enforce critical dynamics
2.discretize the ODE
!"=!"−1++ tanh(%0 &"+%"−%"1−23!"−1+*)
untrained
step sizediffusion
dynamics are arbitrarily close to the edge-of-stability
Gallicchio, Claudio. "Euler state networks: Non-dissipative
Reservoir Computing."Neurocomputing 2024.
/
/&ℎ&=tanh()$*&+)%%&+-)

35
Euler State Networks (EuSN)
Non-dissipative reservoir computing
Gallicchio, Claudio. "Euler state networks: Non-dissipative
Reservoir Computing."Neurocomputing 2024.
High accuracy vs
SOTA fully trainable
models & ESNs

36
Euler State Networks (EuSN)
Non-dissipative reservoir computing
Gallicchio, Claudio. "Euler state networks: Non-dissipative
Reservoir Computing."Neurocomputing 2024.
Extremely more
efficient (up to 1750x
faster) than fully
trainable models

37
Is it possible to create a
reservoir that is better than
just a random one?

38
52346=5(76+8)
Intrinsic Plasticity
Schrauwen, B., Wardermann, M., Verstraeten, D., Steil, J.J. and Stroobandt, D., 2008.
Improving reservoirs using intrinsic plasticity. Neurocomputing, 71(7-9), pp.1159-1171.
•Adapt gain and bias of the act. function
•Tune the probability density of reservoir
neurons to maximum entropy
gainbias
hyperparameters
Kullback–Leibler divergence minimization
G.B.Morales,C.Mirasso,M.C.Soriano, 2021.
Unveiling the role of plasticity rules in reservoir
computing. Neurocomputing.

39
Federated and Continual scenarios in Pervasive AI
Centralized setting:
all data is available
in advance on a
single machine
Federated stationary
scenario: each client
has its own private
dataset in advance
Continual scenario: a
single machine gets
the data from clients in
a streaming fashion
Federated Continual
scenario: each client
has its own private
data stream
De Caro, Valerio, Claudio Gallicchio, and Davide Bacciu. "Continual adaptation of federated
reservoirs in pervasive environments."Neurocomputing 2023

40
,%,.%
,!,.!
,&,.&+$'(=(0
)
,))0
)
.)+2 3
*%
,%+,-=,.+4,.
.%+,-=..+4..
,&+,-=,/+4,/
.&+,-=./+4./
(exact) Federated
Learning at the
readout level
De Caro, Valerio, Claudio Gallicchio, and Davide Bacciu. "Continual adaptation
of federated reservoirs in pervasive environments."Neurocomputing 2023
Bacciu, Davide, et al. "Federated reservoir computing neural networks."2021
International Joint Conference on Neural Networks (IJCNN). IEEE, 2021.

41
Continual & Federated
Intrinsic Plasticity Learning
at the reservoir level
De Caro, Valerio, Claudio Gallicchio, and Davide Bacciu. "Continual
adaptation of federated reservoirs in pervasive
environments."Neurocomputing 2023
Stress Recognition (LM)
Driving-style Personalization
(LM)
.
Stress
prediction
Driving profile.
.
EDA sensing
!!,#!
!",#"
!,#=%
&#
&!$
$∈&!
,%
&#
&#$
$∈&!
CPSoS Applications
with human in the loopteaching-h2020.eu
De Caro, Valerio, et al. "AI-as-a-Service Toolkit for Human-Centered Intelligence in
Autonomous Driving."arXiv preprint arXiv:2202.01645, PERCOM 2022

42
(Neural & Physical) ComputingwithReservoirs
Archetype Computing
System engine to run
dynamical systems
enriched with lifelong
and evolutionary
learning
https://eic-emerge.eu

43
NEURONE: extremely efficient NEUromorphic
Reservoir cOmputing in Nanowire network hardwarE
Advanced
Reservoir
Computing nets
Nanowire
networks
memristive HW
Ultra energy
efficient
learning
Milano, Gianluca, et al. "In materiareservoir computing with a fully memristivearchitecture
based on self-organizing nanowire networks."Nature materials21.2 (2022): 195-202.

44
Reservoir Computing: principled
sustainable AI, designing training-efficient
sequence models.

45
Reservoir Computing: principled sustainable AI, designing training-
efficient sequence models.
Dynamics constrained in a “smart” way to
offer useful architectural biases.

46
Dynamics constrained in a “smart” way to offer useful architectural
biases.
Great potential in pervasive AI scenarios
where reduced computational resources
are available, and neuromorphic HW.
Reservoir Computing: principled sustainable AI, designing training-
efficient sequence models.

47
Dynamics constrained in a “smart” way to offer useful architectural
biases.
Great potential in pervasive AI scenarios where reduced
computational resources are available, and neuromorphic HW.
Reservoir Computing: principled sustainable AI, designing training-
efficient sequence models.
Enable the development of smarter neural
architectures that can be trained end-to-end.

48
Dynamics constrained in a “smart” way to offer useful architectural biases.
Great potential in pervasive AI scenarios where reduced computational
resources are available, and neuromorphic HW.
Reservoir Computing: principled sustainable AI, designing training-efficient
sequence models.
Enable the development of smarter neural architectures that can be trained end-to-end.
A greener path for AI development without
the need of extensive parameter training.

49
Resources

50
https://github.com/reservoirpy/reservoirpy

51
AI-Toolkit
https://github.com/EU-
TEACHING/teaching-ai-toolkit
Lomonaco, Vincenzo, et al.
"AI-Toolkit: A Microservices
Architecture for Low-Code
Decentralized Machine
Intelligence." ICASSPW
2023.
Deep Echo State Networks
Euler State Networks
https://github.com/galli
cch/DeepRC-TF
https://github.com/gallic
ch/EulerStateNetworks

52
Reservoir Computing NNs
https://www.youtube.com/
watch?v=XJg7VdN7g-0
IJCNN 2021 Tutorial: Reservoir Computing:
Randomized Recurrent Neural Networks
WCCI IJCNN 2024
Special Session
https://dyn.web.nitech.ac.jp/en/wc
ci2024_ss_rc

53
IEEE Task Force on Reservoir Computing
Promote and stimulate the development of Reservoir Computing
research under both theoretical and application perspectives.
https://sites.google.com/view/reservoir-computing-tf/
IEEE Task Force on Randomization-based Neural Networks and Learning
Systems
Promote the research and applications of deep rand. neural networks
and learning systems.
https://sites.google.com/view/randnn-tf/

54
Sustainable Futures in AI
Exploring Efficiency and
Emerging Paradigms
Claudio Gallicchio
University of Pisa, Italy
This Webinar is provided to you by
IEEE Computational Intelligence Society
https://cis.ieee.org
[email protected]
https://www.linkedin.com/in/claudio-gallicchio-05a47038/
https://twitter.com/claudiogallicc1
collaborations welcome!

IEEE CIS Webinar Sustainable futures.pdf

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

IEEE CIS Webinar Sustainable futures.pdf

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Slide 28

Slide 29

Slide 30

Slide 31

Slide 32

Slide 33

Slide 34

Slide 35

Slide 36

Slide 37

Slide 38

Slide 39

Slide 40

Slide 41

Slide 42

Slide 43

Slide 44

Slide 45

Slide 46

Slide 47

Slide 48

Slide 49

Slide 50

Slide 51

Slide 52

Slide 53

Slide 54

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx