Federated And Transfer Learning Roozbeh Razavifar Boyu Wang

mystrobbeba 6 views 84 slides May 19, 2025
Slide 1
Slide 1 of 84
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77
Slide 78
78
Slide 79
79
Slide 80
80
Slide 81
81
Slide 82
82
Slide 83
83
Slide 84
84

About This Presentation

Federated And Transfer Learning Roozbeh Razavifar Boyu Wang
Federated And Transfer Learning Roozbeh Razavifar Boyu Wang
Federated And Transfer Learning Roozbeh Razavifar Boyu Wang


Slide Content

Federated And Transfer Learning Roozbeh
Razavifar Boyu Wang download
https://ebookbell.com/product/federated-and-transfer-learning-
roozbeh-razavifar-boyu-wang-46507424
Explore and download more ebooks at ebookbell.com

Here are some recommended products that we believe you will be
interested in. You can click the link to download.
Federated Learning Fundamentals And Advances Yaochu Jin Hangyu Zhu
Jinjin Xu Yang Chen
https://ebookbell.com/product/federated-learning-fundamentals-and-
advances-yaochu-jin-hangyu-zhu-jinjin-xu-yang-chen-47501220
Federated Learning Techniques And Its Application In The Healthcare
Industry H L Gururaj
https://ebookbell.com/product/federated-learning-techniques-and-its-
application-in-the-healthcare-industry-h-l-gururaj-58485156
Federated Learning Privacy And Incentive 1st Ed Qiang Yang
https://ebookbell.com/product/federated-learning-privacy-and-
incentive-1st-ed-qiang-yang-22497910
Federated Learning With Python Design And Implement A Federated
Learning System And Develop Applications Using Existing Frameworks
Kiyoshi Nakayama Phd
https://ebookbell.com/product/federated-learning-with-python-design-
and-implement-a-federated-learning-system-and-develop-applications-
using-existing-frameworks-kiyoshi-nakayama-phd-49053004

Federated Learning Theory And Practice 1st Edition Lam M Nguyen
https://ebookbell.com/product/federated-learning-theory-and-
practice-1st-edition-lam-m-nguyen-58008148
Trust Funds And Fiscal Risks In The North Pacific Analysis Of Trust
Fund Rules And Sustainability In The Marshall Islands And The
Federated States Of Micronesia Asian Development Bank
https://ebookbell.com/product/trust-funds-and-fiscal-risks-in-the-
north-pacific-analysis-of-trust-fund-rules-and-sustainability-in-the-
marshall-islands-and-the-federated-states-of-micronesia-asian-
development-bank-42791788
Security And Privacy In Federated Learning Shui Yu Lei Cui
https://ebookbell.com/product/security-and-privacy-in-federated-
learning-shui-yu-lei-cui-49419992
Federated Learning A Comprehensive Overview Of Methods And
Applications Heiko Ludwig
https://ebookbell.com/product/federated-learning-a-comprehensive-
overview-of-methods-and-applications-heiko-ludwig-43818106
Split Federated Learning For Secure Iot Applications Concepts
Frameworks Applications And Case Studies Gururaj Harinahalli Lokesh
https://ebookbell.com/product/split-federated-learning-for-secure-iot-
applications-concepts-frameworks-applications-and-case-studies-
gururaj-harinahalli-lokesh-93000020

Adaptation, Learning, and Optimization 27
Roozbeh Razavi-Far
Boyu Wang
Matthew E. Taylor
Qiang Yang   Editors
Federated
and Transfer
Learning

Adaptation, Learning, and Optimization
Volume 27
Series Editors
Yew Soon Ong, Nanyang Technological University, Singapore, Singapore
Abhishek Gupta, Singapore Institute of Manufacturing Technology, Singapore,
Singapore
Maoguo Gong, Xidian University, Xian, Shaanxi, China

The role of adaptation, learning and optimization are becoming increasingly essential
and intertwined. The capability of a system to adapt either through modification of its
physiological structure or via some revalidation process of internal mechanisms that
directly dictate the response or behavior is crucial in many real world applications.
Optimization lies at the heart of most machine learning approaches while learning
and optimization are two primary means to effect adaptation in various forms. They
usually involve computational processes incorporated within the system that trigger
parametric updating and knowledge or model enhancement, giving rise to progressive
improvement. This book series serves as a channel to consolidate work related to
topics linked to adaptation, learning and optimization in systems and structures.
Topics covered under this series include:
complex adaptive systems including evolutionary computation, memetic
computing, swarm intelligence, neural networks, fuzzy systems, tabu search,
simulated annealing, etc.
machine learning, data mining & mathematical programming
hybridization of techniques that span across artificial intelligence and computa-
tional intelligence for synergistic alliance of strategies for problem-solving.
aspects of adaptation in robotics
agent-based computing
autonomic/pervasive computing
dynamic optimization/learning in noisy and uncertain environment
systemic alliance of stochastic and conventional search techniques
all aspects of adaptations in man-machine systems.
This book series bridges the dichotomy of modern and conventional mathematical
and heuristic/meta-heuristics approaches to bring about effective adaptation, learning
and optimization. It propels the maxim that the old and the new can come together
and be combined synergistically to scale new heights in problem-solving. To reach
such a level, numerous research issues will emerge and researchers will find the book
series a convenient medium to track the progresses made.
Indexed by SCOPUS, zbMATH, SCImago.

Roozbeh Razavi-Far·Boyu Wang·
Matthew E. Taylor·Qiang Yang
Editors
FederatedandTransfer
Learning

Editors
Roozbeh Razavi-Far
Faculty of Computer Science
University of New Brunswick
Fredericton, NB, Canada
Department of Electrical and Computer
Engineering and School of Computer
Science
University of Windsor
Windsor, ON, Canada
Matthew E. Taylor
University of Alberta and the Alberta
Machine Intelligence Institute (Amii)
Edmonton, AB, Canada
Boyu Wang
Department of Computer Science
Western University
London, ON, Canada
Qiang Yang
Department of Computer Science
and Engineering
Hong Kong University of Science
and Technology
Kowloon, China
ISSN 1867-4534 ISSN 1867-4542 (electronic)
Adaptation, Learning, and Optimization
ISBN 978-3-031-11747-3 ISBN 978-3-031-11748-0 (eBook)
https://doi.org/10.1007/978-3-031-11748-0
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Switzerland AG 2023
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Preface
We live in a world with tremendous amounts of available data. However, not all data
is equally valuable, and oftentimes there is not enough high-quality data to achieve
desired results. Furthermore, it may not be feasible to centralize all relevant data due
to concerns over privacy, producing fast results at the edge, etc. The fields oftransfer
learningandfederated learninghold the promise of addressing these problems,
improving existing machine learning applications, and enabling completely novel
applications.
However, significant fundamental and applied research remains before transfer
and federated learning can reach their full potential in supervised, unsupervised,
and reinforcement learning settings. For instance, concept drift and concept shift are
common problems—how can data from a different (but related) task be best used
when a new task is slightly different? And what happens if the tasks change over
time? What is the best way to reuse past data or adapt it? When are tasks so different
that past data should simply be ignored?
This book aims to serve two related goals. First, the book provides high-level
background information that will allow students, researchers, and practitioners to
quickly get up to speed in these exciting areas, understanding what has been done,
how the algorithms work, how they are related, and what are some of the important
open problems. Second, the book showcases novel contributions over state of the
art, providing significant contributions to the field. We hope that these individual
contributions can not only be used directly, but also serve as starting points for
completely novel research.
After an introductory chapter, Chaps.2–4provide introductions to federated
learning. Chapters5–8then launch into novel federated learning methods. Chap-
ters9–15focus on transfer learning, broken down into three chapters on novel
transfer learning research, one chapter devoted to a released transfer learning tool, one
chapter related to transfer in reinforcement learning, one chapter surveying transfer
in reinforcement learning settings, and one chapter focuses on federated transfer
reinforcement learning.
One of the most exciting parts of transfer learning and federated learning is how
many questions remain unanswered and how much room for improvement remains,
v

vi Preface
even after years of study. We hope you will enjoy learning about these techniques as
much as we have!
The assistance provided by Springer-Verlag is gratefully acknowledged.
Fredericton, NB, Canada
London, ON, Canada
Edmonton, AB, Canada
Kowloon, China
Roozbeh Razavi-Far
Boyu Wang
Matthew E. Taylor
Qiang Yang

Contents
An Introduction to Federated and Transfer Learning................. 1
Roozbeh Razavi-Far, Boyu Wang, Matthew E. Taylor, and Qiang Yang
Federated Learning for Resource-Constrained IoT Devices:
Panoramas and State of the Art..................................... 7
Ahmed Imteaj, Khandaker Mamun Ahmed, Urmish Thakker,
Shiqiang Wang, Jian Li, and M. Hadi Amini
Federated and Transfer Learning: A Survey on Adversaries
and Defense Mechanisms........................................... 29
Ehsan Hallaji, Roozbeh Razavi-Far, and Mehrdad Saif
Cross-Silo Federated Neural Architecture Search
for Heterogeneous and Cooperative Systems......................... 57
Yang Liu, Xinle Liang, Jiahuan Luo, Yuanqin He, Tianjian Chen,
Quanming Yao, and Qiang Yang
A Unifying Framework for Federated Learning...................... 87
Saber Malekmohammadi, Kiarash Shaloudegi, Zeou Hu,
and Yaoliang Yu
A Contract Theory Based Incentive Mechanism for Federated
Learning......................................................... 117
Yuan Liu, Mengmeng Tian, Yuxin Chen, Zehui Xiong, Cyril Leung,
and Chunyan Miao
A Study of Blockchain-Based Federated Learning.................... 139
Samaneh Miri Rostami, Saeed Samet, and Ziad Kobti
Swarm Meta Learning............................................. 167
Xiao Tian, Yuzhang Jiang, and Hua Tianfield
Rethinking Importance Weighting for Transfer Learning............. 185
Nan Lu, Tianyi Zhang, Tongtong Fang, Takeshi Teshima,
and Masashi Sugiyama
vii

viii Contents
Transfer Learning via Representation Learning...................... 233
Mohammad Rostami, Hangfeng He, Muhao Chen, and Dan Roth
Modeling Individual Humans via a Secondary Task Transfer
Learning Method.................................................. 259
Anmol Mahajan and Matthew Guzdial
From Theoretical to Practical Transfer Learning: The ADAPT
Library........................................................... 283
Antoine de Mathelin, Francois Deheeger, Mathilde Mougeot,
and Nicolas Vayatis
Lyapunov Robust Constrained-MDPs for Sim2Real Transfer
Learning......................................................... 307
Reazul Hasan Russel, Mouhacine Benosman, Jeroen van Baar,
and Radu Corcodel
A Study on Efficient Reinforcement Learning Through Knowledge
Transfer.......................................................... 329
Ruben Glatt, Felipe Leno da Silva,
Reinaldo Augusto da Costa Bianchi, and Anna Helena Reali Costa
Federated Transfer Reinforcement Learning for Autonomous
Driving........................................................... 357
Xinle Liang, Yang Liu, Tianjian Chen, Ming Liu, and Qiang Yang

An Introduction to Federated
and Transfer Learning
Roozbeh Razavi-Far, Boyu Wang, Matthew E. Taylor, and Qiang Yang
AbstractIn today’s world, we have access to a tremendous amount of data. How-
ever, there is not enough high-quality data to obtain the desired results. More impor-
tantly, many industries have separate databases, with restricted sharing policies. This
prevents the use of centralized storage of all relevant data in many applications for
various reasons, such as privacy concerns and the need for fast computational results
at the frontier. These problems can be addressed through transfer learning and feder-
ated learning. This book contains some chapters to provide background knowledge
of transfer learning and federated learning, as well as, novel contributions to improve
the performance of distributed learning systems.
R. Razavi-Far (B)
Faculty of Computer Science, University of New Brunswick, Fredericton, NB, Canada
e-mail:[email protected];[email protected]
Department of Electrical and Computer Engineering and School of Computer Science, University
of Windsor, 401 Sunset Avenue, Windsor, ON N9B 3P4, Canada
B. Wang
Department of Computer Science, Western University, London, ON, Canada
e-mail:[email protected]
M. E. Taylor
Department of Computing Science, University of Alberta, 9119-116 St NW, Edmonton, AB T6G
2E8, Canada
e-mail:[email protected]
Alberta Machine Intelligence Institute (AMII), 10065 Jasper Ave 1101, Edmonton, AB T5J 3B1,
Canada
Q. Yang
Department of Computer Science and Engineering, Hong Kong University of Science and
Technology (HKUST), Clearwater Bay, Kowloon, Hong Kong
e-mail:[email protected]
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023
R. Razavi-Far et al. (eds.),Federated and Transfer Learning, Adaptation, Learning,
and Optimization 27,https://doi.org/10.1007/978-3-031-11748-0_1
1

2 R. Razavi-Far et al.
Introduction
The rapid advancement of computational power in recent years resulted in the indus-
trialization of machine learning algorithms [5–8,24,30]. While these algorithms
gain better performance in terms of prediction, two issues become prime concerns
in their real-world applications.
Given that labeled data is often difficult and time-consuming to acquire, and many
companies and industrial sectors do not share their data due to privacy concerns, it
makes sense to reuse knowledge gained from related but distinct datasets. Never-
theless, it is challenging to construct an ideal model from a domain with limited
data samples even if training data is abundant in another domain. Through Transfer
Learning (TL), the model can be pre-trained on data from a specific domain, and,
then, adapted to meet the needs of the given task [18].
TL is one of the branches of machine learning that has been at the forefront of
addressing this problem [16–18,25,35]. TL entails extracting information from one
domain, so-called source domain, and passing it on to another, called target domain
[12,31]. The second issue of concern, on the other hand, is related to learning
from decentralized data samples that are distributed across a wide network [26].
This learning condition is referred to as Federated Learning (FL), which is closely
related to TL. They are usually paired to learn the most from data. Federated and
Transfer Learning (FTL) clients may or may not use the same features for the sake
of training. This is very common in organizations that are similar in their nature, yet
not identical [21]. FTL is designed to handle such heterogeneous data distributed
over decentralized networks.
Transfer learning was initially proposed in 1995 as a way to repurpose previously
learned knowledge through machine learning algorithms. During the following years,
the idea began to attract more attention resulting in numerous research endeavors on
advancing this field [22,34]. Roughly ten years after the original idea, TL focused
more on the target tasks rather than the source tasks. To clarify, instead of initially pre-
sented solutions such as multitask learning, the goal has changed from learning from
all source domains to extracting knowledge from one or more source domains and
transferring to the target domain. According to this definition, source domains have
no symmetric contributions. After that, the proposed solutions aimed at completing
the same task, though with varying approaches. A newer research area compared
with TL, FL has a relatively limited history. FL was first introduced by Google in
2016, which tries to preserve the privacy while constructing models by means of
distributed data [1,14,32].
TL enables learning from a limited number of data samples by transferring
extracted knowledge from one or more domains to a relevant target domain. TL
can also help in detecting concept drift in the data stream, however, it might fail
under certain conditions. The source and target domains must have some overlap for
TL to succeed, and if there is very little overlap, TL will fail. Additionally, domain
similarities can sometimes deceive. Despite FL’s ability to deal with data scarcity
and decentralized learning, it does have a number of challenges. Due to the fact

An Introduction to Federated and Transfer Learning 3
that federated networks include many client-side devices that may not have suf-
ficient bandwidth, it can be difficult to communicate among these devices [2,3].
Furthermore, the specifications of these devices usually vary, resulting in different
processing power and transmission rates on the client side. Therefore, synchronizing
these devices becomes a complex task. In addition, since communications are the
backbone of FL, data protection and privacy are vitally important [4,10,11].
This book aims to help readers to understand transfer learning in conjunction with
federated learning, and, then, bridge the gap between TL and FL and include a num-
ber of recent chapter contributions to frame this space of research. This book studies
the recent advancements and challenges in TL and FL and investigate the connection
between them. Topics discussed in this book include but are not limited to: TL and its
subfields including domain adaptation and confusion, zero-shot learning, one-shot
learning, few-shot learning, multitask learning, meta learning, self-taught leaning,
domain generalization, continual/lifelong learning, TL-based recommenders, hori-
zontal FL, vertical FL, transfer FL, the connection of FL to supervised learning, the
connection of FL to reinforcement learning, FL security for privacy preserving, the
connection between FL and blockchain, and multi-agent systems [9,13,15,19,20,
23,25,27–29,33].
Book Outline
The book consists of 14 chapters in addition to the introductory chapter, each of
which has undergone a rigorous single-blind review process. These high-quality
contributed research chapters provide a well-balanced collection of recent works on
theory, algorithms, and applications of federated learning and transfer learning. We
provide a brief description of each chapter hereafter.
Chapter2provides a comprehensive survey on federated learning for resource-
constrained IoT devices that covers the challenges, potential solutions, open issues
of algorithms and hardware developments, and future research directions. It also
presents the relationship between FL and transfer learning considering resource-
constrained environments.
Chapter3performs a comprehensive survey on the intersection of federated and
transfer learning from a security point of view. It uncovers potential vulnerabilities
and defense mechanisms that might compromise the privacy and performance of
federated transfer learning systems. This chapter can serve as a concise and accessible
overview of this topic. It would greatly help our understanding of the privacy and
robustness attack and defense landscape in federated transfer learning.
Chapter4presents generalized formulations of neural architecture search tasks
in the vertical federated learning setting, which enables parties to simultaneously
optimize heterogeneous network architectures with uninspectable data and hardware
resource constraints.

4 R. Razavi-Far et al.
Chapter5develops a unifying framework for federated learning and shows that
many of the existing algorithms are special cases of it. Based on this unification
scheme, it compares existing federated learning algorithms, refines previous conver-
gence results, and uncovers new algorithmic variants. Moreover, it also proposes an
efficient way to accelerate algorithms without incurring any communication over-
head.
Chapter6proposes a contract theory-based federated learning procedure. In this
work, a multi-dimensional contract model is designed by formalizing the two private
types of federated learning clients, i.e., data coverage and training willingness. The
optimal contract solution is theoretically analyzed and empirically evaluated.
Chapter7provides a systematic literature review of blockchain-based federated
learning, which has shown promising solutions for dealing with privacy and secu-
rity issues of distributed machine learning. It also identifies the advantages that
blockchain technology can bring to the federated learning framework and classifies
them into four categories: decentralization, management, security, and motivation.
Chapter8proposes a new decentralized collaborative learning architecture. This
works adopts private permissioned blockchain and smart contract to protect the pri-
vacy and security of data and knowledge. The experimental results validate the fea-
sibility and security of the proposed architecture.
The reminder of the book focuses on transfer learning.
Chapter9introduces the foundation of transfer learning based on importance
weighting (IW). Then, it reviews recent advances based on joint and dynamic
importance-predictor estimation. Moreover, it also present a causal mechanism trans-
fer approach that incorporates causal structure in transfer learning.
Chapter10surveys recent works that benefit from representation learning to trans-
fer knowledge across machine learning tasks. It reviews the recently developed algo-
rithms that use this strategy to address several learning settings, including zero-shot
learning, few-shot learning, multitask learning, continual learning, and distributed
learning. Finally, it also discusses existing challenges and future potential research
directions.
Chapter11focuses on algorithmic aspects of transfer learning, presenting a
new low-data transfer learning approach that leverages conceptual expansion-Monte
Carlo tree search (CE-MCTS) to model an individual on an unseen target task based
on the behavior of others on that target task, and data of that individual on a secondary
task. Experimental results indicate that CE-MCTS outperforms standard transfer
learning approaches in two real-world applications.
Chapter12presents a Python library: ADAPT (Awesome Domain Adaptation
Python Toolbox), to facilitate the access to transfer methods for a large public, includ-
ing industrial players. It also designs a new presentation of transfer learning needs
from a user point of view. This library helps practitioners to compare the results of
many methods on their own problems.
Chapter13develops the robust constrained Markov decision processes (RCMDPs)
objective to simultaneously deal with safety constraints and model uncertainties in
reinforcement learning. Based on the theoretical analysis, it also proposes an exten-

An Introduction to Federated and Transfer Learning 5
sion to Lyapunov RCMDPs (L-RCMDPs) for RCMDPs based on the Lyapunov
function for Sim2Real transfer learning.
Chapter14reviews recent works on knowledge transfer in deep reinforcement
learning for robust learning in single and multi-agent scenarios. It also summarizes
the strategies for knowledge reuse from parameter sharing to privacy preserving
federated learning. Finally, it identifies several challenges for future work.
Chapter15presents the first federated transfer reinforcement learning demonstra-
tion for alleviating the time-consuming offline model transfer process in autonomous
driving simulations while allowing heavy load of training data stays local in the
autonomous edge vehicles.
The chapters presented in this book can update readers on recent advances in
federated learning and transfer learning. It is therefore expected to be useful for
academic researchers, industry practitioners, as well as people interested in advanced
machine learning topics. Last but not least, the editors would like to express their
gratitude to the authors of the contributed chapters, as well as the reviewers who
helped ensure the high quality of this book.
References
1. Aledhari M, Razzak R, Parizi RM, Saeed F (2020) Federated learning: a survey on enabling
technologies, protocols, and applications. IEEE Access 8:140699–140725
2. Ang F, Chen L, Zhao N, Chen Y, Wang W, Yu FR (2020) Robust federated learning with noisy
communication. IEEE Trans Commun 68(6):3452–3464
3. Bhagoji AN, Chakraborty S, Mittal P, Calo S (2019) Analyzing federated learning through an
adversarial lens. In: Proceedings of the 36th international conference on machine learning, vol
97, pp 634–643
4. Chen Y, Luo F, Li T, Xiang T, Liu Z, Li J (2020) A training-integrity privacy-preserving
federated learning scheme with trusted execution environment. Inf Sci 522:69–79
5. Fan L, Ng KW, Chan CS, Yang Q (2021) Deepip: deep neural network intellectual property
protection with passports. IEEE Trans Pattern Anal Mach Intell 1–1
6. Farajzadeh-Zanjani M, Hallaji E, Razavi-Far R, Saif M (2021) Generative-adversarial class-
imbalance learning for classifying cyber-attacks and faults - a cyber-physical power system.
IEEE Trans Dependable Secure Comput 1–1
7. Farajzadeh-Zanjani M, Hallaji E, Razavi-Far R, Saif M (2021) Generative adversarial dimen-
sionality reduction for diagnosing faults and attacks in cyber-physical systems. Neurocomput-
ing 440:101–110
8. Farajzadeh-Zanjani M, Hallaji E, Razavi-Far R, Saif M, Parvania M (2021) Adversarial semi-
supervised learning for diagnosing faults and attacks in power grids. IEEE Trans Smart Grid
12(4):3468–3478
9. Farajzadeh-Zanjani M, Razavi-Far R, Saif M, Palade V (2022) Generative adversarial net-
works: a survey on training, variants, and applications. In: Generative adversarial learning:
architectures and applications. Springer International Publishing, Cham, pp 7–29
10. Gao D, Liu Y, Huang A, Ju C, Yu H, Yang Q (2019) Privacy-preserving heterogeneous federated
transfer learning. In: IEEE international conference on big data, pp 552–2559
11. Goryczka S, Xiong L (2017) A comprehensive comparison of multiparty secure additions with
differential privacy. IEEE Trans Dependable Secure Comput 14(5):463–477
12. Hernandez-Leal P, Kartal B, Taylor ME (2019) A survey and critique of multiagent deep
reinforcement learning. Auton Agent Multi-Agent Syst 33(6):750–797

6 R. Razavi-Far et al.
13. Kantarcioglu M, Clifton C (2004) Privacy-preserving distributed mining of association rules
on horizontally partitioned data. IEEE Trans Knowl Data Eng 16(9):1026–1037
14. Li T, Sahu AK, Talwalkar A, Smith V (2020) Federated learning: challenges, methods, and
future directions. IEEE Signal Process Mag 37(3):50–60
15. Liu Y, Kang Y, Xing C, Chen T, Yang Q (2020) A secure federated transfer learning framework.
IEEE Intell Syst 35(4):70–82
16. Lu J, Behbood V, Hao P, Zuo H, Xue S, Zhang G (2015) Transfer learning using computational
intelligence: a survey. Knowl-Based Syst 80:14–23
17. Niu S, Liu Y, Wang J, Song H (2020) A decade survey of transfer learning (2010–2020). IEEE
Trans Artif Intell 1(2):151–166
18. Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–
1359
19. Phong LT, Aono Y, Hayashi T, Wang L, Moriai S (2018) Privacy-preserving deep learning via
additively homomorphic encryption. IEEE Trans Inf Forensics Secur 13(5):1333–1345
20. Razavi-Far R, Ruiz-Garcia A, Palade V, Schmidhuber J (eds) (2022) Generative adversarial
learning: architectures and applications. Springer, Cham
21. Saha S, Ahmad T (2021) Federated transfer learning: concept and applications. Intelligenza
Artificiale 15:35–44
22. Shao L, Zhu F, Li X (2015) Transfer learning for visual categorization: a survey. IEEE Trans
Neural Netw Learn Syst 26(5):1019–1034
23. Smith V, Chiang CK, Sanjabi M, Talwalkar AS (2017) Federated multi-task learning. In:
Advances in Neural Information Processing Systems, vol 30
24. Tan AZ, Yu H, Cui L, Yang Q (2022) Towards personalized federated learning. IEEE Trans
Neural Netw Learn Syst 1–17
25. Taylor ME, Stone P (2009) Transfer learning for reinforcement learning domains: a survey. J
Mach Learn Res 10(56):1633–1685
26. Wahab OA, Mourad A, Otrok H, Taleb T (2021) Federated machine learning: survey, multi-
level classification, desirable criteria and future directions in communication and networking
systems. IEEE Communi Surv Tutor 23(2):1342–1397
27. Wang B, Mendez J, Cai M, Eaton E (2019) Transfer learning via minimizing the performance
gap between domains. In: Wallach H, Larochelle H, Beygelzimer A, d’ Alché-Buc F, Fox E,
Garnett R (eds) Advances in neural information processing systems, vol 32. Curran Associates,
Inc
28. Wang B, Pineau J (2015) Online boosting algorithms for anytime transfer and multitask learn-
ing. Proc AAAI Conf Artif Intell 29(1)
29. Wang S, Nepal S, Rudolph C, Grobler M, Chen S, Chen T (2020) Backdoor attacks against
transfer learning with pre-trained deep learning models. IEEE Trans Serv Comput 1–1
30. Xiao Y, Shi H, Wang B, Tao Y, Tan S, Song B (2022) Weighted conditional discriminant analysis
for unseen operating modes fault diagnosis in chemical processes. IEEE Trans Instrum Meas
71:1–14
31. Xu W, He J, Shu Y (2020) Transfer learning and deep domain adaptation. In: Aceves-Fernandez
MA (ed) Advances and applications in deep learning, chap 3. IntechOpen
32. Yang Q, Liu Y, Chen T, Tong Y (2019) Federated machine learning: concept and applications
33. Yang Q, Liu Y, Chen T, Tong Y (2019) Federated machine learning: concept and applications.
ACM Trans Intell Syst Technol 10(2)
34. Yang Q, Zhang Y, Dai W, Pan SJ (eds) (2020) Transfer learning. Cambridge University Press,
Cambridge
35. Zhuang F, Qi Z, Duan K, Xi D, Zhu Y, Zhu H, Xiong H, He Q (2021) A comprehensive survey
on transfer learning. Proc IEEE 109(1):43–76

Federated Learning for
Resource-Constrained IoT Devices:
Panoramas and State of the Art
Ahmed Imteaj, Khandaker Mamun Ahmed, Urmish Thakker,
Shiqiang Wang, Jian Li, and M. Hadi Amini
AbstractNowadays, devices are equipped with advanced sensors with higher pro-
cessing and computing capabilities. Besides, widespread Internet availability enables
communication among sensing devices that results the generation of vast amounts
of data on edge devices to drive Internet-of-Things (IoT), crowdsourcing, and other
emerging technologies. The extensive amount of collected data can be preprocessed,
scaled, classified, and finally, used for predicting future events with machine learn-
ing (ML) methods. In traditional ML approaches, data is sent to and processed in a
central server, which encounters communication overhead, processing delay, privacy
leakage, and security issues. To overcome these challenges, each client can be trained
locally based on its available data and by learning from the global model. This decen-
tralized learning approach is referred to as federated learning (FL). However, in large-
scale networks, there may be clients with varying computational resource capabili-
ties. This may lead to implementation and scalability challenges for FL techniques.
In this paper, we first introduce some recently implemented real-life applications of
FL underlying the applications that are suitable for FL-based resource-constrained
A. Imteaj·K. Mamun Ahmed·M. H. Amini ( B)
Knight Foundation School of Computing and Information Sciences, Sustainability, Optimization,
and Learning for InterDependent networks laboratory (SOLID Lab), Florida International
University, 11200 SW 8th St, ECS 354, Miami, FL 33199, USA
e-mail:moamini@fiu.edu
A. Imteaj
e-mail:aimte001@fiu.edu
K. Mamun Ahmed
e-mail:kahme011@fiu.edu
U. Thakker
Deep Learning Research, SambaNova Systems, Palo Alto, CA, USA
e-mail:[email protected]
S. Wang
IBM T. J. Watson Research Center, Yorktown Heights, NY, USA
e-mail:[email protected]
J. Li
Binghamton University, State University of New York, Albany, NY, USA
e-mail:[email protected]
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023
R. Razavi-Far et al. (eds.),Federated and Transfer Learning, Adaptation, Learning,
and Optimization 27,https://doi.org/10.1007/978-3-031-11748-0_2
7

8 A. Imteaj et al.
IoT environments. We then emphasize the core challenges of implementing the FL
algorithms from the perspective of resource limitations (e.g., memory, bandwidth,
and energy budget) of client devices. We finally discuss open issues associated with
FL for resource-constrained environments and highlight future directions in the FL
domain concerning resource-constrained devices.
KeywordsFederated learning
·Internet-of-Things·Resource-constrained
devices
·On-device training·Global model·Convergence
1 Introduction
The Internet-of-Things (IoT) penetration rate has recently expanded prodigiously
due to the integration of billions of connected IoT devices. Such IoT edge devices
include different kinds of robots, drones, and smartphones, which have limited com-
putation and storage capabilities and can communicate with remote entities via a
wide-area network (WAN). The data generated from these devices at the network
edge is increasing exponentially. Due to bandwidth and privacy concerns, it is infea-
sible to send all locally collected data to the server. However, many IoT applications
require the prediction and classification of data, for which machine learning (ML)
models need to be trained using data collected by multiple devices. The question
is:how to train ML models from the decentralized data at resource-constrained IoT
devices?
To address the above problem, we need to devise an approach through which the
learning process can be accomplished without exchanging raw data between client
devices.Federated learning (FL)is a technique that fulfills this purpose. With FL,
each device can attain a global view and predict a situation that is seen in another
device. For example, we can consider a scenario where multiple drones are placed at
different locations, and each drone observes vehicles that are passing through their
respective ends. If a drone observes a vehicle and gets trained by this, other drones
can learn without observing that vehicle through the FL approach. Applications of
FL cover a wide spectrum of domains, e.g., word prediction by training models on
local edge devices and not sharing sensitive information to the central server [1],
adaptive keyword spotting using a locally-trained voice assistant [2].
Challenges in implementing FL in the presence of heterogeneous hardware in
a system are discussed in [3,4]. Due to various resource constraints (i.e., limited
capacity of communication, computation, storage, etc.) at different types of client
devices, the clients cannot be treated uniformly, and additional care needs to be
taken to handle such heterogeneity. A feasibility study to implement FL on the
low-processing unit (e.g., Raspberry Pi) is presented in [5], where the authors also
discussed the potential challenges to detect emotion from the sound. Besides, FL
on battery-powered devices is studied in [6], where a two-layered strategy to train
the model of battery-powered devices are used. In the first layer, the candidates who
have sufficient power to carry out the training process are selected, and in the second

Federated Learning for Resource-Constrained IoT … 9
layer, FL is applied to those selected battery-powered devices considering local
energy optimization. Such an approach eliminates the problems related to straggler
clients and helps improve the overall performance of the training process. Moreover,
a lightweight FL framework is developed in [7], which is particularly suitable for
a resource-constrained heterogeneous environment. Their proposed framework can
handle systems and statistical heterogeneity by performing model pruning, avoiding
weak and untrustworthy clients through resource and activity checking, and assigning
tasks for the FL clients through dynamic resource allocation.
Another resource-aware FL framework considering heterogeneous edge clients is
proposed in [8]. In that work, they eliminate the straggler clients using an optimiza-
tion technique named ‘soft-training’ that dynamically masks different neurons based
on model updates, and their proposed aggregation scheme speeds up the collabo-
rative convergence. Moreover, a control algorithm that adapts the communication
and computation trade-off is presented in [9]. They minimize the loss function for a
predefined resource budget based on analyzing the impact of different configurations
on the convergence rate. A blog on training models on edge devices [10] focuses on
approaches for model training and crucial aspects that matter while infusing deep
learning models on-device. The authors in [11] designed a scalable FL system that
highlighted significant components of the system, including a high-level view of
device scheduling. This study focused on approaches that we can take while training
models and the crucial aspects of training FL models at scale.
However, there is no comprehensive survey on FL challenges and issues from the
perspective of resource-constrained IoT clients. The main contribution of this paper
is that we analyze the core challenges, potential solutions, open questions, and future
pathways that need to be considered and addressed in the implementation of FL on
resource-constrained IoT devices in the network.
The rest of this paper is organized as follows. Section2introduces the background
of FL. In Sect.3, we discuss some widespread applications of FL, which is followed
by Sect.4that present the transition of FL towards federated deep learning (FDL) and
discuss how FDL can be applied in a resource-constrained environment. In Sect.5,we
introduce the core challenges to implement FL, particularly for resource-constrained
devices and outline some potential solutions to address these challenges. Further, we
highlight the open issues of FL algorithms and hardware developments in Sect.6.In
Sect.7, we list some future directions, which is followed by Sect.8that concludes
the paper.
2 Background of Federated Learning
The FL algorithm aims to learn a single, global model from local data collected by
millions of distributed devices. It learns a model under various resource-constraints
of devices, where the model is trained locally, and intermediate model updates are

10 A. Imteaj et al.
Fig. 1FL procedure consideringNparticipants
shared with the cloud (server) periodically. The overall goal is to minimize a training
objective (loss) function which can be written as follows [12]:
min
z
F(z):=
n

i=1
PiFi(z), (1)
wherenis the number of devices,P
i≥0 defines the relative impact of each local
device, satisfying

i
Pi=1.F iis the objective function of thei-th local device,
which can be defined asF
i(z)=
1
si

si
ji=1
fji(z;x ji,yji), wheres iis the number of
locally available samples,P
k=1/sorP k=si/s, and total sampless=

i
si.
The FL terminology was first introduced by Mcmahan et al. [12] with an algorithm
called federated averaging (FedAvg). FedAvg includes multiple rounds of communi-
cation between clients and the server which are interleaved by multiple local model
update steps at each client. The clients do not share raw data due to security and pri-
vacy considerations. In this way, FL preserves data privacy since only the statistical
summary or model weights are shared with the server.
FL generally includes three steps as shown in Fig.1:
Step 1 (Initialization of training task and global model): In the initial phase, the
central server decides the task requirement and target application. An initial global
model is generated by using hyper-parameters and maintaining a training procedure
(e.g., learning rate) specified by the server. Then the server broadcasts the initialized
global modelW
0
G
to the selected local participants.

Federated Learning for Resource-Constrained IoT … 11
Step 2 (Local model update): Each participating client in the network has a collection
of data from where it performs local model update. Upon receiving the global model
W
t
G
, wheretdenotes thetth iteration, each clientiupdates its model parametersW
t
i
with the goal of finding optimal parametersW
t
i
that minimizes the local loss function
F
i(W
t
i
).
Step 3 (Global aggregation): The central server aggregates the local updates received
from all clients and generates an aggregated updated global modelW
t+1
G
. This latest
global model is then sent back to clients who contributed to generating this new
model. The goal of the central server is to minimize the global loss functionF(W
t
G
)
as defined in (1).
Steps 2 and 3 are repeated until the central server attains target training accuracy
or reaches a convergence.
3 Federated Learning Applications
FL suits best in applications where data within the device is more significant than
data located in the server. Current applications related to FL are mostly based on
supervised learning, typically utilizing labels retrieved from user activities (e.g.,
button click, keyboard type, etc.). In this section, we briefly discuss some applications
based on FL.
•Smart Healthcare:Smart healthcare involves sensitive data, and it needs to train
models on-device. For instance, heart attack situations can be predicted locally for
end-users with wearable devices [13]. The authors in [14] suggested that if all medical
centers cooperate to form a large dataset by sharing their data with proper labeling,
the performance of ML model would be remarkably improved. The combination of
FL and transfer learning is a preeminent way to achieve this goal.
•Recommendation System:It is a widely used method that depends on information
sharing among users, which may cause privacy leakage. To handle this, the authors in
[15] proposed a federated meta-learning-based approach, where local devices share
their algorithm instead of data or model with the central server. It eliminates the risk
of privacy leakage and leads to proper training of the model on the local devices.
•Next-word Prediction:An on-device, distributed framework for next-word pre-
diction for smartphones is proposed in [1]. They trained the local client devices using
the FedAvg algorithm and observed higher prediction recall compared with other
approaches that transferred some sensitive information to the server for learning.
•Keyword Spotting:An embedded speech model, i.e., wake word detector, is
proposed in [2], where an experiment using ‘Hey Snips’ keyword is conducted to
start an interaction with a voice assistant. For keeping users’ speech private, they
applied the FL strategy by leveraging on-device model training without sharing data
with a central server.
•On-device Ranking:An on-device ranking of search results is implemented in
[11] without conducting expensive calls to the server. This avoids issues related to
constrained resources, and sensitive information remains on the device. The sys-

12 A. Imteaj et al.
tem can label the user interaction with the ranking feature by observing the user’s
preferred selected item from the ranked list during the interaction period.
•Improving Resilience of Critical Infrastructures:To mitigate the loss due to
sudden interruption of the critical infrastructures’ operations due to system failure
or natural disasters, the authors in [16] proposed an on-device training method of the
distributed clients to predict outage event and resource sharing information. To that
end, they leverage an FL algorithm that can tackle the straggler issues which may
observe in a resource-constrained environment.
•Relevant Content Suggestions for On-Device Keyboard:A content suggestion
approach for on-device smartphone keyboard was presented in [17], where value is
added to the users by suggesting related content. For instance, while typing on a
keyboard, it can suggest related contents by getting triggered based on the assigned
value learned through on-device training.
4 Federated Deep Learning for Resource-Constrained
Environments
Deep learning or Deep Neural Network (DNN) is a subfield of ML that imitates
the learning process of the human brain. While traditional ML algorithms work in a
linear fashion, DNNs with a stacked hierarchy of increased complexity and abstrac-
tion introduce non-linearity in the learning process that has shown great potential
recently. As a result, during the last few decades, we have seen a tremendous growth
of applications of DNNs in various fields, including healthcare, natural language
processing, computer vision, speech recognition, and bioinformatics [18–21].
Although DNN algorithms have shown great success, training DNNs is a challeng-
ing task due to data accumulation and model training in traditional centralized style
that faces difficulties including privacy leakage, security issues, and communication
delay. Moreover, in some situations, it is not even possible to share data due to legal
constraints, regulatory restrictions, and protection of intellectual property such as
General Data Protection Regulation (GDPR) [22]. FL addresses these difficulties and
trains ML models in the edge devices with their local computing resources and their
local dataset. Therefore, the implementation of DNNs in a federated environment
has unveiled the solution towards a long-standing problem and opened up countless
possibilities for meaningful applications, e.g., medical purposes and vehicular net-
works. However, DNNs generally require high computational resources and larger
datasets to perform better. As a result, training of DNNs in a resource-constraints
federated environment has become a challenging task. To address such issues of
training deep learning models in a federated environment, some of the recent works
proposed several methods.
The authors in [23] proposed a novel computational resource allocation model
that improves the energy efficiency of the edge-devices. They first observed the learn-
ing speed of the edge-devices in the previous iteration and then based on the deep

Federated Learning for Resource-Constrained IoT … 13
reinforcement learning algorithm they decided how many computational resources
needed to be allocated. ELFISH—a resource-aware federated learning framework
is proposed in [8], where authors mask neurons in different layers based on the
DNN model’s computational cost profile in terms of time, memory usage and com-
putation workload. By doing this, ELFISH increased the training time up to 2Xin
various straggler settings and also improved accuracy of 4%. To address the resource-
constrained edge-devices and scale up the convolutional neural network (CNN) size,
a group knowledge transfer (FedGKT) learning algorithm is given in [24]. FedGKT is
a memory and computation efficient algorithm that unifies the advantages of FedAvg
[25] and SL [26] and supports asynchronous training. In FedGKT, compact CNNs
are trained at the edge and after local training extracted features are utilized to train
a large CNN on the server. Considering the client devices heterogeneous nature,
where their local computing resources vary significantly, the efficacy to generate a
set of global models is proposed in [27]. Here, the author trained three global models
that showed comparatively similar performance where the training time increased
significantly. To balance the trade-offs among model accuracy, privacy, and resource
cost, DP-PASGD an FL framework is presented in [28], where authors investigated
the optimal schematic design of DP-PASGD and performed rigorous convergence
analysis. In Table1, we present the key ideas of some recent literatures in resource-
constraint FDL settings.
Table 1Existing federated deep learning approaches
Approaches ReferenceKey ideas
E-DRL [23] Presented a experience driven deep reinforcement learning
(E-DRL) framework that decides the computational resource
allocation of each edge-device
ELFISH [8] Developed a neuron masking strategy based on the model’s
computational profile in training time
FedGKT [24] Presented FedGKT that is memory and computation efficient,
supports asynchronous training, and exchanges hidden features
as opposed to exchanging the entire model
DP-PASGD [28] Implemented an FL framework that can train ML model
efficiently in resource-constrained smart IoT devices
FLT [27] Introduced a federated transfer learning model based on clients’
resource heterogeneity

14 A. Imteaj et al.
5 Relationship Between Federated Learning and Transfer
Learning from the Perspectives of Resource-Constrained
Environments
In the field of machine learning and deep learning, many innovative algorithms or
models have been successfully applied in diverse applications such as image classifi-
cations, healthcare systems, and cybersecurity where patterns from past information
are extracted in order to predict future outcomes [29–31]. However, the performance
of ML depends on the availability of a massive amount of labeled data. For example,
AlphaGo was trained on 160,000 actual games containing 29.4 million moves [32].
Another example would be the ImageNet dataset that has more than 14 million image
data and the state-of-the-art models such as AlexNet, VGGNet, ResNet, GoogLeNet
are trained on this massive dataset [33]. However, in certain scenarios, obtaining
training data is expensive and difficult especially when there is a need for involve-
ment of human intelligence. As a result, the lack of available datasets may hamper
the creation of high-performance learners. Besides, in the traditional FL method, a
common constraint is imposed, which is various organizations that own training data
must need to share similar feature space. However, in a real-life scenario, we cannot
imply the same setting, specially in industries like healthcare, banking, or finance
sector. To overcome the above-mentioned shortcomings, transfer learning (TL) is
considered a good alternative where knowledge can be shared across domains. Par-
ticularly, TL is useful to transit knowledge across multiple domains that have fewer
overlapping clients and features. The authors in [34] explained that researchers are
involved in developing different techniques using TL since 1995 by labeling that
TL has been with various names, such as inductive transfer, knowledge transfer, and
multi-task learning. Various approaches are proposed for TL such as:
Instance Transfer:Re-weights labeled training data in one client and transfer the
knowledge to reuse in another client [35].
Parameter Transfer:Reveals shared parameters between the source client and the
target client [36].
Feature-representation Transfer:Discovers feature-representation, i.e., reduces
the dissimilarity between the source client and the target client [37].
Relational-knowledge Transfer:Constructs a relational mapping of knowledge
between the source and the target client [38].
TL not only resolves the situation of a limited supply of training data but also
improves the convergence time by reusing knowledge from the previous training
phases. As a result, TL has a high impact on a resource-constrained federated envi-
ronment, where clients have fewer training samples, limited computing resources,
thus, training a model from the scratch would take a relatively longer time than
TL. TL can be effective in a resource-constrained FL environment by performing
knowledge transition across distributed clients within the networks and enabling het-
erogeneous tasks, distributions, and domains during the training and testing periods.
An client would have to accomplish fewer computational tasks by learning from other

Federated Learning for Resource-Constrained IoT … 15
clients and sharing information through parameters, features, mapping, or instances.
TL can be applied in various real-world applications, including autonomous driving,
IoT-based healthcare, robotics, image steganalysis, EEG signal classification, etc.
6 Core Challenges to Implement Federated Learning on
Resource-Constrained Devices
In real-world IoT environments, available clients may have heterogeneous hardware
(e.g., memory, computational ability) and variant resource-assets (e.g., energy bud-
get). As a result, we cannot consider all available clients uniformly as the client’s
behavior depends on resource availability. Besides, heterogeneity in hardware can
preclude weak clients from participating in FL.
Extensive surveys were conducted on the architecture and training process of
IoT edge networks and FL [4,17,39,40], but challenges of FL systems due to
resource limitation were not comprehensively discussed. In this section, we discuss
the core challenges (see Fig.2) associated with the implementation of FL for resource-
constrained IoT devices.
Fig. 2Core challenges associated with resource-constrained IoT clients in the FL system

16 A. Imteaj et al.
6.1 Limited Memory and Energy Budget
The clients participating in FL may have limited memory capacity, constrained com-
putational ability, and bounded energy budget. While reduced computational capa-
bilities imply that it takes more time to process data, limited memory capacity makes
the device prone to over-flooding. These situations can lead to more expensive com-
munication (see Sect.6.2for more details) and curtail the overall performance of
the system. The authors in [5] analyzed hardware limitation challenges in the imple-
mentation of FL by considering Raspberry Pi clients. They studied the feasibility
of implementing FL on resource-constrained edge devices. Furthermore, necessary
hardware requirements are highlighted in [1] during the implementation of the next
word prediction on the keyboard. In terms of hardware requirements, the devices must
have at least 2 gigabytes of memory availability, whereas some microcontrollers have
very limited memory. Such memory limitation of clients can be managed by stor-
ing limited sizes of data and this will help the resource-bounded clients to process
those data locally. After a certain period, data within a client can be aggregated and
backed up to avoid unexpected overflow of memory. In this regard, a novel FL-based
approach can be adopted through which shards of data are distributed to the clients
to obtain the target model quickly [41]. According to resource availability, we can
also choose proficient clients that have higher bandwidth, better processing ability,
greater memory size, and higher energy budget to participate in FL, and clients with
resource shortage will not participate.
6.2 Expensive Communication
The necessity of training a local model can be motivated by insufficient communica-
tion bandwidth to broadcast local data to a server for central computing. In FL, the
server interacts with clients for getting updates based on local model training, after
which the server disseminates an updated global model. When we consider resource-
constrained clients with limited bandwidth and transmission power, it is challenging
to utilize those resources prudently for reaching a convergence. Furthermore, FL
systems may comprise millions of devices with various degrees of resources, and
local computation within the devices can be faster than the network communication
[42]. Although frequent interaction between the server and the clients can help us
to attain a target model swiftly, it is costly to perform communication repeatedly.
We need to consider this trade-off while designing optimization algorithms to make
proper use of limited resources. The authors in [43] discussed the trade-offs between
communication expenses and optimal resource utilization, but they did not study the
complexity of the local problem’s solution. To reduce the communication expense
during real-time data collection in a distributed fashion, the authors in [44] proposed
a distributed sensing mechanism that can track any IoT devices remotely and trigger
the devices for sensing environment and sending back model updates with a negli-

Federated Learning for Resource-Constrained IoT … 17
gible communication cost. We need to devise a way to achieve the target model by
sending a compressed size of message [45] and by carrying out a minimal number
of communication rounds between the server and local clients.
6.3 Heterogeneous Hardware
In FL, training can run on multiple devices, each coming from different vendors or
belonging to a different generation of products. It creates a network of devices with
varying computing and memory capabilities and different battery lives. Therefore,
training efficiency may vary significantly across client devices, and considering all
clients with the same scale does not provide us an optimal solution. The authors in [46]
discussed why FL should be aware of heterogeneous hardware configurations. We
need to select clients for training purposes based on system requirements. However,
due to strict cost and energy requirements, only a few clients might end up meeting
the required criterion. It is possible that most of the proficient clients go out of the
network, and existing clients do not fulfill system requirements. Hence dealing with
resource-constrained heterogeneous devices is a challenge.
6.4 Statistical Heterogeneity
Numerous research has already been conducted based on statistical heterogeneity
in machine learning using meta-learning [47], multi-task learning [48]. Now those
ideas are upgraded to utilize in the FL setting [49]. Statistical heterogeneity means
variance in data format from device to device. This problem occurs when data is not
identically distributed among the devices. This variance influences while modeling
the local data as well as inspecting the convergence nature of the related training
process. Besides, data retrieved from different devices can be of a distinct structure
or format. Further, if we consider resource-constrained devices within the learning
environment, then the data volume possessed by various clients could be highly
imbalanced. Handling such a discrepant volume as well as an incongruity structure
of data is also a challenge.
6.5 Energy Efficient Training of DNNs
Deep neural networks (DNNs) are widely used, especially in implementing arti-
ficial intelligence-based applications. It is difficult to perform DNNs on resource-
constrained clients as we need to ensure the required processing capability and energy
availability. The authors in [50] discussed enabling training on local devices. They
demonstrated a way to carry out both inference and training with comparative low-

18 A. Imteaj et al.
bitwidth integers to ensure that back-propagation requires integer numbers, which
reduces the hardware requirement of training. However, this does add significant bias
to the training and leads to loss in accuracy. The authors in [4] showed one way to
generate a high-quality machine learning model based on on-device model param-
eters, output, and data aggregation, while the authors in [51] proposed a method
for efficient learning using pruned models. A promising approach to conduct effi-
cient FL training by taking advantage of the existence of critical learning periods in
DNNs is shown in [52]. Another exciting approach presented in [2,53–55] discussed
training higher capacity models with fewer parameters. Particularly, energy-efficient
learning on resource-constrained devices is essential if the size and number of fea-
tures of the training dataset are large. In such a case, generating a higher-quality
model by considering fewer parameters and performing on-device training could be
a challenge.
6.6 Scheduling
In synchronous FL, all clients interact with the server at the same time, while in
asynchronous FL, the training period can be different. Hence, it is essential to deter-
mine the training period for all local participants, which we callscheduling.Ifwe
consider resource-constrained clients, then it is not an ideal solution to carry out
scheduling frequently. Rather, an optimized scheduling period would cost minimal
energy consumption and less bandwidth. To achieve this, parallel training sessions
can be avoided due to high resource consumption, and a worker queue can be main-
tained for on-device multi-tenant systems. Moreover, clients should not perform
scheduling tasks when they possess old data. It may be possible that older data are
repeatedly used for training, while newer data are omitted [11]. Old data will not
give us many variations to the model parameter, and resources will be wasted without
model improvement. In addition, any client may frequently use a particular app that
provides malicious data, and identification of such app usage is also a challenge.
Thus, it is necessary to execute scheduling after refining data for training.
6.7 Stragglers
In FL, one main reason for the performance bottleneck is the presence of straggler
clients. While each client is responsible for generating a model with their data and
sharing that with the server after a certain period, a straggler client may fail to share
its model with the server at a proper time convenience. Due to this, the central server
needs to wait until all the straggler clients share their model. Hence, the overall
training procedure of the clients is delayed. One solution to avoid straggler clients
is to select competent clients based on their resource availability. For example, a
general distributed solution to mitigate the effect of stragglers has been developed

Federated Learning for Resource-Constrained IoT … 19
in [56] to adaptively select clients for training in each iteration that is aware of
the computational resources at each client. The authors in [5] proposed a solution
to acknowledge the computational power, i.e., overall resource utilization of the
clients after each local update. By observing each client’s resource utilization, a
predictive model can be formed for adjusting the local computation of the clients.
Another option is to use asynchronous training, which, however, is challenging in
the presence of non-independent/identically distributed (non-IID) data among clients
(see Sect.7.2). The authors in [57,58] proposed a trust and resource-aware FL model
considering mobile robot clients that can identify the malicious and straggler clients
by monitoring their previous activities and select clients for the training round by
analyzing their available resources and trust scores.
6.8 Labelling Unlabeled Data
In the FL system, most existing techniques considered that data are labeled. However,
data collected by local devices can be unlabeled due to connection or communica-
tion issues and can be mislabeled. The authors in [59] designed a framework to
identify mislabeled data, while the authors in [3] proposed a solution to put a label
on unlabeled data using collaborative learning with neighboring devices. However, it
is challenging to label data in real-time for resource-constrained devices, as a client
may have power limitations or other issues.
7 Open Issues of Federated Learning Algorithms and
Hardware Developments
In the previous section, we discussed the core challenges for resource-constrained IoT
clients while deploying FL. There still exist some aspects that need to be addressed
(see Fig.3) and can be considered as open research problems. In this section, we
highlight promising future trends.
7.1 Deploying Existing Algorithm to Reduce Communication
Overhead
Existing methods in the literature that are proposed to reduce communication over-
head can be categorized into decentralized training, compression, and local updating.
Integration of these methods can generate an optimized FL platform with fast con-
vergence time. For instance, infusing redundancy amongst the client dataset was
proposed in [41] for bringing diversity and reaching convergence in a shorter time. A

20 A. Imteaj et al.
Fig. 3Open issues and
future directions of the
federating learning theory
and applications
joint learning framework is presented in [60] by quantifying the impact of wireless
factors on clients in the FL environment. Still, further research is needed to attain
minimal communication overhead at scale.
7.2 Guarantee Convergence in Asynchronous Federated
Learning
Most existing studies and implementations consider synchronous FL, where the
progress of the iteration round’s training period depends on the slowest device within
the network. This means synchronous FL has a direct effect on the overall perfor-
mance of the system, although it guarantees convergence. In asynchronous FL, the
participant can join the training round even in the middle of the training progress. It
also ensures scalability although it does not guarantee convergence [61]. Therefore,
one of the core research issues is to formulate a method for ensuring convergence
during the asynchronous training of clients.
7.3 Quantification of Statistical Heterogeneity
In an IoT environment, the data collected by local devices are inherently non-IID;
thus, they may have a discrepancy in terms of the number of samples and dataset
structure. It is challenging to quantify the level of heterogeneity in the system before
the training begins. A local dissimilarity-based strategy was designed in [62]to
quantify heterogeneity, but it cannot quantify heterogeneity before training starts. If

Federated Learning for Resource-Constrained IoT … 21
we have a mechanism of quantifying heterogeneity at initialization, the system can
be configured accordingly to allow more efficient training.
7.4 Handling False Data Injection
In a distributed system, the local devices are responsible for generating their model
using the raw data they extracted. But, somehow, if the client constructs its model
using false data that are injected during data extraction, then the generated model
will cause an erroneous update to the global model. This opens up a new research
direction to identify false data efficiently, particularly using resource-constrained
devices with the limitations discussed earlier.
7.5 On-Device Training
As discussed previously, devices in the IoT domain can have extremely limited capa-
bilities. As a result, it becomes extremely important to do training and inference on
the device and, thus, limit the interaction with servers via energy-inefficient commu-
nication methods. However, training on a device leads to multiple problems. First,
finding a model with enough capacity that can run on devices having a small size of
memory, while still capturing the complexity of the data can be hard. The authors
in [53,54,63] solved these problems for inference but did not discuss training the
models on the device. Second, training can require a much larger computational and
memory capacity than these simple clients can provide. Section6.5discussed some
techniques which either work on specialized neuromorphic or FPGA hardware or
do not meet the aggressive constraints found in the IoT domain. Solving this dual
problem is paramount. There have been promising directions in this regard that need
to be explored further to determine whether they can be further used in FL. To reduce
the cost of running optimizers for a NN, authors in [64] propose using an 8-bit Adam
optimizer. This reduces both the computation cost of doing a training on the device
and the memory required to store the parameters. Workaround low-precision training
[65] will further help reduce the cost and memory required to train on the device.
Another orthogonal approach is to use conditional execution [66,67], which exe-
cutes and trains only a part of the neural network. A lot of the work in this domain
has focused on inference, however, there is potential to adapt this for FL training to
train only a subset of parameters based on the data collected.

22 A. Imteaj et al.
7.6 Managing Dropped Participants
In the resource-constrained FL system, clients might have heterogeneous resources,
i.e., variable bandwidth, limited battery life, and variant transmission power. As a
result, any client within a network can be disconnected during communication with
the server. Recent studies widely assumed that all the participants are available and
connected with the server throughout the process. But, in real-life, any client can go
offline due to the non-availability of resources. Eventually, the disconnection of a
significant number of clients can degrade the convergence speed and model accuracy.
8 Future Directions
FL is a recently invented technique and an active ongoing research area. After ana-
lyzing the challenges of implementing FL in resource-constrained clients along with
potential solutions in Sect.6and discussing some open issues in Sect.7, we figure
out some future directions that need to be highlighted. In this section, we point out
these future directions.
•In the FL environment, some clients may generate more data (i.e., heavier use of an
application by a particular user) than other clients within the underlying network of
decision-making entities. This may lead to varying amounts of local data during the
training period, and we cannot represent any client dataset as population distribution.
Handing such discrepancy inlocal training datasetrequires further research.
•To ensure convergence in a non-IID scenario, particularly for asynchronous learn-
ing, loss functions of thenon-convex problemneed to be considered, and supportive
algorithms should be proposed.
•In the FL scenario, selection of suitable cluster heads and maintaining coordination
within the overall system need more investigation to ensureenergy efficiency.Itmay
be possible that a cluster of clients agrees to generate an aggregated model using
their local data and disseminate it to the central server. The aggregated model can be
passed by the leader client, and hence, it can reduce energy consumption. To reduce
energy consumption, we can consider a group of clients as a cluster in a distributed
structure, and select a proficient client that will act as a leader client. That client will be
responsible for interaction with the central server in an asynchronous fashion, while
the interaction inside each cluster’s clients is conducted in a synchronous fashion.
This reduces the energy consumption of the overall system by avoiding unnecessary
communication among all clients and the central server.
•Adevice-centric wake-up mechanismcan be built through which the clients can
automatically understand the period to interact with the server. This functionality will
help resource-constrained clients, particularly in asynchronous FL, where a client
needs to wait for other neighbor clients to send their models to the server. By building
such a mechanism, the energy consumption rate of the clients can be reduced by a
significant margin.

Federated Learning for Resource-Constrained IoT … 23
•The resource-constrained IoT clients may need to perform more interaction with the
server due tostatistical heterogeneityin the system. So, it is necessary to evaluate an
efficient method for identifying the statistical heterogeneity even before the training
phase and to avoid idiosyncratic situations due to data sample variation.
•In terms of scalability, frequent drop-out of the participant is a significant bottle-
neck. A new approach should be adapted to make the FL system morerobust to
frequent drop-outs. One solution could be predicting or identifying the probability
of a participant’s disconnection. Further, a dedicated connection (e.g., cellular con-
nection) can be provided as an incentive to avoid connection drop-out [3]. Moreover,
a structure can be maintained while designing a protocol so that drop-out participants
can try to make a connection multiple times during the long-running model aggre-
gation. The issue regarding intermittent client availability at scale is not addressed
in prior works.
•Due tomobilityof clients, new clients may join a network that is more competent
than the existing clients, and any client may leave the network during communication
that can hamper the model training. Besides, because of mobility, there may be a
large number of clients in some areas while other areas may not have enough clients
to generate a feasible model. Handling such situations by considering both mobility
and bandwidth ability of clients can be a research direction.
•The optimal communication degree in the FL implementation is still ongoing
research, and there is not yet a deterministic algorithm to identify theglobally-
optimum communication topology. Although divide-and-conquer and one-shot
communication methods are discussed in [68], these schemes are not well-suited
for heterogeneous resource-constrained clients. This specifically leads to challenges
related to FL implementation in heterogeneous IoT systems with time-varying com-
munication topology, such as distributed mobile sensor networks. Although one-shot
and few-shot FL approaches are proposed in [69], extensive practical evaluation is
needed to identify a solution to the optimal communication topology problem. In
the context of wireless networks, the distribution offair resourceshas been stud-
ied extensively. While optimizing the utilities may give us higher throughput, unfair
resource allocations may cause inadequate service facilities. The global model can
be considered as a resource for providing service to clients. If we use asynchronous
FL, then any client may receive a pre-assignedfairnessto modify its objective func-
tion during the training period. We can handle trade-offs between fairness and other
metrics (e.g., average accuracy) by tuning the parameter. Still, more theoretical, and
practical analysis needs to be conducted to optimize resource distribution, particu-
larly for resource-constrained devices.
•Ablockchainparadigm can be constructed to make the FL clients’ communication
more robust and secure. The model update and exchange of resource-constrained IoT
clients can be verified by blockchain. The authors in [70,71] proposed a blockchain-
based on-device FL, but they did not consider resource-constrained clients. How the
resource-constrained IoT clients perform block traversing, select miners as well as
leader-client, ensure atomicity, and reach a consensus for FL scenario is a future
direction.

24 A. Imteaj et al.
•Designing anincentive mechanism for transparent participationis required in
FL. As participants may be resource-bounded or business competitors, it is essential
to develop an approach that effectively divides the earnings in order to enhance the
long-term participation of the clients. Furthermore, how to defend against adversarial
client data owners and optimize the non-adversarial owner participation for ensuring
security needs to be explored.
•The solitary computational model of FL can lead us to build a more refined trust-
based model. As it is challenging in terms ofsecurityto select a participant for
the training phase, atrust-based modelcan reduce extra communication overhead.
Typically, we assume that the server is operated by a non-adversarial entity. This
server can analyze the behavior of participants and leverage a trust model. According
to the trust score, an incentive mechanism can be designed, and that trust model can
assist us when the number of participants is significant. This opens up a new research
direction to explore.
9 Conclusion
In this paper, we conducted a comprehensive survey on FL, particularly for resource-
constrained IoT devices, and highlighted the ongoing research related to this area.
We began by discussing the importance of leveraging FL for resource-constrained
IoT clients and discussed some previous works that considered resource scarcity
during the implementation of FL system. We highlighted the background, including
the working procedure of FL, and explored existing FL applications by discussing
their importance in conducting model training locally. Further, we focused on core
challenges of implementing FL, particularly for resource-constrained devices by
considering hardware limitation, communication expense, client behavior, statistical
data variation and labeling, and energy-efficient training. Finally, we emphasized
the need for future directions for contriving new FL algorithms in terms of currently
open issues and designing the latest hardware considering resource-constrained chal-
lenges, especially in an FL scenario.
References
1. Hard A, Rao K, Mathews R, Ramaswamy et al (2018) Federated learning for mobile keyboard
prediction.arXiv:1811.03604
2. Leroy D, Coucke A et al (2019) Federated learning for keyword spotting. In: IEEE ICASSP
3. Lim WYB, Luong NC et al (2019) Federated learning in mobile edge networks: a comprehen-
sive survey.arXiv:1909.11875
4. Park J, Wang S et al (2019) Distilling on-device intelligence at the network edge.
arXiv:1908.05895
5. Das A, Brunschwiler T (2019) Privacy is what we care about: experimental investigation of
federated learning on edge devices. In: AIChallengeIoT

Federated Learning for Resource-Constrained IoT … 25
6. Xu Z, Li L et al (2019) Exploring federated learning on battery-powered devices. In: ACM
TURC
7. Imteaj A, Amini MH (2021) Fedparl: client activity and resource-oriented lightweight federated
learning model for resource-constrained heterogeneous iot environment. Front Commun Netw
2:10
8. Xu Z, Yang Z et al (2019) Elfish: resource-aware federated learning on heterogeneous edge
devices.arXiv:1912.01684
9. Wang S, Tuor T et al (2019) Adaptive federated learning in resource constrained edge computing
systems. IEEE JSAC 37(6):1205–1221
10. What does it take to train deep learning models on-device? (2018)
11. Bonawitz K, Eichner et al (2019) Towards federated learning at scale: system design.
arXiv:1902.01046
12. McMahan HB, Moore E et al (2016) Communication-efficient learning of deep networks from
decentralized data.arXiv:1602.05629
13. Huang L, Yin Y et al (2018) Loadaboost: loss-based adaboost federated machine learning on
medical data.arXiv:1811.12629
14. Yang T, Andrew G et al (2018) Applied federated learning: improving google keyboard query
suggestions.arXiv:1812.02903
15. Chen F, Dong Z et al (2018) Federated meta-learning for recommendation.arXiv:1802.07876
16. Imteaj A, Khan I, Khazaei J, Amini MH (2021) Fedresilience: a federated learning application
to improve resilience of resource-constrained critical infrastructures. Electronics, 10(16)
17. Yang Q, Liu Y et al (2019) Federated machine learning: concept and applications. ACM Trans
TIST 10(2):12
18. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
19. Esteva A, Robicquet A, Ramsundar B, Kuleshov V, DePristo M, Chou K, Cui C, Corrado G,
Thrun S, Dean J (2019) A guide to deep learning in healthcare. Nat Med 25(1):24–29
20. Shen D, Guorong W, Suk H-I (2017) Deep learning in medical image analysis. Annu Rev
Biomed Eng 19:221–248
21. Young T, Hazarika D, Poria S, Cambria E (2018) Recent trends in deep learning based natural
language processing. IEEE Comput Intell Mag 13(3):55–75
22. General Data Protection Regulation (2018) General data protection regulation (gdpr). Intersoft
Consulting Accessed in October, 24(1)
23. Zhan Y, Li P, Guo S (2020) Experience-driven computational resource allocation of federated
learning by deep reinforcement learning. In: 2020 IEEE international parallel and distributed
processing symposium (IPDPS). IEEE, pp 234–243
24. He C, Annavaram M, Avestimehr S (2020) Group knowledge transfer: federated learning of
large cnns at the edge.arXiv:2007.14513
25. McMahan B, Moore E, Ramage D, Hampson S, y Arcas BA (2017) Communication-efficient
learning of deep networks from decentralized data. In: Artificial intelligence and statistics.
PMLR, pp 1273–1282
26. Gupta O, Raskar R (2018) Distributed learning of deep neural network over multiple agents. J
Netw Comput Appl 116:1–8
27. Ahmed KM, Imteaj A, Amini MH (2021) Federated deep learning for heterogeneous edge
computing. In: 2021 20th IEEE international conference on machine learning and applications
(ICMLA). IEEE
28. Hu R, Guo Y, Ratazzi EP, Gong Y (2020) Differentially private federated learning for resource-
constrained internet of things.arXiv:2003.12705
29. Khan S, Yairi T (2018) A review on the application of deep learning in system health manage-
ment. Mech Syst Signal Process 107:241–265
30. Mahdavifar S, Ghorbani AA (2019) Application of deep learning to cybersecurity: a survey.
Neurocomputing 347:149–176
31. Ahmed KM, Eslami T, Saeed F, Amini MH (2021) Deepcovidnet: deep convolutional neural
network for covid-19 detection from chest radiographic images. In: 2021 IEEE international
conference on bioinformatics and biomedicine (BIBM). IEEE, pp 1703–1710

26 A. Imteaj et al.
32. Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J,
Antonoglou I, Panneershelvam V, Lanctot M et al (2016) Mastering the game of go with deep
neural networks and tree search. Nature 529(7587):484–489
33. Russakovsky O, Deng J, Hao S, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A,
Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis
115(3):211–252
34. Pan SJ, Yang Q (2009) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–
1359
35. Jiang J, Zhai CX (2007) Instance weighting for domain adaptation in nlp. ACL
36. Gao J, Fan W, Jiang J, Han J (2008) Knowledge transfer via multiple model local structure
mapping. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge
discovery and data mining, pp 283–291
37. Argyriou A, Pontil M, Ying Y, Micchelli C (2007) A spectral regularization framework for
multi-task structure learning. Adv Neural Inf Proc Syst 20
38. Mihalkova L, Huynh T, Mooney RJ (2007) Mapping and revising markov logic networks for
transfer learning. Aaai 7:608–614
39. Li H, Ota K et al (2018) Learning iot in edge: deep learning for the internet of things with edge
computing. IEEE Netw 32(1):96–101
40. Cui L, Yang S et al (2018) A survey on application of machine learning for internet of things.
J M L Cybern 9(8):1399–1417
41. Haddadpour F, Kamani MM et al (2019) Trading redundancy for communication: speeding up
distributed sgd for non-convex optimization. In: ICML
42. Huang J, Qian F et al (2013) An in-depth study of lte: effect of network protocol and application
behavior on performance. ACM SIGCOMM CCR 43(4):363–374
43. Ma C, Koneˇcn`y J et al (2017) Distributed optimization with arbitrary local solvers. Optim
Methods Softw 32(4):813–848
44. Imteaj A, Amini MH (2019) Distributed sensing using smart end-user devices: pathway to
federated learning for autonomous iot. In: 2019 international conference on computational
science and computational intelligence (CSCI). IEEE, pp 1156–1161
45. Koneˇcn`y J, McMahan HB et al (2016) Federated learning: strategies for improving communi-
cation efficiency.arXiv:1610.05492
46. Li T, Sahu AK et al (2019) Federated learning: challenges, methods, and future directions.
arXiv:1908.07873
47. Thrun S et al (2012) Learning to learn. Springer Science & Business Media, Berlin
48. Caruana R (1997) Multitask learning. Mach Learn 28(1):41–75
49. Corinzia L et al (2019) Variational federated multi-task learning.arXiv:1906.06268
50. Wu S, Li G et al (2018) Training and inference with integers in deep neural networks.
arXiv:1802.04680
51. Jiang Y, Wang S et al (2019) Model pruning enables efficient federated learning on edge devices.
arXiv:1909.12326
52. Yan G, Wang H, Li J (2021) Critical learning periods in federated learning.arXiv:2109.05613
53. Thakker U, Beu J et al (2019) Compressing rnns for iot devices by 15-38x using kronecker
products.arXiv:1906.02876
54. Thakker U, Whatmough P, Liu Z, Mattina M, Beu J (2021) Doping: a technique for extreme
compression of lstm models using sparse structured additive matrices. In: Smola A, Dimakis
A, Stoica I (eds), Proceedings of machine learning and systems, vol 3, pp 533–549
55. Gope D, Beu J, Thakker U, Mattina M (2020) Ternary mobilenets via per-layer hybrid filter
banks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
(CVPR) workshops
56. Xiong G, Yan G, Li J (2021) Straggler-resilient distributed machine learning with dynamic
backup workers.arXiv:2102.06280
57. Imteaj A, Amini MH (2020) Fedar: activity and resource-aware federated learning model for
distributed mobile robots. In: 2020 19th IEEE international conference on machine learning
and applications (ICMLA). IEEE, pp 1153–1160

Federated Learning for Resource-Constrained IoT … 27
58. Imteaj A (2020) Distributed machine learning for collaborative mobile robots: Phd forum
abstract. In: Proceedings of the 18th conference on embedded networked sensor systems,
SenSys ’20, New York, NY, USA, 2020. Association for Computing Machinery, pp 798–799
59. Gu Z, Jamjoom H et al (2019) Reaching data confidentiality and model accountability on the
caltrain. In: IEEE DSN
60. Chen M, Yang Z et al (2019) A joint learning and communications framework for federated
learning over wireless networks.arXiv:1909.07972
61. Sprague MR, Jalalirad A et al (2018) Asynchronous federated learning for geospatial applica-
tions. In: ECML-PKDD
62. Eliazar II, Sokolov IM (2010) Measuring statistical heterogeneity: the pietra index. Physica A:
Stat Mech App 389(1):117–125
63. Kumar A, Goyal S et al (2017) Resource-efficient machine learning in 2 kb ram for the internet
of things. In: ICML
64. Dettmers T, Lewis M, Shleifer S, Zettlemoyer L (2021) 8-bit optimizers via block-wise quan-
tization
65. Anonymous (2022) Logarithmic unbiased quantization: practical 4-bit training in deep learning.
In: Submitted to the tenth international conference on learning representations. Under review
66. Raju R, Gope D, Thakker U, Beu J (2020) Understanding the impact of dynamic channel
pruning on conditionally parameterized convolutions. In: Proceedings of the 2nd international
workshop on challenges in artificial intelligence and machine learning for internet of things,
AIChallengeIoT ’20, New York, NY, USA, 2020. Association for Computing Machinery, pp
27–33
67. Huang X, Thakker U, Gope D, Beu J (2020) Pushing the envelope of dynamic spatial gating
technologies. AIChallengeIoT ’20, New York, NY, USA, 2020. Association for Computing
Machinery, pp 21–26
68. Zhang Y, Duchi J et al (2015) Divide and conquer kernel ridge regression: a distributed algorithm
with minimax optimal rates. JMLR 16(1):3299–3340
69. Guha N, Talwlkar A et al (2019) One-shot federated learning.arXiv:1902.11175
70. Kim H, Park J et al (2019) Blockchained on-device federated learning. IEEE Commun Lett
71. Xu R, Chen Y, Li J (2020) MicroFL: a lightweight, secure-by-design edge network fabric for
decentralized IoT systems. In: NDSS

Federated and Transfer Learning:
A Survey on Adversaries and Defense
Mechanisms
Ehsan Hallaji, Roozbeh Razavi-Far, and Mehrdad Saif
AbstractThe advent of federated learning has facilitated large−scale data exchange
amongst machine learning models while maintaining privacy. Despite its brief his−
tory, federated learning is rapidly evolving to make wider use more practical. One
of the most significant advancements in this domain is the incorporation of transfer
learning into federated learning, which overcomes fundamental constraints of pri−
mary federated learning, particularly in terms of security. This chapter performs a
comprehensive survey on the intersection of federated and transfer learning from a
security point of view. The main goal of this study is to uncover potential vulnerabil−
ities and defense mechanisms that might compromise the privacy and performance
of systems that use federated and transfer learning.
1 Introduction
Machine learning has exploded in popularity as the information era has matured.
As a sub−discipline of machine learning, deep learning (DL) is responsible for a
number of achievements that have helped popularise the area. The hierarchical feature
extraction within the DL models enables them to learn complex underlying patterns
in the observed input space. This makes DL models suitable for processing various
E. Hallaji (B)
Department of Electrical and Computer Engineering, University of Windsor, 401 Sunset Avenue,
Windsor, ON N9B 3P4, Canada
e−mail:[email protected]
R. Razavi−Far
Faculty of Computer Science, University of New Brunswick, Fredericton, NB, Canada
e−mail:roozbeh.razavi−[email protected];[email protected]
Department of Electrical and Computer Engineering and School of Computer Science,
University of Windsor, 401 Sunset Avenue, Windsor, ON N9B 3P4, Canada
M. Saif
Department of Electrical and Computer Engineering, University of Windsor, 401 Sunset Avenue,
Windsor, ON N9B 3P4, Canada
e−mail:[email protected]
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023
R. Razavi−Far et al. (eds.),Federated and Transfer Learning, Adaptation, Learning,
and Optimization 27,https://doi.org/10.1007/978−3−031−11748−0_3
29

30 E. Hallaji et al.
data types and facilitating different tasks such as prediction [47], detection [93],
imputation [43], and data reduction [46]. Although the success of DL−based projects
is contingent on several factors, one of the most important requirements is usually
to have access to abundant training samples.
With the advancement of technologies such as internet of things and increasing
number of intelligent devices, the diversity and volume of generated data is growing
at an astonishing pace [88]. This abundant data stream is dispersed and diverse in
character. When this data is evaluated as a whole, it may provide knowledge and dis−
coveries that might possibly accelerate technological and scientific advancements.
Nonetheless, the privacy hazards associated with data ownership are increasingly
becoming a major problem. The conflict between user privacy and high−quality ser−
vices is driving need for new technologies and research to enable knowledge extrac−
tion from data without jeopardising data−holding parties’ privacy.
Federated learning (FL) is perhaps the most recent approach presented to poten−
tially resolve this issue. FL allows for the collaborative training of a DL model across
a network of client devices. This multi−party collaboration is accomplished by com−
munication with a central server and decentralisation of training data [71]. In other
words, the FL design requires that the training data be retained on the client, or local
device, that generated or recorded it. FL’s data decentralisation addresses a significant
portion of data and user privacy concerns, since model training at the network’s edge
eliminates the need for direct data sharing. Nevertheless, in FL systems one should
strike a balance between data privacy and model performance. The appropriate bal−
ance is determined by a number of criteria, including the model architecture, data
type, and intended application. Beyond FL restrictions, ensuring the confidentiality
and security of the FL infrastructure is critical for establishing trust amongst diverse
clients of the federated network.
The conventional FL imposes a constraint by requiring customers’ training data to
use similar attributes. However, most industries such as banking and healthcare do not
comply with this assumption. In centralized machine learning, this was addressed by
Transfer Learning (TL), which enables a model obtained from a specific domain to be
used in other domains of with the same application [93]. Inspired by this, Federated
TL (FTL) emerged as a way to overcome this constraint [64]. FTL clients may
differ in the employed attributes, which is more practical for industrial application.
TL effectiveness is heavily reliant on inter−domain interactions. It is worthwhile to
mention that organizations joined in a FTL network are often comparable in the
service they provide and the data they use. As a result, including TL within the FL
architecture can be quite advantageous.
FTL is at the crossroads of two distinct and rapidly expanding research areas of
privacy−preserving and distributed machine learning. For this reason, it is crucial to
investigate more on these two topics to make best use of FTL. Hence, this chapter
studies different security and privacy aspects of FTL.
Information privacy and machine learning are distinct and rapidly expanding
research areas that derive FTL, and, thus, going through their connections to FTL is
necessary. Understanding the interplay between TL and FL, as well as identifying
potential risks to FTL in real−world applications, is crucial. Knowing compatible

Federated and Transfer Learning: A Survey on Adversaries … 31
diffense mechanisms with FTL is also vital for mitigating potential cyber−threats.
Hence, in this chapter, we present a comprehensive survey on possible threats to
FTL and the available defense mechanisms.
The rest of the chapter is organized as follows. The preliminaries of this survey are
explained in Sect.2. Section3reviews known attack scenarios on FL and TL w.r.t.
performance and privacy. Section4presents tools and defense mechanisms that are
undertook in the literature for mitigating threats to FL and TL. Section5explains the
future directions in defending FTL. Finally, Sect.6concludes the conducted survey.
2 Background
This chapter concisely reviews preliminaries of federated and transfer learning to
facilitate the discussions in the following sections.
2.1 Federated Learning
FL paradigm allows for collaborative model training across several participants (i.e.,
also referred to as clients or devices). This multi−party collaboration is accomplished
by communication with a central server and decentralization of training data [71].
Data decentralization of FL mitigates a major part of user privacy issues. Moreover,
the efficiency of FL reduces the communication overhead in the network. FL can be
categorizes from different perspectives, as explained in the following.
2.1.1 Categories of Federated Learning
FL variations generally fall under three categories depending on the portion of feature
and sample space they share [114]:
1.Horizontal Federated Learning:Participants exchange data with comparable
properties captured from various users [55]. For instance, clinical information
of various patients is recorded using the same features across several hospitals.
Therefore, similar deep learning structures can be trained on these datasets since
they all process the same number and types of features.
2.Vertical Federated Learning:The vertical variation is utilized in applications
where participant datasets have considerable overlaps in the sample space but
each has a separate set of attributes [104].
3.Federated Transfer Learning:FTL facilitates knowledge transfer between par−
ticipants when the overlap between sample and feature space is minimal [64].
FTL is discussed in further detail later in this section.

32 E. Hallaji et al.
2.2 Transfer Learning
Most machine learning approaches work on the premise that the training and test
data are in the same feature space. In industry, data may be hard to collect in certain
applications, and, thus, there is a preference for using the available data shared
by large companies and organizations. The challenge, however, is the difference
between the data distributions despite the similarity of the applications. In other
words, the ideal model needs to model from one domain with limited data resources,
while abundant training data is available in another domain. For instance, consider a
factory that trained a model to predict the market demand to adjust it production rate
for different items. However, it may be time consuming to obtain enough samples
for each produced item separately. Instead, TL can enable the model to be trained on
the data of other organizations that produce similar products, albeit samples may not
be recorded using the same features. The past decade has witnessed an increasing
attraction towards research on TL, which resulted in proposing different variations
TL under different names [78].
2.3 Federated Transfer Learning
Similar to TL, clients of FTL may not use the same attributes in the training data.
This is mostly the case in organizations that are similar in nature but are not identical
[92]. Due to such differences, these organizations share only a small portion of each
other’s feature space. Therefore, under this condition, both samples and features are
different in the dataset. Note that the considered condition in FTL is in contrast to
the other variants of FL.
FTL takes a model that has been constructed on source data, and then aligns it to
be employed in a target domain. This allows the model to be utilized for uncorrelated
data points while exploiting the information gained from non−overlapping features in
the source domain. As a result, FTL transmits information from the source domain’s
non−overlapping attributes to new samples within the target domain.
Existing literature on FTL mainly studies the customization of FTL for certain
applications [20,113]. From a learning stand−point, only a limited number of works
present distinct FTL protocols. A secure FTL framework is proposed in [64] that
uses secret sharing and homomorphic encrypton to protect privacy without sacrific−
ing accuracy, which is a typical issue in privacy−preserving techniques. Other benefits
of this method include the simplicity of homomorphic encryption and the fact that
secret sharing ensures zero accuracy loss and quick processing time. On the other had,
Homomorphic encryption, imposes significant computing overhead, and secret shar−
ing necessitated offline operations prior to online computation. A following research
[98] tackles the computation overhead of previous protocols and extends the FTL
model beyond the semi−honest setting by taking malicious users into count as well.
The authors use secret−sharing in the designed algorithm to enhance the security and

Federated and Transfer Learning: A Survey on Adversaries … 33
efficiency of multi−party communications in FTL. An scalable heterogeneous FTL
framework is also presented in [36], which uses secret sharing and homomorphic
encryption.
3 Threats to Federated Learning
Threats to FL and TL often compromise the functionality or privacy of the system.
Table1lists the sources of threats for common attacks to FL. FL enables a distributed
learning process without requiring data exchange, allowing members to freely join
and exit federations. Recent research has shown, however, that resilience of FL
against the mentioned threats in Table1can be questionable.
Existing FL protocol designs are vulnerable to rough servers and adversarial
parties. While both infer confidential information from participants’ updates, the
former mainly tampers with the model training whereas the latter resorts to poisoning
attacks to deviate the aggregation procedure.
During the training process, communicating model updates might divulge critical
information and lead to deep leakage. This can consequently jeopardize the privacy
of local data or lead to high−jacking the training data [118]. The robustness of FL
systems, on the other hand, can be degraded using poisoning attacks on the model to
corrupt the model or training data [8,13,107]. In turn, these attacks lead to planting
a backdoor into the global model or degrade its convergence.
Table 1Identification of sources of attacks on FL systems
Attacks Source of Attack
Data poisoning Malicious client
Model poisoning Malicious client
Backdoor attack Malicious client and malicious server
Evasion attack Malicious client and model deployment
Non−robust aggregation Aggregation algorithm
Training rule manipulation Malicious client
Inference attacks Malicious server and communication
GAN reconstruction Malicious server and communication
Free−riding attack Malicious client
Man−in−the−middle attack Communication

34 E. Hallaji et al.
3.1 Threat Models
Attacks on FL can be launched in different fashions [68,69]. To have a better grasp
of the nature of FL attacks, we will first go through the most frequent threat models
in the following:
Outsider Adversariesinclude attacks by eavesdroppers on the line of communi−
cation between clients and the FL server, as well as attacks by clients of the FL
model once it is provided as a service.
Insider Adversariesinvolve attacks initiated from the server or the edge of the
network. Byzantine [15] and Sybil [35] attacks can be mentioned as two most
important insider attacks.
Semi-Honest Adversariesare non−aggressive adversaries that attempt to discover
the hidden states of other users while being honest to the FL protocol. Only the
received information such as the global model’s parameters is visible to the attack−
ers.
Training Manipulationis the process of learning, affecting, or distorting the FL
model itself [14]. The attacker can damage the integrity of the learning process by
attacking the training data or the model during the training phase [13].
Inference Manipulationmainly consists of evasion or inference attacks [10].
They usually deceive the model into making incorrect decisions or gather infor−
mation regarding the model’s properties. These opponents’ efficacy is determined
by the amount of knowledge given to the attacker, which classifies them into
white−box and black−box variations.
3.2 Attacks on Performance
Figure1shows the taxonomy of attacks on FL and TL. The common threats between
FL and TL that can jeopardize FTL are also specified within the red area.
3.2.1 Data Poisoning
Poisoning training data in FL often affect the integrity of the training data, com−
promising model performance by injecting a backdoor for particular triggers during
inference and degrading the overall model accuracy. The most common types of data
poisoning attacks are as follows:
Denial-of-Service (DoS)attacks generally attempt to minimize the target overall
performance, impacting the recognition rate of all classes. In a FL system, label
noise in the training data may be induced to create a poisoned model that cannot
accurately predict any of the classes. This is also referred to aslabel flipping[54]

Federated and Transfer Learning: A Survey on Adversaries … 35
DoSBackdoor
Gradient
Manipulation
Training Objective
ManipulationModel Inversion
Membership
Inference
Adversarial
SamplesFeature InferenceData PoisoningModel PoisoningInference AttacksEvasion AttacksData PoisoningInference AttacksEvasion Attacks
FL
TL
Attacks on Performance
Attacks on Privacy
Threats to TFL
Fig. 1Taxonomy of the attacks on Federated and Transfer Learning. The common threats between
FL and TL are specified in red area
in the literature. When the parameters of this model are sent to the server, the
performance of other devices diminishes.
Backdoorattacks are designed to impose intentional and particular false predic−
tions for specific data patterns [54]. Backdoor attacks, unlike DoS attacks, only
influence the model’s recognition ability for a specific group of samples or classes.
3.2.2 Model Poisoning
Poisoning a model refers to a wide range of techniques for tampering with the FL
training procedure. It is worth mentioning that in some literature data poisoning is
categorized as a type of model poisoning [52]. However, here, we mainly target
gradient and learning objective manipulation when referring to model poisoning.
Gradient Manipulation:Local model gradients may be manipulated by adver−
saries to degrade overall performance of the central model, for example, by low−
ering detection accuracy [54]. For instance, this approach is used to inject hidden
global model backdoors [8].
Training Objective Manipulation:involve manipulating model training rules
[54]. Training rule manipulation, for example, is used to successfully carry out a
covert poisoning operation by appending a deviating term to the loss function to
penalize the difference of benign and malicious updates [13].

36 E. Hallaji et al.
3.3 Attacks on Privacy
3.3.1 Model Inversion Attacks
It has been demonstrated that model inversion attacks can successfully define sen−
sitive characteristics of the classes and instances covered by the model [52,54].
Fredrikson et al. [32] states that using these attacks in a white−box setting on deci−
sion trees enables reveal sensitive variables such as survey responses, which may
be identified with no false positives. Another study demonstrates that a hacker may
anticipate genetic data of a person simply using their demographic information [54].
3.3.2 Membership Inference Attacks
Membership inference aims at disclosing the membership of a particular sample to
the training dataset or a certain class. Furthermore, this form of attack can work even
if the objective is unrelated to the basic characteristics of the class [72].
3.3.3 GAN Reconstruction Attacks
Model inversion is similar to GAN reconstruction; however, the latter is substantially
more potent and have been demonstrated to create artificial samples that statistically
resemble of the training data [48]. Traditional model inversion approaches utterly fail
when attacking more sophisticated DL architectures, whereas GAN reconstruction
attacks may effectively create desirable outputs. It has been shown that even with the
presence of differential privacy, GAN may be able to reach the objective. As a result,
an adversary may be able to persuade benign clients to mistakenly commit gradient
modifications that leak more confidential details than planned during collaborative
learning [48].
4 Threats to Transfer Learning
4.1 Backdoor Attacks
Plenty of the pre−trained Teacher models (T) employed for TL are openly available,
making them vulnerable to backdoor attacks [108,116]. In a white−box setup, the
intruder has access toT, as is prevalent in modern applications. The intruder intends
to cause a erroneous decision making for a Student model (S) that has been calibrated
via TL using a publicly available pre−trainedTmodel.
The attacker may breach the publicly accessible pre−trainedTmodel prior to theS
system deployment phase. Because the regulations of third−party platforms that store

Federated and Transfer Learning: A Survey on Adversaries … 37
diverseTmodels are often inadequate, the platforms contain multiple variations of
the same pre−trained neural networks. Since weights of a neural network are not self−
explanatory, distinguishing damaging models from refined models is complicated,
if not impossible. In this situation, we suppose that the intruder is familiar with the
structure and parameters of theTand has black−box access toS, but is unaware of
the specificTwho trained this model and which layers were fixed for trainingS
[106].
Adversaries might potentially get around the fine−tuning technique by leveraging
openly accessible pre−trainedTmodels to constructSmodels. TheSmodels must
be optimized using particularTmodels, in which a portion of theTstructure must be
incorporated and retrained frequently. In a white−box setting, we presume the intruder
knows the certainTthat trainedSand which layers were unchanged throughout the
Straining. The adversary, in particular, has access to the architecture and weights of
theSmodel and may change them.
4.2 Adversarial Attacks
In contrast to conventional adversarial attacks, which optimize false data to be mis−
taken for benign samples, the central notion of adversarial attacks against TL is
to optimize a data matrix to imitate the intrinsic representation of the target data.
Models transferred by re−learning the last linear layer have recently been shown to
be sensitive to adversarial instances produced exclusively using a pre−trained model
[89]. It has been demonstrated that such an attack can fool models that have been
transported with end−to−end fine−tuning [21]. This discovery raises questions about
the security of the extensively employed fine−tuning approach.
4.3 Inference Attacks
An inference attack resorts to data analysis to gather unauthorized information about
a subject or database. If an attacker can confidently estimate the true worth of a
subject’s confidential information, it can be termed as leaked. The most frequent
variants of this approach are membership inference and attribute inference.
4.3.1 Membership Inference
The goal of membership inference in machine learning is to establish whether a
sample was employed to train the target model. Discovering the membership status
of a particular user data might lead to serious information theft [119]. For instance,
revealing that a patient’s medical records were utilized to train a model linked with
an illness might disclose that the patient has the condition.

38 E. Hallaji et al.
In contrast with conventional machine learning, there are two attack surfaces for
membership inference in TL setting, that is discovering the membership status of
samples for bothSandTmodels. Furthermore, depending on the abilities of certain
adversaries, access to either theTorSmodel may be possible. Given both attack
surfaces and the extent of attackers’ access to the models, there are three possible
attack scenarios:
1. The attackers can observeTand aim at ascertaining the state of theTdataset’s
membership. This approach is analogous to the traditional membership inference
attack, in which the target model is trained from the ground up.
2. TheSmodel is visible to the attackers, and they attempt to ascertain the status
of theTdataset’s membership The target model is not directly trained from the
target dataset in this scenario.
3. The attackers observes theSmodel and try to deduce theSdataset’s membership
status. In contrast to the first scenario, here the target model is transferred from
theTmodel.
4.3.2 Feature Inference
An adversary with partial prior knowledge of a target’s record can devise feature
inference to fill in the missing features by monitoring the model’s behaviour [32,33,
117]. For instance, a description of attribute inference attack is given in [117] and it
has been demonstrated that by incorporating membership inference as a subroutine,
this attack may deduce missing attribute values. Based on the missing attributes, a
set of distinct feature vectors are generated and passed to the membership inference
adversary as input. The output of this process is attribute values that correspond to
the vector whose membership is confirmed via membership inference. Experimental
validations for the effectiveness of an attribute inference of regression models are
also available [117].
5 Defense Mechanisms
Various defense mechanisms are proposed to fortify FL against privacy and perfor−
mance related threats. Figure2illustrates the taxonomy of the defense mechanisms
in TL and FL, and the common approaches between the two that can be used to
defend FTL.

Federated and Transfer Learning: A Survey on Adversaries … 39
Homomorphic
Encryption
Secure Multiparty
Computation
Differential
Privacy
Anomaly
Detection
Robust
Aggregation
Pruning
Trusted Execution
Environment
Zero-Knowledge
Proofs
Adversarial
Training
Multi-task
Learning
Moving Target
Trustworthiness
Assessment
Federated
Distilliation
Regularization
Perturbing
Posteriors
Model Robustness
Privacy Preserving
FL
TL
TFL Defense Mechanisms
Fig. 2Taxonomy of the defense mechanisms for FTL. The common defense mechanisms between
FL and TL are specified in red area
5.1 Privacy Preserving
Despite the wide diversity of previous efforts on safeguarding FL privacy, sug−
gested methods typically fall into one of these categories: homomorphic encryption,
secure multiparty computation, and differential privacy. The following paragraphs
go through each of these groups.
5.1.1 Homomorphic Encryption
By processing on cyphertext, homomorphic encryption is commonly used to secure
the learning process. Clients can use homomorphic encryption to perform arithmetic
operations on encrypted data (i.e., cyphertext) without having to decode it. The most
prevalent techniques in Homomorphic encryption are explained as in the following
[76].
Fully homomorphic encryption is capable of doing arbitrary calculations on the
encrypted data [37]. This is while partially homomorphic encryption can only exe−
cute one operation (e.g., addition or multiplication), and substantially homomorphic

40 E. Hallaji et al.
encryption can do several operations [23,77,91]. The latter, on the other hand, has a
restricted amount of additions and multiplications. While completely homomorphic
encryption offers greater flexibility, it is inefficient when compared to other forms
of homomorphic encryption [37].
Despite the benefits of homomorphic encryption, executing arithmetic on the
encrypted integers increases the memory and processing time costs. For this rea−
son, one of the main problems in homomorphic encryption is to find a proper bal−
ance between privacy and utility [6,56]. In [82], for instance, additive homomor−
phic encryption is used to secure distributed learning by securing model changes
and maintaining gradient privacy. Another example is [45], which uses an additive
homomorphic architecture to defeat honest−but−curious adversaries using federated
logistic regression on the encrypted vertical FL data. However, the overburdening
of the system with additional computational and communication costs is a typical
downside of such systems.
5.1.2 Secure Multiparty Computation
Secure Multiparty Computation (SMC) [115] is a sub−field of cryptography, in which
multiple parties cooperate to estimate a function on their input, without compromising
privacy between participants. As an example of SMC is proposed in [74], which
enables collaborative training without compromising privacy. Nevertheless, SMC
is followed by considerable computational and communication burden, which may
deter parties from collaborating. This dramatic rise in communication and processing
costs makes SMC undesirable for large−scale FL.
For safe aggregation of individual model updates, [16] suggested a protocol based
on SMC that is secure, communication−efficient, and failure−resistant. Their tech−
nique makes the communicating information perceivable only when they are aggre−
gated. Thus, their protocol is secured in honest−but−curious and malicious setups. In
other words, none of the participants learns anything beyond the aggregate of the
inputs of numerous honest users [16].
Aside from the efficiency−related issues of SMC, another major problem for SMC−
based systems is the necessity for all participants to coordinate at the same time during
the training process. In practise, such multiparty interactions may not be ideal, espe−
cially in FL contexts where the client−server design is typical. Moreover, while the
privacy of client data is preserved, malicious parties may still infer sensitive informa−
tion from the final output [40,90]. As a result, SMC cannot guarantee information
leakage protection, necessitating the incorporation of additional differential privacy
mechanisms within the multiparty protocol to overcome these issues [2,85]. In addi−
tion, all cryptography−based protocols preclude the audition of updates, during which
a hacker can covertly inject backdoor features into the shared model [13].

Federated and Transfer Learning: A Survey on Adversaries … 41
5.1.3 Differential Privacy
The idea of Differential Privacy (DP) is to inject random noise into the generating
updates so that the data interpretation becomes infeasible for malicious entities. DP
is primarily used to safeguard DFL communications against privacy attacks (e.g.,
inference attacks); however, the literature also shows that DP is also beneficial against
data poisoning, as these attacks are usually designed based on the communicated
gradients [1,38,70].
In contrast to homomorphic encryption and SMC whose main disadvantage was
communication overhead, DP does not overburden the system in this sense. Instead,
DP comes at the cost of deteriorating the model quality since the injected noise can
potentially add up to the noise within the constructed model. Moreover, DP provides
resistance to poisoning attempts due to its group privacy trait. As a result, as the
number of attackers increases, this defense will reduce significantly.
DP can be centralized, local, or distributed. In centralized DP, the noise addition
is performed via a server, which makes it impractical in FDL. On the other hand,
local [24] and distributed DP [11,25] both assume that the aggregator is not trusted,
which perfectly complies with the FDL paradigm. In the local variant, participants
inject noise to their estimated gradients before sharing them over the blockchain.
However, research on local DP indicates its impotency to provide privacy guarantee
on large−scale and heterogeneous models with numerous parameters [79,102]. In
FDL, the injected noise should be calibrated to ensure successful DP. Despite the
appealing security qualities of local DP, its practicality becomes questionable when
dealing with immense number of users.
It is also possible to integrate TL into DP. For instance, private aggregation of
Tensembles [79,80] initially training an ensemble ofTs on disjoint subsets of
private data, then perturbs the ensemble’s information by introducing noise to the
aggregatedTvotes before transferring the information to aS. The aggregated output
of the ensemble then is used to train aSmodel, which learns to precisely replicate
the ensemble. To meet the desired accuracy, this method requires a large number of
clients, and each of them must have sufficient training records. On the other hand,
most industrial applications deal with imbalanced data [86], and similarly, FL data
is often imbalanced among parties that does not comply with this assumption.
It has been established that the usage of DP helps prevent inference attacks in TL
[1,51], albeit at the cost of potential utility loss [7,51]. By definition, DP seeks to
conceal the presence or absence of a record in a dataset, which works against the
objective of membership inference attacks. Li et al. [59] draws attention to the fact
that these two concepts seem to counteract each other and establishes a link between
DP and membership inference attacks. This has been often carried out by minimizing
the bias of the model towards any individual sample or feature by including adequate
differential privacy noise. The existing connection between records and features is
elaborated in [117].

42 E. Hallaji et al.
5.2 Model Robustness
Defenses are classified into two types: proactive and reactive. The former is an
inexpensive method of anticipating attacks and associated consequences. The reac−
tive defense operates by detecting an invasion and taking preventative steps. In the
production environment, it is often deployed as a patch−up. FL presents multiple
additional attack surfaces throughout training, resulting in complicated and unique
countermeasures. In this part, we will look at some of the most common types of FL
defensive tactics and investigate their usefulness and limits.
5.2.1 Anomaly Detection
Anomaly detection methods actively identify and stops malicious updates from
affecting the system [9,44]. These methods may be also used in FL systems to iden−
tify potential threats [17]. One frequent technique for handling untargeted adversaries
is to calculate a specific test error rate on updates and reject those disadvantageous
or neutral to the global model [9].
In [99], a protection mechanism is proposed that clusters participants based on
their submitted suggestive attributes to identify malicious updates. It produces groups
of benign and malicious users with each indicator attribute. Another detector monitors
drifts in updates using a distance measure for different participants [15]. Li et al.
[60] proposed producing low−dimensional model weight surrogates to recognise
anomalous updates from participants. An outlier detection−based paradigm presented
in [50] selects a number of updates that work in favour of objective function among
others. DL−based anomaly detection is often performed using autoencoders [84].
These neural network models represent data in a latent space, in which anomalies
can be discriminated. Examples of anomaly detection in FL are given in [27,61].
Backdoor attacks in TL may also be mitigated via anomaly detection. As an
example, [65] employs an anomaly detection technique to determine whether the
input is a possible Trojan trigger. If the input is identified as an anomaly, it will not
be passed to the neural network. This approach employs support vector machines
and decision trees to find anomalies.
5.2.2 Robust Aggregation
The security of FL aggregation techniques is of paramount importance. Extensive
research endeavours has been dedicated to research on robust aggregation that can
recognize and dismiss inaccurate or malicious updates during training [41,83]. Fur−
thermore, strong aggregation approaches must be able to withstand communications
disturbances, client dropout, and incorrect model updates on top of hostile partici−
pants [5]. Existing constraints [54] of aggregation methods for integration with FL
lead to the emergence of more mature techniques such as adaptive aggregation, have

Random documents with unrelated
content Scribd suggests to you:

De triste sauce, duda, ama y recela:
Moraima es, cuyo ánimo acongoja
Pesar secreto que la tiene en vela.
Es la Sultana de cabellos de oro,
Que el alma hechiza del Monarca moro.
Käel, su negro y perspicaz Nubiano,
Yace á sus pies con languidez tendido;
La frente apoya sobre la ancha mano
Fatigado tal vez, tal vez dormido;
Mas la mirada fija del enano
Y la abierta nariz y atento oído,
Al que su instinto y lealtad comprende
Advierten que sagaz á todo atiende.
En el obscuro camarín, formado
Por la maciza fábrica del muro,
Y en donde se abre el ajimez dorado
Que da aire y luz al aposento obscuro
Al estilo de Oriente fabricado,
Contempla el cielo otra mujer; su duro
Contorno sobre el cielo se destaca,
Pues fuera del balcón el cuerpo saca.
Es Aixa, la despótica Sultana,
El genio protector del Islamismo,
Que desde aquella arábiga ventana
Mide del porvenir el hondo abismo.
Genio tenaz, encarnación humana
De la fe, del valor y el heroísmo,
Genio que, á aparecer en otra era,
Mentir á los horóscopos hiciera.
Conelrumordelbosqueconfundidos

Coeuodebosquecouddos
Que sombrea la torre de Comares,
Trae el aura fugaz á sus oídos
Del bullicioso pueblo los cantares.
Á sus vasallos quiere entretenidos
Tener el nuevo Rey en sus hogares,
Y el mal que sus horóscopos predicen
Cantando olvidan y á su Rey bendicen.
Pero Aixa, que jamás en ilusiones
Se adormeció y á quien la edad avisa
De que las populares ovaciones
Tan efímeras son como la brisa
Que su murmullo trae á sus balcones,
Con desdeñosa y lúgubre sonrisa
Su són escucha, que al rayar el día
Ser puede amotinada vocería.
Todo en la regia cámara reposa:
Ajenos al turbión de los placeres
De la morisca corte voluptuosa,
Aquellos tres tan diferentes seres
Tristes meditan. Á la fin la esposa,
La más inquieta de las dos mujeres,
Dando sin duda al pensamiento giro
Distinto, débil exhaló un suspiro.
Llamó de Aixa la atención el eco
De aquella exhalación enamorada,
Y del balcón dejando el fondo hueco
Fijó en Moraima su glacial mirada;
Y con el tono desabrido y seco
De su voz, á mandar acostumbrada,
La dijo: «Afrenta de las Reinas moras,
Espíritu cobarde, ¿por qué lloras?»

sptucobade,¿poquéoas
No lloraba Moraima todavía,
Mas tan duras palabras la preñaron
De lágrimas los ojos. Muda, fría,
Aixa las vió cuando á la faz brotaron
De la débil mujer que las vertía.
Las vió, mas conmoverla no lograron,
Y con regio desdén, á paso lento
Comenzó á atravesar el aposento.
Mas al llegar del arco á los umbrales,
De la alberca en el patio embaldosado
Anunciaron los roncos atabales
Al Rey por las Sultanas esperado.
Seguido de sus deudos más leales
Llegó Abdilá para el combate armado:
Sonrió al verle con su arnés más bello
Aixa, y Moraima se abrazó á su cuello.
—«¡Tan pronto! dijo la afligida esposa.
—Ya tarda, dijo la valiente madre.
—¡Aláh te vuelva!... murmuró la hermosa:
—Mas si no vences: volverá tu padre,
Añadió la Africana vigorosa.
—¡Antes cristiana lanza me taladre!»
Dijo el mancebo rebosando enojos,
Y un rayo de rencor brilló en sus ojos.
Entonces la Sultana:—«En paz os dejo:
(Añadió con voz grave) despedíos
Á solas, pero ved que no me alejo;
No me le quites con tu amor los bríos
Que necesita.» Y, torvo el entrecejo,
ó í

Se sumió en los tortuosos y sombríos
Corredores, dejándoles á solas
Del mar de su aflicción entre las olas.
En silencio abrazados los esposos
Largo espacio quedaron: el exceso
De su dolor en ayes angustiosos
Exhalaba Moraima, mientras preso
Mantenía en sus brazos cariñosos
Á Abú-Abdil: dióla él un tierno beso
De su cariño en la efusión sincera,
Diciéndose los dos de esta manera:
BU-ABDIL.
No llores, alma mía: cobra aliento:
Llevo todo mi ejército conmigo.
MORAIMA.
Abdil, tengo el fatal presentimiento
De que no has de volver: yo te lo digo.
He soñado, mi bien, tu vencimiento,
Y mi sueño es lëal. Mi dulce amigo,
Manda tus capitanes á la guerra:
Tú eres el Rey; no salgas de tu tierra.
BU-ABDIL.
Moraima de mi vida, ¿no comprendes
Que tu congoja mi valor me quita?
Esta salida que evitar pretendes
Es nuestra salvación. Se necesita
Que el pueblo crea en mi valor ¿entiendes?
El Rey ha de ser Rey. Ve á la mezquita
Á orar; mas oye ¡oh flor de mis amores!
Delantedemimadrenuncallores

Delante de mi madre nunca llores.
Mi madre es una Reina verdadera,
Cuyo orgullo jamás ha concebido
Que un Rey pueda llorar. Tu amor modera
Ante ella y muestra del dolor olvido:
Porque ella, aunque á sus pies morir nos viera,
No exhalara, Moraima, ni un gemido;
Matar sobre nosotros se dejara,
Mas creyera infamarse si llorara.
MORAIMA.
¿Qué culpa tengo yo de que Aláh Santo
Débil mujer me hiciera y no Sultana
Feroz como ella? Contener mi llanto
No sabré yo ni tarde ni mañana,
Y soñaré de noche con espanto
Que muerto yaces ó en prisión cristiana,
Sin mí llorando ó demandando á voces
El fin de tus horóscopos atroces.
BU-ABDIL.
¡Calla, Moraima calla: me estremeces!
Creo que tu exaltada fantasía
En la locura te despeña á veces.
Déjale al vulgo que la suerte mía
Juzgue fatal al Árabe, y tus preces
Dirige á Aláh, para que llegue un día
En que contra ellos la victoria arguya
Y el triunfo mis horóscopos destruya.
¡Adiós! yo parto á pelear ahora;
Mas cálmate, bien mío, porque creo
Que en esta correría asoladora

Queeestacoeaasoadoa
Voy sólo á dar un militar paseo
Y á recoger botín. ¡Adiós! que es hora
Ya de partir y á la Sultana veo.
MORAIMA.
¡Aláh te guíe!
BU-ABDIL.
Hasta volver contigo.
MORAIMA.
¡Ay! que no volverás, yo te lo digo.
Esta fué la siniestra despedida
De Moraima y Abdil. Muda y serena
Aixa del corredor á la salida
Se presentó, y á impulso de su pena
Mortal se desplomó desvanecida
Moraima. Partió el Rey para Lucena
Y fué su madre á despedirle al muro,
Fiando á Dios el porvenir obscuro.

LIBRO OCTAVO
DELIRIOS
I

¡Alahuakbar! ¡Dios grande! No sin causa
Llamaron á Bu-Abdil desventurado,
Ni sin razón Moraima el fatalismo
Lloró de sus horóscopos infaustos.
Desdichado en su hogar desavenido,
En sus empresas de armas desdichado
Y en su amor infeliz, siempre implacable
Faltóle Dios en cuanto puso mano.
La casa en que nació, la madre que hubo,
El siglo en que á luz vino, todo aciago
Le fué, y á todo cuanto en torno suyo
Vivió sus desventuras alcanzaron.
Dios le puso al nacer dentro del pecho
Un corazón del infortunio blanco,
Y el ambiente fatal de la desgracia
Por doquiera que fué le fué cercando.
Odio de su nación supersticiosa
Por el temor de sus siniestros hados,
Y por instinto de creencia y raza
Odio á la par del vencedor cristiano,
Vió el mundo sus virtudes sin aprecio
Y su valor inútil sin aplauso,
Y Árabes y Cristianos, por vencido,
Á un tiempo sin piedad le calumniaron.
Los Moros olvidándole con ira,
Mirándole con mofa los Cristianos,
Unos y otros infiel en sus historias
Legaron á los siglos su retrato.
Los unos con lo negro de la saña,
Los otros con la tinta del escarnio,
En el cuadro inmortal de la conquista
Su figura real emborronaron.
La poesía, empero, cuyos ojos
Escudriñan sagaces lo pasado,
Y en dondequiera que lo encuentra admira

Lo bello y lo infeliz, con entusiasmo
Alumbra su semblante obscurecido,
Y, sus forzadas formas restaurando,
Su noble y melancólica figura
Dibuja con contornos más exactos.
No es la de un grande Rey que el fatalismo
De su sino provoca temerario,
Con el valor del héroe que queda
Por él vencido, pero no humillado:
Es la figura triste de un Monarca
Que obedece al impulso de los astros,
Y, sin poderse defender, sucumbe
De su destino bajo el peso abogado.
No es la robusta encina que se troncha
Del huracán gigante entre los brazos,
Sino la flor que, abriéndose tardía,
Muere marchita por el cierzo helado.
¡Mísero Abú-Abdil! La historia austera
No halla luz en tu rostro soberano,
Pero la poesía te le alumbra
Con el fulgor del infortunio santo.
La historia te ve Rey y sin corona,
Enamorado y sin favor, soldado
Y sin victoria, muerto y sin sepulcro...
¿Dónde hallará su luz para ti un rayo?
Alahuakbar ¡Dios grande! No sin causa.
Llamaron á Bu-Abdil desventurado,
Y con razón Moraima el fatalismo
Lloró de sus horóscopos infaustos.

II

Rico de juventud y de hermosura
Cual de esperanza y de valor sobrado,
Jinete sobre un tordo berberisco
Salió el Rey moro Abú-Abdil al campo.
Reverberan al sol de la mañana
Sus arneses con oro claveteados,
Y se ciernen sobre él como palomas
Las plumas de su espléndido penacho.
En lugar del lanzón que en Bib-Elvira
Se hizo al salir en el quicial pedazos,
Despreciando pronósticos siniestros,
Corvo alfanje de Fez empuña osado.
Piafa el brioso bruto en que cabalga,
Fuerza, vapor y espuma respirando,
Mosqueando inquieto con la blanca cola
Sus ricos paramentos africanos;
Y Abú-Abdil sobre la silla diestro
Cabalgador caracolea ufano,
Tan lleno de bravura y gentileza
Como de gloria y de fortuna falto.
Detrás de su pendón tranquilos marchan
Seis mil peones y dos mil caballos,
La flor de la nobleza granadina,
Los campeones del Islam más bravos.
Por honra del Rey mozo, de Granada
Los quinientos mancebos más gallardos
Para salir con él á esta campaña
Como para un torneo se equiparon.
Vense tan sólo rostros juveniles
En derredor de Abú-Abdil, y el fausto
De los trajes, las armas y jaeces
Turba los ojos y suspende el ánimo.
Quién con el velo de su dama lleva
Hecho el turbante al rededor del casco;
Quién de la suya en el crestón prendido

El ceñidor de virgen en un lazo.
Quién una trenza de cabellos negros
Ata en el hierro del lanzón dorado,
Habiendo prometido devolverla
Empapada en la sangre del cristiano.
¡Qué de garzotas desordena el viento!
¡Qué de colores y reflejos varios
Ostentan los brillantes escuadrones
En sus móviles grupos ordenados!
Desde las torres de Granada al verlos
Ya de la vega en el confín lejano,
Cintas de oro parecen sus hileras
Del sol heridas por los limpios rayos.
Aquella tarde Abdil de las murallas
De la empinada Loja al pie llegando,
Vió lanzarse cien árabes jinetes
Del su enhiesto peñón como milanos.
Sobre caballo indócil del desierto
Que avanza á modo de león á saltos,
Bajaba á la cabeza de los ciento
El alcaide Aly-Athár, de fe relámpago.
Al ver los Granadinos campeadores
Llegar al fiero triunfador anciano,
Con un ¡lelí! de admiración unánimes
Su anhelada presencia saludaron.
«De Aláh llevamos el favor, dijeron,
Si con nosotros á Aly-Athár llevamos.»
Y lo creen: hace ya setenta lunas
Que es su bandera de Castilla espanto.
El fuerte viejo, que indomable arrastra
El peso colosal de sus cien años,
De ellos el brío y la experiencia abriga
Bajo el cendal de sus cabellos blancos.
Hijo feroz del África, en la guerra
Endurecido, su nervioso brazo
Con un bote de lanza todavía

Al caballero arranca del caballo.
Árabe verdadero en genio y raza
Y del Korán indómito sectario,
Quiere para subir al paraíso
Una escala de cuerpos de cristianos.
Su existencia Aly-Athár pasó con ellos
En lid no interrumpida peleando,
Sin que de amigos ni enemigos Reyes
Respetara jamás treguas ni pactos.
Tal es el viejo capitán de Loja:
Tal es el padre de Moraima; amparo
De los Muslimes, vencedor doquiera,
Jamás vencido y por doquier temblado.
Mas ¡ay! ¿Quién fía en su feliz estrella,
Ciego imprudente junto á sí llevando
La fortuna de un Rey de quien los cielos
Abrieron un abismo entre los pasos?
¿Para quién resplandece estrella alguna
Á través de los lóbregos nublados?
Alahuakbar ¡Dios grande! Hacia Lucena
Marcha Aly-Athár de Abú-Abdil al lado.
Va la saña de Dios delante de ellos:
De Santaella y de Aguilar los pastos
Quedan sin hoja verde, y como lluvia
Corre á sus pies el oro y el ganado.
De Montilla y la Rambla las moradas
Son humo nada más, y el viento vano
Se lleva sus cenizas, de sus dueños
Sin tumba los cadáveres dejando.
¡Allí van! ¡allí van! Como un torrente
Bajan de las montañas, y su rastro
Siguen manadas de voraces lobos,
Y los buitres sobre ellos van volando.
Allí van: ya las torres de Lucena
Blanquean á lo lejos: espantados
Huyeron los fronteros, ó dormidos

Yacen sin verlos descender al llano.
Todo reposa en la extensión desierta:
Las sombras de la noche condensando
Se van, y de los Árabes protegen
La marcha lenta con que avanzan cautos.
De un silencioso valle en la espesura
Donde abrieron las lluvias un barranco,
Siguiendo de Aly-Athár un buen consejo
El rey Abú-Abdil mandó hacer alto.
Alzáronse las tiendas: en el centro
Metieron el botín, reses y esclavos,
Y esperando la luz del nuevo día
Se dieron unas horas al descanso.
«Nadie se mueve, dijo el Bey: sin duda
Aláh por nuestro bien les ha cegado:
Mañana somos dueños de Lucena,
Cuando no por sorpresa, por asalto.
—Así lo espero, Amir; pero reposa
Para lidiar mejor, dijo el anciano
Aly-Athár á Bu-Abdil: duerme tranquilo
Y deja lo demás á mi cuidado.»
Entró Abdilá en su tienda, y apagadas
Las luces que pudieran delatarlos,
Sumidos en silencio y en tinieblas
Los emboscados Árabes quedaron.
Del valle á la salida, en una altura,
Un hombre se apostó tras un peñasco,
Mudo y quieto como él permaneciendo:
Era Aly-Athár que vigilaba el campo.
Mas ¿cuyos son los ojos que penetran
De la mente de Dios el denso cäos?
¿Cuya la inteligencia que sorprende
De sus hondos designios el arcano?
Mientras el viejo vigilante guarda
El campamento moro, confiando
En la tranquilidad del enemigo
á

Su empresa audaz para llevar á cabo,
En el confín del horizonte obscuro,
En una torre que cual punto blanco
Vió Aly-Athár con el día, una luz roja
Brilló toda la noche. El africano
La vió, mas sola y sin aumento viéndola,
La contempló brillar sin sobresalto,
Pues vió que no era seña ni atalaya,
En avisos de guerra ejercitado.
Á la lejana luz continuamente
Volvíanse sus ojos sin embargo,
No por fundado y racional recelo,
Mas por tenaz presentimiento vago.
«¿Quién allí velará?» Se preguntaba
Á sí mismo Aly-Athár. «Si no me engaño,
Aquel es el castillo de Baena,
Pero ausente está de él su castellano.
Si aquella luz fuera señal, seguía
Consigo propio el Musulmán hablando,
Ya hubieran las cristianas atalayas
Con otros á su fuego contestado.
¿Quién velará en Baena?» Así pensaba
El viejo Moro al resplandor lejano
Mirando; pero Dios solo pudiera
Ver en tiniebla tal, y á tal espacio.
Y á poder ver el Moro, hubiera visto
Á un castellano capitán que armado
Se asomaba al balcón del aposento
Donde brillaba aquella luz. Debajo
De aquel balcón y tras los gruesos muros
De aquel castillo y en su extenso patio,
Hubiera visto á combatir dispuestos
Trescientos caballeros: y, apoyados
Los arcabuces en el muro, hubiera
Visto hasta mil peones castellanos,
Que aguardaban las órdenes del hombre
ó

Que estaba en el balcón iluminado.
Hubiera visto luego que otro jefe
Con otros cien jinetes de su bando
Llegaba, y abrazando al que esperaba
Tocaron bota-silla sus soldados.
Todo esto, á poder ver, hubiera visto
Aly-Athár, ó lo hubiera imaginado,
Si su clara y sagaz inteligencia
No obscureciera Dios para estorbárselo:
Mas no vió más que lo que ver podía;
Y viendo el día á clarëar cercano,
Dejó su puesto y de Abdilá en la tienda
Entró, diciendo respetuoso: «Vamos:
Levántate, Señor: ya está la aurora
Próxima, está el camino solitario,
Y es fuerza que á las puertas de Lucena
Á un tiempo con el sol amanezcamos.»
Cabalgó Abú-Abdil: en breve tiempo
Los escuadrones moros se aprestaron
Á partir y partieron, á Lucena
En su poder el Rey imaginando.
Alahuakbar ¡Dios grande! No sin causa
Llaman á Abú-Abdil desventurado;
Ni sin razón Moraima el fatalismo
Lloró de sus horóscopos infaustos.

III

Llora, esposa infeliz: tu amor es ido
Para más no volver; preso en Lucena
Se dejará su corazón tu esposo,
Y volverá sin alma cuando vuelva.
Sultana de las flores de Granada,
Llora; porque en verdad ya no te queda
Más consuelo que el llanto que derrames
En los amargos días que te esperan.
Arranca, pues, tristísima Moraima,
Tus rizos de oro y sin piedad cercena,
Para hacerte un dogal, de tus cabellos
La rica y aromática madeja.
¡Llora, madre sin par desventurada!
Ese hijo hermoso á quien con ansia besas
Nació cautivo para ser: su cuello
Tiene ya la señal de la cadena.
¿Por qué uniste tu amor y tu fortuna
De Abú-Abdil á la fortuna adversa?
¿Por qué tu padre te arrancó de Loja,
Blanca y olorosísima azucena?
¡Feliz de ti si nunca le dejaras!
¡Feliz si nunca, de amistad en prenda,
Tu padre del Monarca granadino
Al oriental alcázar te trajera!
Tal vez entonces Aly-Athár, contrario
Al hijo de Muley, sólo á la guerra
Le dejara partir, y no quedaras,
Cuando su amparo necesitas, huérfana.
¿Qué has hecho tú, paloma enamorada,
Víctima para ser de tales penas?
¿Qué has hecho á Dios para atraer los rayos
De su furor á tu gentil cabeza?
¡Ay! harto has hecho respirando el aire
Que de tu Rey el hálito envenena.
Nada esperes del Cielo que maldijo

La raza de Bu-Abdil: nada te resta.

IV

¡Pálida sombra de Moraima! escucha:
Oye mi voz que te habla en las tinieblas,
Y verás con placer que todavía
Hay quien contigo de tu mal se duela.
Ven, triste sombra, ven: Dios, compasivo,
Alas me ha dado como á ti, y la lengua
Me ha permitido hablar que hablan las sombras
Para ir á su región y hablar con ellas.
Ven ¡oh Moraima! El universo duerme:
Desciende en una ráfaga á la tierra:
Yo sé que está tu espíritu en la Alhambra
Y vengo á consolártele: no temas.
¡Gracias, hermosa sombra! Ya te veo
Que sobre un rayo de la luna llegas
Á estos escombros que la Alhambra fueron.
¡Ay! ¡sombras sólo en su recinto quedan!
Ven; yo te haré de mi ignorada vida
La misteriosa relación secreta,
Y tú se la dirás á tus hermanas
Cuando al imperio de las sombras vuelvas.
Yo más tarde que tú nací tres siglos:
Mas no que vivo en mi centuria creas,
No: enamorado de las sombras, vivo
Como tú en el país de las quimeras.
He venido esta noche á estas mansiones
De soledad y de silencio llenas
Y, aunque tú te creías invisible
Para mí, yo vagar te vi por ellas.
¿Sabes, dulce y quimérica Moraima,
Cuál es la ocupación de mi existencia?
Pues es no más la de contar al mundo
De los pasados tiempos las leyendas.
Yo he venido á Granada á demandaros
No más que á solas me contéis las vuestras,
Para que yo en mis versos harmoniosos
Á í

Á mi egoísta edad contarlas pueda.
Y ahora escucha, Moraima, otro secreto,
Que mi callado corazón encierra
Desde el instante en que pisé la Alhambra;
Pero que tus hermanas no lo sepan.
Oye: de todas las hermosas sombras
Que los recintos de Granada pueblan,
Tú eres la más gentil, la mas simpática,
Y la de que mi edad menos se acuerda.
Pues bien, Sultana de las sombras, oye:
Yo adoro tu fantástica belleza;
Yo, que he puesto en las sombras mis amores,
Te amo, y mi tierno amor quiero que sepas.
Cuando, mujer, en la región vivías
De los mortales, en mortal tristeza
De los pesares víctima viviste,
Calumniada te viste con afrenta
De tu estirpe y virtud, vendida esposa,
Madre apartada de tus hijos, sierva
Más que reina en tu casa, y del más noble
Y más valiente de los padres huérfana;
Pues bien, Moraima, ahora que, fantasma,
Vives con otro sér otra existencia,
En tu vida de sombra, yo, que te amo,
Una vida mejor quiero que tengas.
Tú serás la Sultana de mis cuentos,
Yo en mi laúd lamentaré tus penas,
Enjugaré tus lágrimas con flores
Y regaré tu lecho con esencias;
Te llevaré conmigo á los alcázares
En donde tiene su morada regia
La noble, omnipotente poesía,
Que sobre el mundo soberana impera.
Entonces tomarás, como las auras
De la montaña, transparente aérea
Y luminosa forma, y será obscura
Á

Á par de ti la nieve de la sierra,
La claridad del alma menos limpia
Que de tu vaga faz la transparencia,
Y la del sol poniente menos rica
Que tu rubia y flotante cabellera.
Y entonces con desdén verás que el mundo
Te reconoce de las sombras reina,
Tu pavorosa aparición adora
Y de tu velo azul las orlas besa.
Mas ya comienza á amanecer: al cielo,
Sombra gentil de mis amores, vuela:
¡Adiós, Sultana de las sombras! huye:
Yo me quedo cantándote en la tierra.

V

Ya por el horizonte blanquecino
Comienza á despuntar la luz primera
Del sexto día en que con hueste brava
El Rey Abú-Abdil partió á Lucena;
Y ya, envuelta en un schal de cachemira
Desde la parda torre de la Vela
Tiende su madre los avaros ojos
Por la extensión de la tranquila Vega.
Todo es silencio, el campo todavía
Iluminado por el alba apenas;
Duermen aún las aves en las ramas
Y cerradas están todas las puertas.
Ningún viviente sér en lontananza
Comienza el punto de su sombra negra
Á acrecentar, sobre el sendero blanco
Por donde de Abdilá se aguardan nuevas.
Fría, impasible al parecer la Mora,
Pero de angustia inexplicable presa,
Silenciosa y sombría se mantiene,
Inmóvil, apoyada en una almena.
Dentro del triste corazón materno
Fiera aunque oculta tempestad fermenta,
Y á sus ojos las lágrimas no suben
Porque en el hondo corazón gotean.
Alguna vez su pie, que el suelo hiere
Con ímpetu, delata su impaciencia,
Y algún suspiro, que fugaz exhala,
La realidad de su aflicción revela.
Nadie parece aún: el sol brillante
De un día de temprana primavera
Extiende ya sus purpurinos rayos
Por el verde tapiz de las laderas.
Las cristalinas gotas del rocío,
Que se columpian en la móvil hierba
Mecidas por el aura matutina,
á

Del sol á los reflejos reverberan.
Ya abandonando su caliente nido
Bulliciosos los pájaros gorjean,
Y estremeciendo de placer sus plumas,
Á Dios bendicen y su luz celebran.
¡Cuán hermosa en los campos de Granada
Se ostenta la feraz naturaleza,
Cuando del seno de las sombras sale
Virgen, florida, perfumada y fresca!
Aixa desde la torre su hermosura
Callada y melancólica contempla,
Sin ver en la extensión de la campiña
Más que de Loja la torcida senda.
«¡Alahuakbar! clamó, sola creyéndose;
¡Ya la tardanza de Abdilá me aterra!»
Y á sus palabras contestó un gemido
Hondo, angustioso: de Moraima era.
Tornó los ojos la Sultana madre
Hacia la esposa pálida, y al verla
Con la vista y la faz desencajadas,
Siguió de su visual la línea recta.
¡Presentimiento de su amor sin duda!
Un punto negro y móvil va con lenta
Vacilación su forma acrecentando
Sobre el camino que hacia Loja lleva.
Käel, que á los pretiles no alcanzando,
Por la hendidura ve de una aspillera,
Fué el primero que un árabe jinete
Reconoció en el punto que negrea,
Y á Moraima con muda pantomima
Explicó la verdad, que aun no penetra
La vista de las Moras, menos clara
Por la edad y las lágrimas en ellas.
«Tiene razón Käel, es un jinete,»
Dijo la madre al fin, sobre las cejas
Formando una pantalla con la mano
á

Para ver más sin que la luz la ofenda.
«Es un guerrero, sí», dijo Moraima
Á su enano Käel que la hace señas:
«Es un guerrero de Granada, dijo
Aixa á Moraima, tus colores lleva.»
Es, en efecto, un caballero moro,
Que á escape las campiñas atraviesa
Sobre un caballo del desierto, y rápido
Como una nube á la ciudad se acerca.
Dos ó tres veces se perdió cubierto
Por los árboles altos de las huertas,
Y apareció otras tantas, más distinto
Cada vez y más próximo. Las cercas
Dobló de los jardines exteriores,
Cruzó las intrincadas callejuelas
Del arrabal y entró por Bib-Elvira,
Por el vigía al conocerle abierta.
«Vamos á recibirle»,—exclamó Aixa.
«Vamos», dijo Moraima: y, la escalera
Tomando de la torre, las Sultanas
Bajaron de la Alhambra hasta la puerta.
Un momento después, bajo del arco
De la justicia, la rendida yegua
Del caballero moro desplomóse
Ante los pies de su jinete muerta.
Era el bizarro Cid-Kaleb, amigo
De Abú-Abdil, quien respirando apenas
Dobló ante las Sultanas la rodilla,
Mas sin poder hablar. En su impaciencia
Hirió Aixa el suelo con la planta y dijo:
«Habla: ¿qué es de Bu-Abdil?—Hacia la tierra
Cristiana con la mano señalando,
Respondió Cid-Kaleb:—¡Allá se queda!
—¿Muerto?—Cautivo.—¿Y Aly-Athár?—Sin vida,
Su cuerpo el agua del Genil se lleva.
¡Cayó sobre los Árabes el cielo

Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.
More than just a book-buying platform, we strive to be a bridge
connecting you with timeless cultural and intellectual values. With an
elegant, user-friendly interface and a smart search system, you can
quickly find the books that best suit your interests. Additionally,
our special promotions and home delivery services help you save time
and fully enjoy the joy of reading.
Join us on a journey of knowledge exploration, passion nurturing, and
personal growth every day!
ebookbell.com