Advances In Natural Multimodal Dialogue Systems 1st Edition Niels Ole Bernsen

jneidlaawd 7 views 78 slides May 19, 2025
Slide 1
Slide 1 of 78
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77
Slide 78
78

About This Presentation

Advances In Natural Multimodal Dialogue Systems 1st Edition Niels Ole Bernsen
Advances In Natural Multimodal Dialogue Systems 1st Edition Niels Ole Bernsen
Advances In Natural Multimodal Dialogue Systems 1st Edition Niels Ole Bernsen


Slide Content

Advances In Natural Multimodal Dialogue Systems
1st Edition Niels Ole Bernsen download
https://ebookbell.com/product/advances-in-natural-multimodal-
dialogue-systems-1st-edition-niels-ole-bernsen-4239206
Explore and download more ebooks at ebookbell.com

Here are some recommended products that we believe you will be
interested in. You can click the link to download.
Advances In Natural Humanmade And Coupled Humannatural Systems
Research Volume 3 Svetlana G Maximova
https://ebookbell.com/product/advances-in-natural-humanmade-and-
coupled-humannatural-systems-research-volume-3-svetlana-g-
maximova-48074168
Advances In Natural Humanmade And Coupled Humannatural Systems
Research Volume 1 Svetlana G Maximova
https://ebookbell.com/product/advances-in-natural-humanmade-and-
coupled-humannatural-systems-research-volume-1-svetlana-g-
maximova-49419524
Advances In Natural Humanmade And Coupled Humannatural Systems
Research Volume 2 Svetlana G Maximova
https://ebookbell.com/product/advances-in-natural-humanmade-and-
coupled-humannatural-systems-research-volume-2-svetlana-g-
maximova-49446258
Advances In Natural Hazards And Volcanic Risks Shaping A Sustainable
Future Proceedings Of The 3rd International Workshop On Natural
Hazards Nathaz22 Terceira Islandazores 2022 Ana Malheiro
https://ebookbell.com/product/advances-in-natural-hazards-and-
volcanic-risks-shaping-a-sustainable-future-proceedings-of-the-3rd-
international-workshop-on-natural-hazards-nathaz22-terceira-
islandazores-2022-ana-malheiro-49596496

Advances In Natural Gas Formation Processing And Applications Volume 8
Natural Gas Process Modelling And Simulation 1st Edition Mohammad Reza
Rahimpour
https://ebookbell.com/product/advances-in-natural-gas-formation-
processing-and-applications-volume-8-natural-gas-process-modelling-
and-simulation-1st-edition-mohammad-reza-rahimpour-57625500
Advances In Natural Language Processing 7th International Conference
On Nlp Icetal 2010 Reykjavik Iceland August 1618 2010 Proceedings
Hrafn Loftsson
https://ebookbell.com/product/advances-in-natural-language-
processing-7th-international-conference-on-nlp-icetal-2010-reykjavik-
iceland-august-1618-2010-proceedings-hrafn-loftsson-2530720
Advances In Natural Computation Fuzzy Systems And Knowledge Discovery
Proceedings Of The Icncfskd 2021 Lecture Notes On Data Engineering And
Communications Technologies 89 1st Ed 2022 Quan Xie Editor
https://ebookbell.com/product/advances-in-natural-computation-fuzzy-
systems-and-knowledge-discovery-proceedings-of-the-
icncfskd-2021-lecture-notes-on-data-engineering-and-communications-
technologies-89-1st-ed-2022-quan-xie-editor-38182472
Advances In Natural Gas Technology H Almegren
https://ebookbell.com/product/advances-in-natural-gas-technology-h-
almegren-4113406
Advances In Natural Medicines Nutraceuticals And Neurocognition Con
Stough Andrew Scholey
https://ebookbell.com/product/advances-in-natural-medicines-
nutraceuticals-and-neurocognition-con-stough-andrew-scholey-4181722

Advances in Natural Multimodal Dialogue Systems

Text, Speech and Language Technology
Series Editors
Nancy Ide, Vassar College, New York
Jean Véronis, Université de Provence andCNRS, France
Editorial Board
Harald Baayen, Max Planck Institute for Psycholinguistics, The Netherlands
Kenneth W. Church, AT & T Bell Labs, New Jersey, USA
Judith Klavans, Columbia University, New York, USA
David T. Barnard, University of Regina, Canada
Dan Tufis, Romanian Academy of Sciences, Romania
Joaquim Llisterri, Universitat Auton ma de Barcelona, Spain
Stig Johansson, University of Oslo, Norway
Joseph Mariani, LIMSI-CNRS, France
The titles published in this series are listed at the end of this volume.
ò
VOLUME 30

and
University of Southern Denmark, Odense, Denmark
Edited by
Advances in Natural
Multimodal Dialogue
Systems
Jan C.J. van Kuppevelt
ærLaila Dybkj
Niels Ole Bernsen
University of Southern Denmark, Odense, Denmark
Waalre, The Netherlands

A C.I.P. Catalogue record for this book is available from the Library of Congress.
All Rights Reserved
No part of this work may be reproduced, stored in a retrieval system, or transmitted
in any form or by any means, electronic, mechanical, photocopying, microfilming, recording
or otherwise, without written permission from the Publisher, with the exception
of any material supplied specifically for the purpose of being entered
and executed on a computer system, for exclusive use by the purchaser of the work.
Printed in the Netherlands
Published by Springer,
Printed on acid-free paper
P.O. Box 17, 3300 AA Dordrecht, The Netherlands.
ISBN-10 1-4020-3932-8 (HB)
ISBN-10 1-4020-3934-4 (PB)
ISBN-10 1-4020-3933-6 (e-book)
ISBN-13 978-1-4020-3934-8 (PB)
ISBN-13 978-1-4020-3032-4 (HB)
ISBN-13 978-1-4020-3933-1 (e-book)
www.springer.com
© 2005 Springer

Contents
Preface xi
1
Natural and Multimodal Interactivity Engineering - Directions and Needs1
Niels Ole Bernsen and Laila Dybkjaer
1. Introduction 1
2. Chapter Presentations 2
3. NMIE Contributions by the Included Chapters 7
4. Multimodality and Natural Interactivity 16
References 19
Part I Making Dialogues More Natural: Empirical Work and Applied Theory
2
Social Dialogue with Embodied Conversational Agents 23
Timothy BickmoreandJustine Cassell
1. Introduction 23
2. Embodied Conversational Agents 25
3. Social Dialogue 29
4. Related Work 32
5. Social Dialogue in REA 36
6. A Study Comparing ECA Social Dialogue with Audio-Only Social
Dialogue 40
7. Conclusion 48
References 49
3
A First Experiment in Engagement for Human-Robot Interaction in Host-
ing Activities
55
Candace L. SidnerandMyroslava Dzikovska
1. Introduction 55
2. Hosting Activities 56
3. What is Engagement? 57
4. First Experiment in Hosting: A Pointing Robot 59
5. Making Progress on Hosting Behaviours 62
6. Engagement for Human-Human Interaction 63
7. Computational Modelling of Human-Human Hosting and Engage-
ment 70
8. A Next Generation Mel 73
v

vi Advances in Natural Multimodal Dialogue Systems
9. Summary 74
References 74
Part II Annotation and Analysis of Multimodal Data: Speech and Gesture
4
FORM 79
Craig H. Martell
1. Introduction 79
2. Structure of FORM 80
3. Annotation Graphs 85
4. Annotation Example 86
5. Preliminary Inter-Annotator Agreement Results 88
6. Conclusion: Applications to HLT and HCI? 90
Appendix: Other Tools, Schemes and Methods of Gesture Analysis 91
References 95
5
On the Relationships among Speech, Gestures, and Object Manipulation
in Virtual Environments: Initial Evidence
97
Andrea Corradini and Philip R. Cohen
1. Introduction 97
2. Study 99
3. Data Analysis 101
4. Results 103
5. Discussion 106
6. Related Work 106
7. Future Work 108
8. Conclusions 108
Appendix: Questionnaire MYST III - EXILE 110
References 111
6
Analysing Multimodal Communication 113
Patrick G. T. Healey, Marcus Colman and Mike Thirlwell
1. Introduction 113
2. Breakdown and Repair 117
3. Analysing Communicative Co-ordination 125
4. Discussion 126
References 127
7
Do Oral Messages Help Visual Search? 131
Noëlle Carbonell and Suzanne Kieffer
1. Context and Motivation 131
2. Methodology and Experimental Set-Up 134
3. Results: Presentation and Discussion 141
4. Conclusion 153
References 154

Contents vii
8
Geometric and Statistical Approaches to Audiovisual Segmentation159
Trevor Darrell, John W. Fisher III, Kevin W. Wilson, and Michael R. Siracusa
1. Introduction 159
2. Related Work 160
3. Multimodal Multisensor Domain 162
4. Results 166
5. Single Multimodal Sensor Domain 167
6. Integration 175
References 178
Part III Animated Talking Heads and Evaluation
9
The Psychology and Technology of Talking Heads: Applications in Lan-
guage Learning
183
Dominic W. Massaro
1. Introduction 183
2. Facial Animation and Visible Speech Synthesis 184
3. Speech Science 191
4. Language Learning 194
5. Research on the Educational Impact of Animated Tutors 197
6. Summary 210
References 211
10
Effective Interaction with Talking Animated Agents in Dialogue Systems215
Björn Granström and David House
1. Introduction 215
2. The KTH Talking Head 217
3. Effectiveness in Intelligibility and Information Presentation 219
4. Effectiveness in Interaction 223
5. Experimental Applications 231
6. The Effective Agent as a Language Tutor 235
7. Experiments and 3D Recordings for the Expressive Agent 237
References 239
11
Controlling the Gaze of Conversational Agents 245
Dirk Heylen, Ivo van Es, Anton Nijholt and Betsy van Dijk
1. Introduction 245
2. Functions of Gaze 248
3. The Experiment 252
4. Discussion 258
5. Conclusion 260
References 260

viii Advances in Natural Multimodal Dialogue Systems
Part IV Architectures and Technologies for Advanced and Adaptive Multimodal
Dialogue Systems
12
MIND: A Context-Based Multimodal Interpretation Framework in Con-
versational Systems
265
Joyce Y. Chai,Shimei Pan and Michelle X. Zhou
1. Introduction 265
2. Related Work 267
3. MIND Overview 267
4. Example Scenario 268
5. Semantics-Based Representation 270
6. Context-Based Multimodal Interpretation 277
7. Discussion 282
References 283
13
A General Purpose Architecture for Intelligent Tutoring Systems287
Brady Clark, Oliver Lemon, Alexander Gruenstein, Elizabeth Owen Bratt,
John Fry, Stanley Peters, Heather Pon-Barry, Karl Schultz,
Zack Thomsen-Gray and Pucktada Treeratpituk
1. Introduction 287
2. An Intelligent Tutoring System for Damage Control 288
3. An Architecture for Multimodal Dialogue Systems 294
4. Activity Models 295
5. Dialogue Management Architecture 298
6. Benefits of ACI for Intelligent Tutoring Systems 301
7. Conclusion 302
References 303
14
MIAMM – A Multimodal Dialogue System using Haptics 307
Norbert Reithinger, Dirk Fedeler,Ashwani Kumar,Christoph Lauer,
Elsa PecourtandLaurent Romary
1. Introduction 308
2. Haptic Interaction in a Multimodal Dialogue System 309
3. Visual Haptic Interaction – Concepts in MIAMM 313
4. Dialogue Management 319
5. The Multimodal Interface Language (MMIL) 326
6. Conclusion 331
References 331
15
Adaptive Human-Computer Dialogue 333
Sorin Dusan and James Flanagan
1. Introduction 333
2. Overview of Language Acquisition 334
3. Dialogue Systems 337
4. Language Knowledge Representation 340
5. Dialogue Adaptation 341
6. Experiments 347

Contents ix
7. Conclusion 351
References 353
16
Machine Learning Approaches to Human Dialogue Modelling 355
Yorick Wilks, Nick Webb, Andrea Setzer, Mark Hepple and Roberta Catizone
1. Introduction 355
2. Modality Independent Dialogue Management 357
3. Learning to Annotate Utterances 362
4. Future work: Data Driven Dialogue Discovery 366
5. Discussion 367
References 368
Index 371

Preface
The chapters in this book jointly contribute to what we shall call the field
of natural and multimodal interactive systems engineering. This is not yet a
well-established field of research and commercial development but, rather, an
emerging one in all respects. It brings together, in a process that, arguably, was
bound to happen, contributors from many different, and often far more estab-
lished, fields of research and industrial development. To mention but a few,
these include speech technology, computer graphics and computer vision. The
field’s rapid expansion seems driven by a shared vision of the potential of new
interactive modalities of information representation and exchange for radically
transforming the world of computer systems, networks, devices, applications,
etc. from the GUI (graphical user interface) paradigm into something which
will enable a far deeper and much more intuitive and natural integration of
computer systems into people’s work and lives.
Jointly, the chapters present a broad and detailed picture of where natural
and multimodal interactive systems engineering stands today. The book is
based on selected presentations made at the International Workshop on Natural,
Intelligent and Effective Interaction in Multimodal Dialogue Systems held in
Copenhagen, Denmark, in 2002 and sponsored by the European CLASS pro-
ject. CLASS was initiated on the request of the European Commission with the
purpose of supporting and stimulating collaboration among Human Language
Technology (HLT) projects as well as between HLT projects and relevant pro-
jects outside Europe. The purpose of the workshop was to bring together re-
searchers from academia and industry to discuss innovative approaches and
challenges in natural and multimodal interactive systems engineering.
The Copenhagen 2002 CLASS workshop was not just a very worthwhile
event in an emerging field due to the general quality of the papers presented.
It was also largely representative of the state of the art in the field. Given the
increasing interest in natural interactivity and multimodality and the excellent
quality of the work presented, it was felt to be timely to publish a book reflect-
ing recent developments. Sixteen high-quality papers from the workshop were
selected for publication. Content-wise, the chapters in this book illustrate most
aspects of natural and multimodal interactive systems engineering: applicable
xi

xii Advances in Natural Multimodal Dialogue Systems
theory, empirical work, data annotation and analysis, enabling technologies,
advanced systems, re-usability of components and tools, evaluation, and future
visions. The selected papers have all been reviewed, revised, extended, and
improved after the workshop.
We are convinced that people who work in natural interactive and multi-
modal dialogue systems engineering – from graduate students and Ph.D. stu-
dents to experienced researchers and developers, and no matter exactly which
community they come from originally - may find this collection of papers in-
teresting and useful to their own work.
We would like to express our sincere gratitude to all those who helped us in
preparing this book. Especially we would like to thank all reviewers for their
valuable and extensive comments and criticism which have helped improve the
quality of the individual chapters as well as the entire book.
THEEDITORS

Chapter 1
NATURAL AND MULTIMODAL INTERACTIVITY
ENGINEERING - DIRECTIONS AND NEEDS
Niels Ole Bernsen and Laila Dybkjær
Natural Interactive Systems Laboratory
University of Southern Denmark
Campusvej 55, 5230 Odense M, Denmark
{nob, laila}@nis.sdu.dk
Abstract This introductory chapter discusses the field of natural and multimodal interac-
tivity engineering and presents the following 15 chapters in this context. A brief
presentation of each chapter is given, their contributions to specific natural and
multimodal interactivity engineering needs are discussed, and the concepts of
multimodality and natural interactivity are explained along with an overview of
the modalities investigated in the 15 chapters.
Keywords:Natural and multimodal interactivity engineering.
1. Introduction
Chapters 2 through 16 of this book present original contributions to the
emerging field of natural and multimodal interactivity engineering (hence-
forth NMIE). A prominent characteristic of NMIE is that the field is not yet
an established field of research and commercial development but, rather, an
emerging one in all respects, including applicable theory, experimental results,
platforms and development environments, standards (guidelines, de facto stan-
dards, official standards), evaluation paradigms, coherence, ultimate scope, en-
abling technologies for software engineering, general topology of the field it-
self, ”killer applications”, etc.
The NMIE field is vast and brings together practitioners from very many
different, and often far more established, fields of research and industrial de-
velopment, such as signal processing, speech technology, computer graphics,
computer vision, human-computer interaction, virtual and augmented reality,
non-speech sound, haptic devices, telecommunications, computer games, etc.
1
© 2005Springer. Printed in the Netherlands.
J.C.J. van Kuppevelt et al. (eds.), Advances in Natural Multimodal Dialogue Systems, 1–19.

2 Advances in Natural Multimodal Dialogue Systems
Table 1.1.Needs for progress in natural and multimodal interactivity engineering (NMIE).
General Specific
Understand issues, Applicable theory: for any aspect of NMIE.
problems, solutionsEmpirical work and analysis: controlled experiments, behavioural
studies, simulations, scenario studies, task analysis on roles of, and
collaboration among, specific modalities to achieve various bene-
fits.
Annotation and analysis: new quality data resources, coding
schemes, coding tools, and standards.
Future visions: visions, roadmaps, etc., general and per sub-area.
Build systems Enabling technologies: new basic technologies needed.
More advanced systems: new, more complex, versatile, and capa-
ble system aspects.
Make it easy: Re-usable platforms, components, toolkits, architec-
tures, interface languages, standards, etc.
Evaluate Evaluate all aspects: of components, systems, technologies,
processes, etc.
It may be noted that the fact that a field of research has been established over
decades in its own right is fully compatible with many if not most of its practi-
tioners being novices in NMIE. It follows that NMIE community formation is
an ongoing challenge for all.
Broadly speaking, the emergence of a new systems field, such as NMIE,
takesunderstandingof issues, problems and solutions, knowledge and skills
forbuilding(or developing) systems and enabling technologies, andevaluation
of any aspect of the process and its results. In the particular case of NMIE,
these goals or needs could be made more specific as shown in Table 1.1.
Below, we start with a brief presentation of each of the following 15 chapters
(Section 2). Taking Table 1.1 as point of departure, Section 3 provides an
overview of, and discusses the individual chapters’ contributions to specific
NMIE needs. Section 4 explains multimodality and natural interactivity and
discusses the modalities investigated in the chapters of this book.
2. Chapter Presentations
We have found it appropriate to structure the 15 chapters into four parts un-
der four headlines related to how the chapters contribute to the specific NMIE
needs listed in Table 1.1. Each chapter has a main emphasis on issues that
contribute to NMIE needs captured by the headline of the part of the book to
which it belongs. The division can of course not be a sharp one. Several chap-

Natural and Multimodal Interactivity Engineering - Directions and Needs3
ters include discussions of issues that would make them fit into several parts of
the book.
Part one focuses on making dialogues more natural and has its main em-
phasis on experimental work and the application of theory. Part two concerns
annotation and analysis of multimodal data, in particular, the modalities of
speech and gesture. Part three addresses animated talking heads and related
evaluation issues. Part four covers issues in building advanced multimodal
dialogue systems, including architectures and technologies.
2.1 Making Dialogues More Natural: Empirical Work
and Applied Theory
Two chapters have been categorised under this headline. They both aim at
making interaction with a virtual or physical agent more natural. Experimental
work is central in both chapters and so is the application of theory.
The chapter by Bickmore and Cassell (Chapter 2) presents an empirical
study of the social dialogue of an embodied conversational real-estate agent
during interaction with users. The study is a comparative one. In one set-
ting, users could see the agent while interacting with it. In the second setting,
users could talk to the agent but not see it. The paper presents several interest-
ing findings on social dialogue and interaction with embodied conversational
agents.
Sidner and Dzikovska (Chapter 3) present empirical results from human-
robot interaction in hosting activities. Their focus is on engagement, i.e. on
the establishment and maintenance of a connection between interlocutors and
the ending of it when desired. The stationary robot can point, make beat ges-
tures, and move its eyes while conducting dialogue and tutoring on the use of
a gas turbine engine shown to the user on a screen. The authors also discuss
human-human hosting and engagement. Pursuing and building on theory of
human-human engagement and using input from experiments, their idea is to
continue to add capabilities to the robot which will make it become better to
show engagement.
2.2 Annotation and Analysis of Multimodal Data: Speech
and Gesture
The five chapters in this part have a common emphasis on data analysis.
While one chapter focuses on annotation of already collected data, the other
four chapters all describe experiments with data collection. In all cases, the
data are subsequently analysed in order to, e.g., provide new knowledge on
conversation or show whether a hypothesis was true or not.
Martell (Chapter 4) presents the FORM annotation scheme and illustrates
its use. FORM enables annotators to mark up the kinematic information of

4 Advances in Natural Multimodal Dialogue Systems
gestures in videos. Although designed for gesture markup, FORM is also de-
signed to be extensible to markup of speech and other conversational informa-
tion. The goal is to establish an extensible corpus of annotated videos which
can be used for research in various aspects of conversational interaction. In
an appendix, Martell provides a brief overview of other tools, schemes and
methods for gesture analysis.
Corradini and Cohen (Chapter 5) report on a Wizard-of-Oz study in which
it was investigated how people use gesture and speech during interaction with
a video game when they do not have access to standard input devices. The
subjects’ interaction with the game was recorded, transcribed, further coded,
and analysed. The primary analysis of the data concerns the users’ use of
speech-only, gesture-only and speech and gesture in combination. Moreover,
subjective data were collected by asking subjects after the experiment about
their modality preferences during interaction. Although subjects’ answers and
their actual behaviour in the experiment did not always match, the study indi-
cates a preference for multimodal interaction.
The chapter by Healey et al. (Chapter 6) addresses the analysis of human-
human interaction. The motivation is the lack of support for the design of
systems for human-human interaction. They discuss two psycholinguistic ap-
proaches and propose a third approach based on the detection and resolution
of communication problems. This approach is useful for measuring the effec-
tiveness of human-human interaction across tasks and modalities. A coding
protocol for identification of repair phenomena across different modalities is
presented followed by evaluation results from testing the protocol on a small
corpus of repair sequences. The presented approach has the potential to help
in judging the effectiveness of multimodal communication.
Carbonell and Kieffer (Chapter 7) report on an experimental study which
investigated if oral messages facilitate visual search tasks on a crowded dis-
play. Using the mouse, subjects were asked to search and select visual targets
in complex scenes presented on the screen. Before the presentation of each
scene, the subject would either see the target alone, receive an oral description
of the target and spatial information about its position in the scene, or get a
combination of the visual and oral target descriptions. Analysis of the data
collected suggests that appropriate oral messages can improve search accuracy
as well as selection time. However, both objectively and subjectively, multi-
modal messages were most effective.
Darrell et al. (Chapter 8) discuss the problem of knowing who is speaking
during multi-speaker interaction with a computer. They present two methods
based on geometric and statistical source separation approaches, respectively.
These methods are used for audiovisual segmentation of multiple speakers and
have been used in experiments. One setup in a conference room used several
stereo cameras and a ceiling-mounted microphone array grid. In this case a

Natural and Multimodal Interactivity Engineering - Directions and Needs5
geometric method was used to identify the speaker. In a second setup, involv-
ing use of a handheld device or a kiosk, a single camera and a single omni-
directional microphone were used and a statistical method applied for speaker
identification. Data analysis showed that each approach was valuable in the
intended domain. However, the authors propose that a combination of the two
methods would be of benefit and initial integration efforts are discussed.
2.3 Animated Talking Heads and Evaluation
Three chapters address animated talking heads. These chapters all include
experimental data collection and data analysis, and present evaluations of var-
ious aspects of talking head technology and its usability.
The chapter by Massaro (Chapter 9) concerns computer-assisted speech and
language tutors for the deaf, hard of hearing and autistic children. The author
presents an animated talking head for visible speech synthesis. The skin of the
head can be made transparent so that one can see the tongue and the palate. The
technology has been used in a language training program in which the agent
guides the user through a number of exercises in vocabulary and grammar. The
aim is to improve speech articulation and develop linguistic and phonological
awareness in the users. The reported experiments show positive learning re-
sults.
Granström and House (Chapter 10) describe work on animated talking
heads, focusing on the increased intelligibility and efficiency provided by the
addition of the talking face which uses text-to-speech synthesis. Results from
various studies are presented. The studies include intelligibility tests and per-
ceptual evaluation experiments. Among other things, facial cues to convey,
e.g., feedback, turn-taking, and prosodic functions like prominence have been
investigated. A number of applications of the talking head are described, in-
cluding a language tutor.
Heylen et al. (Chapter 11) report on how different eye gaze behaviours of
a cartoon-like talking face affect the interaction with users. Three versions of
the talking face were included in an experiment in which users had to make
concert reservations by interacting with the talking face through typed input.
One version of the face was aimed to be a close approximation to human-like
gaze behaviour. In a second version gaze shifts were kept minimal, and in a
third version gaze shifts were random. Evaluation of data from the experiment
clearly showed a better performance of, and preference for, the human-like
version.

6 Advances in Natural Multimodal Dialogue Systems
2.4 Architectures and Technologies for Advanced and
Adaptive Multimodal Dialogue Systems
The last part of this book comprises five chapters which all have a strong fo-
cus on aspects of developing advanced multimodal dialogue systems. Several
of the chapters present architectures which may be reused across applications,
while others emphasise learning and adaptation.
The chapter by Chai et al. (Chapter 12) addresses the difficult task of in-
terpreting multimodal user input. The authors propose to use a fine-grained
semantic model that characterises the meaning of user input and the overall
conversation, and an integrated interpretation approach drawing on context
knowledge, such as conversation histories and domain knowledge. These two
approaches are discussed in detail and are included in the multimodal inter-
pretation framework presented. The framework is illustrated by a real-estate
application in which it has been integrated.
Clark et al. (Chapter 13) discuss the application of a general-purpose ar-
chitecture in support of multimodal interaction with complex devices and ap-
plications. The architecture includes speech recognition, natural language un-
derstanding, text-to-speech synthesis, an architecture for conversational intel-
ligence, and use of the Open Agent Architecture. The architecture takes ad-
vantage of reusability and has been deployed in a number of dialogue systems.
The authors report on its deployment in an intelligent tutoring system for ship-
board damage control. Details about the tutoring system and the architecture
are presented.
Reithinger et al. (Chapter 14) present a multimodal system for access to
multimedia databases on small handheld devices. Interaction in three lan-
guages is supported. Emphasis is on haptic interaction via active buttons com-
bined with spoken input and visual and acoustic output. The overall architec-
ture of the system is explained and so is the format for data exchange between
modules. Also, the dialogue manager is described, including its architecture
and multimodal fusion issues.
Dusan and Flanagan (Chapter 15) address the difficult issue of ensuring suf-
ficient vocabulary coverage in a spoken dialogue system. To overcome the
problem that there may always be a need for additional words or word forms,
the authors propose a method for adapting the vocabulary of a spoken dialogue
system at run-time. Adaptation is done by the user by adding new concepts to
existing pre-programmed concept classes and by providing semantic informa-
tion about the new concepts. Multiple input modalities are available for doing
the adaptation. Positive results from preliminary experiments with the method
are reported.
Wilks et al. (Chapter 16) discuss machine learning approaches to the mod-
elling of human-computer interaction. They first describe a dialogue manager

Natural and Multimodal Interactivity Engineering - Directions and Needs7
built for multimodal dialogue handling. The dialogue manager uses a kind of
stereotypical dialogue patterns, called Dialogue Action Frames, for represen-
tation. The authors then describe an analysis module which learns to assign
dialogue acts and semantic contents from corpora. The idea is to enable auto-
matic derivation of Dialogue Action Frames, so that the dialog manager will
be able to use Dialogue Action Frames that are automatically leaned from cor-
pora.
3. NMIE Contributions by the Included Chapters
Using the righthand column entries of Table 1.1, Table 1.2 indicates how the
15 chapters in this book contribute to the NMIE field.
A preliminary conclusion based on Table 1.2 is that, for an emerging field
which is only beginning to be exploited commercially, the NMIE research be-
ing done world-wide today is already pushing the frontiers in many of the
directions needed.
In the following, we discuss the chapter contributions to each of the lefthand
entries in Table 1.2.
3.1 Applicable Theory
It may be characteristic of the NMIE field at present that our sample of pa-
pers only includes a single contribution of a primarily theoretical nature, i.e.
Healey et al., which applies a psycholinguistic model of dialogue to help iden-
tify a subset of communication problems in order to judge the effectiveness of
multimodal communication. Human-machine communication problems, their
nature and identification by human or machine, has recently begun to attract
the attention of more than a few NMIE researchers, and it has become quite
clear that we need far better understanding of miscommunication in natural
and multimodal interaction than we have at present.
However, the relative absence of theoretical papers is not characteristic in
the sense that the field does not make use of, or even need, applicable theory.
On the contrary, a large number of chapters actually do apply existing theory
in some form, ranging from empirical generalisations to full-fledged theory
of many different kinds. For instance, Bickmore and Cassell test generalisa-
tions on the effects on communication of involving embodied conversational
agents; Carbonell and Kieffer apply modality theory; Chai et al. apply theories
of human-human dialogue to the development of a fined-grained, semantics-
based multimodal dialogue interpretation framework; Massaro applies theories
of human learning; and Sidner and Dzikovska draw on conversation and col-
laboration theory.

8 Advances in Natural Multimodal Dialogue Systems
Table 1.2.NMIE needs addressed by the chapters in this book.
Specific to NMIE Contributions
Applicable theory No new theory except 6 but plenty of applied theory, e.g. 2, 3.
Empirical work and
analysis
Effects on communication of animated conversational agents. 2,
10, 11
Spoken input in support of visual search. 7
Gesture and speech for video game playing. 5
Multi-speaker speech recognition. 8
Gaze behaviour for more likeable animated interface agents. 2, 11
Audio-visual speech output. 9, 10
Animated talking heads for language learning. 9, 10
Tutoring. 3, 9, 10, 13
Hosting robots. 3
Annotation and analysis Coding scheme for conversational interaction research. 4
Standard for internal representation of NMIE data codings. 4
Future visions Many papers with visions concerning new challenges in their re-
search.
Enabling technologies Interactive robotics: robots controlled multimodally, tutoring and
hosting robots. 3
Multi-speaker speech recognition. 8
Audio-visual speech synthesis for talking heads. 9, 10
Machine learning of language and dialogue acts assignment. 15,
16
More advanced systems Multilinguality. 14
Ubiquitous (mobile) application. 14
On-line observation-based user modelling for adaptivity. 12, 14
Complex natural interactive dialogue management. 12, 13, 14, 16
Machine learning for more advanced dialogue systems. 15, 16
Make it easy Platform for natural interactivity. 6
Re-usable components (many papers).
Architectures for multimodal systems and dialogue management.
3, 12, 13, 14, 16, 16
Multimodal interface language. 14
XML for data exchange. 10, 14
Evaluate Effects on communication of animated conversational agents. 2,
10, 11
Evaluations of talking heads. 9, 10, 11
Evaluation of audio-visual speech synthesis for learning. 9, 10
3.2 Empirical Work and Analysis
Novel theory tends to be preceded by empirical exploration and generalisa-
tion. The NMIE field is replete with empirical studies of human-human and
human-computer natural and multimodal interaction [Dehn and van Mulken,

Natural and Multimodal Interactivity Engineering - Directions and Needs9
2000]. By their nature, empirical studies are closer to the process of engineer-
ing than is theory development. We build NMIE research systems not only
from theory but, perhaps to a far greater extent, from hunches, contextual as-
sumptions, extrapolations from previous experience and untried transfer from
different application scenarios, user groups, environments, etc., or even Wiz-
ard of Oz studies, which are in themselves a form of empirical study, see, e.g.,
the chapter by Corradini and Cohen and [Bernsen et al., 1998]. Having built
a prototype system, we are keen to find out how far those hunches, etc. got
us. Since empirical testing, evaluation, and assessment are integral parts of
software and systems engineering, all we have to do is to include ”assump-
tions testing” in the empirical evaluation of the implemented system which we
would be doing anyway.
The drawback of empirical studies is that they usually do not generalise
much due to the multitude of independent variables involved. This point is
comprehensively argued and illustrated for the general case of multimodal and
natural interactive systems which include speech in [Bernsen, 2002]. Still,
as we tend to work on the basis of only slightly fortified hunches anyway,
the results could often serve to inspire fellow researchers to follow them up.
Thus, best-practice empirical studies are of major importance in guiding NMIE
progress.
The empirical chapters in this book illustrate well the points made above.
One cluster of findings demonstrate the potential of audio-visual speech out-
put by animated talking heads for child language learning (Massaro) and, more
generally, for improving intelligibility and efficiency of human-machine com-
munication, including the substitution of facial animation for the, still-missing,
prosody in current speech synthesis systems (Granström and House). In count-
er-point, so to speak, Darrell et al. convincingly demonstrate the advantage of
audio-visual input for tackling an important next step in speech technology,
i.e. the recognition of multi-speaker spoken input. Jointly, the three chapters
do a magnificent job of justifying the need for natural and multimodal (audio-
visual) interaction independently of any psychological or social-psychological
argument in favour of employing animated conversational agents.
A key question seems to be: for which purpose(s), other than harvesting the
benefits of using audio-visual speech input/output described above, do we need
to accompany spoken human-computer dialogue with more or less elaborate
animated conversational interface agents [Dehn and van Mulken, 2000]? By
contrast with spoken output, animated interface agents occupy valuable screen
real estate, do not necessarily add information of importance to the users of
large classes of applications, and may distract the user from the task at hand.
Whilst a concise and comprehensive answer to this question is still pending,
Bickmore and Cassell go a long way towards explaining that the introduction
of life-like animated interface agents into human-computer spoken dialogue is

10 Advances in Natural Multimodal Dialogue Systems
a tough and demanding proposition. As soon as an agent appears on the dis-
play, users tend to switch expectations from talking to a machine to talking to
a human. By comparison, the finding of Heylen et al. that users tend to ap-
preciate an animated cartoon agent more if it shows a minimum of human-like
gaze behaviour might speak in favour of preferring cartoon-style agents over
life-like animated agents because the former do not run the risk of facing our
full expectations to human conversational behaviour. Sidner and Dzikovska do
not involve a virtual agent but, rather, a robot in the dialogue with the user, so
they do not have the problem of an agent occupying part of the screen. But
they have still have the behavioural problems of the robot to look into just as,
by close analogy, do the people who work with virtual agents. The experi-
ments by Sidner and Dzikovska show that there is still a long way to go before
we fully understand and can model the subtle details of human behaviour in
dialogue.
On the multimodal side of the natural interactivity/multimodality semi-
divide, several papers address issues of modality collaboration, i.e., how the
use of modality combinations could facilitate, or even enable, human-computer
interaction tasks that could not be done easily, if at all, using unimodal inter-
action. Carbonell and Kieffer report on how combined speech and graphics
output can facilitate display search, and Corradini and Cohen show how the
optional use of different input modalities can improve interaction in a particu-
lar virtual environment.
3.3 Annotation and Analysis
It is perhaps not surprising that we are not very capable of predicting what
people will do, or how they will behave, when interacting with computer sys-
tems using new modality combinations and possibly also new interactive de-
vices. More surprising, however, is the fact that we are often just as ignorant
when trying to predict natural interactive behaviours which we have the op-
portunity to observe every day in ourselves and others, such as: which kinds
of gestures, if any, do people perform when they are listening to someone else
speaking? This example illustrates that, to understand the ways in which peo-
ple communicate with one another as well as the ways in which people com-
municate with the far more limited, current NMI systems, we need extensive
studies of behavioural data. The study of data on natural and multimodal inter-
action is becoming a major research area full of potential for new discoveries.
A number of chapters make use of, or refer to, data resources for NMIE,
but none of them take a more general view on data resource issues. One
chapter addresses NMIE needs for new coding schemes. Martell presents a
kinematically-based gesture annotation scheme for capturing the kinematic in-
formation in gestures from videos of speakers. Linking the urgent issue of

Natural and Multimodal Interactivity Engineering - Directions and Needs11
new, more powerful coding tools with the equally important issue of standard-
isation, Martell proposes a standard for the internal representation of NMIE
codings.
3.4 Future Visions
None of the chapters have a particular focus on future visions for the NMIE
field. However, many authors touch on future visions, e.g., their descriptions of
future work and what they would like to achieve. This includes the important
driving role of re-usable platforms, tools, and components for making rapid
progress. Moreover, there are several hints at the future importance to NMIE of
two generic enabling technologies which are needed to extend spoken dialogue
systems to full natural interactive systems. These technologies are (i) computer
vision for processing camera input, and (ii) computer animation systems. It is
only recently that the computer vision community has begun to address issues
of natural interactive and multimodal human-system communication, and there
is a long way to go before computer vision can parallel speech recognition as
a major input medium for NMIE.
The chapters by Massaro and Granström and House illustrate current NMIE
efforts to extend natural and multimodal interaction beyond traditional infor-
mation systems to new major application areas, such as training and education
which has been around for a while already, notably in the US-dominated para-
digm of tutoring systems using animated interface agents, but also to edutain-
ment and entertainment. While the GUI, including the current WWW, might
be said to have the edutainment potential of a schoolbook or newspaper, NMIE
systems have the much more powerful edutainment potential of brilliant teach-
ers, comedians, and exiting human-human games.
3.5 Enabling Technologies
An enabling technology is often developed over a long time by some sepa-
rate community, such as by the speech recognition community from the 1950s
to the late 1980s. Having matured to the point at which practical applications
become possible, the technology transforms into an omnipresent tool for sys-
tem developers, as is the case with speech recognition technology today. NMIE
needs a large number of enabling technologies and these are currently at very
different stages of maturity. Several enabling technologies, some of which are
at an early stage and some of which are finding their way into applications, are
presented in this book in the context of their application to NMIE problems, in-
cluding robot interaction and agent technology, multi-speaker interaction and
recognition, machine learning, and talking face technology.
Sidner and Dzikovska focus on robot interaction in the general domain of
”hosting”, i.e., where a virtual or physical agent provides guidance, education,

12 Advances in Natural Multimodal Dialogue Systems
or entertainment based on collaborative goals negotiation and subsequent ac-
tion. A great deal of work remains to be done before robot interaction becomes
natural in any approximate sense of the term. For instance, the robot’s spoken
dialogue capabilities must be strongly improved and so must its embodied ap-
pearance and global communicative behaviours. In fact, Sidner and Dzikovska
make some of the same conclusions as Bickmore and Cassell, namely that
agents need to become far more human-like in all or most respects before they
are really appreciated by humans.
Darrell et al. address the problem in multi-speaker interaction of knowing
who is addressing the computer when. Their approach is to use a combination
of microphones and computer vision to find out who is talking.
Developers of spoken dialogue applications must cope with problems result-
ing from vocabulary and grammar limitations and from difficulties in enabling
much of the flexibility and functionality inherent in human-human communi-
cation. Despite having carried out systematic testing, the developer often finds
that words are missing when a new user addresses the application. Dusan and
Flanagan propose machine learning as a way to overcome part of this problem.
Using machine learning, the system can learn new words and grammars taught
to it by the user in a well-defined way. Wilks et al. address machine learning -
or transformation-based learning - in the context of assigning dialogue acts as
part of an approach to improved dialogue modelling. In another part of their
approach, Wilks et al. consider the use of dialogue action frames, i.e., a set
of stereotypical dialogue patterns which perhaps may be learned from corpus
data, as a means for flexibly switching back and forth between topics during
dialogue.
Granström and House and Massaro describe the gain in intelligibility that
can be obtained by combining speech synthesis with a talking face. There is
still much work to do both on synthesis and face articulation. For most lan-
guages, speech synthesis is still not very natural to listen to and if one wants
to develop a particular voice to fit a certain animated character, this is not
immediately possible with today’s technology. With respect to face articula-
tion, faces need to become much more natural in terms of, e.g., gaze, eyebrow
movements, lip and mouth movements, and head movements, as this seems to
influence users’ perception of the interaction, cf. the chapters by Granström
and House and Heylen et al.
3.6 More Advanced Systems
Enabling technologies for NMIE are often component technologies, and
their description, including state of the art, current research challenges, and
unsolved problems, can normally be made in a relatively systematic and fo-
cused manner. It is far more difficult to systematically describe the complexity

Natural and Multimodal Interactivity Engineering - Directions and Needs13
of the constant push in research and industry towards exploring and exploiting
new NMIE application types and new application domains, addressing new
user populations, increasing the capabilities of systems in familiar domains of
application, exploring known technologies with new kinds of devices, etc. In
general, the picture is one of pushing present boundaries in all or most direc-
tions. During the past few years, a core trend in NMIE has been to combine
different modalities in order to build more complex, versatile and capable sys-
tems, getting closer to natural interactivity than is possible with only a single
modality. This trend is reflected in several chapters.
Part of the NMIE paradigm is that systems must be available whenever
and wherever convenient and useful, making ubiquitous computing an impor-
tant application domain. Mobile devices, such as mobile phones, PDAs, and
portable computers of any (portable) size have become popular and are rapidly
gaining functionality. However, the interface and interaction affordances of
small devices require careful consideration. Reithinger et al. present some of
those considerations in the context of providing access to large amounts of data
about music.
It can be difficult for users to know how to interact with new NMIE appli-
cations. Although not always very successful in practice, the classical GUI
system has the opportunity to present its affordances in static graphics (in-
cluding text) before the user chooses how to interact. A speech-only system,
by contrast, cannot do that because of the dynamic and transitory nature of
acoustic modalities. NMIE systems, in other words, pose radically new de-
mands on how to support the user prior to, and during, interaction. Addressing
this problem, several chapters mention user modelling or repositories of user
preferences built on the basis of interactions with a system, cf. the chapters by
Chai et al. and Reithinger et al.
Machine learning, although another example of less-than-expected pace of
development during the past 10 years, has great potential for increasing inter-
action support. In an advanced application of machine learning, Dusan and
Flanagan propose to increase the system’s vocabulary and grammar by letting
users teach the system new words and their meaning and use. Wilks et al. use
machine learning as part of an approach to more advanced dialogue modelling.
Increasingly advanced systems require increasingly complex dialogue man-
agement, cf. the chapters by Chai et al., Clark et al., and Wilks et al. Life-
likeness of animated interface agents and conversational dialogue are among
the key challenges in achieving the NMIE vision.
Multilinguality of systems is an important NMIE goal. Multilingual ap-
plications are addressed by Reithinger et al. In their case, the application is
running on a handheld device.
Multi-speaker input speech is mentioned by Darrell et al. For good reason,
recognition of multi-speaker input has become a lively research topic. We

14 Advances in Natural Multimodal Dialogue Systems
need solutions in order to, e.g., build meeting minute-takers, separate the focal
speaker’s input from that of other speakers, exploit the huge potential of spoken
multi-user applications, etc.
3.7 Make It Easy
Due to the complexity of multimodal natural interaction, it is becoming dra-
matically important to be able to build systems as easily as possible. It seems
likely that no single research lab or development team in industry, even includ-
ing giants such as Microsoft, is able to master all of the enabling technologies
required for broad-scale NMIE progress. To advance efficiently, everybody
needs access to those system components, and their built-in know-how, which
are not in development focus. This implies strong attention to issues, such
as re-usable platforms, components and architectures, development toolkits,
interface languages, data formats, and standardisation.
Clark et al. have used the Open Agent Architecture (OAA,
http://www.ai.sri.com/∼oaa/), a framework for integrating heterogeneous soft-
ware agents in a distributed environment. What OAA and other architectural
frameworks, such as CORBA (http://www.corba.org/), aim to do is provide
a means for modularisation, synchronous and asynchronous communication,
well-defined inter-module communication via some interface language, such
as IDL (CORBA) or ICL (OAA), and the possibility of implementation in a
distributed environment.
XML (Extensible Markup Language) is a simple, flexible text format de-
rived from SGML (ISO 8879) which has become popular as, among other
things, a message exchange format, cf. Reithinger et al. and Granström and
House. Using XML for wrapping inter-module messages is one way to over-
come the problem of different programming languages used for implementing
different modules.
Some chapters express a need for reusable components. Many of the appli-
cations described include off-the-shelf software, including components devel-
oped in other projects. This is particularly true for mature enabling technolo-
gies, such as speech recognition and synthesis components. As regards multi-
modal dialogue management, there is an expressed need for reuse in, e.g., the
chapter by Clark et al. who discuss a reusable dialogue management architec-
ture in support of multimodal interaction.
In conclusion, there are architectures, platforms, and software components
available which facilitate the building of new NMIE applications, and stan-
dards are underway for certain aspects. There is still much work to be done
on standardisation, new and better platforms, and improvement of component
software. In addition, we need, in particular, more and better toolkits in support
of system development and a better understanding of those components which

Natural and Multimodal Interactivity Engineering - Directions and Needs15
cannot be bought off-the-shelf and which are typically difficult to reuse, such
as dialogue managers. Advancements such as these are likely to require sig-
nificant corpus work. Corpora with tools and annotation schemes as described
by Martell are exactly what is needed in this context.
3.8 Evaluate
Software systems and components evaluation is a broad area, ranging from
technical evaluation over usability evaluation to customer evaluation. Cus-
tomer evaluation has never been a key issue in research but has, rather, tended
to be left to the marketing departments of companies. Technical evaluation
and usability evaluation, including evaluation of functionality from both per-
spectives, are, on the other hand, considered important research issues. The
chapters show a clear trend towards focusing on usability evaluation and com-
parative performance evaluation.
It is hardly surprising that performance evaluation and usability issues are
considered key topics today. We know little about what happens when we
move towards increasingly multimodal and natural interactive systems, both
as regards how these new systems will perform compared to alternative solu-
tions and how the systems will be received and perceived by their users. We
only know that a technically optimal system is not sufficient to guarantee user
satisfaction.
Comparative performance evaluation objectively compares users’ perfor-
mance on different systems with respect to, e.g., how well they understand
speech-only versus speech combined with a talking face or with an embodied
animated agent, cf. Granström and House, Massaro, and Bickmore and Cas-
sell. The usability issues evaluated all relate to users’ perception of a particular
system and include parameters, such as life-likeness, credibility, reliability, ef-
ficiency, personality, ease of use, and understanding quality, cf. Heylen et al.
and Bickmore and Cassell.
Two chapters address how the intelligibility of what is being said can be
increased through visual articulation, cf. Granström and House and Massaro.
Granström and House have used a talking head in several applications, in-
cluding tourist information, real estate (apartment) search, aid for the hearing
impaired, education, and infotainment. Evaluation shows a significant gain
in intelligibility for the hearing impaired. Eyebrow and head movement en-
hance perception of emphasis and syllable prominence. Over-articulation may
be useful as well when there are special needs for intelligibility. The findings
of Massaro support these promising conclusions. His focus is on applications
for the hard-of-hearing, children with autism, and child language learning more
generally. Granström and House also address the increase in efficiency of com-
munication/interaction produced by using an animated talking head. Probably,

16 Advances in Natural Multimodal Dialogue Systems
naturalness is a key point here. This is suggested by Heylen et al. who made
controlled experiments on the effects of different eye gaze behaviours of a
cartoon-like talking face on the quality of human-agent dialogues. The most
human-like agent gaze behaviour led to higher appreciation of the agent and
more efficient task performance.
Bickmore and Cassell evaluate the effects on communication of an embod-
ied conversational real-estate agent versus an over-the-phone version of the
same system, cf. also [Cassell et al., 2000]. In each condition, two variations
of the system was available. One would be fully task-oriented while the sec-
ond version would include some small-talk options. In general, users liked the
system better in the phone condition. In the phone condition, subjects appre-
ciated the small-talk whereas, in the embodied condition, subjects wanted to
get down to business. The implication is that agent embodiment has strong
effects on the interlocutors. Users tend to compare their animated interlocu-
tors with humans rather than machines. To work with users, animated agents
need considerably more naturalness and personally attractive features commu-
nicated non-verbally. This imposes a tall research agenda on both speech and
non-verbal output, requiring conversational abilities both verbally and non-
verbally.
Jointly, the chapters on evaluation demonstrate a broad need for perfor-
mance evaluation, comparative as well as non-comparative, that can inform
us on the possible benefits and shortcomings of new natural interactive and
multimodal systems. The chapters show a similar need for usability evaluation
that can help us find out how users perceive these new systems, and a need for
finding ways in which usability and user satisfaction might be correlated with
technical aspects in order for the former to be derived from the latter.
4. Multimodality and Natural Interactivity
Conceptually, NMIE combines natural interactive and multimodal systems
and components engineering. While both concepts, natural interactivity and
multimodality, have a long history, it would seem that they continue to sit
somewhat uneasily side by side in the minds of most of us.Multimodality
is the idea of being able to choose any input/output modality or combination
of input/output modalities for optimising interaction with the application at
hand, such as speech input for many heads-up, hands-occupied applications,
speech and haptic input/output for applications for the blind, etc. Amodality
is a particular way of representing input or output information in somephys-
ical medium, such as something touchable, light, sound, or the chemistry for
producing olfaction and gustation [Bernsen, 2002], see also the chapter by
Carbonell and Kieffer. The physical medium of the speech modalities, for
instance, is sound or acoustics but this medium obviously enables the trans-

Natural and Multimodal Interactivity Engineering - Directions and Needs17
mission of information in many acoustic modalities other than speech, such
as earcons, music, etc. The term multimodality thus refers to any possible
combination of elementary orunimodalmodalities.
Compared to multimodality, the notion ofnatural interactivityappears to
be the more focused of the two. This is because natural interactivity comes
with a focused vision of the future of interaction with computer systems as
well as a relatively well-defined set of modalities required for the vision to be-
come reality. The natural interactivity vision is that of humans communicating
with computer systems in the same ways in which humans communicate with
one another. Thus, natural interactivity specifically emphasises human-system
communication involving the following input/output modalities used in situ-
ated human-human communication: speech, gesture, gaze, facial expression,
head and body posture, and object manipulation as integral part of the com-
munication (or dialogue). As the objects being manipulated may themselves
represent information, such as text and graphics input/output objects, natural
interaction subsumes the GUI paradigm. Technologically, the natural inter-
activity vision is being pursued vigorously by, among others, the emerging
research community in talking faces and embodied conversational agents as
illustrated in the chapters by Bickmore and Cassell, Granström and House,
Heylen et al., and Massaro. An embodied conversational agent may be either
virtual or a robot, cf. the chapter by Sidner and Dzikovska.
A weakness in our current understanding of natural interactivity is that it
is not quite clear where to draw the boundary between the natural interactiv-
ity modalities and all those other modalities and modality combinations which
could potentially be of benefit to human-system interaction. For instance, isn’t
pushing a button on the mouse or otherwise, although never used in human-
human communication for the simple reason that humans do not have com-
municative buttons on them, as natural as speaking? If it is, then, perhaps, all
or most research on useful multimodal input/output modality combinations is
also research into natural interactivity even if the modalities addressed are not
being used in human-human communication? In addition to illustrating the
need for more and better NMIE theory, the point just made may explain the
uneasy conceptual relationship among the two paradigms of natural interactiv-
ity and multimodality. In any case, we have decided to combine the paradigms
and address them together as natural and multimodal interactivity engineering.
Finally, by NMI ’engineering’ we primarily refer to software engineering. It
follows that the expression ’natural and multimodal interactivity engineering’
primarily represents the idea of creating a specialised branch of software en-
gineering for the field addressed in this book. It is important to add, however,
that NMIE enabling technologies are being developed in fields whose practi-
tioners do not tend to regard themselves as doing software engineering, such
as signal processing. For instance, the recently launched European Network of

18 Advances in Natural Multimodal Dialogue Systems
Excellence SIMILAR (http://www.similar.cc) addresses signal processing for
natural and multimodal interaction.
4.1 Modalities Investigated in This Book
We argued above that multimodality includes all possible modalities for the
representation and exchange of information among humans and between hu-
mans and computer systems, and that natural interactivity includes a rather
vaguely defined, large sub-set of those modalities. Within this wide space of
unimodal modalities and modality combinations, it may be useful to look at the
modalities actually addressed in the following chapters. These are summarised
by chapter in Table 1.3.
Table 1.3.Modalities addressed in the included chapters (listed by chapter number plus first
listed author).
Chapter Input Output
2. Bickmore speech, gesture (via camera) vs.
speech-only
embodied conversational agent +
images vs. speech-only + images
3. Sidner speech, mouse clicks, (new ver-
sion includes face and gesture in-
put via camera)
robot pointing and beat gestures,
speech, gaze
4. Martell gesture N/A
5. Corradini speech, gesture, object manipula-
tion/manipulative gesture
video game
6. Healey N/A N/A
7. Carbonell gesture (mouse) speech, graphics
8. Darrell speech, camera-based graphics N/A
9. Massaro mouse, speech audio-visual speech synthesis,
talking head, images, text
10. Granström N/A audio-visual speech synthesis,
talking head
11. Heylen typed text talking head, gaze
12. Chai speech, text, gesture (pointing) speech, graphics 13. Clark speech, gesture (pointing) speech, text, graphics 14. Reithinger speech, haptic buttons music, speech, text, graphics,
tactile rythm
15. Dusan speech, keyboard, mouse, pen-
based drawing and pointing,
camera
speech, graphics, text display
16. Wilks speech and possibly other modal-
ities
speech and possibly other modal-
ities. Focus is on dialogue mod-
elling so input/output modalities
are not discussed in detail

Natural and Multimodal Interactivity Engineering - Directions and Needs19
Combined speech input/output which, in fact, means spoken dialogue al-
most throughout, is addressed in about half of the chapters (Bickmore and
Cassell, Chai et al., Clark et al., Corradini and Cohen, Dusan and Flanagan,
Reithinger et al., and Sidner and Dzikovska). Almost two thirds of the chapters
address gesture input in some form (Bickmore and Cassell, Chai et al., Clark
et al., Corradini and Cohen, Darrell et al., Dusan and Flanagan, Martell, Rei-
thinger et al., and Sidner and Dzikovska). Five chapters address output modal-
ities involving talking heads, embodied animated agents, or robots (Bickmore
and Cassell, Granström and House, Heylen et al., Massaro, and Sidner and
Dzikovska). Three chapters (Darrell et al., Bickmore and Cassell, and Sidner
and Dzikovska) address computer vision. Dusan and Flanagan also mention
that their system has camera-based input. Facial expression of emotion is ad-
dressed by Granström and House. Despite its richness and key role in natural
interactivity, input or output speech prosody is hardly discussed. Granström
and House discuss graphical ways ofreplacingmissing output speech prosody
by facial expression means.
In general, the input and output modalities and their combinations discussed
would appear representative of the state-of-the-art in NMIE. The authors make
quite clear how far we are from mastering the very large number of poten-
tially useful unimodal ”compounds” theoretically, in input recognition, in out-
put generation, as well as in understanding and generation.
References
Bernsen, N. O. (2002). Multimodality in Language and Speech Systems - From
Theory to Design Support Tool. In Granström, B., House, D., and Karlsson,
I., editors,Multimodality in Language and Speech Systems, pages 93–148.
Dordrecht: Kluwer Academic Publishers.
Bernsen, N. O., Dybkjær, H., and Dybkjær, L. (1998).Designing Interactive
Speech Systems. From First Ideas to User Testing. Springer Verlag.
Cassell, J., Bickmore, T., Campbell, L., Vilhjálmsson, H., and Yan, H. (2000).
Human Conversation as a System Framework: Designing Embodied Con-
versational Agents. InEmbodied Conversational Agents, pages 29–63. Cam-
bridge, MA: MIT Press.
Dehn, D. and van Mulken, S. (2000). The Impact of Animated Interface Agents:
A Review of Empirical Research.International Journal of Human-Compu-
ter Studies, 52:1–22.

PART I
MAKING DIALOGUES MORE NATURAL:
EMPIRICAL WORK AND APPLIED THEORY

Chapter 2
SOCIAL DIALOGUE WITH EMBODIED
CONVERSATIONAL AGENTS
Timothy Bickmore
Northeastern University, USA
[email protected]
Justine Cassell
Northwestern University, USA
[email protected]
Abstract The functions of social dialogue between people in the context of performing a
task is discussed, as well as approaches to modelling such dialogue in embodied
conversational agents. A study of an agent’s use of social dialogue is presented,
comparing embodied interactions with similar interactions conducted over the
phone, assessing the impact these media have on a wide range of behavioural,
task and subjective measures. Results indicate that subjects’ perceptions of the
agent are sensitive to both interaction style (social vs. task-only dialogue) and
medium.
Keywords:Embodied conversational agent, social dialogue, trust.
1. Introduction
Human-human dialogue does not just comprise statements about the task at
hand, about the joint and separate goals of the interlocutors, and about their
plans. In human-human conversation participants often engage in talk that,
on the surface, does not seem to move the dialogue forward at all. However,
this talk – about the weather, current events, and many other topics without
significant overt relationship to the task at hand – may, in fact, be essential
to how humans obtain information about one another’s goals and plans and
decide whether collaborative work is worth engaging in at all. For example,
realtors use small talk to gather information to form stereotypes (a collection
23
© 2005Springer. Printed in the Netherlands.
J.C.J. van Kuppevelt et al. (eds.), Advances in Natural Multimodal Dialogue Systems, 23–54.

24 Advances in Natural Multimodal Dialogue Systems
of frequently co-occurring characteristics) of their clients – people who drive
minivans are more likely to have children, and therefore to be searching for
larger homes in neighbourhoods with good schools. Realtors – and salespeo-
ple in general – also use small talk to increase intimacy with their clients, to
establish their own expertise, and to manage how and when they present infor-
mation to the client [Prus, 1989].
Nonverbal behaviour plays an especially important role in such social di-
alogue, as evidenced by the fact that most important business meetings are
still conducted face-to-face rather than on the phone. This intuition is backed
up by empirical research; several studies have found that the additional non-
verbal cues provided by video-mediated communication do not effect perfor-
mance in task-oriented interactions, but in interactions of a more social nature,
such as getting acquainted or negotiation, video is superior [Whittaker and
O’Conaill, 1997]. These studies have found that for social tasks, interactions
were more personalized, less argumentative and more polite when conducted
via video-mediated communication, that participants believed video-mediated
(and face-to-face) communication was superior, and that groups conversing us-
ing video-mediated communication tended to like each other more, compared
to audio-only interactions.
Together, these findings indicate that if we are to develop computer agents
capable of performing as well as humans on tasks such as real estate sales then,
in addition to task goals such reliable and efficient information delivery, they
must have the appropriate social competencies designed into them. Further,
since these competencies include the use of nonverbal behaviour for convey-
ing communicative and social cues, then our agents must have the capability
of producing and recognizing nonverbal cues in simulations of face-to-face in-
teractions. We call agents with such capabilities “Embodied Conversational
Agents” or “ECAs.”
The current chapter extends previous work which demonstrated that social
dialogue can have a significant impact on a user’s trust of an ECA [Bickmore
and Cassell, 2001], by investigating whether these results hold in the absence
of nonverbal cues. We present the results of a study designed to determine
whether the psychological effects of social dialogue – namely to increase trust
and associated positive evaluations – vary when the nonverbal cues provided
by the embodied conversational agent are removed. In addition to varying
medium (voice only vs. embodied) and dialogue style (social dialogue vs. task-
only) we also assessed and examined effects due to the user’s personality along
the introversion/extroversion dimension, since extroversion is one indicator of
a person’s comfort level with face-to-face interaction.

Social Dialogue with Embodied Conversational Agents 25
2. Embodied Conversational Agents
Embodied conversation agents are animated anthropomorphic interface
agents that are able to engage a user in real-time, multimodal dialogue, using
speech, gesture, gaze, posture, intonation, and other verbal and nonverbal be-
haviours to emulate the experience of human face-to-face interaction [Cassell
et al., 2000c]. The nonverbal channels are important not only for convey-
ing information (redundantly or complementarily with respect to the speech
channel), but also for regulating the flow of the conversation. The nonverbal
channel is especially crucial for social dialogue, since it can be used to provide
such social cues as attentiveness, positive affect, and liking and attraction, and
to mark shifts into and out of social activities [Argyle, 1988].
2.1 Functions versus Behaviours
Embodiment provides the possibility for a wide range of behaviours that,
when executed in tight synchronization with language, carry out a commu-
nicative function. It is important to understand that particular behaviours, such
as the raising of the eyebrows, can be employed in a variety of circumstances
to produce different communicative effects, and that the same communica-
tive function may be realized through different sets of behaviours. It is there-
fore clear that any system dealing with conversational modelling has to han-
dle function separately from surface-form or run the risk of being inflexible
and insensitive to the natural phases of the conversation. Here we briefly de-
scribe some of the fundamental communication categories and their functional
sub-parts, along with examples of nonverbal behaviour that contribute to their
successful implementation. Table 2.1 shows examples of mappings from com-
municative function to particular behaviours and is based on previous research
on typical North American nonverbal displays, mainly [Chovil, 1991; Duncan,
1974; Kendon, 1980].
Conversation initiation and terminationHumans partake in an elaborate
ritual when engaging and disengaging in conversations [Kendon, 1980]. For
example, people will show their readiness to engage in a conversation by turn-
ing towards the potential interlocutors, gazing at them and then exchanging
signs of mutual recognition typically involving a smile, eyebrow movement
and tossing the head or waving of the arm. Following this initial synchroniza-
tion stage, or distance salutation, the two people approach each other, seal-
ing their commitment to the conversation through a close salutation such as a
handshake accompanied by a ritualistic verbal exchange. The greeting phase
ends when the two participants re-orient their bodies, moving away from a
face-on orientation to stand at an angle. Terminating a conversation similarly
moves through stages, starting with non-verbal cues, such as orientation shifts

26 Advances in Natural Multimodal Dialogue Systems
Table 2.1.Some examples of conversational functions and their behaviour realization [Cassell
et al., 2000b].
Communicative Functions Communicative Behaviour
Initiation and termination
Reacting Short Glance
Inviting Contact Sustained Glance, Smile
Distance Salutation Looking, Head Toss/Nod, Raise Eyebrows, Wave, Smile
Close Salutation Looking, Head Nod, Embrace or Handshake, Smile
Break Away Glance Around
Farewell Looking, Head Nod, Wave
Turn-Taking
Give Turn Looking, Raise Eyebrows (followed by silence)
Wanting Turn Raise Hands into gesture space
Take Turn Glance Away, Start talking
Feedback
Request Feedback Looking, Raise Eyebrows
Give Feedback Looking, Head Nod
or glances away and cumulating in the verbal exchange of farewells and the
breaking of mutual gaze.
Conversational turn-taking and interruptionInterlocutors do not normal-
ly talk at the same time, thus imposing a turn-taking sequence on the conver-
sation. The protocols involved in floor management – determining whose turn
it is and when the turn should be given to the listener – involve many factors
including gaze and intonation [Duncan, 1974]. In addition, listeners can inter-
rupt a speaker not only with voice, but also by gesturing to indicate that they
want the turn.
Content elaboration and emphasisGestures can convey information about
the content of the conversation in ways for which the hands are uniquely suited.
For example, the two hands can better indicate simultaneity and spatial re-
lationships than the voice or other channels. Probably the most commonly
thought of use of the body in conversation is the pointing (deictic) gesture, pos-
sibly accounting for the fact that it is also the most commonly implemented for
the bodies of animated interface agents. In fact, however, most conversations
don’t involve many deictic gestures [McNeill, 1992] unless the interlocutors
are discussing a shared task that is currently present. Other conversational
gestures also convey semantic and pragmatic information. Beat gestures are
small, rhythmic baton like movements of the hands that do not change in form
with the content of the accompanying speech. They serve a pragmatic func-

Social Dialogue with Embodied Conversational Agents 27
tion, conveying information about what is “new” in the speaker’s discourse.
Iconic and metaphoric gestures convey some features of the action or event be-
ing described. They can be redundant or complementary relative to the speech
channel, and thus can convey additional information or provide robustness or
emphasis with respect to what is being said. Whereas iconics convey infor-
mation about spatial relationships or concepts, metaphorics represent concepts
which have no physical form, such as a sweeping gesture accompanying “the
property title is free and clear.”
Feedback and error correctionDuring conversation, speakers can non-
verbally request feedback from listeners through gaze and raised eyebrows and
listeners can provide feedback through head nods and paraverbals (“uh-huh”,
“mmm”, etc.) if the speaker is understood, or a confused facial expression or
lack of positive feedback if not. The listener can also ask clarifying questions
if they did not hear or understand something the speaker said.
2.2 Interactional versus Propositional Behaviours
The mapping from form (behaviour) to conversational function relies on a
fundamental division of conversational goals: contributions to a conversation
can be propositional and interactional. Propositional information corresponds
to the content of the conversation. This includes meaningful speech as well as
hand gestures and intonation used to complement or elaborate upon the speech
content (gestures that indicate the size in the sentence “it was this big” or ris-
ing intonation that indicates a question with the sentence “you went to the
store”). Interactional information consists of the cues that regulate conversa-
tional process and includes a range of nonverbal behaviours (quick head nods
to indicate that one is following) as well as regulatory speech (“huh?”, “Uh-
huh”). This theoretical stance allows us to examine the role of embodiment not
just in task- but also process-related behaviours such as social dialogue [Cas-
sell et al., 2000b].
2.3 REA
Our platform for conducting research into embodied conversational agents
is the REA system, developed in the Gesture and Narrative Language Group at
the MIT Media Lab [Cassell et al., 2000a]. REA is an embodied, multi-modal
real-time conversational interface agent which implements the conversational
protocols described above in order to make interactions as natural as face-to-
face conversation with another person. In the current task domain, REA acts
as a real estate salesperson, answering user questions about properties in her
database and showing users around the virtual houses (Figure 2.1).

28 Advances in Natural Multimodal Dialogue Systems
Figure 2.1.User interacting with REA.
REA has a fully articulated graphical body, can sense the user passively
through cameras and audio input, and is capable of speech with intonation, fa-
cial display, and gestural output. The system currently consists of a large pro-
jection screen on which REA is displayed and which the user stands in front
of. Two cameras mounted on top of the projection screen track the user’s head
and hand positions in space. Users wear a microphone for capturing speech
input. A single SGI Octane computer runs the graphics and conversation en-
gine of REA, while several other computers manage the speech recognition
and generation and image processing.
REA is able to conduct a conversation describing the features of the task
domain while also responding to the users’ verbal and non-verbal input. When
the user makes cues typically associated with turn taking behaviour such as
gesturing, REA allows herself to be interrupted, and then takes the turn again
when she is able. She is able to initiate conversational error correction when
she misunderstands what the user says, and can generate combined voice, fa-
cial expression and gestural output. REA’s responses are generated by an incre-
mental natural language generation engine based on [Stone and Doran, 1997]
that has been extended to synthesize redundant and complementary gestures
synchronized with speech output [Cassell et al., 2000b]. A simple discourse

Social Dialogue with Embodied Conversational Agents 29
model is used for determining which speech acts users are engaging in, and
resolving and generating anaphoric references.
3. Social Dialogue
Social dialogue is talk in which interpersonal goals are foregrounded and
task goals – if existent – are backgrounded. One of the most familiar contexts
in which social dialogue occurs is in human social encounters between individ-
uals who have never met or are unfamiliar with each other. In these situations
conversation is usually initiated by “small talk” in which “light” conversation
is made about neutral topics (e.g., weather, aspects of the interlocutor’s physi-
cal environment) or in which personal experiences, preferences, and opinions
are shared [Laver, 1981]. Even in business or sales meetings, it is customary
(at least in American culture) to begin with some amount of small talk before
“getting down to business”.
3.1 The Functions of Social Dialogue
The purpose of small talk is primarily to build rapport and trust among
the interlocutors, provide time for them to “size each other up”, establish an
interactional style, and to allow them to establish their reputations [Dunbar,
1996]. Although small talk is most noticeable at the margins of conversational
encounters, it can be used at various points in the interaction to continue to
build rapport and trust [Cheepen, 1988], and in real estate sales, a good agent
will continue to focus on building rapport throughout the relationship with a
buyer [Garros, 1999].
Small talk has received sporadic treatment in the linguistics literature, start-
ing with the seminal work of Malinowski who defined “phatic communion”
as “a type of speech in which ties of union are created by a mere exchange
of words”. Small talk is the language used in free, aimless social intercourse,
which occurs when people are relaxing or when they are accompanying “some
manual work by gossip quite unconnected with what they are doing” [Mali-
nowski, 1923]. Jacobson also included a “phatic function” in his well-known
conduit model of communication, that function being focused on the regulation
of the conduit itself (as opposed to the message, sender, or receiver) [Jakob-
son, 1960]. More recent work has further characterized small talk by de-
scribing the contexts in which it occurs, topics typically used, and even gram-
mars which define its surface form in certain domains [Cheepen, 1988; Laver,
1975; Schneider, 1988]. In addition, degree of “phaticity” has been proposed
as a persistent goal which governs the degree of politeness in all utterances a
speaker makes, including task-oriented ones [Coupland et al., 1992].

30 Advances in Natural Multimodal Dialogue Systems
3.2 The Relationship between Social Dialogue and Trust
Figure 2.2 outlines the relationship between small talk and trust. REA’s
dialogue planner represents the relationship between her and the user using
a multi-dimensional model of interpersonal relationship based on [Svennevig,
1999]:
familiaritydescribes the way in which relationships develop through the
reciprocal exchange of information, beginning with relatively non-
intimate topics and gradually progressing to more personal and private
topics. The growth of a relationship can be represented in both the
breadth (number of topics) and depth (public to private) of information
disclosed [Altman and Taylor, 1973].
solidarityis defined as “like-mindedness” or having similar behaviour dispo-
sitions (e.g., similar political membership, family, religions, profession,
gender, etc.), and is very similar to the notion of social distance used by
Brown and Levinson in their theory of politeness [Brown and Levinson,
1978]. There is a correlation between frequency of contact and soli-
darity, but it is not necessarily a causal relation [Brown and Levinson,
1978; Brown and Gilman, 1972].
affectrepresents the degree of liking the interactants have for each other, and
there is evidence that this is an independent relational attribute from the
above three [Brown and Gilman, 1989].
The mechanisms by which small talk are hypothesized to effect trust include
facework, coordination, building common ground, and reciprocal appreciation.
FaceworkThe notion of “face” is “the positive social value a person effec-
tively claims for himself by the social role others assume he has taken during a
particular contact” [Goffman, 1967]. Interactants maintain face by having their
social role accepted and acknowledged. Events which are incompatible with
their line are “face threats” and are mitigated by various corrective measures if
they are not to lose face. Small talk avoidsface threat(and therefore maintains
solidarity) by keeping conversation at a safe level of depth.
CoordinationThe process of interacting with a user in a fluid and natural
manner may increase the user’s liking of the agent, and user’s positive affect,
since the simple act of coordination with another appears to be deeply gratify-
ing. “Friends are a major source of joy, partly because of the enjoyable things
they do together, and the reason that they are enjoyable is perhaps the coor-
dination.” [Argyle, 1990]. Small talk increasescoordinationbetween the two
participants by allowing them to synchronize short units of talk and nonverbal
acknowledgement (and therefore leads to increased liking and positiveaffect).

Social Dialogue with Embodied Conversational Agents 31
Figure 2.2.How small talk effects trust [Cassell and Bickmore, 2003].
Building common groundInformation which is known by all interactants
to be shared (mutual knowledge) is said to be in the “common ground” [Clark,
1996]. The principle way for information to move into the common ground is
via face-to-face communication, since all interactants can observe the recogni-
tion and acknowledgment that the information is in fact mutually shared. One
strategy for effecting changes to the familiarity dimension of the relationship
model is for speakers to disclose personal information about themselves – mov-
ing it into the common ground – and induce the listener to do the same. An-
other strategy is to talk about topics that are obviously in the common ground
– such as the weather, physical surroundings, and other topics available in the
immediate context of utterance. Small talk establishescommon ground(and
therefore increasesfamiliarity) by discussing topics that are clearly in the con-
text of utterance.
Reciprocal appreciationIn small talk, demonstrating appreciation for and
agreement with the contributions of one’s interlocutor is obligatory. Perform-
ing this aspect of the small talk ritual increases solidarity by showing mutual
agreement on the topics discussed.

32 Advances in Natural Multimodal Dialogue Systems
3.3 Nonverbal Behaviour in Social Dialogue
According to Argyle, nonverbal behaviour is used to express emotions, to
communicate interpersonal attitudes, to accompany and support speech, for
self presentation, and to engage in rituals such as greetings [Argyle, 1988].
Of these, coverbal and emotional display behaviours have received the most
attention in the literature on embodied conversational agents and facial and
character animation in general, e.g. [Cassell et al., 2000c]. Next to these, the
most important use of nonverbal behaviour in social dialogue is the display
of interpersonal attitude [Argyle, 1988]. The display of positive or negative
attitude can greatly influence whether we approach someone or not and our
initial perceptions of them if we do.
The most consistent finding in this area is that the use of nonverbal “im-
mediacy behaviours” – close conversational distance, direct body and facial
orientation, forward lean, increased and direct gaze, smiling, pleasant facial
expressions and facial animation in general, nodding, frequent gesturing and
postural openness – projects liking for the other and engagement in the interac-
tion, and is correlated with increased solidarity [Argyle, 1988; Richmond and
McCroskey, 1995].
Other nonverbal aspects of “warmth” include kinesic behaviours such as
head tilts, bodily relaxation, lack of random movement, open body positions,
and postural mirroring and vocalic behaviours such as more variation in pitch,
amplitude, duration and tempo, reinforcing interjections such as “uh-huh” and
“mm-hmmm”, greater fluency, warmth, pleasantness, expressiveness, and clar-
ity and smoother turn-taking [Andersen and Guerrero, 1998].
In summary, nonverbal behaviour plays an important role in all face-to-face
interaction – both conveying redundant and complementary propositional in-
formation (with respect to speech) and regulating the structure of the interac-
tion. In social dialogue, however, it provides the additional, and crucial, func-
tion, of conveying attitudinal information about the nature of the relationship
between the interactants.
4. Related Work
4.1 Related Work on Embodied Conversational Agents
Work on the development of ECAs, as a distinct field of development, is best
summarized in [Cassell et al., 2000c]. The current study is based on the REA
ECA (see Figure 2.1), a simulated real estate agent, who uses vision-based
gesture recognition, speech recognition, discourse planning, sentence and ges-
ture planning, speech synthesis and animation of a 3D body [Cassell et al.,
1999]. Some of the other major systems developed to date are Steve [Rickel
and Johnson, 1998], the DFKI Persona [André et al., 1996], Olga [Beskow and

Social Dialogue with Embodied Conversational Agents 33
McGlashan, 1997], and pedagogical agents developed by Lester et al. [1999].
Sidner and Dzikovska [2005] report progress on a robotic ECA that performs
hosting activities, with a special emphasis on “engagement” – an interactional
behaviour whose purpose is to establish and maintain the connection between
interlocutors during a conversation. These systems vary in their linguistic gen-
erativity, input modalities, and task domains, but all aim to engage the user in
natural, embodied conversation.
Little work has been done on modelling social dialogue with ECAs. The Au-
gust system is an ECA kiosk designed to give information about local restau-
rants and other facilities. In an experiment to characterize the kinds of things
that people would say to such an agent, over 10,000 utterances from over 2,500
users were collected. It was found that most people tried to socialize with the
agent, with approximately 1/3 of all recorded utterances classified as social in
nature [Gustafson et al., 1999].
4.2 Related Studies on Embodied Conversational Agents
Koda and Maes [1996] and Takeuchi and Naito [1995] studied interfaces
with static or animated faces, and found that users rated them to be more en-
gaging and entertaining than functionally equivalent interfaces without a face.
Kiesler and Sproull [1997] found that users were more likely to be cooperative
with an interface agent when it had a human face (vs. a dog or cartoon dog).
André, Rist and Muller found that users rated their animated presentation
agent (“PPP Persona”) as more entertaining and helpful than an equivalent
interface without the agent [André et al., 1998]. However, there was no differ-
ence in actual performance (comprehension and recall of presented material)
in interfaces with the agent vs. interfaces without it. In another study involv-
ing this agent, van Mulken, André and Muller found that when the quality of
advice provided by an agent was high, subjects actually reported trusting a
text-based agent more than either their ECA or a video-based agent (when the
quality of advice was low there were no significant differences in trust ratings
between agents) [van Mulken et al., 1999].
In a user study of the Gandalf system [Cassell et al., 1999], users rated
the smoothness of the interaction and the agent’s language skills significantly
higher under test conditions in which Gandalf utilized limited conversational
behaviour (gaze, turn-taking and beat gesture) than when these behaviours
were disabled.
In terms of social behaviours, Sproull et al. [1997] showed that subjects
rated a female embodied interface significantly lower in sociability and gave it
a significantly more negative social evaluation compared to a text-only inter-
face. Subjects also reported being less relaxed and assured when interacting
with the embodied interface than when interacting with the text interface.

34 Advances in Natural Multimodal Dialogue Systems
Finally, they gave themselves significantly higher scores on social desirability
scales, but disclosed less (wrote significantly less and skipped more questions
in response to queries by the interface) when interacting with an embodied
interface vs. a text-only interface. Men were found to disclose more in the
embodied condition and women disclosed more in the text-only condition.
Most of these evaluations have tried to address whether embodiment of a
system is useful at all, by including or not including an animated figure. In their
survey of user studies on embodied agents, Dehn and van Mulken conclude that
there is no “persona effect”, that is a general advantage of an interface with an
animated agent over one without an animated agent [Dehn and van Mulken,
2000]. However, they believe that lack of evidence and inconsistencies in the
studies performed to date may be attributable to methodological shortcomings
and variations in the kinds of animations used, the kinds of comparisons made
(control conditions), the specific measures used for the dependent variables,
and the task and context of the interaction.
4.3 Related Studies on Mediated Communication
Several studies have shown that people speak differently to a computer than
another person, even though there are typically no differences in task outcomes
in these evaluations. Hauptmann and Rudnicky [1988] performed one of the
first studies in this area. They asked subjects to carry out a simple information-
gathering task through a (simulated) natural language speech interface, and
compared this with speech to a co-present human in the same task. They found
that speech to the simulated computer system was telegraphic and formal, ap-
proximating a command language. In particular, when speaking to what they
believed to be a computer, subject’s utterances used a small vocabulary, often
sounding like system commands, with very few task-unrelated utterances, and
fewer filled pauses and other disfluencies.
These results were extended in research conducted by Oviatt [Oviatt, 1995;
Oviatt and Adams, 2000; Oviatt, 1998], in which she found that speech to
a computer system was characterized by a low rate of disfluencies relative to
speech to a co-present human. She also noted that visual feedback has an effect
on disfluency: telephone calls have a higher rate of disfluency than co-present
dialogue. From these results, it seems that people speak more carefully and
less naturally when interacting with a computer.
Boyle et al. [1994] compared pairs of subjects working on a map-based task
who were visible to each other with pairs of subjects who were co-present but
could not see each other. Although no performance difference was found be-
tween the two conditions, when subjects could not see one another, they com-
pensated by giving more verbal feedback and using longer utterances. Their
conversation was found to be less smooth than that between mutually visible

Social Dialogue with Embodied Conversational Agents 35
partners, indicated by more interruptions, and less efficient, as more turns were
required to complete the task. The researchers concluded that visual feedback
improves the smoothness and efficiency of the interaction, but that we have
devices to compensate for this when visibility is restricted.
Daly-Jones et al. [1998] also failed to find any difference in performance
between video-mediated and audio-mediated conversations, although they did
find differences in the quality of the interactions (e.g., more explicit questions
in audio-only condition).
Whittaker and O’Conaill [1997] survey the results of several studies which
compared video-mediated communication with audio-only communication
and concluded that the visual channel does not significantly impact perfor-
mance outcomes in task-oriented collaborations, although it does affect social
and affective dimensions of communication. Comparing video-mediated com-
munication to face-to-face and audio-only conversations, they also found that
speakers used more formal turn-taking techniques in the video condition even
though users reported that they perceived many benefits to video conferencing
relative to the audio-only mode.
In a series of studies on the effects of different media and activities on trust,
Zheng, Veinott et al. have demonstrated that social interaction, even if carried
out online, significantly increases people’s trust in each other [Zheng et al.,
2002]. Similarly, Bos et al. [2002] demonstrated that richer media – such as
face-to-face, video-, and audio-mediated communication – leads to higher trust
levels than media with lower bandwidth such as text chat.
Finally, a number of studies have been done comparing face-to-face con-
versations with conversations on the phone [Rutter, 1987]. These studies find
that, in general, there is more cooperation and trust in face-to-face interaction.
One study found that audio-only communication encouraged negotiators to be-
have impersonally, to ignore the subtleties of self-presentation, and to concen-
trate primarily on pursuing victory for their side. Other studies found similar
gains in cooperation among subjects playing prisoner’s dilemma face-to-face
compared to playing it over the phone. Face-to-face interactions are also less
formal and more spontaneous than conversations on the phone. One study
found that face-to-face discussions were more protracted and wide-ranging
while subjects communicating via audio-only kept much more to the specific
issues on the agenda (the study also found that when the topics were more
wide-ranging, changes in attitude among the participants was more likely to
occur). Although several studies found increases in favourable impressions
of interactants in face-to-face conversation relative to audio-only, these effects
have not been consistently validated.

36 Advances in Natural Multimodal Dialogue Systems
4.4 Trait-Based Variation in User Responses
Several studies have shown that users react differently to social agents based
on their own personality and other dispositional traits. For example, Reeves
and Nass have shown that users like agents that match their own personality
(on the introversion/ extraversion dimension) more than those which do not,
regardless of whether the personality is portrayed through text or speech [Nass
and Gong, 2000; Reeves and Nass, 1996]. Resnick and Lammers showed that
in order to change user behaviour via corrective error messages, the messages
should have different degrees of “humanness” depending on whether the user
has high or low self-esteem (“computer-ese” messages should be used with
low self-esteem users, while “human-like” messages should be used with high-
esteem users) [Resnick and Lammers, 1985]. Rickenberg and Reeves showed
that different types of animated agents affected the anxiety level of users dif-
ferentially as a function of whether users tended towards internal or external
locus of control [Rickenberg and Reeves, 2000].
In our earlier study on the effects of social dialogue on trust in ECA interac-
tions, we found that social dialogue significantly increased trust for extraverts,
while it made no significant difference for introverts [Cassell and Bickmore,
2003]. In light of the studies summarized here, the question that remains is
whether these effects continue to hold if the nonverbal cues provided by the
ECA are removed.
5. Social Dialogue in REA
For the purpose of trust elicitation and small talk, we have constructed a new
kind of discourse planner that can interleave small talk and task talk during
the initial buyer interview, based on the relational model outlined above. An
overview of the planner is provided here; details of its implementation can be
found in Cassell and Bickmore [2003].
5.1 Planning Model
Given that many of the goals in a relational conversational strategy are non-
discrete (e.g., minimize face threat), and that trade-offs among multiple goals
have to be achieved at any given time, we have moved away from static world
discourse planning, and are using an activation network-based approach based
on Maes’Do the Right Thingarchitecture [Maes, 1989]. This architecture
provides the capability to transition smoothly from deliberative, planned be-
haviour to opportunistic, reactive behaviour, and is able to pursue multiple,
non-discrete goals. In our implementation each node in the network represents
a conversational move that REA can make.

Social Dialogue with Embodied Conversational Agents 37
Thus, during task talk, REA may ask questions about users’ buying prefer-
ences, such as the number of bedrooms they need. During small talk, REA can
talk about the weather, events and objects in her shared physical context with
the user (e.g., the lab setting), or she can tell stories about the lab, herself, or
real estate.
REA’s conversational moves are planned in order to minimize the face threat
to the user, and maximize trust, while pursuing her task goals in the most ef-
ficient manner possible. That is, REA attempts to determine the face threat
of her next conversational move, assesses the solidarity and familiarity which
she currently holds with the user, and judges which topics will seem most rel-
evant and least intrusive to users. As a function of these factors, REA chooses
whether or not to engage in small talk, and what kind of small talk to choose.
The selection of which move should be pursued by REA at any given time is
thus a non-discrete function of the following factors:
ClosenessREA continually assesses her “interpersonal” closeness with the
user, which is a composite representing depth of familiarity and solidar-
ity, modelled as a scalar quantity. Each conversational topic has a pre-
defined, pre-requisite closeness that must be achieved before REA can
introduce the topic. Given this, the system can plan to perform small
talk in order to “grease the tracks” for task talk, especially about sensi-
tive topics like finance.
TopicREA keeps track of the current and past conversational topics. Conver-
sational moves which stay within topic (maintain topic coherence) are
given preference over those which do not. In addition, REA can plan to
execute a sequence of moves which gradually transition the topic from
its current state to one that REA wants to talk about (e.g., from talk
about the weather, to talk about Boston weather, to talk about Boston
real estate).
RelevanceREA maintains a list of topics that she thinks the user knows about,
and the discourse planner prefers moves which involve topics in this list.
The list is initialized to things that anyone talking to REA would know
about – such as the weather outside, Cambridge, MIT, or the laboratory
that REA lives in.
Task goalsREA has a list of prioritized goals to find out about the user’s hous-
ing needs in the initial interview. Conversational moves which directly
work towards satisfying these goals (such as asking interview questions)
are preferred.
Logical preconditionsConversational moves have logical preconditions
(e.g., it makes no sense for REA to ask users what their major is

38 Advances in Natural Multimodal Dialogue Systems
until she has established that they are students), and are not selected for
execution until all of their preconditions are satisfied.
One advantage of the activation network approach is that by simply adjust-
ing a few gains we can make REA more or less coherent, more or less polite
(attentive to closeness constraints), more or less task-oriented, or more or less
deliberative (vs. reactive) in her linguistic behaviour.
In the current implementation, the dialogue is entirely REA-initiated, and
user responses are recognized via a speaker-independent, grammar-based, con-
tinuous speech recognizer (currently IBM ViaVoice). The active grammar
fragment is specified by the current conversational move, and for responses to
many REA small talk moves the content of the user’s speech is ignored; only
the fact that the person responded at all is enough to advance the dialogue.
At each step in the conversation in which REA has the floor (as tracked
by a conversational state machine in REA’s Reaction Module [Cassell et al.,
2000a], the discourse planner is consulted for the next conversational move to
initiate. At this point, activation values are incrementally propagated through
the network (following [Maes, 1989]) until a move is selected whose precon-
ditions are satisfied and whose activation value is over a specified threshold.
Within this framework, REA decides to do small talk whenever closeness
with the user needs to be increased (e.g., before a task query can be asked),
or the topic needs to be moved little-by-little to a desired topic and small talk
contributions exist which can facilitate this. The activation energy from the
user relevance condition described above leads to REA starting small talk with
topics that are known to be in the shared environment with the user (e.g., talk
about the weather or the lab).
5.2 Interactional Behaviour during Social Dialogue
Shifts between small talk moves and task moves are marked by conven-
tional contextualization cues – discourse markers and beat gestures. Discourse
markers include “so” on the first small talk to task talk transition, “anyway”
on resumption of task talk from small talk, and “you know” on transition to
small talk from task talk [Clark, 1996]. Prior to producing lengthy utterances,
REA gazes away briefly before she starts her turn, partly as a turn-taking and
floor-holding move and partly to mask the processing delays in generating long
utterances. Finally, REA smiles as soon as she detects that users have started
their speaking turns (using audio thresholding of the user’s speech).
5.3 Example Interaction
An interview between REA and a user typically proceeds as shown in the
following dialogue. (User responses are only shown in positions in which they

Social Dialogue with Embodied Conversational Agents 39
effect the selection of subsequent moves; REA contributions are her actual
utterances.)
1 That microphone is terrible, I hate using those things.
2 Sorry about my voice, this is some engineer’s idea of natural sounding.
3 Are you one of our sponsors?User: Yes
4 Were you at our last sponsor meetings?
5 I got so exhausted at the last sponsor meeting I think I was starting to
lose my voice by the end.
6 So, where would you like to live?
7 How many bedrooms do you need?
8 Do you need access to the subway?
9 Is one bath enough?
10 You know, Boston is certainly more expensive than it used to be.
11 Anyway, what can you afford?
12 What kind of down payment can you make?
13 Let me see what I have available.
Dialogue 1.“Small Talk” REA.
In this example, REA opens with small talk moves regarding things in her
shared physical environment with the user (1-2). She then proceeds to small
talk related to sponsors (after establishing that the user is a sponsor). After a
few turns, enough closeness has been established (simply by doing small talk)
that REA can move into task talk (6-9). However, before bringing up the topic
of finance – a topic that is potentially very face threatening for the user – REA
decides that additional closeness needs to be established, and moves back into
small talk (10). This small talk move not only increases closeness but shifts
the topic to finance, enabling REA to then bring up the issue of how much the
user is able to afford (11-12).
If REA’s adherence to closeness preconditions is reduced, by decreasing
the contributions of these preconditions to the activation of joint projects, this
results in her engaging in less small talk and being more task goal oriented. If
everything else is held constant (relative to the prior example) the following
dialogue is produced.

40 Advances in Natural Multimodal Dialogue Systems
1 So, where would you like to live?
2 What can you afford?
3 What kind of down payment can you make?
4 How many bedrooms do you need?
5 Do you need access to the subway?
6 Is one bath enough?
7 Let me see what I have available.
Dialogue 2.“Task-only REA”.
In this example, REA does not perform any small talk and sequences the
task questions in strictly decreasing order of priority.
6. A Study Comparing ECA Social Dialogue with
Audio-Only Social Dialogue
The dialogue model presented above produces a reasonable facsimile of the
social dialogue observed in service encounters such as real estate sales. But,
does small talk produced by an ECA in a sales encounter actually build trust
and solidarity with users? And, does nonverbal behaviour play the same criti-
cal role in human-ECA social dialogue as it appears to play in human-human
social interactions?
In order to answer these questions, we conducted an empirical study in
which subjects were interviewed by REA about their housing needs, shown
two “virtual” apartments, and then asked to submit a bid on one of them. For
the purpose of the experiment, REA was controlled by a human wizard and
followed scripts identical to the output of the planner (but faster, and not de-
pendent on automatic speech recognition or computational vision). Users in-
teracted with one of two versions of REA which were identical except that
one had only task-oriented dialogue (TASK condition) while the other also
included the social dialogue designed to avoid face threat, and increase trust
(SOCIAL condition). A second manipulation involved varying whether sub-
jects interacted with the fully embodied REA – appearing in front of the virtual
apartments as a life-sized character (EMBODIED condition) – or viewed only
the virtual apartments while talking with REA over a telephone. Together these
variables provided a 2x2 experimental design: SOCIAL vs. TASK and EM-
BODIED vs. PHONE.
Our hypotheses follow from the literature on small talk and on trust among
humans. We expected subjects in the SOCIAL condition to trust REA more,
feel closer to REA, like her more, and feel that they understood each other more

Social Dialogue with Embodied Conversational Agents 41
than in the TASK condition. We also expected users to think the interaction was
more natural, lifelike, and comfortable in the SOCIAL condition. Finally, we
expected users to be willing to pay REA more for an apartment in the SOCIAL
condition, given the hypothesized increase in trust. We also expected all of
these SOCIAL effects to be amplified in the EMBODIED condition relative to
the PHONE-only condition.
6.1 Experimental Methods
This was a multivariate, multiple-factor, between-subjects experimental de-
sign, involving 58 subjects (69% male and 31% female).
6.1.1 Apparatus. One wall of the experiment room was a rear-pro-
jection screen. In the EMBODIED condition REA appeared life-sized on the
screen, in front of the 3D virtual apartments she showed, and her synthetic
voice was played through two speakers on the floor in front of the screen.
In the PHONE condition only the 3D virtual apartments were displayed and
subjects interacted with REA over an ordinary telephone placed on a table in
front of the screen.
For the purpose of this experiment, REA was controlled via a wizard-of-oz
setup on another computer positioned behind the projection screen. The inter-
action script included verbal and nonverbal behaviour specifications for REA
(e.g., gesture and gaze commands as well as speech), and embedded com-
mands describing when different rooms in the virtual apartments should be
shown. Three pieces of information obtained from the user during the inter-
view were entered into the control system by the wizard: the city the subject
wanted to live in; the number of bedrooms s/he wanted; and how much s/he
was willing to spend. The first apartment shown was in the specified city, but
had twice as many bedrooms as the subject requested and cost twice as much
as s/he could afford (they were also told the price was “firm”). The second
apartment shown was in the specified city, had the exact number of bedrooms
requested, but cost 50% more than the subject could afford (but this time, the
subject was told that the price was “negotiable”).
The scripts were comprised of a linear sequence of utterances (statements
and questions) that would be made by REA in a given interaction: there was
no branching or variability in content beyond the three pieces of information
described above. This helped ensure that all subjects received the same in-
tervention regardless of what they said in response to any given question by
REA. Subject-initiated utterances were responded to with either backchannel
feedback (e.g., “Really?”) for statements or “I don’t know” for questions, fol-
lowed by an immediate return to the script.
The scripts for the TASK and SOCIAL conditions were identical, except
that the SOCIAL script had additional small talk utterances added to it, as

Another Random Document on
Scribd Without Any Related Topics

Tutte queste considerazioni spettano più specialmente alla
condizione delle province orientali dell'America inglese. Ma nelle
occidentali, essendo le terre molto più fertili, e perciò godendo i
coloni di una maggior larghezza di facoltà, dovevano anche poter
essere di vantaggio liberi in sulla propria volontà, e meno per le
necessità naturali a quella d'altrui obbligati. Nè si potrebbe pensare,
ciò avere ammollito, o snervato gli animi loro; che anzi, vivendo
eglino continuamente in sui campi, lontani dal lusso e dagli
allettamenti delle città, ed essendo in ogni desiderio loro assegnati e
modesti, si deve credere, la maggior abbondanza delle cose al vivere
umano necessarie conferisse ai corpi loro più vigore, e gli animi
rendesse ad ogni soggezione più impazienti.
In queste ancora la schiavitù dei Neri, la quale vi era in uso,
quantunque sembri a prima vista strana cosa a dirsi, allettava gli
uomini bianchi all'amore della libertà. Avendo questi continuamente
sotto gli occhj l'esempio vivo della miserabile condizione dell'uomo
ridotto in ischiavitù, dovevano sapere meglio, e più apprezzare la
libertà, la quale e' gioivano; questa libertà riputavano, non che un
diritto, essere una franchigia ed un privilegio; e siccome quando si
tratta dell'interesse proprio e delle passioni loro, gli uomini giudicano
alla grossa e cogli occhi della mente abbacinati, impazientemente
sopportavano i coloni la superiorità del governo inglese, e le
pretensioni sue, siccome quelle, che tendessero a condurli in uno
stato prossimo, o simile a quello, al quale gli schiavi loro erano
ridotti, detestando eglino in sè stessi ciò, che esercitavano sugli altri.
Gli abitanti delle colonie, specialmente delle orientali, fruivano non
solo l'ombra, ma di più la sostanza medesima del governo inglese,
ed in questo conto poco mancava, non fossero affatto independenti.
Eglino eleggevano i proprj maestrati; eglino gli pagavano, ogni cosa
spettante all'amministrazione interiore loro s'apparteneva; e la sola
prova della dependenza verso l'antica patria in ciò era, che non
potessero far leggi, o statuti contrarj alla lettera, od alla intenzione
delle leggi inglesi; che il Re avesse la facoltà del divieto sopra le
deliberazioni delle assemblee loro, e che si sottomettessero a quelle
regole e restrizioni di commercio, che fossero dal Parlamento

giudicate necessarie, ed al bene universale di tutto l'impero
conducenti. Del rimanente queste cose erano più vane parole, che
altro; perciocchè il Re di rado diede divieto; e da un altro canto e'
cansavano destramente quelle regole e restrizioni per il mezzo del
traffico di contrabbando. Le assemblee provinciali poi erano molto
libere, e forse più del Parlamento stesso dell'Inghilterra, non
essendovi là i ministri pronti ad imbeccherare ad ogni dì, ed il calore
e zelo democratico non avendovi freno, se non debole o niuno;
conciossiachè i governatori, i quali v'intervenivano da parte del Re,
non avessero credito da tanto, traendo gli loro stipendj, non dalla
corona, ma sì dalla provincia stessa, ed in alcune fossero anche eletti
dai suffragi degli abitatori.
L'eccessivo zelo religioso, il quale era ne' coloni, e massimamente
negli abitatori della Nuova-Inghilterra, manteneva tra i medesimi i
buoni costumi, e la parsimonia, la temperanza e la castità erano virtù
frequenti in mezzo a quel popolo. Là non si vedevano le mogli
pompose, i mariti randagi, i figliuoli discoli. I ministri di una religione
severissima eranvi ed osservati e venerati; perciocchè davano essi
stessi l'esempio di quelle virtù, che agli altri predicavano. Là si
passava il tempo tra i lavori camperecci, le brigate domestiche, e le
preghiere, e grazie indiritte e rendute a quel Dio, il quale, aprendo
loro le viscere di una fertile terra, e con gli accidenti di un propizio
cielo fecondandola, accumulava sopra di essi tanti beni e tanti tesori.
Se a ciò si aggiunge, che gli abitanti della Nuova-Inghilterra
s'incontrarono, dopo superati i primi ostacoli, in una regione
generativa e sana, non sarà da maravigliare, la popolazione delle
colonie americane essere, nel termine di un secolo, cresciuta in
maniera, che pochi e miserabili uomini, i quali l'avversa fortuna
aveva spinto a que' lidi estrani, siano diventati in sì breve tempo una
grande e potente nazione.
Oltre a questo si deve fare considerazione, che i padri americani
andavano esenti del tutto da quella inquietudine, la quale ad ogni dì,
ad ogni ora, e quasi ad ogni momento punge e travaglia l'animo dei
padri europei intorno al sostentamento e collocamento futuro della

prole loro. Laonde l'appetito naturale di generare non trovava, sotto
quel cielo, nella strettezza delle facoltà famigliari opposizione alcuna;
che anzi la nascita di un figliuolo era non solo un evento prospero al
paternale amore; ma sì lo era ancora per l'interesse ed il prò di tutta
la famiglia; perciocchè in quella immensità di terre tuttavia incolte
non era da dubitare, che il nuovo fanciullo all'età conveniente
pervenuto, riducendone a propria coltivazione anche un altro tratto
colle mani sue, non procurasse a sè ed ai parenti un nuovo
sostentamento; e perciò più erano i figliuoli, e più eziandio erano gli
strumenti del bene ed agiatamente vivere di tutta la Casa. Per la
qual cosa egli è chiaro, che in quei paesi il cielo, la natura, le
istituzioni civili e religiose, e l'interesse medesimo delle famiglie, tutti
concorrevano in questo; che avessero a nascervi in copia, da robusti
padri, robusti e generosi figliuoli.
E siccome la industria, lo intraprendere, ed il sommo desiderio di
convertire ogni cosa in prò sono proprj di coloro, i quali si trovano
dagli altri uomini segregati, e solo da sè stessi possono ogni
sostentamento aspettare, discendendo anche i coloni da una nazione
nota a tutti per suo ardire e per la sua industria nelle cose di
commercio, si deve facilmente credere, che all'accrescimento della
popolazione si proporzionasse quello del commercio stesso. La qual
cosa si può chiaramente argomentare da ciò, che nell'anno 1704 la
totalità dell'uscita commerciale dall'Inghilterra, compresevi le merci
tratte per alla volta delle sue colonie, era stata di sei milioni
cinquecento e novemila lire di sterlini; ma da quell'anno sino al 1772
queste crebbero sì fattamente in popolazione e prosperità, che in
quest'ultimo anno trassero da sè sole dall'Inghilterra pel valore di sei
milioni ventiduemila cento e trenta due lire di sterlini; che è quanto
dire, che nel 1772 le colonie ricavarono da per sè sole dalla comune
patria quasi altrettante mercatanzie, quante esse stesse unitamente
a tutte le altre parti del mondo sessant'otto anni indietro avevano
ricavato.
Tale era lo stato delle colonie inglesi d'America, tali le opinioni e le
affezioni di coloro, che le abitavano, essendo già oltre la metà
trascorso il decimo ottavo secolo. Potenti di numero e di forze,

abbondanti di ricchezze, e d'ogni cosa al vivere umano necessaria,
proceduti già molt'oltre nella carriera delle arti utili e delle nobili
discipline, andando già mercatando per ogni dove con tutte le
nazioni del mondo, non era possibile, non fossero diventati di sè
medesimi consapevoli; e che, crescendo appoco appoco il nazionale
orgoglio, il giogo della superiorità inglese impazientemente non
sopportassero. Ma queste opportunità ed inclinazioni a cose nuove
non procedevano a manifesto incendio, e sarebbersi senza nuova
esca contenute tuttavia ne' termini, in cui già per sì lungo tempo
erano bastate; la quale esca il governo brittanico, durante un secolo,
governando con prudenza le cose delle colonie, aveva evitato di
somministrare; che anzi quasi con cura paterna allevandole e
proteggendole, quando elleno erano ancora deboli, e quasi in istato
d'infanzia costituite, e poscia con savie leggi regolando il commercio
loro colla comune madre e coll'estere nazioni, le aveva gradatamente
alla presente prosperità condotte, e fattele fiorentissime;
imperciocchè ne' tempi prossimi alla fondazione delle colonie,
l'Inghilterra cogli uomini suoi e colle sue navi, non altrimenti che una
buona madre i proprj figliuoli, le difendeva contro gl'impeti delle
vicine e barbare popolazioni, e dalle avanie e soprusi delle altre
nazioni; concedeva immunità e privilegj a coloro, i quali volessero
dall'Europa ridursi in quelle nuove terre; somministrava ai coloni a
buonissimo prezzo i drappi, i panni, i feltri, le tele ed ogni maniera
d'istromenti necessarj tanto per la propria difesa contro i nemici,
quanto per le arti utili in tempo di pace, e specialmente ogni cosa
atta e conveniente all'acconcime delle terre, ed ai lavori della
agricoltura. Medesimamente i mercatanti inglesi gli accomodavano
dei loro grossi capitali, senza dei quali non avrebbero potuto
intraprendere opere di gran momento, come quelle di costrur navi di
gran portata, seccare vaste paludi, ordinare letti a' fiumi, diboscare
le selve, e numerose piantazioni fare, e simili altre imprese di somma
considerazione.
In contraccambio di tanti benefizj, e piuttosto come una
conseguenza necessaria dell'atto di navigazione, che come una
restrizione fiscale e particolare di commercio, l'Inghilterra altro non

ricercava dall'America, se non se che questa l'accomodasse di quelle
cose che a lei mancavano, e da lei ricevesse quelle che in casa
soprabbondavano, e delle quali avessero le colonie difetto. Perciò
l'America era obbligata a portare in Inghilterra tutte le derrate,
grasce e proventi di qualsivoglia sorta, che le sue terre producono
soprabbondevolmente, e delle quali questa aveva bisogno; ed anche
tutte le materie gregge, le quali possono alle manifatture servire.
Oltre a questo era fatto divieto agli Americani di far procaccio di
lavori da ogni altra parte qualsivoglia del mondo fuori dell'Inghilterra,
e di non far compra parimente dei proventi delle terre appartenenti
ad alcune nazioni europee, colle quali era essa in gelosia e rivalità,
se prima questi proventi non erano nei porti inglesi stati introdotti.
Questo è stato lo scopo costante, e tale la materia di moltissimi atti
del Parlamento perfino dal 1660 sino al 1764, dimodochè un vero
monopolio commerciale venne ad ordinarsi a carico delle colonie
inglesi, ed in favore della Inghilterra. Della qual cosa però i coloni
non se ne tenevano nè offesi, nè gravati; sia perchè ne ricevevano in
ristoro tanta protezione dal governo e tante comodità dai particolari,
sia perchè, e molto più, egli pareva e riputavasi, che la gravezza, che
ne sperimentavano, tenesse luogo delle tasse ed imposizioni, alle
quali gli abitanti della Gran-Brettagna andavano soggetti per virtù
delle leggi emanate dal Parlamento. In tutto questo tempo le tasse
parlamentari non formarono parte del sistema del governo colonario.
In fatti in tutte le leggi, le quali alle colonie riguardavano, tutte le
parole speciali, che ne' preamboli delle leggi di finanza significano
l'imporre gravezze, balzelli, o tasse, a fine di creare una entrata
pubblica ad uso del governo, erano studiosamente evitate; e solo si
usavano quelle di doni, di concessioni, o di aiuti prestati alla Corona.
Ed avvegnadiochè il Parlamento avesse più volte imposte gabelle su
di varj oggetti di commercio nelle colonie, queste erano riputate
meglio regole e restrizioni di commercio, che sorgenti di pubblica
entrata. Così sino all'anno 1764 il negozio delle tasse da imporsi per
autorità del Parlamento a fine di creare una rendita al comune si
passò sotto silenzio; e l'Inghilterra stette contenta ad esercitare la
sua superiorità solamente regolando i generali interessi delle colonie,
e facendogli tutti concorrere e rinvergare nella utilità di tutto il

regno. Alla quale condizione si sottomettevano gli Americani, se non
senza qualche mal cenno, almeno con una filiale obbedienza. Dal
che si dimostrò, che quantunque non fossero sottoposti alle tasse
parlamentari, davano ciò non ostante buona corrispondenza di sè
medesimi, ed utilmente servivano alla prosperità di tutto il dominio
inglese.
Non è però, che non siano corsi di quando in quando mali umori tra
l'uno e l'altro popolo per le tente fatte dall'un canto a fine di
mantenere, ed anche amplificare la superiorità, e dall'altro per
progredire verso l'independenza. Un anno dopo la pace di
Aquisgrana fu fatta nelle vicinanze del fiume Ojo una concessione di
seicentomila acri (un'acre chiamano una sorta di misura agraria
usata nell'America settentrionale, delle quali cinque sommate
insieme equivalgono a un dipresso a due ectari) delle migliori terre
ad alcuni gentiluomini, che esercitavano la mercatura, i quali
collegatisi si chiamarono la compagnia dell'Ojo. Della qual cosa
avendo avuto sentore il governatore della provincia del Canadà, la
quale si teneva allora pei Francesi, venne in apprensione, non
avessero gl'Inglesi il pensiero di disturbare il commercio loro con gli
Indiani chiamati Tuigtuis, ed interrompere la comunicazione loro tra
le due province della Luigiana e del Canadà. Mandò dunque ai
governatori della Nuova-Jork e della Pensilvania significando, i
mercatanti inglesi aver posto piede sul territorio francese trafficando
con gl'Indiani, i quali coi sudditi della Corona di Francia dovevano
solo trafficare; e minacciando, gli farebbe pigliare, ovunque trovati
gli avesse. Ma questi nonostante continuarono i traffichi loro; onde
nel principio dell'anno 1751 alcune bande di Francesi e d'Indiani
posero le mani addosso ai mercatanti inglesi. Gl'Indiani, amici
all'Inghilterra, alteratisi grandemente all'ingiuria stata fatta ai
confederati si assembrarono, e fatta nelle selve una diligente
scoperta pigliarono a furia i mercatanti francesi, e gli trasportarono
in Pensilvania. E non contenti a questo, i Virginiani mandarono al sig.
San-Pietro comandante, pel Re di Francia, di un forte piantato sul
fiume Ojo, il maggiore Washington, quell'istesso, il quale imperò
poscia agli eserciti americani, commettendogli, gli domandasse

ragione di questi atti d'ostilità, e ricercasselo, ritirasse i suoi. Rispose
San-Pietro, non potere alle dimande inglesi acconsentire;
appartenere la contrada al Re di Francia suo signore; non avere
gl'Inglesi nissuna ragione di trafficare su per quei fiumi; e che perciò
eseguendo gli ordini datigli avrebbe fatto pigliare, e condurre prigioni
nel Canadà tutti quegl'Inglesi, che si attentassero di trafficare per
l'Ojo, e sue dependenze.
Questo procedere dei Francesi alterò grandemente i ministri della
Gran-Brettagna, i quali non potevano tollerare, che fossero fatti
soprusi agli amici e confederati loro. Perciò si risentirono tosto, e
scrissero risolutamente in America, dovessesi resistere alle
usurpazioni francesi colla forza dell'armi. Le istruzioni pervennero
molto per tempo in Virginia. Nacquero quindi le ostilità, e si sparse
sangue da ambe le parti.
Il maestrato, il quale nell'Inghilterra tien cura dei negozj
appartenenti al commercio ed alle piantazioni, accorgendosi che le
colonie divise tra di loro, non potevano se non tardi e male opporsi
ai tentativi di una gente audace ed arrisicata, secondata anche da
buon numero d'Indiani, raccomandò a ciascuna di esse, facessero un
convento generale di deputati, a fine si contraesse una generale lega
fra di tutte, e fra queste e gl'Indiani sotto il nome e la protezione di
Sua Maestà britannica. Appuntossi, che il convento dei governatori e
dei principali di ciascuna delle colonie si facesse in Albania, Terra
posta sul fiume del Nort. Questi dopo di avere con doni convenienti
assicurati gli animi degl'Indiani delle sei tribù, procedettero alla
disamina dei mezzi più opportuni per poter difendere sè e le robe
loro dagli assalti degl'inimici. Sopra di che furono di parere, essere
del tutto necessaria una lega generale fra tutte le colonie. Le
condizioni della lega furono accettate addì 4 di luglio 1754, la somma
delle quali importava quanto siegue: «Si supplicasse, a fine
d'impetrare dal Parlamento un atto, in virtù del quale venisse a
ordinarsi un governo generale in America; che sotto questo governo
ciascuna colonia conservasse gli ordini suoi interni, da quei
particolari in fuori, nei quali dal medesimo atto fosse qualche
cambiamento introdotto; che il governo generale fosse amministrato

da un presidente generale da eleggersi e stipendiarsi dalla Corona, e
da un Gran Consiglio da eleggersi dai rappresentanti del popolo delle
colonie; il presidente generale avesse il divieto sopra gli atti del Gran
Consiglio, e fosse suo uffizio di mettergli ad effetto; il medesimo, col
parere del Gran Consiglio, avesse autorità di concludere ed eseguire
tutti que' trattati cogli Indiani, nei quali tutte le colonie avessero un
interesse comune, come ancora di concludere la pace, o di dichiarare
la guerra alle nazioni indiane; ancora fosse autorizzato a far
provvisioni per regolare ogni traffico con quelle; potesse dagl'Indiani
comprare, e ciò per la Corona, terre situate fuori del territorio delle
particolari colonie; avesse facoltà di fondare nuove colonie sulle terre
acquistate; e potesse far leggi per regolare e governare queste
nuove colonie; potesse far leve e stipendiare soldati, construrre
fortezze, allestir naviglj per la custodia delle coste, e per la
protezione del commercio; ancora, ed a questi fini avesse facoltà di
far provvisioni per imporre tali generali dazj, balzelli, o tasse, che più
credesse convenienti; eleggesse un tesoriere generale, ed anche un
particolare in ciascheduna colonia, ove ne fosse d'uopo; il presidente
generale avesse la facoltà di eleggere gli uffiziali di terra e di mare,
ed il Gran Consiglio avesse la facoltà di nominare gli uffiziali civili; nel
rimanente le leggi che facessero, non solo non potessero essere
contrarie, ma di più dovessero essere consentanee alle leggi inglesi,
e da trasmettersi al Re per l'approvazione». Questi furono i modelli
del governo a venire proposti dalle colonie, i quali furono inviati in
Inghilterra per l'approvazione; della qual cosa gli Americani avevano
grande speranza; perciocchè le cose già si volgevano a manifesta
guerra colla Francia; ed affermavano bastar loro la vista, se la lega
era approvata, di difendersi da sè stessi dalle armi francesi,
senz'altro ajuto dalla parte dell'Inghilterra.
Nissuno non vede, quanto un sì fatto ordine pubblico avrebbe
attenuato l'autorità del governo inglese, ed avvicinati i coloni ad una
totale independenza; imperciocchè per quello venivano a conseguire,
e ad avere in mezzo di loro medesimi un governo, il quale in fatto
avrebbe esercitata tutta l'autorità e tutti i diritti, che spettano alla
sovranità, quantunque in nome paresse dipendere tuttavia dal

governo patrio. Ma questo disegno non sapeva del buono al governo
inglese, il quale s'era stranamente ingelosito, che la lega di cui si
trattava, non somministrasse la opportunità, ed un fondamento
notabile ad accordo di macchinazioni in America, che tendessero a'
danni della sovranità sua. Perciò mal grado del pericolo imminente di
una guerra esterna contro di un nemico poderoso d'uomini e d'armi,
gli articoli della confederazione non furono approvati.
Ma i ministri d'Inghilterra non trasandarono questa occasione per
ampliare, se avessero potuto, l'autorità del governo in America, e
massimamente quella d'imporre le tasse; cosa più di tutte desiderata
al di qua, e detestata al di là dell'Oceano. E perciò in luogo del
modello americano ne immaginarono un altro, e lo mandarono ai
governatori delle colonie, acciò alle assemblee colonarie lo
proponessero; «che i governatori di tutte le colonie, accompagnati
da uno, o due membri dei Consiglj, convenissero insieme per
accordare tra di loro quelle cose, che alla difesa comune fossero
necessarie; per construr fortezze; per far leve di soldati con facoltà
di trarre sopra il tesoro britannico per quelle somme che fosser di
bisogno; e si rimborsasse il tesoro per mezzo di una tassa da porsi
sulle colonie per via di un atto del Parlamento». A qual fine mirasse
questo trovato ministeriale non è difficile a vedersi, se si considera,
che per lo più i governatori ed i membri del Consiglio erano eletti dal
Re; onde il tentativo non ebbe successo in America, ed i motivi
furono acconciamente dedotti in una lettera del dottor Beniamino
Franklin, scritta al governatore Shirley, il quale gli aveva il modello
dei ministri inviato. In quella s'incominciarono a scorgere i semi della
discordia, che poco poi nacque
[1].
La Corte generale di Massacciusset scrisse al suo agente in Londra di
opporsi ad ogni cosa, la quale avesse la mira a por balzelli nelle
colonie per un uso pubblico qualsivoglia, o per sovvenimento del
governo. Per lo contrario, i governatori, e particolarmente il Shirley,
mandavano continuamente dicendo, ciò essere e giusto a
pretendersi, e possibile a farsi, ed utile ad eseguirsi.

Queste sospizioni e questa gelosia, che ingombravano le menti
americane, originate dal timore di una tassa parlamentare,
incontravano nelle medesime buona corrispondenza per certe ruggini
antiche, che vi rimanevano cagionate da alcune provvisioni del
Parlamento, le quali, abbenchè non avessero tendenza a por tasse, o
balzelli, ristringevano però molto il commercio interno delle colonie,
o impedivano le manifatture, od in qualsivoglia modo andavano a
ferire l'amore proprio degli Americani, come se eglino non fossero
uomini da tanto, quanto gl'Inglesi; ovvero come se questi, tarpando
l'ali agl'ingegni americani, volessero in uno stato inferiore e di
minore stima mantenergli. Tale si era la provvisione, la quale portava
divieto di tagliare gli alberi da pece e da ragia, i quali non fossero in
chiudenda compresi; e quell'altra, che proibiva il trasportare fuori
delle colonie, ed anche dall'una nell'altra introdurre i cappelli fatti in
quelle, e le lane ivi lavorate, e vietava ai cappellai di non avere ad un
tempo più di due novizj, o sia apprendenti. Ancora quell'altra vinta
per facilitare la riscossione dei debiti nelle colonie, la quale ordinava,
le case, le terre, i Neri, ed altri effetti reali dover sodare il
pagamento dei debiti. Quella finalmente, la quale fu vinta nell'anno
1733 per istanze fatte dagli abitanti delle colonie, dove si coltiva lo
zucchero, per la quale si vietava, che dalle colonie olandesi e francesi
non si trasportassero, se non se mediante un grave dazio, dentro le
colonie inglesi settentrionali il rum, lo zucchero, e le mielate. A
queste si debb'aggiugnere un'altra provvisione del Parlamento vinta
nell'anno 1750, per la quale si ordinò, che facendo tempo dal dì 24
giugno del medesimo anno, non potessero nelle colonie americane
eseguirsi certi lavori di ferro; e non fosse lecito il fabbricarvi l'acciajo;
e quella, per la quale si regolarono e restrinsero i biglietti di credito
verso i governi della Nuova-Inghilterra, e si dichiarò, non potere essi
avere forza di moneta nel pagamento dei debiti, affinchè i creditori
inglesi non fossero dannificati per essere obbligati a ricevere, in
luogo di moneta, una carta, la quale scapitava. Questa provvisione,
comechè giusta, gli Americani ricevettero di mal animo, siccome
quella, che tendeva a screditare i loro biglietti. Di qui nacquero i
primi sdegni negli Americani, ed i primi sospetti negl'Inglesi.

Da un'altra parte si discorreva in Inghilterra, che se i coloni per le
restrizioni commerciali poste dal governo, per le quali veniva
grandemente a vantaggiarsi la comune patria, non pretendessero più
oltre che questo, che nell'imposizione delle tasse, avessero ad essere
con molta dolcezza ed equità trattati, sarebbe ella cosa giusta e
ragionevole riputata; ma richiamarsi da ogni specie di ulteriore ajuto
verso la patria europea, ciò non potersi in niuna maniera
comportare; l'Inghilterra, riserbando a sè stessa il commercio delle
sue colonie, avere adoperato come tutte le moderne nazioni hanno
adoperato da molto tempo; aver ella imitato l'esempio degli
Spagnuoli e dei Portoghesi; ma questo ancora aver fatto con una
moderazione, che i governi di queste nazioni non hanno conosciuto.
Fondando queste lontane colonie, l'Inghilterra averle fatte partecipi
di tutti que' diritti e privilegj, che i sudditi stessi inglesi gioiscono
nella patria loro; lasciandole al tutto governare a sè stesse, e tali
leggi promulgare, le quali la saviezza e la prudenza delle proprie
assemblee avrebbero credute necessarie. E brevemente essa aver
conceduta alle colonie la più ampia facoltà di provvedere a sè stesse,
e procurare gli rispettivi interessi, solo salvando per sè il benefizio
del commercio loro, e la congiunzione politica sotto il medesimo
sovrano. Le colonie francesi ed olandesi, e soprattutto le portoghesi
e spagnuole non isperimentare a gran pezza tanta indulgenza. E
veramente le colonie inglesi, non ostanti quelle restrizioni di cui esse
fanno querele, avere in commercio ed in proprietà loro un immenso
capitale; imperciocchè oltre i ricchi carichi dei proventi delle terre
loro levati dalle navi inglesi, le quali vanno per que' porti trafficando,
avere i coloni proprj navilj, i quali portano con incredibile prò in gran
copia le derrate e merci loro, non solo ai porti della metropoli, ma
ancora (per l'indulgenza e tolleranza maternale di questa) a quelli di
alcune altre parti del mondo, e riportano a casa le merci e comodità
europee. Quindi procedere, esser nelle colonie inglesi insoliti, anzi
inuditi quegli enormi prezzi, ai quali si vendono le mercatanzie
europee in quelle della Spagna e del Portogallo; che anzi nelle prime,
molte vendersi allo stesso, ed alcune anche a più infimo prezzo, che
nell'Inghilterra medesima. Queste cose non vedersi nelle colonie
portoghesi e spagnuole e poche nelle francesi; le restrizioni poste

dall'Inghilterra sul commercio americano riguardare piuttosto ad una
giusta e prudente distribuzione del medesimo verso tutte le parti de'
suoi vasti dominj, acciocchè tutte egualmente ne potessero diventar
partecipi, che ad una vera proibizione; e se i sudditi inglesi sono
liberi di andar trafficando per tutte le parti del mondo, la medesima
facoltà essere concessa ai sudditi americani per molti capi, se si
eccettuano però le parti settentrionali dell'Europa, e le Indie
orientali. In Portogallo, in Ispagna, in Italia, pel Mediterraneo, sulle
coste dell'Africa, in tutto l'emisfero americano le navi delle colonie
inglesi potere liberamente esercitare il commercio; savie, e bene
considerate essere le leggi inglesi per dar favore a questa sorta di
commercio, siccome quelle che hanno in mira di far levare più
mercatanzie dai porti americani, e ad abilitare i coloni a diboscare e
coltivare le terre per la vendita certa di una grandissima quantità di
legni da fabbricar navi, de' quali abbondano le foreste loro. Esser
vero, molte cose non poter recare i coloni a nissun altro luogo, che
ne' porti d'Inghilterra; ma in ciò doversi far considerazione, le terre
americane per la natura e vastità loro dovere occupare assai, e gli
animi e i corpi degli abitanti, senza che e' sia richiesto, che vadino a
cercar civanza altrove, a modo degli abitanti di altre contrade già con
ogni studio coltivate. E se l'Inghilterra riserbava a sè stessa il
commercio esclusivo di certe mercatanzie, ciò che importare, o come
nuocere agli Americani? Queste mercatanzie essendo per lo più di
quelle concernenti la delicatezza del vivere civile, in quale contrada,
o presso a qual gente potranno eglino procacciarsele più perfette, ed
a sì umil prezzo, che nell'Inghilterra? L'amorevolezza, e la liberalità
del governo inglese verso le sue colonie essersi tant'oltre distese,
che egli non solo s'astenne dal porre gabelle sulle proprie
manifatture, che avessero nei porti di quelle a trasportarsi, ma per
anche levò via del tutto quelle, dalle quali erano gravate le
mercatanzie forestiere, quando dall'Inghilterra fossero ai porti
americani indiritte, dimodochè le medesime rinviliarono sì fattamente
in alcune delle colonie, che a più umil prezzo vi si vendettero, che in
alcune contrade d'Europa. Nè si deve pretermettere, la libertà la più
intiera di traffico essere permessa per gli scambj opportuni delle
mercatanzie tra l'America settentrionale e le isole delle Indie

occidentali inglesi, dalla qual cosa ritrarre i coloni un grandissimo
utile. In fatti, mal grado le varie restrizioni poste sopra il commercio
dei coloni non ne rimaner forse a bastanza per rendere quel popolo
ricco, fiorente e avventuroso? La prosperità loro non esser forse
nota, nè non fare invidia a tutto il mondo? Certo se l'uomo vive in
qualche parte di quaggiù beata e felice vita, questo specialmente e
fuori d'ogni dubitazione nell'America inglese aver luogo. Non esser
questa una prova irrefragabile, non un esempio vivo del paternale
amore dell'Inghilterra verso le colonie sue? Pareggino gli Americani
la condizione loro con quella dei coloni forestieri, e confessino non
senza riconoscenza verso la comune madre e la propria felicità, e la
vanità delle querele loro.
Ma tutte queste ed altre cose, che si allegavano per l'Inghilterra, non
avevano valeggio di contentar gli Americani, e vi rimanevano molte
gozzaje. I Francesi, siccome è inveterata la gelosia tra le due nazioni
francese e britannica, non mancarono a sè stessi, e non
tralasciarono di pigliare l'occasione che si offeriva, per fare con
accorte maniere penetrare più addentro ne' cuori degli Americani
quelle ferite, che dai concittadini d'Inghilterra avevano, o credevano
di aver ricevute. Non potevano i Francesi già da lungo tempo vedere
con animo indifferente lo stato sì prospero delle colonie inglesi. Sulle
prime determinarono di fondarne anch'essi in qualche parte di quel
vastissimo continente, sperando di ricavarne sì copiosi frutti, come
gli Inglesi stessi ricavavano; e perciò procurare a sè i medesimi
comodi, e fare in modo che il commercio d'America e d'Europa
pigliasse almeno fino ad un certo segno un altro indirizzo.
Intendevano essi, o colle buone leggi, o coll'armi, giovarsi di modo,
che si riparasse a quei difetti di suolo e di sito, che si osservava nelle
contrade, le quali erano cadute loro in sorte. Ma siccome il governo
francese s'indirizzava, secondo che e' suole, più alla milizia che al
commercio, ed i Francesi vanno più volonterosamente soldati, che
mercatanti, così fecero tosto disegni alla natura loro confacenti; e
siccome poi anche è per lo più smisurato l'animo loro, e non mai al
presente contento, così incontanente vollero, ed affortificarsi, ed
allargarsi. Un bastione qua, un riparo là; in questo luogo un

arsenale, in quell'altro un'armerìa; e non istettero contenti, finchè
non ebbero compiuta una tela continua di fortezze da una parte
all'altra del continente. Ma l'apparato militare non è abile a dare nè
la popolazione, nè il commercio, nè la prosperità di questo, o di
quella. Quelle fortezze, quelle armi, que' presidj mostravansi in
deserte e povere regioni. Una immensa solitudine si distendeva tutto
all'intorno; foreste senza fine ingombravano la terra ed il cielo. Molto
diverso era il procedere degl'Inglesi. A passo a passo andavano
progredendo, e in vece di voler abbracciare troppo per istringer nulla
o poco, andavano gradatamente e con gran cura coltivando quello,
che possedevano, e più oltre non cercavano, se non quando i bisogni
di una popolazione accresciuta il richiedevano. Così i progressi loro
erano lenti, ma sicuri; così non occupavano nuove terre, se non se
dopo che le già occupate erano ad ottima coltivazione ridotte, e di
sufficiente popolazione fornite. Un sì diverso metodo non poteva non
produrre effetti del tutto contrarj: e per verità un secolo dopo che le
colonie inglesi e francesi state erano fondate, le terre di queste
erano a ragguaglio povere, sterili e scarsamente abitate, mentre che
quelle e fertili, e ricche, e piene di un industrioso e profittante
popolo si mostravano. Provando pertanto i Francesi, che o fosse per
la malignità dell'aria e del suolo delle regioni da essi occupate, o per
difetto della propria industria, o per mancanza di leggi opportune
non potevano sperare di volgere a le loro il commercio delle colonie
inglesi, o almeno di pareggiarne i benefizj, conoscendo da un altro
canto, di quanta utilità queste fossero, e quanta prosperità o
potenza accrescessero alla nazione rivale, deliberarono di volgersi
all'armi, e di ottenere con queste ciò, che colla industria non
avevano potuto. Speravano, il mal animo degli Americani dovere
manifestarsi e produrre eventi favorevoli; o almeno non dover'esser
questi alla contesa così pronti, conoscendo benissimo di quanta
importanza ciò fosse, stante che nelle armi, negli uomini, nelle
vettovaglie, e nei danari americani doveva tutto consistere il nervo e
la somma della guerra. E procedendo colla solita impazienza, senza
aspettare che prima le provvisioni della guerra fossero in pronto,
andavano provocando il nemico, ora facendo richiami, che questi
occupasse terre che a loro s'appartenessero, ora occupando e

turbando le sue possessioni. Risentissi gravemente il governo
britannico, e la guerra si ruppe fra le due nazioni nell'anno 1755. Ma
non corrisposero gli effetti a tante speranze; imperciocchè essendo i
Consiglj dell'Inghilterra guidati da Guglielmo Pitt, che fu poi Conte di
Chatam, uomo per la grandezza dell'ingegno, e la santità dei costumi
piuttosto singolare, che raro, andarono così prospere le cose
degl'Inglesi, e le armi loro superarono sì fattamente per mare e per
terra quelle dei nemici loro, che, stanchi questi e sbattuti, e perduta
ogni speranza della vittoria, consentirono alle condizioni della pace di
Parigi, la quale si conchiuse nel 1763. Per questa l'Inghilterra rimase
in possessione del vastissimo continente dell'America settentrionale
dalle rive del Mississipì sino alle spiagge della Groelandia; e
principalmente, cosa di grandissima importanza, fu a lei ceduta dalla
Francia la provincia del Canadà. Ella ne acquistò ancora molte ricche
isole dell'Indie occidentali; e nelle orientali tanto si distese la potenza
sua, ed a sì sodi fondamenti si appoggiò, che venne ad ottenere una
superiorità di gran lunga maggiore, tanto pel commercio, che per la
forza dell'armi. Da un'altra parte gli Americani si mostrarono
anch'essi tanto pronti a secondare, e coll'armi, e colle ricchezze loro
gli sforzi della comune patria, che ne acquistarono molta gloria, e
furono degni riputati di partecipare ne' vantaggi, che il corso di tanta
prosperità aveva alle cose inglesi procacciati. In questo stato,
disperati i Francesi di far frutto coll'armi, si volsero all'arti; e uomini a
posta andavano percorrendo la terra ferma americana, dicendo, a
chi lo voleva udire: a che fine, a che prò avere gli Americani versato
tanto sangue, corsi tanti pericoli, spesa tanta pecunia in quest'ultima
guerra, se ha a continuare sopra i medesimi la maggioranza inglese
tanto dura e tanto detestata? In premio di tanta fedeltà e di tanta
costanza avere forse il governo inglese moderate le proibizioni,
sciolto il commercio dai tanti lacciuoli, che lo legano, ed impediscono
con tanto danno delle cose americane? Essersi forse rivocate le leggi
sì odiose, e tanto lamentate delle manifatture? Dover forse gli
Americani sudar sulle terre loro, e percorrere i vasti mari solamente
per empiere le borse dei mercatanti inglesi? Avere forse il governo
dell'Inghilterra fatto qualche segno di voler abbandonare per sempre
il pensiero delle tasse parlamentari? Non esser per lo contrario più

verisimile, ora colle forze e colla potenza essersi anche accresciute e
la fame dell'oro, e le voglie tiranniche? Ciò non avere accennato lo
stesso Pitt, quando e' disse, che, terminata la guerra, avrebbe ben
egli saputo trovar modo di trarre entrate pel pubblico dalle colonie, e
por fine una volta alla ritrosìa americana? Non avere ora l'Inghilterra,
signoreggiando il Canadà, siccome provincia testè francese, e perciò
più sottomessa al governo, la facoltà di porre con numerosa
soldatesca il freno in bocca agli Americani? Non essere più ora questi
una nazione fanciulla, ma essersi robusta e forte fatta, ed entrata
nella più fiorita adolescenza. Aver essi ciò con molta gloria loro ed
utile dell'Inghilterra mostrato a tutto il mondo durante il corso della
testè terminata guerra; e per qual cagione una isola lontana ha da
reggere e governare a senno suo un continente popolato e grande?
E sino a quando s'avranno a sopportare le parzialità e l'avarizia
inglesi? Non sono qui le armi, non sono qui gli uomini, non l'ardire,
non il coraggio, non l'industria, non le ricchezze, non il cielo propizio
ad ogni più onorata impresa? Piglino adunque gli Americani con forte
animo la occasione, ora ch'eglino sperimentato hanno, tagliare
anch'esse le armi loro; ora che un debito pubblico enorme aggrava
ed opprime l'Inghilterra; ora che era venuto esoso il nome suo a
tutti; e certo non mancheranno loro le speranze e gli ajuti esterni.
Che cosa potersi a sì generosa risoluzione opporre? La
consanguinità? Gl'Inglesi avergli fin qui trattati più da sudditi, che da
fratelli. La gratitudine? Aver l'Inghilterra interrotto il corso di questa
con l'avarizia, e con l'animo mercantile suoi.
Veramente le condizioni generali dell'Europa favorivano
efficacemente questi disegni; perciocchè non ha dubbio, che i
pensieri di tutti i potentati europei non concorressero a questo
tempo in ciò, che il maraviglioso incremento di potenza della nazione
britannica, e per mare e per terra, non minacciasse di continuo e da
vicino le libertà e la pace d'Europa; poichè la prosperità della fortuna
suole indur gli uomini a non saper metter fine ai disegni loro.
Signoreggiando essa tutti i mari, avendo in una mano le sue colonie
dell'emisfero occidentale, e nell'altra le sue possessioni dell'Indie
orientali pareva tenere le due estremità del globo, e tentare l'intiero

dominio dell'Oceano. Dal dì in cui fu conchiusa la pace del 1763, fu
l'Inghilterra considerata nell'istesso modo, col quale fu la Francia a'
tempi del Re Luigi decimoquarto. Le medesime gelosie, i medesimi
sospetti l'accompagnavano. Ognuno desiderava di vedere abbassata
la sua potenza; e quanto più essa s'era formidabile mostrata nella
passata guerra, tanto più si bramava di profittare della presente
pace per umiliarla e consumarla. In ciò erano più ardenti i desiderj
degli Stati marittimi, e specialmente quelli della Olanda, la quale
aveva da parte dell'Inghilterra provato in quegli ultimi tempi
grandissimi danni: perciocchè le navi inglesi avevano interrotto, e ciò
spesso con istranezze ed insolenze singolari, quel commercio, che
andavano facendo gli Olandesi, portando in Francia munizioni da
guerra, quantunque anche non di rado usassero di questi soprusi
contro quelle stesse navi che andavano cariche di cose, le quali se
non di lontano possono riputarsi all'uso della guerra appartenere. I
Reami del Nort anch'essi sopportavano molto malvolentieri la
superiorità inglese, ed apertamente si dolevano che l'Inghilterra il
commercio de' neutri angariasse a' tempi di guerra. Si scorgeva, che
erano pronti a pigliar le prime occasioni per imporle un freno. Ma la
Francia sopra tutti ardeva di questo desiderio, siccome quella, la
quale essendo di alti e generosi spiriti ne' negozj della guerra, non
poteva sgozzare le recenti sconfitte, le perdite fatte, la dignità
oscurata; e mai non intermetteva di pensare agl'istromenti opportuni
per ristorarsene; e nissun mezzo più efficace, nissuna via più sicura
si appresentava per ottenere l'intento, che quella di separare,
lacerando il seno della parte avversaria, le colonie americane, parte
sì principale della potenza inglese, dall'Inghilterra.
A tali suggestioni, siccome quelle che andavano molto a' versi agli
abitanti dell'America inglese, si commovevano grandemente gli animi
loro, e detestavanne di vantaggio gli avari procedimenti
dell'Inghilterra. Forsechè coloro, i quali più amavano, o la libertà, o
l'ambizione, formarono anche nella più segreta parte dell'animo il
pensiero di levarsi dal collo il giogo della superiorità inglese, quando
la prima occasione per ciò si appresentasse. A ciò dava anche
maggiore incentivo la cessione fatta dalla Francia all'Inghilterra della

vicina provincia del Canadà; imperciocchè quando questa era sotto la
divozione francese, la propinquità di una gente inquieta e potente in
sull'armi teneva generalmente i coloni in sospetto, e più
ardentemente e più spesso si rivolgevano agli ajuti inglesi, siccome
quelli, nei quali soli potevano protezione sufficiente sperare per
contenerla ne' limiti, e le sue correrie raffrenare. Ma, cacciati i
Francesi dal Canadà, dovettero gli Americani maggiormente
diventare padroni di sè stessi; fare più fondamento sulle forze loro, e
meno provare il bisogno di ricorrere per la sicurtà propria agli ajuti
altrui. Si aggiunga a questo, che nella passata guerra un buon
numero di coloni allontanatisi dalle pacifiche arti, e pigliando la
spada in luogo della marra, avevano imparato l'uso della milizia,
avvezzato i corpi loro alle fatiche militari, indurati gli animi, e fattigli
forti contro i pericoli della guerra; e lasciando dall'un de' lati ogni
abitudine da agricoltori o da mercatanti, avevano vestito quelle che a
soldati si appartengono. E siccome la coscienza delle proprie forze le
moltiplica a molti doppj, e chi più gagliardo si crede, meno abile
diventa a sopportare ogni specie di soggezione, così è da credere,
che per la perizia nuovamente acquistata negli usi della guerra, ed
universalmente sparsasi per ogni dove fra gli Americani, diventassero
eziandio al giogo inglese più impazienti. Brutta, e vituperevol cosa
credevano essere, da qualche Ministro a tremila miglia lontano, e da'
suoi agenti venire malmenati coloro, i quali avevano con tanto valore
combattuto, e spesso avute vittorie contro i soldati di una nazione
agguerrita, possente e gloriosa. Recavansi in mente la presente
prosperità dell'Inghilterra, la quale a tanti altri era cagione d'invidia,
essere in gran parte opera loro. Allegavano col sangue e colle
sostanze loro avere rimunerata l'Inghilterra di quelle maternali cure,
colle quali a' tempi dell'infanzia loro ella gli aveva e allevati, e
cresciuti, ora esservi più parità tra le due nazioni, e perciò con
termini di maggiore egualità dover essere trattati. Così discorrevano
gli Americani; e forse i meno rispettivi fra i medesimi s'elevavano a
più grandi speranze. L'universalità però, contenta agli antichi termini
della congiunzione coll'Inghilterra, purchè questa rinunziasse alle
tentate ed alle disegnate usurpazioni, abborriva la totale separazione
dalla medesima, e se i più erano più audaci diventati a difendere i

diritti e privilegj loro, non detestavano però meno intensamente il
pensiero di gettare via del tutto ogni specie di dependenza verso del
legittimo sovrano. La qual cosa tanto più prontamente
condannavano, quanto che avrebbe in tale tentativo non solo fatto
bisogno di affrontare per sè stessi tutte le forze dell'Inghilterra, le
quali per tante vittorie erano formidabili diventate a tutto il mondo;
ma ancora ricorrere agli ajuti di una nazione per lingua, per costumi,
per abiti, per maniere tanto da sè stessi diversa, colla quale,
seguendo le bandiere della comune patria, avevano sì lunga e sì
ardente nimicizia esercitato. In tale stato avrebbono forse continuato
le cose ancora per lungo tempo, malgrado delle suggestioni francesi
da un canto, e della nuova baldanza americana dall'altro, se, dopo
conchiusa la pace del 1763, l'Inghilterra non avesse fatto insoliti
pensieri di nuove avanie, di nuove proibizioni, di nuove gabelle, e di
nuove tasse.
Il commercio inglese essendo sul finire della guerra colla Francia
arrivato al più estremo grado di prosperità, ei non si potrebbe dire
facilmente, quanta fosse la moltitudine delle navi, le quali portavano
ne' porti della Gran-Brettagna le più ricche derrate, e merci da tutte
le parti del mondo, e ne levavano i proventi, e specialmente i lavori
del paese, i quali sopra tutti erano in pregio presso le estere nazioni;
e siccome le varie mercatanzie, che o si introducevano, o si
levavano, erano le une più e le altre meno gabellate, così questo
commercio era divenuto sorgente di una abbondante rendita del
pubblico tesoro. Ma accadde, che con quello crebbe anche il
contrabbando con grandissimo danno di esso tesoro. Volendo il
governo andare all'incontro ad una peste sì perniziosa, fece una
provvisione nel 1764, per la quale si ordinò, non solo ai comandanti
delle fuste armate che stanziavano sulle coste dell'Inghilterra, ma
ancora a quelli di quegli altri vascelli, che erano mandati in America,
che avessero a fare l'uffizio dei gabellieri, e conformarsi alle regole
stabilite per le cose di dogana; cosa in vero insolita e di pessimo
effetto, che que' valorosi uffiziali, i quali con laude universale
avevano contro il nemico combattuto, ora avessero a diventare
altrettanti gabellieri, stradieri e grascini. Questa provvisione produsse

gli effetti i più perniziosi; perciocchè prima di tutto le genti di mare,
essendo poco informate delle regole di gabella, mettevano la mano
addosso e confiscavano indifferentemente, e le navi, che portavano
merci vietate, e quelle che non ne portavano; e nacquero in ciò molti
abusi, i quali, se in Inghilterra erano tostamente emendati, non
potevano esserlo del pari in America per la lontananza de' luoghi e le
formalità da osservarsi. La qual cosa fece levare nelle colonie un
romor grande contro la legge. Ma maggiori ancora furono i danni da
questa partoriti. Da lungo tempo s'intratteneva un commercio tra le
colonie inglesi e spagnuole molto proficuo alle une ed alle altre
siccome in ultimo anche all'Inghilterra. I principali oggetti di questo
traffico erano, dalla parte delle colonie inglesi le manifatture inglesi,
le quali gli Americani coi proventi loro avevano procacciate in
Inghilterra; e dalla parte degli Spagnuoli oro ed argento vergati e
monetati, cocco e droghe medicinali, ed inoltre bestiame, e
spezialmente muli, i quali gli Americani portavano nelle isole delle
Indie occidentali, dove erano in grandissimo pregio tenuti. Questo
traffico procurava agli Americani un'abbondanza di que' metalli, per
la quale erano abilitati a fare copiose incette di manifatture inglesi, e
forniva nello stesso tempo il paese loro con una sufficiente quantità
di monete d'oro e d'argento. Ciò, se non era proibito dalle leggi
inglesi concernenti il commercio, non era tampoco con specifiche
parole permesso. Pertanto i nuovi gabellieri credettero, fosse debito
loro di arrestare il corso di questo traffico, come se fosse di
contrabbando, e pigliavano indistintamente tutte le navi, o inglesi, o
estere che fossero, le quali portassero merci di sì fatta natura. Ond'ei
fu di breve interrotto con grave danno delle colonie di terra ferma,
ed anche delle stesse isole inglesi, massimamente della Giamaica.
Da queste medesime cause fu guasto un altro molto importante
commercio, che si esercitava tra le colonie inglesi dell'America da
una parte, e le Indie occidentali appartenenti alla Francia dall'altra, il
quale era per quelle e per queste di grandissima utilità. La materia di
questo erano quelle grasce, derrate, o merci, che erano superflue
agli uni, e mancanti agli altri. Perciò non è da far maraviglia, se i
coloni, subito ricevute le novelle di sì grave danno, abbiano

deliberato, di non fare più per l'avvenire nissun procaccio di quelle
mercatanzie inglesi, che al vestire dell'uomo sono necessarie, o
convenienti, e di non usarne altre, per quanto possibile fosse, fuori
di quelle, che fossero opera dei proprj manifattori, come pure di dare
a quelle manifatture, dove s'adoperassero materie prodotte in
abbondanza dalle terre ed animali loro, ogni favore. Ma in Boston
particolarmente, città ricca e popolata, in cui s'era grandemente
introdotto il lusso delle cose inglesi, non si può dire, quanto
s'alterassero gli animi, nè con quanta prontezza, abbandonando le
superfluità, concorressero a volere alla antica modestia ritornare.
Della quale cosa se ne vide un notabile esempio nelle pompe dei
funerali, i quali incominciarono a farsi senz'abiti da scorruccio e
senza guanti inglesi. Questa temperanza nuova tanto si distese in
quella città, che nell'anno 1764 ne furono risparmiate oltre le
diecimila lire di sterlini. Altre Terre seguitarono l'esempio; sicchè
diventò uso presso tutti di mettere in disparte quelle superfluità, le
quali erano i proventi o delle manifatture, o del terreno
dell'Inghilterra. Oltre a ciò, e questa era anche necessità per la
scarsezza della moneta, trovandosi i negozianti delle colonie debitori
di grosse somme verso gl'Inglesi, e non potendo sperare di avere ad
ottenere da questi nuove somministranze senza nuovi pagamenti, i
quali non erano in grado di effettuare, entrarono anch'essi
nell'annuale dei risparmj, si astennero dalle incette, e rinunziarono
alle delicatezze e pompe passate con gravissimo danno dei
manifattori inglesi.
Ma qui non ristette il governo inglese, come se non fosse contento,
ad avere il mal animo generato in America, ma di più volesse indurvi
la disperazione. Nel mese di marzo del 1764 fu vinta nel Parlamento
una provvisione, per la quale se dall'un canto si veniva a permettere
il traffico tra le americane colonie e le Antille francesi, ed altre
spettanti ad altri potentati europei; dall'altro si gravavano sì
fattamente d'ingorde gabelle le robe, che da queste in quelle
s'avessero ad introdurre, che venne, come suole, ad originarsi un
contrabbando frequentissimo in ogni cosa con grave danno del
commercio stesso, ed eguale pregiudizio del costume e probità

mercantile. Per soprassoma a tanto male per la medesima
provvisione era statuito, che la moneta ricavata da queste gabelle
dovesse in ispecie essere pagata nell'erario d'Inghilterra. Colla quale
ordinazione, se qualche poca di moneta rimaneva nelle colonie,
questa la doveva tutta sottrarre, ed in Inghilterra trasportare. Si
alterarono vieppiù gli Americani ricevendo le novelle di una legge
tanto insolita, ed andavano dicendo, queste essere cose tra di loro
contrarie; questo essere un volere il fine, e nello stesso tempo
togliere i mezzi per arrivarvi; perchè da una parte il governo gli
privava di ogni maniera di poter procacciare moneta, dall'altra voleva
trarla fuori del paese, e trasportare a tremila miglia lontano. Ma
quasichè i Ministri temessero, non si calmasse troppo presto l'impeto
degli sdegni da queste nuove provvisioni suscitato, ne arrosero
anche un'altra, la quale fu vinta nel Parlamento quindici giorni dopo,
ed ordinò che i biglietti di credito, che venissero per l'avvenire a
gittarsi dalle diverse colonie in America, non potessero più aver
corso di legale moneta ne' pagamenti; e che in riguardo a quelli, i
quali erano già in corso, non potessero medesimamente servire di
pagamento legale oltre il termine prefisso per la loro redenzione ed
estinzione. Egli è però vero, che tutta la moneta da ritirarsi dalle
mentovate gabelle doveva, per altri articoli della provvisione, essere
tenuta in serbanza, e solo doveva impiegarsi nelle spese alla
protezione delle colonie necessarie: e che nel medesimo tempo, in
cui si vinse la provvisione concernente i biglietti di credito, alcune
altre ne furon fatte per accrescere e regolare il vicendevole
commercio tra le colonie e la comune patria, e quello tra l'una e
l'altra colonia. Ma queste leggi non sortivano l'effetto che se ne
aspettava; perciocchè dovessero di necessità esser molto lente
nell'operare, mentre che quelle che restrignevano ed ampliavano il
commercio esterno delle colonie, o il traffico domestico loro
impedivano, subitamente dovevano l'effetto loro partorire. Egli è
vero ancora, che alcuni affermavano che la più gran parte, per non
dire la totalità della moneta riscossa da queste gabelle, non poteva
non tornare indietro nelle colonie per dare le paghe ai soldati, i quali
per difenderle e proteggerle avevano in quelle gli alloggiamenti loro.
Ma chi assicurava le colonie, che le soldatesche avessero a continuar

Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.
More than just a book-buying platform, we strive to be a bridge
connecting you with timeless cultural and intellectual values. With an
elegant, user-friendly interface and a smart search system, you can
quickly find the books that best suit your interests. Additionally,
our special promotions and home delivery services help you save time
and fully enjoy the joy of reading.
Join us on a journey of knowledge exploration, passion nurturing, and
personal growth every day!
ebookbell.com