UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida

neo4j 82 views 36 slides May 14, 2024
Slide 1
Slide 1 of 36
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36

About This Presentation

Prof. Antonio Origlia – Università degli Studi di Napoli Federico II
Dip. di Ingegneria Elettrica e Tecnologie dell’Informazione & Centro di Ricerca URBAN/ECO


Slide Content

Il ruolo dei grafi nell’AI Conversazionale Ibrida

The team Dr. Maria Di Maro – Post-doc in computational linguistics , expertise in Common Ground management and Clarification Requests Martina Di Bratto – PhD Student in computational linguistics , expertise in argumentation-based dialogue Sabrina Mennella – PhD Student in computational linguistics , expertise in common sense representation Roberto Basile Giannini – Master student in computer science, working on Influence Diagrams for Conversation Management Danilo Esposito – Master student in computer science, working on instructions sequence generation based on common sense reasoning Marco Galantino – bachelor student in computer science, working on expressive speech synthesis Yegor Napadystyy – bachelor student in computer science, working on Nvidia Audio2Face REST control

Modelling language vs modelling communication Large Language Models can capture conversation dynamics but not the motives behind them Being based on machine learning, they capture regularities and correlations but cannot model causation Predicting the most probable utterance , given what people usually say , and presenting it as the AI’s position is wrong Choosing to talk (or not to talk) serves purposes : language is not an abstract sequence of symbols to be predicted Being able to consider , if not understand , the consequences of its own actions is a necessary step for machines to produce language with actual intents and avoid harmful content Machine Learning is only one of many powerful tools to understand language and human behaviour in general

Theoretical Model theoretical framework built upon the observation of communicative behaviour patterns Evaluation on the field deployment of the computational model in human centered tasks for evaluation purposes Computational Model formal representation of the theoretical model 03 01 02 Modelling language vs modelling communication

It is a tale told by an idiot, full of sound and fury, signifying nothing. Macbeth

Dialogue system

Dialogue flow ASR LLM TTS ASR Common Ground Representation Decision Making LLM TTS Generative AI is good for Natural Language Generation but not for Dialogue Management

Illusion of intelligence Braitenberg , V. (1986). Vehicles: Experiments in synthetic psychology. MIT press Simple wired vehicles can lead humans to attribute intelligent behaviour to machines while only reactive capabilities are given Machine Learning is very good at cleaning data by recognising underlying patterns inside noisy channels. Are Neural Networks very complex Braitenberg vehicles? May appear to be intelligent but truly not have actual intentionality […] when we look at these machines or vehicles as if they were animals in a natural environment ... we will be tempted, then, to use psychological language in describing their behavior. And yet we know very well that there is nothing in these vehicles that we have not put in ourselves

Austin, J. L. (1962). Speech acts.

Austin, J. L. (1962). Speech acts.

Conflict Search Graph Di Maro, M., Origlia, A., & Cutugno, F. (2021). Cutting melted butter? Common Ground inconsistencies management in dialogue systems using graph databases. IJCoL . Italian Journal of Computational Linguistics, 7(7-1, 2), 157-190.

Bastian

The ladder of causation Pearl, J., & Mackenzie, D. (2018). The book of why: the new science of cause and effect. Basic books. Understanding one’s own role in the world gives meaning to producing linguistic acts Machine learning based agents are passive observers of the world and can only react to it, not act in it Modelling the effect that acting in the world provides , including speaking , and confronting the consequences allows to generate language for its intended purpose Retrospective reasoning is a fundamental ability that relies on causal modelling capabilities

Doing To say something is to do something ; or in which by saying or in saying somthing we are doing something Austin (Language Philosophy - 1962) The first cognitive ability, seeing or observation, is the detection of regularities in our environment, and it is shared by many animals as well as early humans before the Cognitive Revolution. The second ability, doing , stands for predicting the effect(s) of deliberate alterations of the environment, and choosing among these alterations to produce a desired outcome. Usage of tools, provided they are designed for a purpose and not just picked up by accident or copied from one’s ancestors, could be taken as a sign of reaching this second level. Pearl (Computer Science - 2018)

Language as a tool I have a dream But it is not this day When diplomacy ends, war begins The first step in the development of taste is to be willing to credit your own opinion Blessed are those who are persecuted because of righteousness Does he look like a bitch?

Machine learning Decision making Counterfactual reasoning

The role of graphs Good to represent data in a human interpretable way Backed by considerable math Can be projected in numerical spaces for the machine to work with A modern way to recover inference engines Support grounding and explainability

Network analysis algorithms help studying the problem from a mathematical point of view. The Hyperlink-Induced Topic Search (HITS) algorithm scores nodes in terms of Authority: how important the node is in the network Hub: how important are the node’s relationships Paglieri , F., & Castelfranchi , C. (2005). Revising beliefs through arguments: Bridging the gap between argumentation and belief revision in mas. Argumentation in multi-agent systems: First international workshop Kleinberg, J. M. (1999). Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM) , 46 (5), 604-632. The role of graphs

Text embeddings are numeric vectors that capture the general topics found in textual content (movie plots) Node embeddings are numeric vectors that capture the role of the node in the network, in terms of neighbourhood Weighted embeddings can be computed using text similarities Chen, H., Sultan, S. F., Tian, Y., Chen, M., & Skiena , S. (2019). Fast and accurate network embeddings via very sparse random projection. In Proceedings of the 28th ACM international conference on information and knowledge management (pp. 399-408). The role of graphs

Interpretability : a graph is uninterpretable if any of the graph patterns describing each of the foreseen communication problems is activated. In these cases, a clarification request is produced; Completeness : an interpretable graph is incomplete if information needed to respond to the user intent is missing. In these cases, a request for information is produced; Coherence : a complete graph is incoherent if logical conflicts are found in the belief graph. In these cases, the adequate disambiguation question is produced; Stability : a coherent graph is unstable if there are open issues, like unanswered questions. In these cases, a question answering strategy is activated; Desirability : a stable graph is undesirable if it does not exhibit the goal pattern. In these cases, the most useful dialog move to create the goal pattern is produced, like an exploration or exploitation move. The role of graphs

Communication as a graph stabilisation game

Example

Decision making Bayesian models take into account all the data that is needed to take a decision Can consider machine learning performance on test sets as priors Separate priors from evidence Consider the utility of graph configurations given the possibility to take decisions Where do the models come from?

Decision making Relevant subgraph extraction

Why ( not ) to speak ? Desire The graph representing the situation I am part of is not what I want it to be. What is the linguistic act that maximises the probability that the desired graph configuration will appear ? Competence I am not able to confidently predict what is going to happen given the current graph configuration . Can I ask a question to collect more information and improve my prediction capabilities? Curiosity The current graph configuration probably contains missing information. Can I ask a question to complete it ? Hypothesizing Things went differently from what I expected . What would have been the graph configuration if I did something different ?

Illocutionary strength Problem complexity Interpretability Incompleteness Incoherence Instability Undesirability Unfilled core FE Perception Syntactic Lexical Reference Unfilled non-core FE Information optionality Belief strenght Answer complexity Structured QA Entropy Deliberate Information sharing Explanation RAG Neutral conflict CS conflict -IAR conflict +IAR conflict Illocutionary forces

Behaviour Trees

FANTASIA Origlia, A., Cutugno, F., Rodà , A., Cosi, P., & Zmarich , C. (2019). FANTASIA: a framework for advanced natural tools and applications in social, interactive approaches. Multimedia Tools and Applications , 78 (10), 13613-13648. Origlia, A., Di Bratto, M., Di Maro, M., & Mennella, S. (2022). Developing Embodied Conversational Agents in the Unreal Engine: The FANTASIA Plugin. In  Proceedings of the 30th ACM International Conference on Multimedia  (pp. 6950-6951).

The Metafamily – Maya 2017 – 2023 Spatially aware presentations

The Metafamily – Bastian 2020 – 2023 Common ground inconsistencies

The Metafamily – Jason 2021 – 2023 User profiling and recommendation

Desideratum Symbolic AI Statistical AI Markov Random Fields Embedding Logical Neural Networks Neural nets as universal solvers X X X Allow specialised sub- units X X X X Meta-learning/Multi-task X - X Modular X - - X Can use prior knowledge X X X True reasoning X - - X Variables X X X X Symbol manipulation X --- Generic models X X X X X Causality - - --- Planning capabilities X X - Perception / reasoning blending - - X Perform true NLU with novel interpretation generation - Acquire knowledge through Natural Language --- Learn with less data ---

Conclusions Modern AI provides great tools to project physical observations in n- dimensional numerical spaces , which can represent the mind of a machine A machine can only communicate with intention if it is given the chance to explore and learn about the consequences of the action of speaking Psychological theories about motivation and emotion provide drivers for the machine to evaluate what speaking or not speaking implies Research is too much oriented at depleting the mine of every technology that is presented . Lots of papers showing the applications of new models and little research about the reasons why these models work A commercial push towards brute force approaches is present : only the big players have the strength so democratization is a problem Computer science is maybe the most introspective among hard sciences. Developing machines that talk can help model intelligence and build tools to stimulate thought through unbiased dialogues