Detecting and translating language ambiguity with multilingual LLMs
i_BM
153 views
26 slides
Jun 29, 2024
Slide 1 of 26
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
About This Presentation
Language is one of the most important landmarks in humans in history. However, most languages could be ambiguous, which means the same conveyed text or speech, results in different actions by different readers or listeners. In this project we propose a method to detect the ambiguity of a sentence us...
Language is one of the most important landmarks in humans in history. However, most languages could be ambiguous, which means the same conveyed text or speech, results in different actions by different readers or listeners. In this project we propose a method to detect the ambiguity of a sentence using translation by multilingual LLMs. In this context, we hypothesize that a good machine translator should preserve the ambiguity of sentences in all target languages. Therefore, we investigate whether ambiguity is encoded in the hidden representation of a translation model or, instead, if only a single meaning is encoded. The potential applications of the proposed approach span i) detecting ambiguous sentences, ii) fine-tuning existing multilingual LLMs to preserve ambiguous information, and iii) developing AI systems that can generate ambiguity-free languages when needed.
Size: 2.36 MB
Language: en
Added: Jun 29, 2024
Slides: 26 pages
Slide Content
Detecting and translating language ambiguity with multilingual LLMs Behrang Mehrparvar ( [email protected]) Sandro Pezzelle ( [email protected]) (ILLC, UvA ) Summer 2024
Motivation! 2
What is Ambiguity? 3 Mariano Ceccato, Nadzeya Kiyavitskaya, Nicola Zeni , Luisa Mich, and Daniel M Berry. 2004. Ambiguity identification and measurement in natural language texts. Publisher: University of Trento .
Examples “Give me the bat!” ~ Lexical “The professor said on Monday he would give an exam ” ~ Syntactic “Jane saw the man with a telescope ” ~ Semantic “I like you too !” ~ Pragmatic “The prof said she would give us all A’s .” ~ Vagueness “Proposal” to “ voorstel ” and “ aanzoek ” ~ Translational 4 Apurwa Yadav, Aarshil Patel, and Manan Shah. 2021b . A comprehensive review on resolving ambiguities in natural language processing . AI Open , 2:85–92.
Research questions Question 1 : Can a state-of-the-art Transformer-based MT model properly encode whether a sentence in the source language is (non-)ambiguous ? Question 2 : Are both semantic validity and ambiguity preserved by the translation when the sentence is translated into a target language, and then translated back? Question 3 : Can we predict the ambiguity of a sentence by translating it into another language looking at the learned hidden representations? 5
Approach 6
Step1&2: Translation Model: Facebook M2M100 418M Many-to-many multilingual encoder-decoder ( seq -to- seq ) model Fan, A., Bhosale , S., Schwenk , H., Ma, Z., El- Kishky , A., Goyal , S., ... & Joulin , A. (2021). Beyond english -centric multilingual machine translation. Journal of Machine Learning Research , 22 (107), 1-48 . Source language: English Target languages (n=14): {German, Greek, Persian, Spanish, French, Hindi, Italian, Korean, Dutch, Russian, Turkish, Croatian, Romanian, Chinese} 7
Possible states 8
Step3: Mapping function Idea : “Can we map source translation representation to target translation representation ?” “What are the properties of this mapping function?” 9
Step3: Mapping function (cont.) Idea: Autoencoder as a mapping function! Normalized reconstruction error Model complexity 10
Step3: Mapping function (cont.) Idea : “ An Auto-encoder should behave differently in mapping ambiguous and unambiguous sentences ” “ How reconstruction error changes over model size and target language is predictive of sentence ambiguity ” 11
Sample mappings Andrei picked up the chair or the bag and the telescope ( ambiguous ) Andrei picked up the chair, or both the bag and the telescope ( unambiguous ) 12
Step4: Engineering approach 13
Is Rec. Error informative? 14
Step 4: Classification Input: Single language – All models All languages - Best model Only along languages Whole mapping function info. Input variable: Reconstruction error Reconstruction error differences Output: Ambiguous vs unambiguous Ambiguity type (4 classes) Model Logistic regression Neural network Cross-validation 10-fold 15
Results Input Input variable Output Model Accuracy F-measure Engineering appr . Alpha Amb . Vs unamb . Logistic 52.53 % 0.525 Persian Differences Amb . Vs unamb . Logistic 57.81 % 0.578 Best AE Values Amb . Vs unamb . Logistic 66.67 % 0.667 Along languages Differences Amb . Vs unamb . Logistic 85.87 % 0.859 Whole Differences Amb . Type Logistic 92.83 % 0.928 Whole Differences Amb . Vs unamb . Logistic 88.19 % 0.882 Best AE Values Amb . Vs unamb . NN 73.20 % 0.732 Whole Values Amb . Vs unamb . NN 81.99 % 0.820 Whole Values Amb . Type NN 78.26 % - Whole Differences Amb . Type NN 93.03 % 0.925 Whole Differences Amb . Vs unamb . NN 94.94 % 0.949 16
Findings Single language translation is not informative enough in predicting ambiguity. 57.8059 % 85.865 % Adding more features … Single best auto-encoder is not informative enough in predicting ambiguity. 66.6667 % 88.1857 % Adding more features about the gradual change over the size of the AE model … Adding reconstruction error differences between languages improves accuracy. 85.865 % 88.1857 % Adding more features about the properties of the mapping function mesh … 17
Findings (cont.) Reconstruction error differences is more informative than their values. 81.9876 % 94.9367 % The shape of the mapping function is informative not the position … A simple linear model can perform relatively close to a complex NN model . 88.1857 % 94.9367 % More complex features learned … Predicting more detailed classes improves the accuracy in linear models . 88.1857 % 92.827 % (LR) More detailed regions in the misclassified regions (see next slide) … 94.9367 % 93.038 % (NN) Nonlinear boundaries anyway … 18
Data distribution (PCA) 19
Source of misclassification 20
Misclassification examples Italian (translation problem): Source sentence: Andrei and Danny put down the yellow bag and chair ( amb .) Machine translation: Andrei e Danny mettono il sacchetto giallo e la sedia ( unamb .) Machine translation back: Andrei and Danny put the yellow bag and the chair. ( unamb .) Human translation: Andrei e Danny mettono il sacchetto e la sedia color giallo ( amb .) Persian (language problem): Source sentence: Andrei left the person with a green bag ( amb .) Machine translation: آندری مرد را با چمدان سبز ترک کرد ( unamb .) Machine translation back: Andrew left the man with a green bag. ( amb .) 21
Research questions Question 3: Predicting Yes, we can predict language ambiguity using translation with LLMs with high accuracy. Question 2: Preservation Depends on the language. The problem could be either because of the translation or the target language itself. Question 1: Encoding Yes, as ambiguity could be preserved in translation and be predicted as well, the information shouldn’t have been lost in encoding . 22
Contributions Predicting ambiguity without direct use of semantics Data complexity : Could be trained with small training data Generalizability : Could be easily generalizable to new unseen samples Robustness : No need to update classifier model due to change in input distribution 23
Future directions Interpretability : Detecting source of ambiguity (which word?) Extendibility : Extending to source languages other than English Analysis : Source of misclassification in all languages via data annotation 24
Potential applications Automatic translation of critical documents e.g. legal, political, commercial, etc. Ask the user for an unambiguous sentence Fine-tuning existing multilingual LLMs Prevent ambiguity Developing AI systems for generating ambiguity-free languages ( www.Synaptosearch.com ) Classify the sentence Get the gradient w.r.t. input Use as a loss function term 25