Detecting and translating language ambiguity with multilingual LLMs

i_BM 153 views 26 slides Jun 29, 2024
Slide 1
Slide 1 of 26
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26

About This Presentation

Language is one of the most important landmarks in humans in history. However, most languages could be ambiguous, which means the same conveyed text or speech, results in different actions by different readers or listeners. In this project we propose a method to detect the ambiguity of a sentence us...


Slide Content

Detecting and translating language ambiguity with multilingual LLMs Behrang Mehrparvar ( [email protected]) Sandro Pezzelle ( [email protected]) (ILLC, UvA ) Summer 2024

Motivation! 2

What is Ambiguity? 3 Mariano Ceccato, Nadzeya Kiyavitskaya, Nicola Zeni , Luisa Mich, and Daniel M Berry. 2004. Ambiguity identification and measurement in natural language texts. Publisher: University of Trento .

Examples “Give me the bat!” ~ Lexical “The professor said on Monday he would give an exam ” ~ Syntactic “Jane saw the man with a telescope ” ~ Semantic “I like you too !” ~ Pragmatic “The prof said she would give us all A’s .” ~ Vagueness “Proposal” to “ voorstel ” and “ aanzoek ” ~ Translational 4 Apurwa Yadav, Aarshil Patel, and Manan Shah. 2021b . A comprehensive review on resolving ambiguities in natural language processing . AI Open , 2:85–92.

Research questions Question 1 : Can a state-of-the-art Transformer-based MT model properly encode whether a sentence in the source language is (non-)ambiguous ? Question 2 : Are both semantic validity and ambiguity preserved by the translation when the sentence is translated into a target language, and then translated back? Question 3 : Can we predict the ambiguity of a sentence by translating it into another language looking at the learned hidden representations? 5

Approach 6

Step1&2: Translation Model: Facebook M2M100 418M Many-to-many multilingual encoder-decoder ( seq -to- seq ) model Fan, A., Bhosale , S., Schwenk , H., Ma, Z., El- Kishky , A., Goyal , S., ... & Joulin , A. (2021). Beyond english -centric multilingual machine translation.  Journal of Machine Learning Research ,  22 (107), 1-48 . Source language: English Target languages (n=14): {German, Greek, Persian, Spanish, French, Hindi, Italian, Korean, Dutch, Russian, Turkish, Croatian, Romanian, Chinese} 7

Possible states 8

Step3: Mapping function Idea : “Can we map source translation representation to target translation representation ?” “What are the properties of this mapping function?” 9

Step3: Mapping function (cont.) Idea: Autoencoder as a mapping function! Normalized reconstruction error Model complexity 10

Step3: Mapping function (cont.) Idea : “ An Auto-encoder should behave differently in mapping ambiguous and unambiguous sentences ” “ How reconstruction error changes over model size and target language is predictive of sentence ambiguity ” 11

Sample mappings Andrei picked up the chair or the bag and the telescope ( ambiguous ) Andrei picked up the chair, or both the bag and the telescope ( unambiguous ) 12

Step4: Engineering approach 13

Is Rec. Error informative? 14

Step 4: Classification Input: Single language – All models All languages - Best model Only along languages Whole mapping function info. Input variable: Reconstruction error Reconstruction error differences Output: Ambiguous vs unambiguous Ambiguity type (4 classes) Model Logistic regression Neural network Cross-validation 10-fold 15

Results Input Input variable Output Model Accuracy F-measure Engineering appr . Alpha Amb . Vs unamb . Logistic 52.53 % 0.525 Persian Differences Amb . Vs unamb . Logistic 57.81 % 0.578 Best AE Values Amb . Vs unamb . Logistic 66.67 % 0.667 Along languages Differences Amb . Vs unamb . Logistic 85.87 % 0.859 Whole Differences Amb . Type Logistic 92.83 % 0.928 Whole Differences Amb . Vs unamb . Logistic 88.19 % 0.882 Best AE Values Amb . Vs unamb . NN 73.20 % 0.732 Whole Values Amb . Vs unamb . NN 81.99 % 0.820 Whole Values Amb . Type NN 78.26 % - Whole Differences Amb . Type NN 93.03 % 0.925 Whole Differences Amb . Vs unamb . NN 94.94 % 0.949 16

Findings Single language translation is not informative enough in predicting ambiguity. 57.8059 %  85.865 % Adding more features … Single best auto-encoder is not informative enough in predicting ambiguity. 66.6667 %  88.1857 % Adding more features about the gradual change over the size of the AE model … Adding reconstruction error differences between languages improves accuracy. 85.865 %  88.1857 % Adding more features about the properties of the mapping function mesh … 17

Findings (cont.) Reconstruction error differences is more informative than their values. 81.9876 %  94.9367 % The shape of the mapping function is informative not the position … A simple linear model can perform relatively close to a complex NN model . 88.1857 %  94.9367 % More complex features learned … Predicting more detailed classes improves the accuracy in linear models . 88.1857 %  92.827 % (LR) More detailed regions in the misclassified regions (see next slide) … 94.9367 %  93.038 % (NN) Nonlinear boundaries anyway … 18

Data distribution (PCA) 19

Source of misclassification 20

Misclassification examples Italian (translation problem): Source sentence: Andrei and Danny put down the yellow bag and chair ( amb .) Machine translation: Andrei e Danny mettono il sacchetto giallo e la sedia ( unamb .) Machine translation back: Andrei and Danny put the yellow bag and the chair. ( unamb .) Human translation: Andrei e Danny mettono il sacchetto e la sedia color giallo ( amb .) Persian (language problem): Source sentence: Andrei left the person with a green bag ( amb .) Machine translation: آندری مرد را با چمدان سبز ترک کرد ( unamb .) Machine translation back: Andrew left the man with a green bag. ( amb .) 21

Research questions Question 3: Predicting Yes, we can predict language ambiguity using translation with LLMs with high accuracy. Question 2: Preservation Depends on the language. The problem could be either because of the translation or the target language itself. Question 1: Encoding Yes, as ambiguity could be preserved in translation and be predicted as well, the information shouldn’t have been lost in encoding . 22

Contributions Predicting ambiguity without direct use of semantics Data complexity : Could be trained with small training data Generalizability : Could be easily generalizable to new unseen samples Robustness : No need to update classifier model due to change in input distribution 23

Future directions Interpretability : Detecting source of ambiguity (which word?) Extendibility : Extending to source languages other than English Analysis : Source of misclassification in all languages via data annotation 24

Potential applications Automatic translation of critical documents e.g. legal, political, commercial, etc. Ask the user for an unambiguous sentence Fine-tuning existing multilingual LLMs Prevent ambiguity Developing AI systems for generating ambiguity-free languages ( www.Synaptosearch.com ) Classify the sentence Get the gradient w.r.t. input Use as a loss function term 25

Thank you! 26