A Framework for Automatic Question Answering in Indian Languages

precogatIIITD 239 views 73 slides Aug 08, 2022
Slide 1
Slide 1 of 73
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73

About This Presentation

The distribution of research efforts done in the field of Natural Language
Processing (NLP) has not been uniform across all natural languages. It has
been observed that there is a significant gap between the development of
NLP tools in Indic languages (indic-NLP), and in European languages. We
aim t...


Slide Content

A Framework for Automatic Question
Answering in Indian Languages
Ritwik Mishra (PhD19xxx)
27th July 2022

Comprehensive Panel Members
Dr Rajiv Ratn Shah
PhD advisor
IIIT-Delhi
Prof Ponnurangam Kumaraguru
PhD advisor
IIIT-Hyderabad
Prof Radhika Mamidi
External expert
IIIT-Hyderabad
Dr Arun Balaji Buduru
Internal expert
IIIT-Delhi
Dr Raghava Mutharaju
Internal expert
IIIT-Delhi
2

What is Automatic Question Answering (QnA)?
Human/user asks a question, and a computer produces an answer.
3

Outline
●Motivation
●Thesis Statement
●Contributions
○FAQ retrieval
○Open Information Extraction
○Better Contextual Embeddings
●Future Plans
○Pronoun Resolution
○Long-context Question Answering
4

Question Answering
Open
Information
Extraction
(OIE)
Better Contextual
Embeddings
Pronoun Resolution
Our Work
5
Thesis Overview

Why languages?
●Human intelligence is very much
attributed to our ability to express
complex ideas through language.
●With the advent of text modality
(i.e. writing) our ability to store
and spread information increased
dramatically.
6

Why Indian Languages?
●Despite being spoken by billions of people, Indic languages
are considered low-resource due to the lack of annotated
resources.
“Hindi is... an underdog. It has enough resources and everything looks
promising for Hindi.”
- Dr. Monojit Choudhury, MSR India, ML India Podcast, Aug 2020
7

Cases where Indic-QnA works well
8

Cases where it fails
9

Thesis Statement
●Explore the possibility to develop a framework for Automated
Question-Answering (QnA) in Indian languages with the help of multiple
supporting tasks like Open Information Extraction (OIE) and Pronoun
Resolution, and improved contextual embeddings.
10

Thesis Statement (visualized)
11
Question-Answering
Retrieval-based
Open
Information
Extraction
Pronoun
resolution
ChunkingImprove Contextual Embeddings

Question-Answering
12
Question-Answering
Retrieval-based
Open
Information
Extraction
Pronoun
resolution
ChunkingImprove Contextual Embeddings

Types of QnA
QnA
Domain
Context
Answer Type
Discourse
Open-Domain Closed-Domain
Short
Long
No
Span-based
(MRC)
Sentence-based
(AS2)
Conversational
Memory-less
13

Methodologies in QnA
●Rule based approach
●Generative approach
●Retrieval based approach
14

Methodologies in QnA
●Rule based approach
●Generative approach
●Retrieval based approach
15

Methodologies in QnA
16
●Rule based approach
●Generative approach
●Retrieval based approach

Retrieval-based
17
Question-Answering
Retrieval-based
Open
Information
Extraction
Pronoun
resolution
ChunkingImprove Contextual Embeddings

FAQ retrieval based QnA
User query
(q)
Question
i
( Q
i
)
Answer
i
( A
i
)
Top-k Question-Answer pairs (QA)
Where Qi is most similar to the user query (q)
k
18

FAQ retrieval based QnA (internal view)
User query (q)
FAQ database Sentence
Similarity
Calculator
(SSC)
# Q A
# Q A score
FAQ database with each
row having its similarity
score with q
19

Example
User query (q) : ब?च/ क' न!gभ फ?ल# ह* ह0 ठ$क करन/ क/ gलए ?य! ?य! कर/?
Chatbot suggested:
Q1) ब?च/ क' अच!नक न!gभ फ? ल ज!ए त* ?य! करE?
A1) ब?च/ क' न!gभ अच!नक नह#2 फ?लत" आमत+र पर यह द/ख! गय! ह0 iक जब ब?च/ क! प/ट थ*ड़! फ?लत! ह0 …
Q2) अगर 5 म?हन/ क ब?च/ क' न!gभ फ?ल# ह?ई ह* त* ?य! कर/?
A2) 5 मह#न/ क/ ब?च/ क' न!gभ आम त*र पर मस?स क' कमज*र# क/ क!रण ह*त" ह0 इस मE क*ई lच2त! ….
Q3) नवज!त ब?च/ क' फ?ल# ह?ए न!gभ ह*त" ह0 वह जब ठ$क ह* ज!ए त* ?य! उसक/ बढ़ उस/ ग0स क' $(?लम ह*
ज!त" ह0?
A3) यह एक ब1म ह0 ब?च/ क' न!gभ जब फ??ल# ह*त" ह0 त* वह खतर/ क/ ल?ण नह#2 ह0 फ??ल# ह?ई न!gभ …
20

How to evaluate?
Question ( Q

)
Answer ( A

)
User deciding the relevancy of the
top-k Question-Answer pairs (QA)
suggested by chatbot for 4 different queries
k
21
Question ( Q

)
Answer ( A

)
k
Question ( Q

)
Answer ( A

)
k
Question ( Q

)
Answer ( A

)
k
Top-k suggestions for q1 Top-k suggestions for q2
Top-k suggestions for q3 Top-k suggestions for q4

Evaluation Metrics
1.Success Rate (SR)
2.Precision at k (P@k)
3.Mean Average Precision (mAP)
4.Mean Reciprocal Rank (MRR)
5.normalized Discounted Cumulative Gain (nDCG)
22
Success Rate is the % of user queries for whom the chatbot suggested at least one relevant QA pair
among its top-k suggestions.

Evaluation techniques
1.Success Rate (SR)
2.Precision at k (P@k)
3.Mean Average Precision (mAP)
4.Mean Reciprocal Rank (MRR)
5.normalized Discounted Cumulative Gain (nDCG)
Precision at k is the proportion of recommended items in the top-k set that
are relevant.
23

Evaluation techniques
1.Success Rate (SR)
2.Precision at k (P@k)
3.Mean Average Precision (mAP)
4.Mean Reciprocal Rank (MRR)
5.normalized Discounted Cumulative Gain (nDCG)
For each user (or q)
For each relevant item
Compute precision till that item
Average them
Average them
24

Evaluation techniques
1.Success Rate (SR)
2.Precision at k (P@k)
3.Mean Average Precision (mAP)
4.Mean Reciprocal Rank (MRR)
5.normalized Discounted Cumulative Gain (nDCG)
25

Evaluation techniques
1.Success Rate (SR)
2.Precision at k (P@k)
3.Mean Average Precision (mAP)
4.Mean Reciprocal Rank (MRR)
5.normalized Discounted
Cumulative Gain (nDCG)
26

Evaluation techniques
1.Success Rate (SR)
2.Precision at k (P@k)
3.Mean Average Precision (mAP)
4.Mean Reciprocal Rank (MRR)
5.normalized Discounted
Cumulative Gain (nDCG)
27

Selected Literature
●Daniel, Jeanne E., et al. "Towards automating healthcare question answering in a noisy multilingual low-resource setting."
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019.
○Non-public healthcare dataset in African languages
○230K QA pairs → 150K Qs with 126 As
○Hence 126 clusters of Qs
○Predict a cluster for q
●Bhagat, Pranav, Sachin Kumar Prajapati, and Aaditeshwar Seth. "Initial Lessons from Building an IVR-based Automated
Question-Answering System." Proceedings of the 2020 International Conference on Information and Communication
Technologies and Development. 2020.
○Transcription based
○516 Qs with 90 As
○User had to specify broad category of q
●Sakata, Wataru, et al. "FAQ retrieval using query-question similarity and BERT-based query-answer relevance." Proceedings of
the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2019.
○Used stackexchange dataset (719 QA and 1K q) and scrapped localGovFAQ dataset (1.7K QA and 900 q).
○Q-q similarity works better than QA-q or A-q similarity.
28

Selected Literature
●Daniel, Jeanne E., et al. "Towards automating healthcare question answering in a noisy multilingual low-resource setting."
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019.
○Non-public healthcare dataset in African languages
○230K QA pairs → 150K Qs with 126 As
○Hence 126 clusters of Qs
○Predict a cluster for q
●Bhagat, Pranav, Sachin Kumar Prajapati, and Aaditeshwar Seth. "Initial Lessons from Building an IVR-based Automated
Question-Answering System." Proceedings of the 2020 International Conference on Information and Communication
Technologies and Development. 2020.
○Transcription based
○516 Qs with 90 As
○User had to specify broad category of q
●Sakata, Wataru, et al. "FAQ retrieval using query-question similarity and BERT-based query-answer relevance." Proceedings of
the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2019.
○Used stackexchange dataset (719 QA and 1K q) and scrapped localGovFAQ dataset (1.7K QA and 900 q).
○Q-q similarity works better than QA-q or A-q similarity.
Research Gap: There is a need to propose automated solutions to provide healthcare related information
to an end-user in remote places of India.
29

The proposed FAQ-chatbot
Chatbot
FAQ
database if count < k
User query (q)
Finding top-k most similar
QA pair for the given q
Asking user if Q is similar to q?
No
No
Yes
A
q cannot be
answered
Yes Response
Response
User User User
30

Different SSC techniques
Sentence
Similarity
Calculator
(SSC)
Dependency Tree
Pruning (DTP) method
Cosine Similarity (COS)
method
Sentence-pair Classifier
(SPC) method
Training NOT needed Training needed
31

Evaluation data
●On field testing was not feasible due to COVID.
●We collected 336 queries from the healthcare workers.
●Manually annotated each of them, and found complete/partial matching Qs.
32
q
336
Healthcare-worker
q Qs
Hold-out
test-set
For each query (q) we manually
found similar questions (Qs) from
the FAQ database 270

Results 1/2
DTP SPC COS

DTP DTP
q-e
SPC SPC
+A
SPC
q-e
COS COS
q-e
mAP 30.5 35.1 39.4 21.1 39.1 26.5 27.9 45.3
MRR 42.6 48.5 54.6 42.2 54.2 38.7 41.0 61.6
SR 27.1 59.6 66.2 49.6 64.4 47.7 51.1 70.3
nDCG 55.5 51.2 57.1 43.9 56.5 40.8 43.3 62.5
P@3 27.1 30.0 34.6 34.6 34.6 22.7 23.9 34.6
Table 1. Comparison of all three primary approaches on hold-out test set for top-3 suggestions. Ensemble
(ℇ) is obtained by taking the best performing models, highlighted with underlined text, from each of the
primary approach.
33

Results 2/2

-COS

-DTP

-SPC

mAP 40.9 40.8 30.3 45.3
MRR 56.2 56.5 43.8 61.6
SR 66.2 66.2 51.1 70.3
nDCG 58.4 58.2 45.5 62.5
P@3 34.6 34.6 23.9 34.6
Table 2. Results of ablation study on the Ensemble method (ℇ). We observed that each
approach plays an important role in the performance of the developed chatbot.
34

Limitations
●The retrieval-based approaches are limited to extracting information from an
indexed database which has to be curated manually.
●In order to retrieve information from a text dump of raw sentences, a
knowledge-graph has to constructed from the unstructured text.
●OIE tools could be used to extract facts from raw sentences in different
domains.
35

Open Information Extraction
36
Question-Answering
Retrieval-based
Open
Information
Extraction
Pronoun
resolution
ChunkingImprove Contextual Embeddings

Open Information Extraction (OIE)
●Extract facts from raw sentences.
○Use triples to represent the facts.
○Format of triples is <head, relation, tail>.
●Example
○PM Modi to visit UAE in Jan marking 50 yrs of diplomatic ties.
■one of the possible meaningful triple would be:
<PM Modi, to visit, UAE>
37

Why OIE is important?
●Its ability to extract triples from large amounts of texts in an unsupervised
manner.
●It serves as an initial step in building or augmenting a knowledge graph out of
an unstructured text.
●OIE tools have been used to solve downstream applications like
Question-Answering, Text Summarization, and Entity Linking.
38

OIE in Indian language needs a special attention
●Take an English sentence
○Ram ate an apple
○A sensible triple would be
■<Ram, ate, an apple>
○A nonsensical triple would be
■<an apple, ate, Ram>
●Now look at a Hindi sentence (same meaning)
○र!म न/ स/ब ख!य!
○Both these triples would be sensible
■<र!म न/, ख!य!, स/ब>
■<स/ब, ख!य!, र!म न/>
39

How generated triples are evaluated?
●Manual Annotations Valid
Well-formed
Reasonable
Concrete
True
Tabular representation
Image representation
40

How generated triples are evaluated?
●Manual Annotations
●Automatic Evaluations
41

sent_id:1 वह ऑ?/gलय! क/ पहल/ $ध!न म2" क/ ?प मE क!य[रत थ/ और ऑ?/gलय! क/ उ?च ?य!य!लय क/ स2?थ!पक ?य!य!ध"श बन/ ]
(He served as the first Prime Minister of Australia and became a founding justice of the High Court of Australia .)
------ Cluster 1 ------
वह --> क!य[रत थ/ --> [ऑ?/gलय! क/]{a} [पहल/] $ध!न म2" क/ ?प मE
(He --> served --> as [the] [first] Prime Minister [of Australia]{a})
वह --> बन/ --> [ऑ?/gलय! क/ उ?च ?य!य!लय क/]{b} [स2?थ!पक] ?य!य!ध"श
(He --> became --> [founding] justice [of the High Court of Australia]{b})
{a} ऑ?/gलय! क/ --> property --> [पहल/] $ध!न म2" क/ ?प मE |OR| वह --> [पहल/] $ध!न म2" क/ ?प मE क!य[रत थ/ --> ऑ?/gलय! क/
({a} of Australia --> property --> as [first] Prime Minister |OR| He --> served as [the] [first] Prime Minister --> of Australia)
{b} [ऑ?/gलय! क/]{c} उ?च ?य!य!लय क/ --> property --> [स2?थ!पक] ?य!य!ध"श |OR| वह --> [स2?थ!पक] ?य!य!ध"श बन/ --> [ऑ?/gलय! क/]{c} उ?च ?य!य!लय क/
({b} of High Court [of Australia]{c} --> property --> [founding] justice |OR| He --> became [founding] justice --> of High Court [of
Australia]{c})
{c} ऑ?/gलय! क/ --> property --> उ?च ?य!य!लय क/
({c} of Australia --> property --> of High Court)
------ Cluster 2 ------
वह --> [पहल/] $ध!न म2" क/ ?प मE क!य[रत थ/ --> ऑ?/gलय! क/
(He --> served as [the] [first] Prime Minister --> of Australia)
वह --> [स2?थ!पक] ?य!य!ध"श बन/ --> [ऑ?/gलय! क/]{a} उ?च ?य!य!लय क/
(He --> became [founding] justice --> of High Court [of Australia]{a})
{a} ऑ?/gलय! क/ --> property --> उ?च ?य!य!लय क/
({a} of Australia --> property --> of High Court)
Hindi-BenchIE
42

Selected Literature
●Faruqui, Manaal, and Shankar Kumar. "Multilingual Open Relation Extraction Using Cross-lingual Projection." Proceedings of
the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language
Technologies. 2015.
○Supported Hindi but it was translation based
○Didn’t release code but released triples
●Rao, Pattabhi RK, and Sobha Lalitha Devi. "EventXtract-IL: Event Extraction from Newswires and Social Media Text in Indian
Languages@ FIRE 2018-An Overview." FIRE (Working Notes) (2018): 282-290.
○Event specific and domain specific IE
●Ro, Youngbin, Yukyung Lee, and Pilsung Kang. "Multiˆ2OIE: Multilingual Open Information Extraction Based on Multi-Head
Attention with BERT." Findings of the Association for Computational Linguistics: EMNLP 2020. 2020.
○Modelled triple extraction as sequence labelling problem through BERT embeddings (end-to-end)
○Identify relations, and then their head-tail
43

Selected Literature
●Faruqui, Manaal, and Shankar Kumar. "Multilingual Open Relation Extraction Using Cross-lingual Projection." Proceedings of
the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language
Technologies. 2015.
○Supported Hindi but it was translation based
○Didn’t release code but released triples
●Rao, Pattabhi RK, and Sobha Lalitha Devi. "EventXtract-IL: Event Extraction from Newswires and Social Media Text in Indian
Languages@ FIRE 2018-An Overview." FIRE (Working Notes) (2018): 282-290.
○Event specific and domain specific IE
●Ro, Youngbin, Yukyung Lee, and Pilsung Kang. "Multiˆ2OIE: Multilingual Open Information Extraction Based on Multi-Head
Attention with BERT." Findings of the Association for Computational Linguistics: EMNLP 2020. 2020.
○Modelled triple extraction as sequence labelling problem through BERT embeddings (end-to-end)
○Identify relations, and then their head-tail
Research Gap:
(1)The field of Open Information Extraction (OIE) from unstructured text in Indic languages has not been
explored much. Moreover, the effectiveness of existing multilingual OIE techniques has to be evaluated on
Indic languages.
(2)There is a scarcity of annotated resources for automatic evaluation of automatically generated triples for
Indic languages.
(3)Construction of knowledge-graph using extracted triples from unstructured text in Indian languages needs
to be explored as well.
44

IndIE: our Indic OIE tool
Raw Text
Output
Sentence segmented text,
Triples for each sentence,
Execution Time
Off-the-shelf sentence segmentation and
dependency parser by Stanford
(a) XLM-roberta fine-tuned on chunk
annotated data
(b) Creating Merged-phrases
Dependency Tree (MDT)
3: Chunk tags
3: Dependency
tree
Greedy algorithm
based on
hand-crafted rules
(c) Triple extraction
2: Sentences
1: Shallow parsing
4: Passing MDT
5: Result
accumulation
45

Chunker
46
Question-Answering
Retrieval-based
Open
Information
Extraction
Pronoun
resolution
ChunkingImprove Contextual Embeddings

Chunker Implementation 1/2
Transformer
Tokenizer
Pretrained
Transformer
Tokens Token IDs Embeddings
47

Chunker Implementation 2/2
Transformer
Tokenizer
Pretrained
Transformer
Feed-forward
Layer
Tokens Token IDs
Token-level
predictions
Ground-Truth
Calculating
Loss
(a) Initial
Embeddings
(b) Taking
average
(c) Final
Embeddings
48
Our approach

Results: Chunker
Classification LayersCommon approach in literatureA different approach Our approach
1 82±10 (50±20) 89±0.5 (62±1.0) 91±0.0 (65±0.5)
2 86±1.8 (51±6.2) 89±0.5 (54±7.4) 90±0.5 (54±4.5)
3 79±14 (43±13) 82±11 (41±12) 90±0.5 (48±2.2)
Table 3. A comparison of three approaches for solving the sub-word token embeddings for chunking task. Four different random seeds
were used to calculate the mean and standard deviation for the given samples. All the experiments were run on the combined data of
TDIL and UD. The numbers written outside round brackets represent the accuracy, whereas numbers inside round brackets represent
the macro average.
Model Hindi English Urdu Nepali Gujarati Bengali
XLM 78% 60% 84% 65% 56% 66%
CRF 67% 56% 71% 58% 53% 53%
Table 4. A comparison of (fine-tuned) XLM chunker and CRF chunker on the languages which are removed from training-set.
The numbers represent the accuracy obtained by each model when sentences from the given language are used only in the
test-set.
49

Results: Triple Evaluation (manual)
Image options ArgOE M&K Multi2OIE PredPatt IndIE
No information 17% 28% 71% 5% 4%
Most Information 22% 44% 24% 29% 17%
All information 61% 28% 5% 66% 79%
Table 5. Percentage of sentences having no/most/all information in the image representation of their generated triples. The method which
generates maximum triples with ‘All information’ is considered the best method.
#Triple ArgOE M&K Multi2OIE PredPatt IndIE
Total 45 180 50 40 272
Valid 38 142 10 39 252
Well-formed 32 75 9 36 240
Reasonable 26 69 6 32 227
Concrete 21 53 4 25 158
True 19 51 4 25 152
Table 6. Number of triples extracted by each OIE method for 106 Hindi sentences.
50

Results: Triple Evaluation (automatic)
ArgOE M&K Multi2OIEPredPattIndIE
Precision0.26 0.14 0.12 0.37 0.62
Recall 0.06 0.16 0.03 0.08 0.62
F1-score 0.10 0.15 0.05 0.14 0.62
Table 7. Performance of different OIE methods on Hindi-BenchIE golden set consisting of 75
Hindi sentences. It is observed that IndIE outperforms other methods on all three metrics.
51

Limitations
●Small size of evaluation datasets.
●The contextual embeddings (from transformers) are known to not take the
syntactic structure of sentence into account.
●As a result, it has been observed that syntactic information is either absent in
the transformer embeddings or it is not utilized while making the predictions.
●Therefore, there is a need to explore the possibility of incorporating syntactic
(dependency) information in transformer embeddings.
52

Improve Contextual Embeddings
53
Question-Answering
Retrieval-based
Open
Information
Extraction
Pronoun
resolution
ChunkingImprove Contextual Embeddings

Dependency-aware Transformer (DaT) Embedding
Transformer
Tokenizer
Pretrained
Transformer
Tokens Token IDs (a) Initial
Embeddings
(b) Taking
average
(c) Final
Embeddings
54

DaT Embedding: Graph
12345
101100
200011
310000
401000
501000
Adjacency Matrix A
12345
120000
203000
300100
400010
500001
Degree Matrix D
1
2 3
4 5
55

DaT Embedding: Graph Convolution Network (GCN)
12345
101100
200011
310000
401000
501000
Adjacency Matrix A
12345
120000
203000
300100
400010
500001
Degree Matrix D
1
2 3
4 5
56

DaT Embedding: GCN message passing
57

Selected Literature
●Jie, Zhanming, Aldrian Obaja Muis, and Wei Lu. "Efficient dependency-guided named entity recognition." Thirty-First AAAI
Conference on Artificial Intelligence. 2017.
●Marcheggiani, Diego, and Ivan Titov. "Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling."
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017.
●Zhang, Yuhao, Peng Qi, and Christopher D. Manning. "Graph Convolution over Pruned Dependency Trees Improves Relation
Extraction." Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018.
58

Selected Literature
●Jie, Zhanming, Aldrian Obaja Muis, and Wei Lu. "Efficient dependency-guided named entity recognition." Thirty-First AAAI
Conference on Artificial Intelligence. 2017.
●Marcheggiani, Diego, and Ivan Titov. "Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling."
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017.
●Zhang, Yuhao, Peng Qi, and Christopher D. Manning. "Graph Convolution over Pruned Dependency Trees Improves Relation
Extraction." Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018.
Research Gap: Incorporating dependency structure of a sentence in generating dependency-aware
contextual embeddings is an under-explored field in Indic-NLP.
59

GCN variants
0) No GCN
1)
2.1)
2.2)
3.1)
60

Evaluation Dataset
●Small Dataset
○Fearspeech | #4782 | Labels: 'Fear speech':1 , 'Normal': 0
○Codalab | #5728 | 'hate’:1, offensive':1,'defamation:1, 'fake':1, 'non-hostile':0
○HASOC | #4665 | ‘non-hostile’: 0, ‘fake’:0, … , ‘offensive’:1 , ‘hate’: 1
●Big Dataset
○ IIITD abusive language data |
7 lacs (multilingual) | 30k Hindi |
not abusive: 0, abusive:1
61

DaT Embedding Results
62
Epoch 1 Epoch 2 Epoch 3 Epoch 4
v0
71.2 +/- 2.9
(67.7-74.6)
75.9 +/- 1.2
(74.1-77.5)
77.4 +/- 0.6
(76.8-78.1)
79.0 +/- 0.7
(78.2-79.8)
v1
72.4 +/- 1.9
(69.4-74.0)
76.6 +/- 1.2
(74.9-78.0)
78.8 +/- 1.0
(77.1-79.6)
80.3 +/- 1.1
(79.3-81.8)
v2.2
74.2 +/- 1.1
(72.5-75.3)
77.2 +/- 0.9
(75.9-78.3)
79.3 +/- 0.8
(78.4-80.2)
80.5 +/- 0.7
(79.5-81.2)
Epoch 1 Epoch 2 Epoch 3 Epoch 4
v0
71.2 +/- 2.5
(67.3-73.9)
76.2 +/- 1.2
(74.6-77.4)
77.9 +/- 0.6
(77.2-78.8)
79.8 +/- 0.6
(78.8-80.3)
v1
72.6 +/- 1.6
(70.2-74.0)
76.5 +/- 1.0
(75.5-77.8)
78.5 +/- 1.1
(76.9-79.7)
79.9 +/- 1.1
(79.0-81.8)
v2.2
74.2 +/- 1.3
(72.1-75.2)
76.6 +/- 1.0
(75.5-77.8)
79.0 +/- 1.0
(77.9-80.3)
80.0 +/- 1.2
(78.9-82.1)
Table 8. Validation Accuracy on the Small dataset
Table 9. Testing Accuracy on the Small dataset

Limitation
ल+ट/
र!म अय*?य!
बन/
वह र!ज!
Q
Q
nsubj
punct
obl
nsubj
xcomppunct
root
root
र!मअय*?य!ल+ट/Q वहर!ज!बन/Q[PAD][PAD]
र!म1 1
अय*?य!1 1
ल+ट/1 1 1 1
Q 1 1
वह 1 1
र!ज! 1 1
बन/ 1 1 1 1
Q 1 1
[PAD] 1
[PAD] 1
63

Future Plans
●End-to-End Pronoun Resolution in Hindi
●Indic-SpanBERT
●Long-context Question Answering
64

Pronoun Resolution
65
Question-Answering
Retrieval-based
Open
Information
Extraction
Pronoun
resolution
ChunkingImprove Contextual Embeddings

Pronoun Resolution
Every speaker had to
present his paper .
Barack Obama visited
India. Modiji came to
receive Obama.
He was brave. He was
Ashoka the great.
If you want to hire them,
then all candidates must
be treated nicely.
Ram went to forest
with his wife
Krishna was an avatar of
Vishnu.
CEO of Reliance, Mukesh
Ambani, inaugurated the
SOM building.
66

Dataset
Mujadia, Vandan, Palash Gupta, and Dipti Misra Sharma. "Coreference Annotation Scheme and Relation Types for Hindi."
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16). 2016.
67
Table 10. Corpus Details
Table 11. Distribution of co-referential entities

Coref chain
68
Unique
mention ID
0: Intermediate token
1: End token
Unique chain ID
They modify the mention
Source

Selected Literature
●Lee, Kenton, et al. "End-to-end Neural Coreference Resolution." Proceedings of the 2017 Conference on Empirical Methods in
Natural Language Processing. 2017.
●Joshi, Mandar, et al. "Spanbert: Improving pre-training by representing and predicting spans." Transactions of the Association
for Computational Linguistics 8 (2020): 64-77.
●Dakwale, Praveen, Vandan Mujadia, and Dipti Misra Sharma. "A hybrid approach for anaphora resolution in hindi." Proceedings
of the Sixth International Joint Conference on Natural Language Processing. 2013.
●Devi, Sobha Lalitha, Vijay Sundar Ram, and Pattabhi RK Rao. "A generic anaphora resolution engine for Indian languages."
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers. 2014.
●Sikdar, Utpal Kumar, Asif Ekbal, and Sriparna Saha. "A generalized framework for anaphora resolution in Indian languages."
Knowledge-Based Systems 109 (2016): 147-159.
69

Selected Literature
●Lee, Kenton, et al. "End-to-end Neural Coreference Resolution." Proceedings of the 2017 Conference on Empirical Methods in
Natural Language Processing. 2017.
●Joshi, Mandar, et al. "Spanbert: Improving pre-training by representing and predicting spans." Transactions of the Association
for Computational Linguistics 8 (2020): 64-77.
●Dakwale, Praveen, Vandan Mujadia, and Dipti Misra Sharma. "A hybrid approach for anaphora resolution in hindi." Proceedings
of the Sixth International Joint Conference on Natural Language Processing. 2013.
●Devi, Sobha Lalitha, Vijay Sundar Ram, and Pattabhi RK Rao. "A generic anaphora resolution engine for Indian languages."
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers. 2014.
●Sikdar, Utpal Kumar, Asif Ekbal, and Sriparna Saha. "A generalized framework for anaphora resolution in Indian languages."
Knowledge-Based Systems 109 (2016): 147-159.
Research Gap:
(1)A publicly available tool for coreference resolution in Indic languages is required to be built using
the state-of-art coreference resolution techniques.
(2)Pretraining of transformer models using different subword tokenization methods and pretraining
tasks are to be investigated extensively.
70

Publications
1.Mishra, Ritwik, Simranjeet Singh, Rajiv Ratn Shah, Ponnurangam Kumaraguru, and
Pushpak Bhattacharya. "IndIE: A Multilingual Open Information Extraction Tool For Indic
Languages". 2022. [Submitted to a special issue of TALLIP]
2.Mishra, Ritwik, Simranjeet Singh, Jasmeet Kaur, Rajiv Ratn Shah, and Pushpendra Singh.
"Exploring the Use of Chatbots for Supporting Maternal and Child Health in
Resource-constrained Environments". 2022. [Draft ready. Under internal review]
Miscellaneous
●Mishra, Ritwik, Ponnurangam Kumaraguru, Rajiv Ratn Shah, Aanshul Sadaria, Shashank
Srikanth, Kanay Gupta, Himanshu Bhatia, and Pratik Jain. "Analyzing traffic violations
through e-challan system in metropolitan cities (workshop paper)." In 2020 IEEE Sixth
International Conference on Multimedia Big Data (BigMM), pp. 485-493. IEEE, 2020.
78

Acknowledgements
●I would like to express my gratitude to pillars group members for their valuable
guidance.
●I would like to thank Simranjeet, Samarth, Ajeet, and Jasmeet for being diligent
co-authors.
●I would also like to thank Prof Pushpak Bhattacharyya, and CFILT lab members for
providing me great insights and hosting me at IIT Bombay under Anveshan Setu
program.
●I would also like to University Grant Commission (UGC) Junior Research Fellowship
(JRF) / Senior Research Fellowship (SRF) for funding my PhD program
79

Thank You
80