Next word Prediction

KHUSHISOLANKI3 2,685 views 14 slides Jan 19, 2022

Slide 1 of 14

About This Presentation

How next word is predicted using NLP and LSTM

Size: 2.24 MB

Language: en

Added: Jan 19, 2022

Slides: 14 pages

Slide Content

NEXT WORD PREDICTION KHUSHI SOLANKI 91800151036 UMMEHANI VAGHELA 91800151019 UNDER GUIDANCE OF: Prof. KAUSHA MASANI

2 2 2 ABSTRACT 2 The next word prediction model built will exactly perform. The model will consider the last word of a particular sentence and predict the next possible word. We will be using methods of natural language processing, language modelling, and deep learning. We will start by analysing the data followed by the pre-processing of the data. We will then tokenize this data and finally build the deep learning model. The deep learning model will be built using LSTM’s.

3 3 3 INTRODUCTION 3 In this Project will cover what the next word prediction model built will exactly perform. The model will consider the last word of a particular sentence and predict the next possible word. We will be using methods of natural language processing(NLP), language modelling, and deep learning. We will start by analysing the data followed by the pre-processing of the data. We will then tokenize this data and finally build the deep learning model. The deep learning model will be built using LSTM’s.

WORD PREDICTION 4 • What are likely completions of the following sentences? • “Oh, that must be a syntax …” • “I have to go to the …” • “I’d also like a Coke and some …” • Word prediction (guessing what might come next) has many applications: • speech recognition • handwriting recognition • augmentative communication systems • spelling error detection/correction

5 N-GRAMS • A simple model for word prediction is the n-gram model. • An n-gram model “uses the previous N-1 words to predict the next one”.

6 USING CORPORA IN WORD PREDICTION • A corpus is a collection of text (or speech). • In order to do word prediction using n-grams we must compute conditional word probabilities. • Conditional word probabilities are computed from their occurrences in a corpus or several corpora.

COUNTING TYPES AND TOKENS IN CORPORA 7 • Prior to computing conditional probabilities, counts are needed. • Most counts are based on word form (i.e. cat and cats are two distinct word forms). • We distinguish types from tokens: • “We use types to mean the number of distinct words in a corpus ” • We use “tokens to mean the total number of running words.” [pg. 195]

8 WORD(unigram) PROBABLITIES 8 • Simple model: all words have equal probability. For example, assuming a lexicon of 100,000 words we have P(the )=0.00001, P(rabbit) = 0.00001 . • Consider the difference between “the” and “rabbit” in the following contexts: • I bought … • I read …

9 9 UNIGRAM 9 • Somewhat better model: consider frequency of occurrence. For example, we might have P(the) = 0.07, P(rabbit)=0.00001. • Consider difference between • Anyhow, … • Just then, the white …

10 10 BIGRAM 10 • Bigrams consider two-word sequences. • P(rabbit | white) = …high… • P(the | white) = …low… • P(rabbit | anyhow) = …low… • P(the | anyhow) = …high… • The probability of occurrence of a word is dependent on the context of occurrence: the probability is conditional.

11 11 PROBABLITIES 11 • We want to know the probability of a word given a preceding context. • We can compute the probability of a word sequence: P(w1w2… wn ) • How do we compute the probability that wn occurs after the sequence (w1w2…wn-1)? • P(w1w2… wn ) = P(w1)P(w2|w1)P(w3|w1w2) … P(wn|w1w2…wn-1) • How do we find probabilities like P(wn|w1w2…wn-1)? • We approximate! In a bigram model we approximate using the previous one word; approximate by P(wn|wn-1).

12 12 12 COMPUTING CONDITIONAL PROBABLITIES 12 • How do we compute P(wn|wn-1 )? • P(wn|wn-1) = C(wn-1wn) / ∑ w C(wn-1w ) • Since the sum of all bigram counts that start with wn-1 must be the same as the unigram count for wn-1, we can simplify: • P(wn|wn-1) = C(wn-1wn) / C(wn-1)

13 UML DIAGRAM

14 14 14 14 THANK YOU!! 14

Download

Download Slideshow Get the original presentation file

Quick Actions

Statistics

Views 2,685
Slides 14
Age 1414 days

Next word Prediction

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Next word Prediction

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx