EXPLANATION Recognize the word said by the speaker. Hands free operation-we can use this speech recognition. The only problem is that the sounds of the words maybe ambiguous. EXAMPLE: “wreck a nice beach” sounds as same as “recognize speech”
ISSUES SEGMENTATION In sentences we have spaces but while we speak we don’t calculate the space. COARTICULTION The last letter in the word merges with the first letter of the next word. HOMOPHONES Words will have same pronunciation but the words will have different spellings
MODELS ACOUSTIC MODEL: Describes the word sounds. Eg : ceiling sealing LANGUAGE MODEL: Describes the word alone. Eg : ceiling fan sealing fan NOISY CHANNEL MODEL: Original message noisy channel the sound of the original message is received at the end APPROACHES : Speech Recognition Machine learning Spelling correction
ACOUSTIC MODEL SAMPLING RATE: sound is converted from analog to digital at certain different intervals. QUANTIZATION FACTOR: The range of the sound may be from 100hz to 1000hz .Measuring the word sound rate. PHONE: Exact sound of the word is not needed. This is can be formed based on human language. It corresponds to single vowel or consonant. Eg : Lets take rat and rate the vowel ”a” in the word rat sounds like ah( r ah t ) the vowel “a” in the word rate sounds like ae( r ae te )
PHONEME: Smallest unit of sound. The sound of the word may be similar but it has different words and meaning. Eg : TICK AND STICK FRAME: The signals of the word is known as Frame. Based on the signal of the word the frames are intervelled . FEATURE: Each frame has a specific ferature .
LANGUAGE MODEL Spoken language has different characteristics than written language. It is better to get a corpus(large set of text) of transcript of spoken language. Language model is basically getting the correct response for the question asked.