SPEECH TRANSLATION USING LLMS using chatgpt.pptx

AmitSinghYadav21 32 views 12 slides Aug 12, 2024
Slide 1
Slide 1 of 12
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12

About This Presentation

Durham University logo
Search
Open menu
Home
Study
Courses
Master of Laws M1K116
Master of Laws
Developed to reflect the vast range of legal contexts that exist in governmental structures across the globe, this course offers the flexibility to focus on your own areas of particular interest.
Alte...


Slide Content

SPEECH TRANSLATION USING LLM NAME- AMIT SINGH YADAV COURSE- PHD CSE ROLL NO- 2301201001 SUPERVISOR- DR. CHANDRESH KUMAR MAURYA

Table of Content Introduction Transformers BERT GPT Audio Conditioned LLM Speech Translation Conclusion References

INTRODUCTION Large Language Models (LLMs) are a class of artificial intelligence models designed to understand and generate human-like text. These models are typically based on deep learning architectures, particularly variants of recurrent neural networks (RNNs) or transformer architectures. Some examples of LLMS are GPT-3,ChatGPT,RoBERTa,XLNET etc.

TRANSFORMERS The core innovation of transformers is the self-attention mechanism. This mechanism allows the model to weigh the importance of different words in a sentence when encoding or decoding. It computes attention scores between all pairs of words in a sequence, capturing dependencies regardless of their position.

BERT BERT is based on the Transformer architecture, which is a neural network architecture specifically designed for sequence-to-sequence tasks, such as language translation and text generation .

GPT GPT is a generative model, meaning it can generate human-like text based on a given prompt or context. It's trained to predict the next word in a sequence given the preceding words, and this ability enables it to generate coherent and contextually relevant text.

HOW SPEECH TRANSLATION WORKS?

HOW SPEECH TRANSLATION WORKS?

CONCLUSION In conclusion, speech translation using LLMs offers a seamless and efficient solution for real-time multilingual communication, enabling individuals to overcome language barriers and interact effectively in diverse linguistic environments. This technology holds great potential for enhancing accessibility, fostering global collaboration, and facilitating cross-cultural communication in various domains.

Graves, A., Mohamed, A. R., & Hinton, G. (2013). Speech recognition with deep recurrent neural networks. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on (pp. 6645-6649). IEEE. Jimmy Lei Ba, Jamie Ryan Kiros , and Geoffrey E Hinton. Layer normalization. arXiv preprint arXiv:1607.06450, 2016. Vaswani, A., Shazeer , N., Parmar, N., Uszkoreit , J., Jones, L., Gomez, A. N., ... & Polosukhin , I. (2017). Attention is all you need. In Advances in neural information processing systems (pp. 5998-6008). Wang, Y., Skerry-Ryan, R., Stanton, D., Wu, Y., Jaitly , N., Yang, Z., ... & Shor, J. (2017). Tacotron : Towards end-to-end speech synthesis. arXiv preprint arXiv:1703.10135 . REFERENCES

THANK YOU