EXPLORING NATURAL LANGUAGE PROCESSING By:- MANAS SINGH How Computers Understand Human Language
What is NLP? Natural Language Processing (NLP) is a field of artificial intelligence that focuses on the interaction between computers and human languages. It involves the development of algorithms and models to enable computers to understand, interpret, and generate human language. NLP has applications in various areas such as machine translation, sentiment analysis, text summarization, and speech recognition. Chat GPT, Sora AI, Siri, and Google Translate are some examples of Natural Language Processing. 2 Fig.1
Things NLP Can Do: Sentiment Analysis Machine Translation Question Answering Text Summarization Language Generation NLP Challenges: Multilinguality Multimodality Handling Noisy Data Ambiguity Understanding Context 3
How does NLP Work in Chat GPT? 1. Tokenization: Algorithm/Tool: WordPiece Tokenization or Byte Pair Encoding Description: Splits the input text into smaller units called tokens, often based on subword units or characters. 2. Word Embeddings: Algorithm/Tool: Word2Vec, GloVe, or FastText Description: Converts tokens into dense vector representations capturing semantic relationships between words. 4 Fig.2
3. Transformer Architecture: Algorithm/Tool: Transformer architecture (e.g., OpenAI GPT, BERT) Description: Using neural networks to find dependencies between words and contextual information. 4. Training: Algorithm/Tool: Backpropagation, Stochastic Gradient Descent (SGD). Description: The model is trained with a large dataset of human conversations or text data to help it understand the language and context. Similar to what we do to babies. 5. Generation: Algorithm/Tool: Beam Search, Top-k Sampling, Top-p Sampling. Description: Generates a response by predicting the most likely next word based on learned patterns and context, to assist the user in communicating. 5
6. Decoding: Algorithm/Tool: Greedy Decoding, Beam Search, Sampling techniques Description: Converts the sequence of predicted tokens into human-readable text, selecting the most likely sequence or sampling diverse sequences based on different decoding strategies. 7. Fine-Tuning: Algorithm/Tool: Transfer Learning, Fine-Tuning Description: Retrains the model on a smaller dataset relevant to the specific task or domain, updating parameters to improve performance in particular contexts. 6
Bibliography Intro To NLP: An informative PowerPoint presentation by Sergei Nirenberg. https://www.seas.upenn.edu/~eeaton/teaching/cmsc471_fall07/slides/IntroToNLP.ppt Chat GPT: Tools used in NLP for ChatGPT, and SoraAI . https://chat.openai.com/ Images: https://tezo.com/ https://commons.wikimedia.org/ https://www.vecteezy.com/