Mattingly "AI and Prompt Design: LLMs with NER"

BaltimoreNISO 698 views 32 slides May 14, 2024

Slide 1 of 32

About This Presentation

This presentation was provided by William Mattingly of the Smithsonian Institution, during the sixth segment of the NISO training series "AI & Prompt Design." Session Six: Text Classification with LLMs, was held on May 9, 2024.

Size: 3.2 MB

Language: en

Added: May 14, 2024

Slides: 32 pages

Slide Content

Prompt Design LLMs with NER

GliNER Large Language Models NER Vector Databases Semantic Searching RAG Goals

Machine Learning GliNER => A transformer architecture that allows you to pass a text and your own labels to a model without any training. Example: https://huggingface.co/spaces/tomaarsen/gliner_medium-v2.1 Zero-Shot NER

Large Language Models

LLMs Contextual Understanding Less Manual Effort Adaptability Improved Accuracy Multilingual Capability Benefits

LLMs Resource Intensity (and Cost) Data Privacy Concerns Black Box Models Training Data Bias Generalization Challenges Latency Issues Hallucinations Consistency Limitations

LLMs Thinking through your methodology for NER Assisting in certain steps of NER (RegEx) Zero-Shot NER Few-Shot NER How to use LLMs

Mrs. Jessica Monica Kapitan works at the office. Mrs. Kapitan is a lawyer. She is also friends with Mrs. Thompson and Miss. Smith. Sometimes Miss. Smith will miss her train.

Exercise 1: Capture all examples of Miss. and Mrs. in the text with their corresponding names using an LLM to generate RegEx https://regex101.com/r/TLfbGE/1

Exercise 1: One Solution \b(Mrs\.|Miss\.)\s+([A-Z][a-z]*(?:\s+[A-Z][a-z]*)*)

Mr. Thomas and Dr. Jessica Davis went to the store. They met Mrs. Stevens who works at a nearby office. They are all friends with Colonel Jackson. Col. Jackson is known to her friends by her first name, Terry. They all know Mr. and Mrs. Kapitan.

Exercise 2: Capture all examples [Honorific Entity] in the text with their corresponding names using an LLM to generate RegEx https://regex101.com/r/FYcO8C/1

Exercise 3: Use an LLM to identify the people in the following text. Think through an ethical way to use an LLM to identify potential women in these contexts. Dr. Tracey Jordan works at the Smithsonian where he develops methods to identify named entities. Mrs. Alex Jackson leads the team. She was trained in machine learning at Stanford. While Tracey functions as the domain expert, Alex Jackson designs the experiments. They have another colleague, Leslie Peters.

Vector Databases

Representing Texts Digitally Embeddings The apple is in the tree. 1-[0.01234, -0.23456, 0.87654, 0.45678, -0.56123, 0.65432, 0.12345, -0.77123, 0.08456, 0.34567, ...] 2-different vector 3-different vector 4-different vector 1-[0.01234, -0.23456, 0.87654, 0.45678, -0.56123, 0.65432, 0.12345, -0.77123, 0.08456, 0.34567, ...] 5-different vector

Vector Database What is it? It holds vectors in a database as storage. Similar vectors are stored closer.

Vector Database How do we use a vector database? We populate a vector database with by using a machine learning model to vectorize data and send them to the database.

Vector Database Why use a vector database?

Vector Database Why use a vector database? Vector databases allow users to store vector data in a way that allows users to query it and find similarity based on a vector-level similarity, rather than explicit human-defined similarity.

Vector Database What is it? A vector database holds numerous vectors or embeddings of data. Sometimes, the database will also store the original data alongside these vectors.

Vector Database Stacks

Vector Database Stacks What is available to us? Python, Annoy, Streamlit Cheap, easy to deploy, great for smaller datasets, but requires a little bit of knowledge to build from scratch Best for smaller databases (under 10,000 data) Python, txtAI Cheap and easy to use, more resource intensive but easy to deploy Allows for easy interpretability (via highlighting)

Multi-Modal How does it work?

Retrieval-Augmented Generation

How tall is Wookie?

RAG What is it? RAG allows for you to combine the strengths of large language models (LLMs) with vector databases It limits the chances for an LLM to hallucinate (generate fake information) It uses a vector database to find relevant material to a query

Mattingly "AI and Prompt Design: LLMs with NER"

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Mattingly &quot;AI and Prompt Design: LLMs with NER&quot;

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Slide 28

Slide 29

Slide 30

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx

Mattingly "AI and Prompt Design: LLMs with NER"