Langid with ChatGPT benchmark comarision with fasttext.pdf

bazevgenii 14 views 15 slides Sep 19, 2024

Slide 1 of 15

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

About This Presentation

LangID with LLM

Size: 1.63 MB

Language: en

Added: Sep 19, 2024

Slides: 15 pages

Slide Content

ChatGPT: Jack of all trades,
master of none
Paper Discussion

03/2023
Evgeny BAZAROV, [email protected]

arXiv:2302.10724v1 [cs.CL] 21 Feb 2023

Let’s go to paper

Small After Party
Language identification with Chat-GPT

ChatGPT API (gpt-3.5-turbo)
BESEDO

Old Benchmark

1.Filter data (1 < n_chars < 15)
2.Sample 10k (stratified) - should be ok to be representative
3.Create prompt
4.Run API with gpt-3.5-turbo
5.Postprocess results
6.Benchmark fasttext vs chatgpt
Steps

Prompting
V1: "You are an API to a Language Identification service. I will give you text as
input, you will return to me the ISO 639-1 code of a language in which text was
written. Your response should be in json format."

Creativity?

Prompting
V2:

Result

Result

Result
5-7 points better than fasttext

Tags

langid

Categories

General

Download

Download Slideshow Get the original presentation file

Quick Actions

Statistics

Views 14
Slides 15
Age 437 days

Related Slideshows

View More in This Category