Responsible & Safe AI at GSFC Univ Vadodra

precogatIIITD 295 views 54 slides Aug 01, 2024

Slide 1 of 54

About This Presentation

Bias, Machine Unlearning, Interpretability, Guardrails, Jailbreaking, and many more

Size: 31.95 MB

Language: en

Added: Aug 01, 2024

Slides: 54 pages

Slide Content

Responsible & Safe AI 1 Aug 2024 Inaugural Ceremony GSFCU ACM Student chapter Vadodra , Gujarat Ponnurangam Kumaraguru (“PK”) # ProfGiri CS IIIT Hyderabad Vice President, ACM India TEDx Speaker https:// precog.iiit.ac.in / /in/ponguru @ponguru

Know the Audience Students / Faculty / Others Any of you have attended my talk(s) before?  4

What is AI? 5

What is Responsible & Safe? 6

What is Responsible & Safe AI? 7

11 Observations?

20 https://translate.google.co.in/

21 https://translate.google.co.in/

22 https://translate.google.co.in/

23 https://translate.google.co.in/

Activity Please do any prompting in any of these or other platforms, get them to give you biased response, do not do gender bias HINT: There are very nice prompts that students have come up with in the past  24

28 Guardrails

30 Jailbreak

What is an alignment problem? 31

What is an alignment problem? 32 https://youtu.be/yWDUzNiWPJA?si=wSDO4i_EMrHzHYDP

High- level instantiation: ‘RLHF’ pipeline First step: instruction tuning! Second + third steps: maximize reward 33 https://arxiv.org/pdf/2203.02155

Rouge AIs We risk losing control over AIs as they become more capable. Proxy gaming: YouTube / Insta – User engagement – Mental health 34

What is going on?  https://www.youtube.com/watch?v=lnyuIHSaso8&t=75s 35

Questions? 36

Forms of unlearning Exact unlearning Approximate unlearning Unlearning via differential privacy Empirical unlearning, where data to be unlearned are precisely known (training examples) Empirical unlearning, where data to be unlearned are underspecified (think “knowledge”) 37

Graph Unlearning What is it? 38

Graph Unlearning 39 Node feature unlearning Node unlearning Edge unlearning

40 Interpretability

New techniques and paradigms for turning model weights and activations into concepts that humans can understand https://en.wikipedia.org/wiki/Neural_network https://en.wikipedia.org/wiki/Artificial_neural_network 41 Interpretability

Interpretability: Mechanistic Reverse-engineer neural networks Explaining neurons and connected circuits 42

Bias in LLMs Current systems like ChatGPT employ guardrails, and do not respond to biased content Users on the Web leave out key contexts, which make LLMs think the content is biased This negatively affects user engagement LLMs must be able to explore and ask more questions Our work aims to make LLMs bias-aware – context resolves confusion! 48

Other Directions Probing Robustness Jailbreaking …. 49 https://precog.iiit.ac.in/pages/publications.html

50 https://precog.iiit.ac.in/teaching/responsible-ai-nptel-f24/index.html

51 Search for: Ponnurangam Kumaraguru https://www.linkedin.com/in/ponguru/ https://twitter.com/ponguru

Interested in working with us? 52 Full time Research Associates PhD Students Interns

Responsible & Safe AI at GSFC Univ Vadodra

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Responsible &amp; Safe AI at GSFC Univ Vadodra

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Slide 28

Slide 29

Slide 30

Slide 31

Slide 32

Slide 33

Slide 34

Slide 35

Slide 36

Slide 37

Slide 38

Slide 39

Slide 40

Slide 41

Slide 42

Slide 43

Slide 44

Slide 45

Slide 46

Slide 47

Slide 48

Slide 49

Slide 50

Slide 51

Slide 52

Slide 53

Slide 54

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx

Responsible & Safe AI at GSFC Univ Vadodra