JanusCon 2024: Mom there are robots in my meeting

saghul 26 views 43 slides May 09, 2024
Slide 1
Slide 1 of 51
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51

About This Presentation

Slides from my presentation at JanusCon 2024


Slide Content

Mon, there are robots
in my meeting!
Saúl Ibarra Corretgé


Once upon a time…

Set of Open Source projects
Easily deploy scalable and secure video
conferencing solutions.
APIs and mobile SDKs
Integrate Jitsi into existing products to add video
conferencing capabilities.
Community
Open Source enthusiasts from all around the world
contribute to Jitsi.

JaaS
Hosted Jitsi, by 8x8 — jaas.8x8.vc

2 API levels
•High level: iframe based, very
versatile and customisable
•Low level: your own UI

Product + Platform
•Work is spread across both
•Improvements in the in-
meeting experience
•Improvements in APIs for
embedders

The AI gold rush
AI + meetings = ?

Alright, let’s build a chatbot!
•It needs to join the meeting as
a participant
•Interact via text (built-in chat)
•Private messages for questions
•Hot word detection?
•Audio output?
•Video output?
How hard can it be?

Demos
Recorded, no sweat !

What was that?
•Node app running Chrome via Playwright
•Bot written with lib-jitsi-meet, our low level API
•Google for transcriptions
•ChatGPT for AI
•Zapier for emails

What was that?
•Node app running Chrome via Playwright
•Bot written with lib-jitsi-meet, our low level API
•Google for transcriptions
•ChatGPT for AI
•PlayHT for playing back audio
•WebGL for the moving robot

But we had other robots before…
•Our recorder is based on a Chrome instance running on a container
•PSTN access is accomplished with a server side WebRTC client
•Our robots are invisible though
•So we used Chrome because YOLO

Ian Malcolm, Jurassic Park
“Your scientists were so preoccupied with
whether or not they could, they didn't stop to
think if they should.”

How do we interact with chatbots?

Private text
conversations
(Mostly)

AI is personal

Let’s focus on the building blocks
•GenAI has text as its lingua
franca
•Whatever the input it gets
converted to text
•The some LLM operates on that
text
•Last, the result is converted to
the desired output

Text is King

Start by adding core
primitives
Transcriptions + Summaries

Real Time
Usable as subtitles

A10

How does it all fit together?

Getting to users
How would they know to turn transcriptions on?

Brave Talk
What are our customers doing?

How can you built this?
•Use transcription web hooks to
keep a full transcript
•Use transcription iframe events
to locally display the transcript
+ the full transcript to catch up
•Custom LLM (Brave doesn’t use
Skynet) for prompting

What we leaned
•Every time we tried to solve a problem with a robot there was a better way
to solve it without it
•In the context of a meeting
•Understanding transcriptions are a type of recording makes sense
•The first demo was poisonous, complexity wise; start simple!
Hopefully this helps others

Try it out!
•JaaS: https://jaas.8x8.vc — Get started for free
•meet.jit.si — Free Jitsi Meet instance
•github.com/jitsi — host it yourself

Future with AI
•Current generation of AI is very powerful but still ways to go
•A good future would involve small models for dedicated tasks
•Skynet
•Keep iterating
•Image summaries with LLaVA?
•Will we do bots?

Thanks
Tudor Avram Răzvan Purdel

That’s all folks!
Questions?
github.com/jitsi


" @jitsinews

" @saghul

# [email protected]