[DSC DACH 24] AI and XR - Ivan Voras

DataScienceConferenc1 72 views 47 slides Oct 08, 2024
Slide 1
Slide 1 of 47
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47

About This Presentation

Life imitates art again as technologies previously only seen in science fiction media are, one by one, becoming accessible in the real world. This talk will present ideas, implementations, and case studies of combining immersive technologies like Augmented and eXtended reality with generative AI.


Slide Content

DSC DACH 2024 Ivan Voras, PhD <[email protected]> AI XR &

OUR GOAL

Overview 01 ABOUT US Introduction, competencies 02 FROM ML TO LLM What are LLMs? Intelligence? Story telling? 03 USING LLMS Intro to integrating LLMs 04 CONTENT GENERATION Text, 2D and 3D content generation via AI 05 METAVERSES Putting it all together 06 QUESTIONS Q&A from the Audience

1 INTRO

About me EQUINOX VISION AR platform startup (slide examples) Highly customized, purpose-made BESPOKE SW Bleeding edge technologies and interfaces CONSULTANCY Experience with multiple startups CTO Ivan Voras, PhD <[email protected]>

Equinox Vision AR PLATFORM Easy deployment of AR content at scale Real-world interactions look like games GAMIFICATION AR is an interface to business processes BUSINESS LINKS Intelligent chat-bots, Interfaces to back-ends AI INTEGRATIONS

2 ML ⇣ LLM

"Classical" machine learning still exists. LLMs are a bit "brute force" approach to AI, trained on billions of data items and relying on "emergent" behaviour.

1 MODEL PREDICTION AKA generation or inference TRAINING Training creates a model from input data 02 03 Machine learning Model is the basis for prediction

GPT-3 CREATOR BENCHMARKS NOTABLE FOR # PARAMETERS OpenAI The first one to gather mindshare 175B (published) June 2020 MMLU 65 MATH 5 ARC(c) 51 https://paperswithcode.com/sota/multi-task-language-understanding-on-mmlu https://paperswithcode.com/sota/math-word-problem-solving-on-math https://paperswithcode.com/sota/common-sense-reasoning-on-arc-challenge

GPT-3.5-turbo CREATOR BENCHMARKS NOTABLE FOR # PARAMETERS OpenAI Really popular ChatGPT backend 20B (estimated) March 2022 MMLU 70 MATH 34 ARC(c) 85

GPT-4 CREATOR BENCHMARKS NOTABLE FOR # PARAMETERS OpenAI Latest best OpenAI model 1.8T (published) May 2024 MMLU 89 MATH 86 ARC(c) 96

Gemini 1 Ultra CREATOR BENCHMARKS NOTABLE FOR # PARAMETERS Google Google's best model 1T (estimated) Feb 2024 MMLU 84 MATH 53 ARC(c) N/A

Claude 3 Opus CREATOR BENCHMARKS NOTABLE FOR # PARAMETERS Anthropic GPT-4 challenger 2T (estimated) March 2024 MMLU 96 MATH 60 ARC(c) 96 https://klu.ai/glossary/anthropic-claude-3

Llama3.1-405B CREATOR BENCHMARKS NOTABLE FOR # PARAMETERS Meta Open source GPT-4 challenger 405B (open) July 2024 MMLU 87 MATH 74 ARC(c) 97

Llama3.1-8B CREATOR BENCHMARKS NOTABLE FOR # PARAMETERS Meta Open source capable small LLM 8 B (open) July 2024 MMLU 73 MATH 52 ARC(c) 83

Qwen2-7B CREATOR BENCHMARKS NOTABLE FOR # PARAMETERS Alibaba Open source multilingual small LLM 7 B (open) June 2024 MMLU 70 MATH 44 ARC(c) 60 https://qwenlm.github.io/blog/qwen2/

Progress 2024 AliBaba Qwen2-7b MMLU: 70 MATH: 44 ARC-C: 60 Free (Apache License), small, multiligual openAI 2022 GPT3.5-turbo-20b MMLU: 70 MATH: 34 ARC-C: 85 Closed, commercial

Literally data pieces in the LLM, of different types. Most importantly: embeddings and weights. Directly related to RAM and computational complexity. Understanding "parameters"

Lossy compression of parameters. Models usually published with 16-bit parameters. Useful quantizations: 8-bit, 4-bit. Understanding quantization

"Programming" the LLM - but with plain language. VERY sensitive to LLM type and hyperparameters. Understanding prompt engineering

3 USING LLMs

Cloud API EASY No issues with setup / drivers NO FLEXIBILITY Can't use custom / specialised models EXPENSIVE Pricing is usually per token in+out, expensive in volume e.g. OpenAI, Groq, Fireworks.AI, Google, AWS, HuggingFace

FLEXIBLE Also cheaper for experimentation Internal / local HW / CLOUD COSTS Can be costly if unused COMPLEXITY Server configuration, GPU drivers Internally hosted, Locally hosted, Partly-managed or unmanaged cloud

Hugging FACE Simple in-process Python API VLLM High performance, advanced OLLAMA Easy to run API server Not only a model

Ollama INSTALLATION https://www.ollama.com/ ollama run llama3.1 USAGE On Linux, installs as a service. On Windows, can be run manually. SERVER Large library of GGUF models, in many quantizations. MODELS Ollama.com

http://chat.ivoras.net/

4 CONTENT 🙋‍♂️💗🙋‍♀️ GENERATION

Generate story and visuals The task

ALMOST...

http://tavern.ivoras.net/ User: tavern Password: tavern.ivoras.net

https://modelslab.com/models

https://www.meshy.ai

https://www.3daistudio.com

AR Platform Connected to LLM Connected to business processes Our company www.equinox.vision

AR app on mobile phones is the client for our platform Our servers manage content and AI interfaces LLM prompts generate content

Self-generating content LLM RESPONSE LLM response is passed to text-to-speech INITIAL TOPIC The LLM also extracts new topics to continue the response TOPIC EXTRACTION Set by the admin on a particular AR object 1 2 3

5 META VERSE

Our vision is to enrich the real world with virtual content, not replace it. AR metaverse

what we do BUILD SCENARIOS Vision, storytelling DEPLOY EXPERIENCES Content, interaction, business interfaces

Services AI STRATEGY DEVELOPMENT AR GAMIFICATION Discover where it makes sense to use AI tools in your company, and use it economically. Implement marketing campaigns, information points and educational assistants in AR. [email protected]

6 Q&A
Tags