Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
555 views
54 slides
Jun 04, 2024
Slide 1 of 54
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
About This Presentation
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring securit...
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Size: 6.51 MB
Language: en
Added: Jun 04, 2024
Slides: 54 pages
Slide Content
GENERATIVE AI:
FROM PROOF-OF-CONCEPT
TO PRODUCTION,
REAL-WORLD EXAMPLE
MAHER HANAFI
Vice President of Engineering
3
●VP of Engineering
●VP of Engineering
●Chief Technology Officer
●Ambassador
●Board Advisor
School of Management
Strategic Artificial Intelligence
●Entrepreneur
Introduction
MAHER HANAFI
linkedin.com/in/maher-hanafi
4
DIGNIUM GRAPHICS
Betterworks Company Confidential 2023
Feedback &
Recognition
Employee
Engagement
Enterprise-Grade Platform
Delightful User
Experience
In the flow
of work
Actionable
insights
Connected
experience
Designed
for Global
Goals
1:1s Calibration
Check-ins &
Perf. Reviews
Employee
Development
Succession
Planning
Gen AI
5
Betterworks Company Confidential 2023
6
Generative AI
Feedback Assist &
Conversation Assist
Use of generative AI to refine feedback
and conversation responses to
enhance precision, clarity &
actionability, and reduce bias.
Gen AI
Betterworks Company Confidential 2023
0
1.Generative AI
Potential applications and capabilities, focus on Large Language Models
2.Where to Start
Identify the problem and start with the data.
3.Proof-of-Concept: Building a Strong Foundation
Choosing the right tools, focus on value and iterate.
4.Bridging the Gaps to Production
Deployment and enterprise considerations for scalability, security and
robustness.
5.Gen AI Maturity Framework
Increasing the sophistication of the the solutions while increasing the ROI.
Summary
7
Betterworks Company Confidential 2023
Generative AI
Potential Capabilities
8
1
Betterworks Company Confidential 2023
Acknowledgment
"The world of Artificial Intelligence is such a vibrant ecosystem, driven by the
countless efforts of researchers and developers from around the globe, who
are relentlessly sharing their knowledge through research papers and
open-source contributions.
This accelerated the field's progress, empowering a wider community to
explore the potential of AI and most recently Generative AI and unlock new
possibilities across various creative and technological domains."
9
Betterworks Company Confidential 2023
10
Is AI maturing or is it another “Hype”?
Betterworks Company Confidential 2023
Generative AI: LLM Predicts the Next Word1
11
Source: Alonso Silva, Next token prediction visualization (GPT-2, 124M)
https://alonsosilva-nexttokenprediction.hf.space
LLM is focused on generating the next token
given the sequence of tokens.
The model does this in a loop appending the
predicted token to the input sequence. (aka
auto-regressive models)
Betterworks Company Confidential 2023
Generative AI: LLM Components1
12
Tokenization
Text sequences undergo
division into smaller units
or tokens, meaningful
subword units, effectively
catering to the model's
capacity to accommodate
a diverse vocabulary.
Embedding
Continuous vector
representations of tokens
that capture semantic
information and intricate
relationships between
tokens, making it possible
for the model to
understand subtle
contextual nuances.
Attention
Analyze the relationships
between all tokens in a
sequence, facilitating the
capture of long-range
dependencies. This
mechanism is highly
parallelizable, enabling
efficiency at scale.
Training
During pre-training,
models learn general
linguistic patterns, world
knowledge, and
contextual
understandings. They
become repositories of
language expertise, can
then be fine-tuned for
specific tasks.
Generation
Can produce coherent
and contextually relevant
text across. The extensive
exposure during training
enables them to mimic
human-like language use,
making them versatile
tools for many kinds of
tasks.
Betterworks Company Confidential 2023
Generative AI: LLM Capabilities1
13
Source: Patrick Meyer, A Taxonomy of Large Language Models Capabilities
https://generativeai.pub/a-taxonomy-of-large-language-models-capabilities-b019d3d582b2
1.Text Generation
2.Information Extraction
3.Classification
4.Text Transformation
5.Text Comprehension
6.Problem-solving
7.Combination of the above
Betterworks Company Confidential 2023
Where to Start?
14
2
Betterworks Company Confidential 2023
PLAN
Set up an AI task force, council and focus on one use case
BUILD
Choose Model, architecture and techniques
RUN
Deployment, observability and performance
OPTIMIZE
Scalability, security and robustness
15
BUILD
RUN
PLAN
OPTIMIZE
15
Gen AI
Where to Start? The Flywheel2
Betterworks Company Confidential 2023
-On Customer Needs
-Define Clear Goals
-Start Small
-Personas
Focus
Expertise
Focus
Collaboration
Data
-Invest in Training
-Continuous Learning
-Go Deeper
Expertise
Collaboration
-Readiness
-Quality
-Governance
-AI PP and ToS
Data
-Cross-Functional Teams
-Customer Council
-Responsible AI
-Legal & Ethics
Plan: Framework2
16
Betterworks Company Confidential 2023
Build: Choosing a Model 2
17
Source: Jingfeng Yang and al.: Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond
https://arxiv.org/abs/2304.13712
Betterworks Company Confidential 2023
Build: Things to Consider when choosing?2
18
Source: https://pub.towardsai.net/which-open-source-llm-should-you-choose-in-2024-c3901ce02271
●Licensing / Cost
Proprietary vs Open-Source, Off-The-Shelf vs Self-Hosted
●Model Properties
Number of Parameters, Context Window, Floating-Point Precision
●Training Data Set
Transparency, Ethical considerations, Biases
●Technical performance
Latency, Inference Speed, Memory Footprint
●Task Performance
Architecture, Instruct, Task specificity, Benchmarks, Domain Knowledge
●Supported Languages
Human Languages and Programming Languages
Betterworks Company Confidential 2023
Build: Proprietary or Open Source LLM?2
19
Factor Proprietary LLM Open-Source LLM
Cost Paid: Licensing fees or subscription models✅Free: No licensing fees.
Transparency &
Explainability
Black Box: Limited access to internals
✅Full access to the model architecture
and weights, allowing for greater
understanding of the mechanisms and
training.
Availability of
Cutting-Edge Tech
✅Premium Access: May offer access to
latest advancements
Might Play Catch-Up: May have a delay in
incorporating latest research
Options Fewer options, but mostly more efficient
✅Multiple options, but required
understanding what works best for the tasks
Betterworks Company Confidential 2023
Build: Off-The-Shelf or Self-Hosted?2
20
Factor Off-The-Shelf LLM Self-Hosted LLM
Development Time &
Expertise
✅Fastest: Quick to no setup, leverage
pre-built functionality, lower technical
expertise needed
Slower: Model setup, updates, and
integration. Takes more time and requires tech
expertise
Control & CustomizationLimited control, pre-defined functionality✅High control, customizable to specific data
Cost
Pay-per-use, cheaper to start but can be
expensive for high volume
✅Upfront investment, potentially
cost-effective for high volume
Privacy & Security
Privacy Concerns: Especially in enterprise
and B2B since data sent to provider's
servers
✅Greater Control: Data stays on-premise or
in the private cloud.
Scalability
Scalability depends on provider's
infrastructure
✅Scalability depends on your investment
Ownership and Scope
Depends on Vendor: Prone to service
outages and external changes
✅Fully owned by the organization and easy
to scale use cases to a bigger scope
Betterworks Company Confidential 2023
Build: Model Properties2
21
Context
Window
Number of
Parameters
Benchmarks*Floating-Point
Precision
The higher the
model size, the
more performant
the model. But also
bigger, resource
heavy
The amount of text
the LLM considers
when predicting
the next word of
phrase.
Higher precision,
means higher
accuracy, but also
the bigger the
model.
Answering Accuracy,
Summarization, Text
Generation,
Language
Understanding.
Human-In-The Loop.
Betterworks Company Confidential 2023
Proof-of-Concept:
Building a Strong
Foundation
22
Betterworks Company Confidential 2023
PoC: Identify the “Problem” with Feedback3
23
Feedback Quality & Management:
●Lack of Precision, Vague and Unclear
●Informal or Unprofessional Language
●Actionability Issues
●Writer’s Block: Difficulty articulating
●Time Constraints
●Potential Bias
●Difficulty Finding Themes
●Limited Insights
●Ineffective Communication
●Reduced Motivation
●Missed Opportunities
●Wasted Time and Resources
Betterworks Company Confidential 2023
PoC: Understand the “Solution”3
24
Feedback Writing Assist
●Refined feedback
●Enhanced Precision and Clarity
●Professional Tone
●More Actionable
●Reduced Bias
●Drafting
●Overcome Writer’s Block by providing
templates and rephrasing
●+ Explainability
+ Human Oversight
Betterworks Company Confidential 2023
The InternetOn-Prem or VPC
The Right Tools: Minimum Viable Product (MVP)3
25
Client
Wrapper
Service
(Modular)
Off-The-Shelf
LLM
Gen AI
Service
(Modular)
API Gateway
Controls Observability
Betterworks Company Confidential 2023
The AI Maturity Framework - Level 03
Sophistication
Security
Privacy
Responsible AI
Serving Data & RetrievalPrompt Engineering
Level 0 Data Pre-ProcessingOf-The-Shelf LLM API
Simple Prompts
+ User Content
Source: Ali Arsanjani, https://dr-arsanjani.medium.com/the-genai-maturity-model-a1a42f6f390b
Betterworks Company Confidential 2023
PoC: Identify the “Problem”3
27
Feedback Feature
Feedback Quality:
●Lack of Precision
●Clarity Issues
●Actionability Issues
●Potential Bias
Feedback Management:
●Time-consuming Summarization
●Difficulty Finding Themes
●Limited Insights
●Ineffective Communication
●Reduced Motivation
●Missed Opportunities
●Wasted Time and Resources
Betterworks Company Confidential 2023
PoC: Understand the “Solution” 23
28
Feedback Summary Assist
●Summarized of past feedback
○Grouped recurring themes in
strengths and areas for
improvement.
○Saved Time
Betterworks Company Confidential 2023
PoC: Data is your key “Asset”3
29
Data Collection:
●Identify Relevant Sources
Relational, Document Store, Vector, Knowledge Graph
●Define Filters and Limits
●Define Controls
Data Preprocessing:
●Standardize, Normalize
●Reduce latency to collect and process the data
●Grouping and Caching
Betterworks Company Confidential 2023
The Internet
The Right Tools: Minimum Viable Product (MVP)3
30
Client
Wrapper
Service
(Modular)
Off-The-Shelf
LLM
Gen AI
Service
(Modular)
Data
API Gateway
Wrapper
Service
(Modular)
Feedback
Service
Feature
Store
Controls
Observability
Betterworks Company Confidential 2023
The AI Maturity Framework - Level 13
Level 0 Data Pre-Processing
Sophistication
Security
Privacy
Responsible AI
Of-The-Shelf LLM API
Serving Data & RetrievalPrompt Engineering
Simple Prompts
+ User Content
Level 1
Basic Prompts
+ Context
Of-The-Shelf LLM API
Data retrieval with
SQL
Bridging the Gap to
Production
32
4
Betterworks Company Confidential 2023
The Gaps: Third-Party AI Vendor4
33
The Internet
Client
Wrapper
Service
(Modular)
Off-The-Shelf
LLM
Gen AI
Service
(Modular)
Data
API Gateway
Wrapper
Service
(Modular)
Feedback
Service
DB
Controls
Observability
data
Betterworks Company Confidential 2023
Prompt
Engineering
Self-Deploy
Model
Guardrails
The Gaps: How To Address Them?4
34
Betterworks Company Confidential 2023
Self-Deployed LLM4
35
Data Risks:
●Data flowing outside of the boundaries
●Data Leaks
●Potential Data Retention and/or used for Training by
vendors
Privacy Risks:
●Accidental Exposure, Inferred Information
●Compliance and Governance
Indirect Risks:
●Vendor Lock-in, Throttling, Outages
●Performance Gap getting smaller
●Cost
Source: Valohai, https://valohai.com/blog/llm-for-the-rest-of-us/
Betterworks Company Confidential 2023
Prompt Engineering: In-Context Learning4
36
Source: Language Models are Few-Shot Learners https://arxiv.org/pdf/2005.14165
Betterworks Company Confidential 2023
Safe Output
Unsafe Input
Safe Input
Customizations: Guardrails4
37
LLMClient Guard
Unsafe Output
*The Guards can be Libraries/Toolkits like Nvidia Nemo or LLMs like Llama Guard
Betterworks Company Confidential 2023
Improvements: Addressing Privacy Concerns4
38
The Internet
Client
Wrapper
Service
(Modular)
Off-The-Shelf
LLM
Gen AI
Service
Data
API Gateway
Wrapper
Service
(Modular)
Feedback
Service
DB
Controls
Observability
Self-Deployed
LLM
Prompt
+ data (SQL)
+ Few-shots
ETL
Self-Deployed
LLM
Guards
Betterworks Company Confidential 2023
The AI Maturity - Level 2
Level 0 Data Pre-Processing
Responsible AI
Level 1
Of-The-Shelf LLM API
Basic Prompts
+ Context
Serving Data & RetrievalPrompt Engineering
Data retrieval with
SQL
Simple Prompts
+ User Content
4
Infrastructure Guardrails
Level 2
One Shot Prompts
In-Context Learning
Data retrieval with
SQL
Basic LLMOps
Serving, Inference
Observability
Basic Guardrails
Self
-Deployed
LLM
Of-The-Shelf LLM API
Betterworks Company Confidential 2023
Goal AI Assist
Goal Assist leverages employee performance data
from different Betterworks modules, including
previous goals, feedback, conversations, 1:1 meetings,
and employee role/title, to suggest individual goals
and key results.
The result: Precise, measurable, challenging yet
attainable goals that align with both team and
company objectives.
Run: Betterworks Use Case 3
40
4
To effectively address the technical debt in the admin
module, we aim to upgrade and modernize the company's
network infrastructure. This will provide a more stable and
efficient foundation for the admin module to operate on,
reducing potential bottlenecks and errors caused by
outdated technology.
To empower our workforce to be a strong line of defense
against cyber threats, we will implement a comprehensive
cybersecurity training program. This program will equip all
employees with the knowledge and skills needed to identify
and mitigate security risks, fostering a culture of
cybersecurity awareness throughout the organization.
Betterworks Company Confidential 2023
Search Service
Customizations: Naive RAG4
41
Source: https://bea.stollnitz.com/blog/rag/
LLM
Embedding
Model
Vector
Search
Documents
Search Index
DB
Client
Input
Output
Instructions & In-Context Learning
Query
Retrieval
Augmented
Generation
Betterworks Company Confidential 2023
Search Service
Customizations: Advanced RAG4
42
Source: https://bea.stollnitz.com/blog/rag/
LLM
Embedding
Model
Vector
Search
Documents
Search Index
DB
Client
Input
Output
Instructions & In-Context Learning
Query
Retrieval
Pre-Retrieval: Query Rewriting, Query Expansion
Post-Retrieval:
Reranking
Augmented
Generation
Re-Ranking Model
Post-Retrieval:
Summarization
Betterworks Company Confidential 2023
Customizations: Quantization 4
43
Reduce the Model Size
Model Quantization enhances the
efficiency of LLMs by reducing the
precision of the its weights.
For example, 32-bit floating-point
numbers can be quantized to 8-bit
integers.
This reduces the model’s memory
footprint without significantly impacting its
performance.
Source: The case for 4-bit precision: k-bit Inference Scaling Laws https://arxiv.org/pdf/2212.09720
Betterworks Company Confidential 2023
Basic Guardrails
The AI Maturity - Level 3
Level 0 Data Pre-Processing
Sophistication
Security
Privacy
Responsible AI
Level 1
Level 2
Of-The-Shelf LLM API
Basic Prompts
+ Context
Serving Data & RetrievalInfrastructurePrompt Engineering
Data retrieval with
SQL
One Shot Prompts
In-Context Learning
Data retrieval with
SQL
Basic LLMOps
Serving, Inference
Observability
Simple Prompts
+ User Content
Guardrails
Self
-Deployed
LLM
Of-The-Shelf LLM API
Intermediate
Guardrails
Level 3
Basic+ LLMOps
+ Automation
Basic RAG
+ Vector DB
Self-Deployed
Multi-Model
Few Shot Prompts
In-Context Learning
Weights
Manipulation
Quantization
4
Betterworks Company Confidential 2023
Customizations: Fine-Tuning4
45
Pre-Trained
LLM
Fine-Tuned
LLM
Small Target
Dataset
Fine-tuning
Betterworks Company Confidential 2023
Customizations: RAG or Fine-Tuning?4
46
-When to RAG
-Grounding with real-time data / dynamic knowledge
-You don't have a lot of data for training
-When you use the same LLM for multiple tasks
-Cost is a constraint
-Transparency is key, source information
-When to Fine-Tune
-When you have static data
-You have a set of high quality structured data
-When you use the LLM for one to a very few tasks. You
want the model to act the same way every time. (ex:
specific tasks, input/output constraints, language training)
-Memorization-heavy tasks
Betterworks Company Confidential 2023
Basic Guardrails
Intermediate
Guardrails
The AI Maturity - Level 4
Level 0 Data Pre-Processing
Sophistication
Security
Privacy
Responsible AI
Level 1
Level 2
Level 3
Of-The-Shelf LLM API
Basic Prompts
+ Context
Serving Data & RetrievalInfrastructure
Basic+ LLMOps
+ Automation
Prompt Engineering
Data retrieval with
SQL
One Shot Prompts
In-Context Learning
Basic RAG
+ Vector DB
Data retrieval with
SQL
Self-Deployed
Multi-Model
Few Shot Prompts
In-Context Learning
Weights
Manipulation
Basic LLMOps
Serving, Inference
Observability
Simple Prompts
+ User Content
Guardrails
Quantization
Self
-Deployed
LLM
Of-The-Shelf LLM API
Intermediate
Guardrails
Level 4
Self-Deployed
Multi-Model
Intermediate
LLMOps
Basic + Fine-Tuning
Few Shot Prompts
In-Context Learning
Basic RAG
+ Vector DB
Fine-Tuning
4
Betterworks Company Confidential 2023
AI Maturity
Framework
48
5
Betterworks Company Confidential 2023
Intermediate
Guardrails
Basic Guardrails
Intermediate
Guardrails
The AI Maturity Framework
Level 0 Data Pre-Processing
Sophistication
Security
Privacy
Responsible AI
Level 1
Level 2
Level 3
Level 4
Of-The-Shelf LLM API
Basic Prompts
+ Context
Serving Data & RetrievalInfrastructure
Basic+ LLMOps
+ Automation
Prompt Engineering
Data retrieval with
SQL
Few Short Prompts
In-Context Learning
Basic RAG
+ Vector DB
Data retrieval with
SQL
Self-Deployed
Multi-Model
Self-Deployed
Multi-Model
Intermediate
LLMOps
Basic + Fine-Tuning
Few Short Prompts
In-Context Learning
Few Short Prompts
In-Context Learning
Weights
Manipulation
Basic LLMOps
Serving, Inference
Observability
Simple Prompts
+ User Content
Basic RAG
+ Vector DB
Fine-Tuning
Intermediate
Guardrails
Level 5
Self-Deployed
Multi-Model
Intermediate
LLMOps
Basic + Fine-Tuning
Few Short Prompts
In-Context Learning
Intermediate RAG
+ Reranker + Hyde…
Fine-Tuning
Advanced
Guardrails &
Safeguards
Level 6
Multi-Model
+ Multi-Agent
+ Orchestration
Advanced LLMOps
Basic + Fine-Tuning
Tree-of-Thoughts /
Chain-of-Thoughts
Advanced RAG
+Orchestration
Advanced
Fine-Tuning
Guardrails
Quantization
Self
-Deployed
LLM
Of-The-Shelf LLM API
5
Source: Ali Arsanjani, https://dr-arsanjani.medium.com/the-genai-maturity-model-a1a42f6f390b
Betterworks Company Confidential 2023
Future
50
Betterworks Company Confidential 2023
6
Future
51
Agent
A
(LLM)
LLM
SLM
Agent
B
(LLM)
SLM
Small
Language Models
Agentic
Workflows
SLM
On-The-Edge
SLM
Betterworks Company Confidential 2023
6
Resources
52
Betterworks AI
●AI for HR by Betterworks, https://www.betterworks.com/ai-for-hr
●The Evolution of Performance Management in an AI-Powered World https://tinyurl.com/3eny7vu2
●3 Steps to Thoughtfully Implementing Generative AI in Performance Management
https://tinyurl.com/2cmyxx9b
●3 Differences Between GenAI and Analytics — and Why It Matters for Performance Management
https://tinyurl.com/46sufcff
Learning Resources
●3Blue1Brown - Neural Network https://tinyurl.com/4zr5dcfb
●MLOps Community, https://mlops.community
●AI Makerspace, https://aimakerspace.io
●Deep Learning AI, https://www.deeplearning.ai
GENERATIVE AI:
FROM PROOF-OF-CONCEPT TO
PRODUCTION
53
ENGINEERING
TECH TALK
5/30/2014