Building successful and secure products with AI and ML
s-j
227 views
25 slides
Sep 04, 2024
Slide 1 of 25
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
About This Presentation
Advancing the development of Artificial Intelligence (AI) and Machine Learning (ML) driven products requires a balance of cutting-edge theoretical knowledge and practical considerations about platform architecture, tool selection, and security. To navigate these complexities, this presentation will ...
Advancing the development of Artificial Intelligence (AI) and Machine Learning (ML) driven products requires a balance of cutting-edge theoretical knowledge and practical considerations about platform architecture, tool selection, and security. To navigate these complexities, this presentation will explore three essential topics:
- Successfully applying AI and ML
- Going from experiments to production with MLOps
- Fortifying AI and ML systems against attacks
Size: 1.6 MB
Language: en
Added: Sep 04, 2024
Slides: 25 pages
Slide Content
Building successful and secure
products with AI and ML
Oslo MLOps Community Meetup,
September 4th 2024
Simon Lia-Jonassen
Stefan Mandaric
Agenda
⬢Successfully applying AI and ML
⬢Going from experiments to production
⬢Fortifying AI and ML systems against attacks
Challenges of applying AI and ML
Incremental value
Time
●AI plays a key role
●Several ML systems
●Simple use cases for AITACTICAL
TRANSFORMATIONAL
STRATEGIC
Phase 1: Defining problems and success criteria
Key points:
⬢Define specific problems to be solved.
⬢Set measurable success criteria.
⬢Connect to business objectives.
“A computer program is said to learn from experience E with
respect to some task T and some performance measure P if its
performance on T, as measured by P, improves with experience E.“
Good examples:
⬢Improve click-rate on recommendations.
⬢Reduce user effort performing a task.
⬢Automate a specific task.
https://developers.google.com/machine-learning/guides/rules-of-ml
Phase 2: Applying scientific methods in product delivery
Key points:
⬢Test what works and what not.
⬢Apply latest research and methods.
⬢Iteratively refine and improve solutions.
Good examples:
⬢Retrieval augmented generation.
⬢Vulnerability prioritization.
⬢News recommendations.
Phase 3: Acquiring and processing data at scale with AI
Key points:
⬢Start with off the shelf solutions.
⬢Deliver value first, optimize later.
⬢Evolve frameworks over time.
Good examples:
⬢NLP, CNN, LLM tools and models.
⬢Data stores and pipelines.
⬢Media annotation tools.
Phase 4: Improving solutions and experiences with ML
Key points:
⬢Enforce right problems and right timing.
⬢Ensure reliable inventory and ensure data quality.
⬢Streamline deployment, retraining, and ops.
Good examples:
⬢Search and recommendations.
⬢Vulnerability proritization.
⬢Anomaly detection.
https://ml-ops.org/
Phase 5: Making data-driven product decisions
Key points:
⬢Lead the shift towards data-driven culture.
⬢Democratize tooling and data across the org.
⬢Decouple - simplify, automate, say no.
Good examples:
⬢Controlled online experiments.
⬢Offline experiments.
⬢Hypothesis validation with data
Phase 6: Innovating and disrupting
Key points:
⬢Combine a comprehensive understanding of
business, problems, technology and data.
⬢Identify and follow through opportunities.
Good examples:
⬢Search in 1990s.
⬢Recommendations in 2000s.
⬢Deep learning in 2010s.
⬢LLMs in 2020s.
Agenda
⬢Successfully applying AI and ML
⬢Going from experiments to production
⬢Fortifying AI and ML systems against attacks
MLOps bridges the gap between
Experimentation and Operations
MLOps
⬢Problem
understanding
⬢Flexible
exploration
⬢Short cycles
⬢Rapid feedback
⬢Robust and
repeatable
⬢Controlled
environment
⬢Scalable
⬢Automated
Experiment Operations
Data Science
Code
Specification
Orchestration
Training
data
Artifacts
Schema
Schema &
Profile
Machine Learning
+ Ops Code
Data for MLOps
●Data versioning forms the foundation for reproducibility
●Schema and data profile are the contact between data
source and ML Project
●Data processing steps that may change in experiment
should be managed in the kept in the ML repo
●Structure transformation scripts into DAGs and manage
code according to best practices
●Make versioned data sets available in project folder
structure but don’t commit large data sets to git
●Produce samples for quick iterations
●Handle source data changes using dataset versioning
# Prepare data for model training
python -m src.digit_recognition.test_train_split \
--data_path "./data/raw" \
--output_path "./data/prepared" \
--test_size 0.2 \
--random_state 42
Data Science Code
●Focus on solving core DS problem in this layer
●Use notebooks with care
●Develop scripts that receive data path and
parameters as CLI arguments
●Local execution for quick feedback loops
●Maintain code quality through CI practices
●Separate approaches into modules - gather
common code in libraries
●DS code must be independent of specification and
orchestration logic
# Train model using CNN approach
python -m src.digit_recognition.cnn_classifier.train \
--epochs 8 \
--batch_size 32 \
--data_path "./data/prepared" \
--artifacts_path "./data/artifacts"
Experiment specification
●All information required to reproduce an
experiment
●At minimum you should have a
specification for a local experiment
triggered from command line
●Associate artifacts with specification for
full reproducibility and traceability
# Experiment configuration example for AzureML
# Name for tracking
experiment_name: digit_recognition
# What to run
command: >-
python -m src.digit_recognition.cnn_classifier.train
--epochs ${{inputs.epochs}}
--batch_size ${{inputs.batch_size}}
--data_path ${{inputs.data_path}}
--artifacts_path "./outputs"
Orchestration
●Defines logic for triggering jobs based
on changes in specification
●Repo will contain multiple
orchestration pipelines and
environments that depend on the
same specification
●Controls runtime parameters or
conditional logic
# Example orchestration pipeline for Azure DevOps
schedules:
- cron: "0 3 * * Mon"
displayName: Monday 3:00 AM (UTC) weekly retraining
branches:
include:
- main
always: true
- task: AzureCLI@2
displayName: Run training
inputs:
azureSubscription: $(SERVICE_CONNECTION)
scriptType: bash
scriptLocation: inlineScript
inlineScript: |
az ml job create -f $(scriptRoot)/azure-ml-job.yaml \
--resource-group $(RESOURCE_GROUP) \
--workspace-name $(WORKSPACE)
MLOps Takeaways
●Well structured code makes it easier to transition from
experimentation to operation and back
●Never lose focus on quick experiment feedback
●Reproducible experiments require control of specification
●Multiple orchestrators based on changes in specification
●Start in experimentation mode then build out operations
layer by layer
●Develop templates for reuse across projects
Agenda
⬢Successfully applying AI and ML
⬢Going from experiments to production
⬢Fortifying AI and ML systems against attacks
Examples of attacks on AI and ML applications
⬢Traditional AI and ML models:
⬡Data leakage.
⬡Adversarial attacks.
⬢GenAI and LLMs:
⬡Jailbreaks.
⬡Exfiltration.
https://www.microsoft.com/en-us/security/blog/2024/02/14/staying-ahead-of-threat-actors-in-the-age-of-ai/
Fortifying AI and ML models
https://learn.microsoft.com/en-us/legal/cognitive-services/openai/overview
RAI and Security Guardrails
●https://ai.google.dev/responsible
●https://www.tensorflow.org/responsible_ai
●https://github.com/microsoft/responsible-ai-toolbox
●https://github.com/guardrails-ai/guardrails
●https://github.com/NVIDIA/NeMo-Guardrails