Conf42 - AI Augmented Platform Engineering.pdf

cfaa0001 36 views 26 slides Sep 25, 2024
Slide 1
Slide 1 of 26
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26

About This Presentation

AI Augmented Platform Engineering looks into how to apply GenAI into various aspects of Platform Engineering with the explicit purpose of improving the experience of your developers. The presentation goes into a RAG based approach, and shows through pseudo code on how to build the RAGs step-by-step


Slide Content

AI AUGMENTED
PLATFORM ENGINEERING
AJAY CHANKRAMATH
SEP 5, 2024
CHIEF TECHNOLOGY OFFICER / MANAGING DIRECTOR
PLATFORMS & PRODUCTS
BRILLIO
BRILLIO.COM | CHANKRAMATH.COM
CONF42 PLATFORM ENGINEERING

Improving the way platform engineering works
Predictive & Generative AI
Use of LLMs and why EPOCHs matter
Applying RAGs - A Case Study in BFSI domain
How to get started on the AI journey in PE?
INTRODUCTION
BRILLIO / AJAY C
EMPOWERING INDUSTRY-SPECIFIC PLATFORM ENGINEERING
THROUGH AI AND LLMS FOR UNPARALLELED EFFICIENCY,
RELEVANCE, AND SCALABILITY
BRILLIO.COM

WHY?
Code Generation
Developer Productivity
Streamlining CI/CD
Improve Collaboration
Optimize Infra management
Optimize Path-to-production
Better ephemeral envs
Rapid Prototyping
Automate/Reduce repetitive tasks
Improve Utilization of resources
Reducing human error
Enhanced error detection
Streamline documentation
Reduced cognitive load
METRICS THEMSELVES DOES NOT CHANGE
LLMs enhance each of them!
TIME TO MARKET COST REDUCTION DEV PRODUCTIVITY
BRILLIO / AJAY C BRILLIO.COM

PLATFORM ENGINEERING?
BRILLIO / AJAY C BRILLIO.COM
PLATFORM
ENGINEERING
APPLICATION RUNTIME
CONAINERS, K8S..
DATABASE
DATA INFRASTRUCTURE
SERVICE CATALOGS
PIPELINES
COMPLIANCE & GOVERNANCE
OBSERVABILITY
SERVICE MESH
DEVEX REPORTING
& DASHBOARDING
IDP ORCHESTRATORS
& ECOSYSTEM

BRILLIO / AJAY C BRILLIO.COM
PE
DEVEX
SRE
DEVOPS
REQUIRE
SUPPORTS
ENABLES
IN CONTEXT

PE EVOLUTION
BRILLIO / AJAY C BRILLIO.COM
Data Center /
DevOps
Config
Mgmt/Infra /
DevOps -> PE
Cloud /
Platform Eng
Hybrid Cloud /
AI Led and AI
Infused
Platforms

DOMAIN SPECIFIC
BRILLIO / AJAY C BRILLIO.COM
General AI
Capabilities in PE
Applying LLMs
for Platforms
Industry
Abstraction
for Platforms
Organizational
DNA

AI HYPE?
BRILLIO / AJAY C BRILLIO.COM
PREDICTIVE
GENERATIVE
Forecasting
Trend Analysis
Incident Management
Predictive Observability
Code Generation
Test Generation
Documentation
Automated Decisions
LLMS
OpenAI GPT
Codex

GPT4 Code Generation, Documentation, DevOps Assistant
Codex Autocompletion, IaC, API Integration
BERT Log Analysis / Error Detection, Search & Doc retrieval, Knowledge management
T5 Data Transformation, Config Management, Continuous Integration
ROBERTa Semantic search of repos, knowledge extraction, Chatbots
LLAMA Chatbots, code reviews, AI powered devex tools
BRILLIO / AJAY C BRILLIO.COM
LARGE LANGUAGE MODELS

BRILLIO / AJAY C BRILLIO.COM
UNTRAINED
UNDERFIT
OVERFIT
CONVERGE
TRAINING MODELS & EPOCHS

BRILLIO / AJAY C BRILLIO.COM
TRAINING LLMS USING EPOCHS
initialize_model_and_tokenizer()
set_hyperparameters(epochs, batch_size, learning_rate, max_length)
create_dataset_and_dataloader(texts, tokenizer, max_length, batch_size)
initialize_optimizer(model_parameters, learning_rate)
for epoch in range(epochs):
for batch in dataloader:
clear_gradients()

inputs, attention_masks = prepare_batch(batch)

outputs = model_forward_pass(inputs, attention_masks, labels=inputs)

loss = calculate_loss(outputs)

backpropagate_loss(loss)

optimizer_step()

log_training_progress(epoch, batch, loss)
save_trained_model_and_tokenizer()

RETRIEVAL AUGMENTED GENERATION
BRILLIO / AJAY C
BRILLIO.COM
QUERY
RAG BRIDGES THE TRAINING DATA GAPS BY INTEGRATING EXTERNAL KNOWLEDGE BASES, ALLOWING LLMS TO
DELIVER MORE RELEVANT, ACCURATE, AND UP-TO-DATE CAPABILITIES IN YOUR PLATFORM
LLM
OUTPUT
QUERY
LLM
OUTPUT
ADDITIONAL DATA SOURCES
DOMAIN AND
ORG SPECIFIC
DTA
AUGMENTED QUERY

RAG CASE STUDY FOR BFSI DOMAIN
BRILLIO / AJAY C
BRILLIO.COM
SIMILAR DOMAIN SPECIFIC IMPROVEMENTS CAN BE APPLIED TO ANY BUSINESS DOMAIN
PROBLEM STATEMENT
Enhance the developer experience for BFSI domain
developers by integrating traditional platform
engineering activities with the RAGs

RAG CASE STUDY FOR BFSI DOMAIN
BRILLIO / AJAY C
BRILLIO.COM
APPROACH
Identify critical BFSI APIs1.
Develop domain-specific RAGs >> APIs2.
Create a Platform Abstraction Layer3.
Create a self-service portal for developers4.
Standardize development environments5.
Setup e2e Observability 6.
Automate CI/CD Pipelines7.
Automate Security & Compliance8.

1.Identify domain specific APIs
RAG CASE STUDY FOR BFSI DOMAIN
BRILLIO / AJAY C
BRILLIO.COM
AccountCreation / Balance / Transactions
PaymentProcessing
Loan Management
Investment Management
Reporting and Analytics
Trading
Customer Management

2. Develop Domain Specific RAGs
RAG CASE STUDY FOR BFSI DOMAIN
BRILLIO / AJAY C
BRILLIO.COM
Internal Docs Regulatory Guidelines
Ontologies/Taxonomies
Openbanking APIs
Market Data
Research Publications
Vendor APIs Financial Data Lakes Community Forums/SDK
Models & Knowledge
Graphs

3.Platform Abstraction Layer
function createAbstractionLayer():
// Design APIs that abstract the complexity of BFSI services and RAG integration
bfsAPIs = defineAPIsForCommonTasks()
// e.g., customer data retrieval, transaction processing
ragIntegrationAPIs = defineAPIsForRAGUsage()
// e.g., document retrieval, augmented query response
exposeAPIs(bfsAPIs, ragIntegrationAPIs)
RAG CASE STUDY FOR BFSI DOMAIN
BRILLIO / AJAY C
BRILLIO.COM
AccountCreation / Balance / Transactions
PaymentProcessing
Loan Management
Investment Management
Reporting and Analytics
Trading
Customer Management

RAG CASE STUDY FOR BFSI DOMAIN
BRILLIO / AJAY C
BRILLIO.COM
4. Setup a Self-serve portal
function buildSelfServicePortal():
// Set up a portal with integrated development tools and resources

portal = initializePortalFramework()
addDocumentation(portal, bfsAPIs, ragIntegrationAPIs)
addSandboxEnvironment(portal, bfsAPIs, ragIntegrationAPIs)
deployPortal(portal)

RAG CASE STUDY FOR BFSI DOMAIN
BRILLIO / AJAY C
BRILLIO.COM
5. Ephermeral Environments
function setupEphemeralDevEnvironments():
// Create container images with pre-installed BFSI tools and RAG integrations
devContainer = createContainerImage(bfsAPIs, ragIntegrationAPIs, ciCDTools)
publishDevContainer(devContainer)
provideInstructionsForLocalSetup(devContainer)

RAG CASE STUDY FOR BFSI DOMAIN
BRILLIO / AJAY C
BRILLIO.COM
6. Enable E2E Observability
function implementObservability():
// Integrate logging, metrics, and tracing for BFSI and RAG-related activities
loggingService = setupCentralizedLogging(bfsAPIs, ragIntegrationAPIs)
monitoringService = setupMonitoringDashboards(bfsAPIs, ragIntegrationAPIs)
tracingService = implementDistributedTracing(bfsAPIs, ragIntegrationAPIs)

exposeObservabilityToolsToDevelopers(loggingService, monitoringService, tracingService)

RAG CASE STUDY FOR BFSI DOMAIN
BRILLIO / AJAY C
BRILLIO.COM
7. Setup Pipelines
function automateCICDPipelines():
// Set up CI/CD pipelines with BFSI-specific stages (e.g., security checks, compliance testing)
pipeline = createCICDPipeline()
addStagesForSecurityCompliance(pipeline)
addAutomatedTestingStages(pipeline, bfsAPIs, ragIntegrationAPIs)
integratePipelineWithVersionControl(pipeline)
//Developers build the pipelines from the published template
publishPipelineTemplatesToDevelopers(pipeline)

RAG CASE STUDY FOR BFSI DOMAIN
BRILLIO / AJAY C
BRILLIO.COM
8. Security & Compliance
function automateSecurityCompliance():
// Automate security and compliance checks within the platform

securityScanner = integrateVulnerabilityScanner(bfsAPIs, ragIntegrationAPIs)
complianceChecker = automateComplianceChecks(securityStandards, bfsAPIs)
integrateSecurityComplianceIntoPipeline(securityScanner, complianceChecker)
alertDevelopersOnSecurityComplianceIssues(securityScanner, complianceChecker)

AI POWERED DEVEX
BRILLIO / AJAY C BRILLIO.COM
COGNITIVE
OVERLOAD
FINDING
INFORMATION
DX FRICTION
FEEDBACK
LOOPS
OPERATING
MODEL
OVERHEAD/WASTE : VALUE-ADD DELIVERY
70:30
40:60
10:90 ???

TESTING
SOLUTION ARCHITECTURE
CODING & DEBUG
USER RESEARCH
REQUIREMENTS ANALYSIS
LEVERAGE?
MONITORING/OBSERVABILITY
BRILLIO / AJAY C BRILLIO.COM

AI is changing platform engineering and pushing the boundaries
Training LLMs are still challenging and costly. Focus on epochs
RAGs are starting to make a difference
Metrics that drive the success SHOULD NOT change
CONCLUSION
BRILLIO / AJAY C BRILLIO.COM

CONNECT WITH ME?
BRILLIO / AJAY C BRILLIO.COM