Privacy-first in-browser Generative AI web apps: offline-ready, future-proof, standards-based

webmaxru 10 views 25 slides Oct 20, 2025

Slide 1 of 25

About This Presentation

Powerful generative AI features are quickly becoming a baseline in modern development. Potential blockers include privacy concerns, the need for a stable connection, and the costs associated with using or hosting models. However, we can now leverage generative AI directly in the browser on the user&...

Size: 1.21 MB

Language: en

Added: Oct 20, 2025

Slides: 25 pages

Slide Content

ELT layout
Privacy-first in-browser
Generative AI web apps:
•offline-ready,
•future-proof,
•standards-based
Maxim Salnikov
AI Developer Tools Solution Engineer at
Microsoft

•Building on web platform since 90s
•Organizing developer communities and
technical conferences
•Speaking, training, blogging: Webdev,
Cloud, Generative AI, Prompt Engineering
•Member of Web Machine Learning
Community Group
Helping developers to succeed with the Dev Tools, Cloud & AI in Microsoft
I’m Maxim Salnikov
Making Machine Learning a first-class web citizen by incubating Web APIs for machine learning
inference in the browser and in products using modern web engines

ELT layout
Native AI in
the browser.
Standardized.
We use web
(61% of PC
time)
We use AI
(> 1B people)
We [will] have
AI-capable
devices
We want
performance,
privacy,
offline-ready.
All FREE!
@Dev: unified
codebase
@Dev: handy
abstractions

ELT layout
Web Neural Network API (WebNN)
Near native execution characteristics: both
speed and power efficiency
Heterogeneous hardware execution: CPU, GPU,
NPU
Unified abstraction: W3C API standard
Model-agnostic: General computational graph
allows to BYOM
Compatible with existing ML frameworks
https://www.w3.org/TR/webnn/

ELT layout
All starts from the usecases
https://www.w3.org/TR/webnn/#usecases
•Person Detection
•Semantic Segmentation
•Skeleton Detection
•Face Recognition
•Facial Landmark Detection
•Style Transfer
•Super Resolution
•Image Captioning
•Text-to-image
•Machine Translation
•Emotion Analysis
•Video Summarization
•Noise Suppression
•Speech Recognition
•Text Generation
•Detecting fake video

ELT layout
Edge AI ecosystem
CPU GPU NPU
Native
ML APIs
Web Browser
(e.g., Chrome/Edge)
Frameworks
Use cases
WebNN
JavaScript Runtime
(e.g., Electron/Node.js)
Noise
Suppression
Image
Classification
Background
Segmentation
TensorFlow.js
ONNX Runtime
Web
MediaPipe Web
Natural
Language
Hardware
CoreMLDirectML
Web API
Web
Engines
OpenCV.js
WebAssembly WebGPU
Object
Detection
TFLite Other ML OS APIs
Windows Studio
Effects
API extensions

ELT layout
Which hardware to choose for AI workloads
CPU: Provides the broadest compatibility and usability across all
client devices with varying degrees of performance.
GPU: Provides the broadest range of achievable performance across
graphics hardware platforms from consumer devices to professional
workstations.
NPU: Provides power efficiency for sustained workloads across
hardware platforms with purpose-built accelerators.

ELT layout
WebNN for the users
Low Latency
In-browser inference enables novel use cases with local media sources
Privacy Preserving
User data stays on-device and preserves user-privacy
High Availability
No reliance on the network after initial asset caching for offline case
Low Cost
Computing on client devices means no server farms needed.
https://webmachinelearning.github.io/webnn-intro/

ELT layout
WebNN for the developers
Get capabilities from the underlying hardware innovations
Take advantage of the native OS services for machine learning
Benefit web applications and frameworks including ONNX
Runtime Web, TensorFlow.js
Implement consistent, efficient, and reliable AI experiences on the
web
https://webmachinelearning.github.io/webnn-intro/

ELT layout
Pre-requisites
https://microsoft.github.io/webnn-developer-preview/install.html
about://flags#web-machine-learning-neural-network
Canary or Dev versions of the Edge or Chrome
Enabling NPU: latest drivers for Intel | ARM

ELT layout
Context, operand, graph…

ELT layout
Let’s build an app
AI
usecases
Platform AI
capabilities
WebNN API
Web frontend app
ONNX Web Runtime
Transformers.js
Low-level, operates execution graph
Mid-level, operates inference sessions,
defines model format (ONNX)
High-level*, operates task-based pipelines,
handles model fetching & caching
* - level distribution is relative

ELT layout
What is ONNX?
https://onnxruntime.ai/
https://onnx.ai/
ONNX is an open format built to represent machine learning models.ONNX defines a common set of
operators - the building blocks of machine learning and deep learning models - and a common file format to
enable AI developers to use models with a variety of frameworks, tools, runtimes, and compilers.
ONNX Runtime is a production-grade AI engine to speed up training and inferencing in your existing
technology stack.

ELT layout
ONNX Runtime Web
const options = {
executionProviders: [
{
name: 'webnn', // wasm | webgpu | webnn | webgl
deviceType: 'npu', // cpu | gpu | npu
powerPreference: 'low-power', // default | low-power | high-performance
},
],
}
...
const session = await ort.InferenceSession.create('./model.onnx');
const tensorA = new ort.Tensor('float32', dataA, [3, 4]);
const tensorB = new ort.Tensor('float32', dataB, [4, 3]);
const feeds = { a: tensorA, b: tensorB };
const results = await session.run(feeds);

ELT layout
What is Transformers.js?
https://github.com/huggingface/transformers.js
Natural Language Processing:
text classification, named entity recognition, question answering, language modelling,
summarization, translation, multiple choice, and text generation
Computer Vision:
image classification, object detection, segmentation, and depth estimation
Audio:
automatic speech recognition, audio classification, and text-to-speech
Multimodal:
embeddings, zero-shot audio classification, zero-shot image classification, and zero-shot object
detection
Directly in the browser, with task-based APIs

ELT layout
Plus:
https://github.com/huggingface/transformers.js
2,3K+ hosted pretrained models (subset of Hugging Face catalog)
https://huggingface.co/models?library=transformers.js
Seamless caching of the models (with the Cache Storage API)
Serving your own models (converted to the ONNX format)

ELT layout
Task-based API
https://huggingface.co/docs/transformers.js/en/pipelines
import { pipeline } from '@huggingface/transformers’;
const classifier = await pipeline('sentiment-analysis’);
const result = await classifier('I love AI!’);
// [{'label': 'POSITIVE', 'score': 0.9998}]

ELT layout
Text Vision Audio …
https://huggingface.co/docs/transformers.js/en/pipelines
sentence-similarity
summarization
text-generation
translation
question-answering
fill-mask
...
image-classification
image-segmentation
image-to-image
mask-generation
object-detection
...
audio-classification
automatic-speech-
recognition
text-to-speech
text-to-audio
...

ELT layout
Summary and call to action:
•Web standard for running AI/ML tasks in the browser natively is here
•It’s the only way to leverage all in-device AI capabilities
•There are still some moving parts in the specification (device selection)
•Choose your own comfortable abstraction level using higher-level
frameworks
•Same frameworks could provide fallback mechanisms to handle
API/device availability fallbacks
•User experience first! Offline-readiness, web workers, providing
choices

ELT layout
Demo repos:
https://github.com/webmaxru/nextjs-webnn
•AI-capable: Transformers.js
(under the hood: ONNX Web Runtime, WebNN)
•WebGPU, WebNN, NPU features detection
•Smooth UX: AI computation is in the web worker
•Offline-ready: Workbox
https://github.com/webmaxru/ng-ai
AngularReact + Next.js

ELT layout
NEW! Prompt and Writing Assistance APIs
•New APIs for Web Developers: Prompt API, Writing Assistance APIs
(Summarizer, Writer, Rewriter), and Translator API now available in
Canary/Dev of Edge and Chrome via Origin Trial
•Built-in Local Model: Uses small, efficient language models
integrated into the browsers — no cloud calls or token costs.
•Easy-to-Use JavaScript Interfaces: Just a few lines of client-side
code enable powerful AI features like text generation, summarization,
and rewriting.
•Improved Privacy & Performance: No external model hosting,
optimized for real-time, low-latency AI on supported devices.

ELT layout
// Summarize an article with added context and desired style and length.
const summarizer = await Summarizer.create({
sharedContext: "An article from the Daily Economic News magazine" ,
type: "tl;dr",
length: "short"
});
const summary = await summarizer.summarize(articleEl.textContent, {
context: "This article was written 2024 -08-07 and is in the World Markets
section."
});
// Write a blurb based on the provided prompt and tone.
const writer = await Writer.create({
tone: "formal"
});
const result = await writer.write(
"A draft for an inquiry to my bank about how to enable wire transfers on my
account"
);
Show me the code!

ELT layout
How to get started
Read the explainers and try the playground
Identify real-life usecases and build:
AI-powered search
Personalized news feeds and custom content filters
Calendar event creation
Seamless contact extraction
Share feedback on GitHub to shape future development
https://github.com/webmachinelearning/prompt-api
https://github.com/webmachinelearning/writing-assistance-apis

ELT layout
Thank you! Connect with me on LinkedIn to:
•Get this deck immediately
•Get support with WebNN and
other AI technologies
•Follow latest Gen AI updates for
the developers

Privacy-first in-browser Generative AI web apps: offline-ready, future-proof, standards-based

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Privacy-first in-browser Generative AI web apps: offline-ready, future-proof, standards-based

About This Presentation

Slide Content

Slide 1

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx