Unstructured Data Processing from Cloud to Edge Webinar

bunkertor 257 views 53 slides Aug 01, 2024

Slide 1 of 53

About This Presentation

Unstructured Data Processing from Cloud to Edge Webinar

https://zilliz.com/event/unstructured-data-processing-from-cloud-to-edge

In this talk, Tim will do a presentation on why you should add a Cloud Native Vector Database to your Data and AI platform. He will also cover a quick introduction to Mi...

Size: 14.53 MB

Language: en

Added: Aug 01, 2024

Slides: 53 pages

Slide Content

1 | © Copyright 2024 Zilliz1
Unstructured Data Processing From
Cloud to Edge
Tim Spann @ Zilliz

2 | © Copyright 10/22/23 Zilliz2 | © Copyright 10/22/23 Zilliz 2| © Copyright 10/22/23 Zilliz 2| © Copyright 10/22/23 Zilliz
Tim Spann
Principal Developer
Advocate, Zilliz
[email protected]
https://www.linkedin.com/in/timothyspann/
https://x.com/PaaSDev
Speaker

3 | © Copyright 2024 Zilliz3
01Introduction
CONTENTS
02Edge AI Use Cases
03Edge Devices

5 | © Copyright Zilliz5
01
Introduction

6 | © Copyright 2024 Zilliz6
●Introduction to Unstructured Data Processing
●Introduction to Milvus
●Adding Milvus to Your Infrastructure
●AI Use Case Improvements
●Edge devices working with AI and vector databases

7 | © Copyright Zilliz7
-Unstructured Data is 80% of data

-Vector Databases are the only type of database
that can work with unstructured data

- Examples of Unstructured Data include text,
images, videos, audio, etc
Unstructured Data?

8 | © Copyright Zilliz8
Why Even Use a Vector DB?
Beyond High-Performance Search
•CRUD Operations: Just like traditional databases, vector
databases allow you to Create, Read, Update, and Delete data.
•Data Freshness: Vector databases ensure your data remains
up-to-date, reflecting the latest information for accurate searches.
•Persistence: Your data is securely stored and persists even if the
system restarts.
•Availability: Your data is readily accessible for search and retrieval
operations.
•Scalability: Vector databases can handle growing data volumes
efficiently.

9 | © Copyright Zilliz9
Complete Data Management
•Data Management: Vector databases provide tools to manage
your data effectively, including data ingestion, indexing, and
querying.
•Backup and Migration: Create backups of your data for disaster
recovery and easily migrate your data between different systems.
Why Even Use a Vector DB?

10 | © Copyright Zilliz10
Operational ease
•Cloud or On-Premise Deployment: Vector databases can be
deployed easily on various platforms, including cloud and
on-premise environments.
•Observability: Monitor the health and performance of your vector
database to ensure optimal operation.
•Multi-tenancy: Support multiple users or applications accessing
the same database instance securely.
Why Even Use a Vector DB?

11 | © Copyright Zilliz11
Milvus Features
Multi-Tenancy

Hardware-
Accelerated
Compute Support
Python, Java,
Golang, NodeJS

Milvus Lite, K8,
Zilliz Cloud, Docker

Scalable and Elastic
Architecture

Diverse Index
Support

Versatile Search
Capabilities

Tunable
Consistency

12 | © Copyright Zilliz12
Technologies for various types of Use
cases
Compute Types

Designed for various
compute powers, such as
AVX512, Neon for SIMD,
quantization cache-aware
optimization and GPU

Leverage strengths of each
hardware type, ensuring
high-speed processing and
cost-effective scalability for
different application needs

Search Types

Support multiple types such
as top-K ANN, Range ANN,
sparse & dense,
multi-vector, grouping,
and metadata filtering

Enable query flexibility and
accuracy, allowing
developers to tailor their
information retrieval needs
Multi-tenancy

Enable multi-tenancy
through collection and
partition management

Allow for efficient resource
utilization and customizable
data segregation, ensuring
secure and isolated data
handling for each tenant
Index Types

Offer a wide range of 15
indexes support, including
popular ones like
Hierarchical Navigable
Small Worlds HNSW, PQ,
Binary, Sparse, DiskANN
and GPU index

Empower developers with
tailored search
optimizations, catering to
performance, accuracy and
cost needs

13 | © Copyright Zilliz13
02
Edge AI Use Cases

14 | © Copyright Zilliz14
•Robots

•Smart Cities

•Smart Factories

•Autonomous Cars

•Automated Retail

•Smart Home
Edge AI Use Cases

15 | © Copyright Zilliz15
•Proprietary Document Search

•On-Device Object Detection

•Milvus Lite on Device
Local Search on Edge Devices

16 | © Copyright Zilliz16
•5G and Everywhere Networks

•IoT with Cheap Plentiful Sensors

•Edge Computing Power CPU, GPU, RAM

•Edge Neural Networks and Gen AI

•Unstructured Data Processing and Vector DB
Edge Hyper Enablers

17 | © Copyright Zilliz17
•Vision to Images and Videos

•Audio from Cameras and Microphones

•Raw Text

•Edge Neural Networks and Gen AI

•Unstructured Data Processing and Vector DB
Edge Unstructured Data

21 | © Copyright Zilliz21
•NVIDIA Jetson Xavier NX

•NVIDIA Jetson AGX Orin

•Smart AI Cameras

•Raspberry Pi 5 with Hailo AI Acceleration
module
Edge Hardware Examples

22 | © Copyright Zilliz22

•Cloud, Docker, Standalone or On-Premise Deployment: Can send
vectors and other fields to local, remote or Cloud Milvus.
•Instant Local Search: access local unstructured data for fast
search and local applications.
•Secure Local Data
•No Network Necessary: Especially for autonomous robots and
vehicles. Make instant local decisions.
•Local RAG and Super Charge Edge AI: enhance local image,
audio, video, text data with local LLMs. OLLAMA with RPI.
Generative AI
•Local Live Video
Why Even Use a Vector DB on the Edge?

27 | © Copyright Zilliz27
Some Other SDKs and Interfaces
Node.JS

Java

Golang

RESTful API
.NET C#

Apache Spark

Apache Kafka

Ruby

28 | © Copyright Zilliz28
Edge AI  Edge Vector Database
Retrieval Augmented
Generation RAG
Run local LLM like OLLAMA
Image Similarity Search
Capture and search images at the
edge for no network, local
robotics, remote and secure.
Video Similarity Search
Search for similar videos, scenes,
or objects from local videos.
Audio Similarity Search
Find similar audios in local audio for
tasks like genre classification or
speech recognition for robotics and
sensing
Anomaly Detection
Detect data points, events, audio,
images and observations that
deviate significantly from the usual
pattern at the edge
Facial Recognition
For security applications
Customization
Robots
Benefits
Lower latency
Offline
Security
Localized storage

29 | © Copyright Zilliz29
Edge Vectors to the Cloud
Framework
Hardware
Infrastructure
Embedding Models LLMs
Software Infrastructure
Vector Database

30 | © Copyright Zilliz30
Milvus-Lite to the Cloud
●Milvus-Lite Dump/Export to Cloud Import

●Dual Ingest

●Switch to Cloud Only

●Kafka / Pulsar / MQTT

●Unstructured Data to MinIO, S3 or Cloud Object Storage

31 | © Copyright Zilliz31
Milvus-Lite Export to JSON
milvus-lite dump -d XavierEdgeAI.db -p
/home/nvidia/nvme/AIMXavierEdgeAI/backup/ -c XavierEdgeAI
Dump collection XavierEdgeAI's data: 100%|████████████████ |
33/33 00000000, 188.54it/s]
Dump collection XavierEdgeAI success
Dump collection XavierEdgeAI's data: 100%|████████████████ |
33/33 00000000, 127.16it/s]
(milvusvenv)

https://github.com/milvus-io/milvus-lite

https://medium.com/@tspann/unstructured-data-processing-with-a-raspberry-pi-ai-kit-c959dd7fff47
Raspberry Pi AI Kit Hailo
Edge AI

https://medium.com/@tspann/edgeai-edge-vector-database-6a9b5238bffb
https://github.com/tspannhw/AIM-XavierEdgeAI

38 | © Copyright Zilliz38
Vector Database Resources
Give Milvus a Star!

Chat with me on Discord!
https://github.com/milvus-io/milvus

Extracting Value from Unstructured Data
Example
•A company has 100,000s+ pages of
proprietary documentation to enable
their staff to service customers.
Problem
•Searching can be slow, ineﬃcient, or
lack context.
Solution
•Create internal chatbot with ChatGPT
and a vector database enriched with
company documentation to provide
direction and support to employees
and customers.
https://osschat.io/chat

We provide deployment flexibility for different
operational, security and compliance requirements
BRING YOUR OWN CLOUD
Zilliz BYOC
Enterprise-ready Milvus for
Private VPCs
Deploy in your virtual private cloud
Zilliz Cloud
Milvus Re-engineered for the
Cloud
Available on the leading public
clouds
FULLY MANAGED SERVICE
Coming Soon! Coming Soon!
Milvus
Most widely-adopted open
source vector database
Self hosted on any machine with
community support
SELF MANAGED SOFTWARE
Local Docker K8s

42 | © Copyright Zilliz42
Well-connected in LLM infrastructure to enable RAG
use cases
Framework
Hardware
Infrastructure
Embedding Models LLMs
Software Infrastructure
Vector Database

43 | © Copyright 10/22/23 Zilliz43 | © Copyright 10/22/23 Zilliz
milvus.io
github.com/milvus-io/
@milvusio
@paasDev

/in/timothyspann
Connect with me! Thank you!

45 | © Copyright 10/22/23 Zilliz45 | © Copyright 10/22/23 Zilliz 45| © Copyright 10/22/23 Zilliz 45| © Copyright 10/22/23 Zilliz
Milvus
Open Source Self-Managed

Zilliz Cloud
SaaS Fully-Managed

github.com/milvus-io/milvus

Getting Started with Vector Databases
zilliz.com/cloud

46
Unstructured Data Meetup

https://www.meetup.com/unstructured-data-meetup-new-york/

This meetup is for people working in unstructured data. Speakers will come present about related topics
such as vector databases, LLMs, and managing data at scale. The intended audience of this group
includes roles like machine learning engineers, data scientists, data engineers, software engineers, and
PMs.
This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.

47 | © Copyright 10/22/23 Zilliz47 | © Copyright 10/22/23 Zilliz
47
This week in Milvus, Towhee, Attu, GPT
Cache, Gen AI, LLM, Apache NiFi, Apache
Flink, Apache Kafka, ML, AI, Apache Spark,
Apache Iceberg, Python, Java, Vector DB
and Open Source friends.
https://bit.ly/32dAJft
https://github.com/milvus-io/milvus

AIM Weekly by Tim Spann

https://medium.com/@tspann/unstructured-street-data-in-new-york-8d3cde0a1e5b

https://medium.com/@tspann/not-every-field-is-just-text-numbers-or-vectors-976231e90e4d

https://medium.com/@tspann/shining-some-light-on-the-new-milvus-lite-5a0565eb5dd9

Unstructured Data Processing from Cloud to Edge Webinar

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Unstructured Data Processing from Cloud to Edge Webinar

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Slide 28

Slide 29

Slide 30

Slide 31

Slide 32

Slide 33

Slide 34

Slide 35

Slide 36

Slide 37

Slide 38

Slide 39

Slide 40

Slide 41

Slide 42

Slide 43

Slide 44

Slide 45

Slide 46

Slide 47

Slide 49

Slide 50

Slide 51

Slide 53

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx