Unstructured Data Processing from Cloud to Edge Webinar

bunkertor 257 views 53 slides Aug 01, 2024
Slide 1
Slide 1 of 53
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53

About This Presentation

Unstructured Data Processing from Cloud to Edge Webinar

https://zilliz.com/event/unstructured-data-processing-from-cloud-to-edge

In this talk, Tim will do a presentation on why you should add a Cloud Native Vector Database to your Data and AI platform. He will also cover a quick introduction to Mi...


Slide Content

1 | © Copyright 2024 Zilliz1
Unstructured Data Processing From
Cloud to Edge
Tim Spann @ Zilliz

2 | © Copyright 10/22/23 Zilliz2 | © Copyright 10/22/23 Zilliz 2| © Copyright 10/22/23 Zilliz 2| © Copyright 10/22/23 Zilliz
Tim Spann
Principal Developer
Advocate, Zilliz
[email protected]
https://www.linkedin.com/in/timothyspann/
https://x.com/PaaSDev
Speaker

3 | © Copyright 2024 Zilliz3
01Introduction
CONTENTS
02Edge AI Use Cases
03Edge Devices

4 | © Copyright 2024 Zilliz4

5 | © Copyright Zilliz5
01
Introduction

6 | © Copyright 2024 Zilliz6
●Introduction to Unstructured Data Processing
●Introduction to Milvus
●Adding Milvus to Your Infrastructure
●AI Use Case Improvements
●Edge devices working with AI and vector databases

7 | © Copyright Zilliz7
-Unstructured Data is 80% of data

-Vector Databases are the only type of database
that can work with unstructured data

- Examples of Unstructured Data include text,
images, videos, audio, etc
Unstructured Data?

8 | © Copyright Zilliz8
Why Even Use a Vector DB?
Beyond High-Performance Search
•CRUD Operations: Just like traditional databases, vector
databases allow you to Create, Read, Update, and Delete data.
•Data Freshness: Vector databases ensure your data remains
up-to-date, reflecting the latest information for accurate searches.
•Persistence: Your data is securely stored and persists even if the
system restarts.
•Availability: Your data is readily accessible for search and retrieval
operations.
•Scalability: Vector databases can handle growing data volumes
efficiently.

9 | © Copyright Zilliz9
Complete Data Management
•Data Management: Vector databases provide tools to manage
your data effectively, including data ingestion, indexing, and
querying.
•Backup and Migration: Create backups of your data for disaster
recovery and easily migrate your data between different systems.
Why Even Use a Vector DB?

10 | © Copyright Zilliz10
Operational ease
•Cloud or On-Premise Deployment: Vector databases can be
deployed easily on various platforms, including cloud and
on-premise environments.
•Observability: Monitor the health and performance of your vector
database to ensure optimal operation.
•Multi-tenancy: Support multiple users or applications accessing
the same database instance securely.
Why Even Use a Vector DB?

11 | © Copyright Zilliz11
Milvus Features
Multi-Tenancy

Hardware-
Accelerated
Compute Support
Python, Java,
Golang, NodeJS

Milvus Lite, K8,
Zilliz Cloud, Docker

Scalable and Elastic
Architecture

Diverse Index
Support

Versatile Search
Capabilities

Tunable
Consistency

12 | © Copyright Zilliz12
Technologies for various types of Use
cases
Compute Types


Designed for various
compute powers, such as
AVX512, Neon for SIMD,
quantization cache-aware
optimization and GPU


Leverage strengths of each
hardware type, ensuring
high-speed processing and
cost-effective scalability for
different application needs


Search Types


Support multiple types such
as top-K ANN, Range ANN,
sparse & dense,
multi-vector, grouping,
and metadata filtering

Enable query flexibility and
accuracy, allowing
developers to tailor their
information retrieval needs
Multi-tenancy


Enable multi-tenancy
through collection and
partition management



Allow for efficient resource
utilization and customizable
data segregation, ensuring
secure and isolated data
handling for each tenant
Index Types


Offer a wide range of 15
indexes support, including
popular ones like
Hierarchical Navigable
Small Worlds HNSW, PQ,
Binary, Sparse, DiskANN
and GPU index

Empower developers with
tailored search
optimizations, catering to
performance, accuracy and
cost needs

13 | © Copyright Zilliz13
02
Edge AI Use Cases

14 | © Copyright Zilliz14
•Robots

•Smart Cities

•Smart Factories

•Autonomous Cars

•Automated Retail

•Smart Home
Edge AI Use Cases

15 | © Copyright Zilliz15
•Proprietary Document Search

•On-Device Object Detection

•Milvus Lite on Device
Local Search on Edge Devices

16 | © Copyright Zilliz16
•5G and Everywhere Networks

•IoT with Cheap Plentiful Sensors

•Edge Computing Power CPU, GPU, RAM

•Edge Neural Networks and Gen AI

•Unstructured Data Processing and Vector DB
Edge Hyper Enablers

17 | © Copyright Zilliz17
•Vision to Images and Videos

•Audio from Cameras and Microphones

•Raw Text

•Edge Neural Networks and Gen AI

•Unstructured Data Processing and Vector DB
Edge Unstructured Data

18 | © Copyright Zilliz18
03
Edge Devices

19 | © Copyright Zilliz19
Demos

20 | © Copyright Zilliz20
Demos

21 | © Copyright Zilliz21
•NVIDIA Jetson Xavier NX

•NVIDIA Jetson AGX Orin

•Smart AI Cameras

•Raspberry Pi 5 with Hailo AI Acceleration
module
Edge Hardware Examples

22 | © Copyright Zilliz22

•Cloud, Docker, Standalone or On-Premise Deployment: Can send
vectors and other fields to local, remote or Cloud Milvus.
•Instant Local Search: access local unstructured data for fast
search and local applications.
•Secure Local Data
•No Network Necessary: Especially for autonomous robots and
vehicles. Make instant local decisions.
•Local RAG and Super Charge Edge AI: enhance local image,
audio, video, text data with local LLMs. OLLAMA with RPI.
Generative AI
•Local Live Video
Why Even Use a Vector DB on the Edge?

23 | © Copyright Zilliz23
Milvus Lite Locally
pip install pymilvus

24 | © Copyright Zilliz24
Milvus Docker Locally or Edge Server
pip install pymilvus

25 | © Copyright Zilliz25
Milvus Zilliz Cloud
pip install pymilvus

26 | © Copyright Zilliz26
Python SDK Connect…

27 | © Copyright Zilliz27
Some Other SDKs and Interfaces
Node.JS

Java

Golang

RESTful API
.NET C#

Apache Spark

Apache Kafka

Ruby

28 | © Copyright Zilliz28
Edge AI  Edge Vector Database
Retrieval Augmented
Generation RAG
Run local LLM like OLLAMA
Image Similarity Search
Capture and search images at the
edge for no network, local
robotics, remote and secure.
Video Similarity Search
Search for similar videos, scenes,
or objects from local videos.
Audio Similarity Search
Find similar audios in local audio for
tasks like genre classification or
speech recognition for robotics and
sensing
Anomaly Detection
Detect data points, events, audio,
images and observations that
deviate significantly from the usual
pattern at the edge
Facial Recognition
For security applications
Customization
Robots
Benefits
Lower latency
Offline
Security
Localized storage

29 | © Copyright Zilliz29
Edge Vectors to the Cloud
Framework
Hardware
Infrastructure
Embedding Models LLMs
Software Infrastructure
Vector Database

30 | © Copyright Zilliz30
Milvus-Lite to the Cloud
●Milvus-Lite Dump/Export to Cloud Import

●Dual Ingest

●Switch to Cloud Only

●Kafka / Pulsar / MQTT

●Unstructured Data to MinIO, S3 or Cloud Object Storage

31 | © Copyright Zilliz31
Milvus-Lite Export to JSON
milvus-lite dump -d XavierEdgeAI.db -p
/home/nvidia/nvme/AIMXavierEdgeAI/backup/ -c XavierEdgeAI
Dump collection XavierEdgeAI's data: 100%|████████████████ |
33/33 00000000, 188.54it/s]
Dump collection XavierEdgeAI success
Dump collection XavierEdgeAI's data: 100%|████████████████ |
33/33 00000000, 127.16it/s]
(milvusvenv)

https://github.com/milvus-io/milvus-lite

https://medium.com/@tspann/unstructured-data-processing-with-a-raspberry-pi-ai-kit-c959dd7fff47
Raspberry Pi AI Kit Hailo
Edge AI

https://medium.com/@tspann/edgeai-edge-vector-database-6a9b5238bffb
https://github.com/tspannhw/AIM-XavierEdgeAI

34 | © Copyright Zilliz34 | © Copyright Zilliz34
RESOURCES

35 | © Copyright Zilliz35
Source Code
https://github.com/tspannhw/AIM-RPIAIKit-PoseEstimation

36 | © Copyright Zilliz36
More Source Code
https://github.com/tspannhw/AIM-RPIAIKit

37 | © Copyright Zilliz37
Even More Source Code
https://github.com/tspannhw/AIM-XavierEdgeAI

38 | © Copyright Zilliz38
Vector Database Resources
Give Milvus a Star!




Chat with me on Discord!
https://github.com/milvus-io/milvus

39 | © Copyright Zilliz39
https://zilliz.com/learn/generative-ai

Extracting Value from Unstructured Data
Example
•A company has 100,000s+ pages of
proprietary documentation to enable
their staff to service customers.
Problem
•Searching can be slow, inefficient, or
lack context.
Solution
•Create internal chatbot with ChatGPT
and a vector database enriched with
company documentation to provide
direction and support to employees
and customers.
https://osschat.io/chat

We provide deployment flexibility for different
operational, security and compliance requirements
BRING YOUR OWN CLOUD
Zilliz BYOC
Enterprise-ready Milvus for
Private VPCs
Deploy in your virtual private cloud
Zilliz Cloud
Milvus Re-engineered for the
Cloud
Available on the leading public
clouds
FULLY MANAGED SERVICE
Coming Soon! Coming Soon!
Milvus
Most widely-adopted open
source vector database
Self hosted on any machine with
community support
SELF MANAGED SOFTWARE
Local Docker K8s

42 | © Copyright Zilliz42
Well-connected in LLM infrastructure to enable RAG
use cases
Framework
Hardware
Infrastructure
Embedding Models LLMs
Software Infrastructure
Vector Database

43 | © Copyright 10/22/23 Zilliz43 | © Copyright 10/22/23 Zilliz
milvus.io
github.com/milvus-io/
@milvusio
@paasDev


/in/timothyspann
Connect with me! Thank you!

44 | © Copyright 10/22/23 Zilliz44 | © Copyright 10/22/23 Zilliz
Join the
Milvus
Discord!

45 | © Copyright 10/22/23 Zilliz45 | © Copyright 10/22/23 Zilliz 45| © Copyright 10/22/23 Zilliz 45| © Copyright 10/22/23 Zilliz
Milvus
Open Source Self-Managed

Zilliz Cloud
SaaS Fully-Managed

github.com/milvus-io/milvus

Getting Started with Vector Databases
zilliz.com/cloud

46
Unstructured Data Meetup


https://www.meetup.com/unstructured-data-meetup-new-york/

This meetup is for people working in unstructured data. Speakers will come present about related topics
such as vector databases, LLMs, and managing data at scale. The intended audience of this group
includes roles like machine learning engineers, data scientists, data engineers, software engineers, and
PMs.
This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.

47 | © Copyright 10/22/23 Zilliz47 | © Copyright 10/22/23 Zilliz
47
This week in Milvus, Towhee, Attu, GPT
Cache, Gen AI, LLM, Apache NiFi, Apache
Flink, Apache Kafka, ML, AI, Apache Spark,
Apache Iceberg, Python, Java, Vector DB
and Open Source friends.
https://bit.ly/32dAJft
https://github.com/milvus-io/milvus

AIM Weekly by Tim Spann

https://medium.com/@tspann/unstructured-street-data-in-new-york-8d3cde0a1e5b

https://medium.com/@tspann/not-every-field-is-just-text-numbers-or-vectors-976231e90e4d

https://medium.com/@tspann/shining-some-light-on-the-new-milvus-lite-5a0565eb5dd9

53 | © Copyright Zilliz53
T H A N K Y O U