Session 4 - Specialized AI Associate Series: UiPath Document Understanding Overview

DianaGray10 0 views 36 slides Oct 15, 2025
Slide 1
Slide 1 of 36
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36

About This Presentation

🚀 Welcome to Session 4/ UiPath Upskill to Agentic Automation - Specialized AI Series 2025!

📕 Agenda:

Introductions
The stages of the Document Understanding framework
The main Document Understanding components
The Document Understanding project lifecycle
Document Understanding licensing
For l...


Slide Content

The UiPath word mark, logos, and robots are registered trademarks owned by UiPath, Inc. and its affiliates. UiPath (R) is a registered trademark in the United States and several
countries across the globe. See TMEP 906. ©2025 UiPath. All rights reserved.
By UiPath Community
AI Associate
Developer Series

2
Agenda slide
01
02
Document Understanding Overview
UiPath Platform for Document Understanding
03
Document Understanding –A Closer Look
04
Document Understanding –Methodologies

UiPath Document
Understanding
Overview

4
Get your documents
processed intelligently
Teach your robots to understand documents
using AI-enhanced skillsfor data extraction
and interpretation.
Drag and drop these capabilities directly into
your automationworkflows to embed AI

5
What is document understanding?
Document
Understanding
Artificial
Intelligence
(AI)
Document
Processing
Robotic Process
Automation
(RPA)
Document understanding is the ability to
extractand interpret information and
meaningfrom a wide range of documents.
It emerges at the intersection of document
processing, AI, and RPA.
Not OCR
Not Computer Vision

6
© Copyright UiPath 2022. All Rights Reserved.
Document Understanding Features
Document Understanding offers end-to-end capabilities to use a combination of rule-based and model-based
approaches to process documents.
Composable framework for
flexibility and best
technological choices
End-to-end solution for
extracting and interpreting
information
Support for various types
of documents
Support for processing
different file formats
Recognition of various
document objects
Usage of templates/rules
and Machine Learning (ML)
models to understand data
AI-powered capabilities Model retraining
capabilities
Availability in cloud and
on-premise

7
Teach robots how to process your documents using intelligent
drag-and-drop skills for dataextraction and interpretation​
IN
T
E
L
L
I
G
E
N
TF
L
E
X
IB
LE
A
C
C
U
R
A
T
E EFF
I C
IE
N
T
AI understands documents, takes actions,
and learns from the data
Getting rid of the “noise” caused by
unrelated, rotated, or skewed documents
Saving time and costs with seamless
end-to-end automation
Processing a wide range of documents and
layouts, handwriting, checkboxes
Machine learning (ML) skills improve
over time based on the custom data
Mix of template and template-less
approaches for most accurate results

8
•Like forms, passports, licenses, time
sheets
•Fixed in format and can contain
handwriting, signatures, checkboxes​
•Like invoices, receipts, purchase
orders, medical bills, utility bills
•Containing fixedand variable parts
like tables
•Like contracts,agreements, emails,
scripts, drug prescriptions, news
•No fixed format, free-form
sentences/paragraphs
Which documents can be handled by
Document Understanding?
Structured documents Semi-structured documents Unstructured documents

UiPath Platform for
Document
Understanding

10
© Copyright UiPath 2022. All Rights Reserved.
Document Understanding and the
UiPath Business Automation Platform
Studio
Robots
Orchestrator
Action Center
Pre-trained models available out
of the box
Bring your own model -custom or
third-party
Retrain the models
Core RPA tools
Human validation
Integration
Service
AI Center

11
1 2 3
Understand ActReceive
Document Types
•Structured
•Semi-structured
•Unstructured
Document Variety
•Multiple languages
•Various formats
•Varying templates
•Handwriting
•Signatures
•Skewed docs
•Checkboxes
•Low quality scans
Streamline end-to-end processes,
improve business outcomes
and reduce manual effort
Built for: RPA and Citizen Developers –no data science skills required
Built-in ML: Pre-trained ML models, data labeling, retraining
DIGITIZE
CLASSIFY
EXTRACT
VALIDATE
RETRAIN
ANALYZE
End-to-end Intelligent Document
Processing solution

13
Document Understanding Typical
Workflow
Load taxonomy
defines document
types and fields for
processing.
Digitize
documents using
Optical Character
Recognition (OCR)
to make them
machine-readable.
Classify
and split the files
into document types.
Extract
information from the
documents.
Export
the extracted data
for further usage.
Train
classifiers based on
the validated data.
Train
extractors based on
the validated data.
Validate
classification results
(human review).
Validate
extractors results
(human review).

Document Understanding –
A Closer Look

15
Taxonomy manager is
used once at the start to
define the collection of
documents that you
would want to process
as well as business rules.
Additionally, you can
describe what datayou
would like to extract.
Load taxonomy

16
Move-For-You Co.
We move so you don’t need to move
PO: NP74006735
1 February 2020
PAYABLE WITHIN 15 DAYS OF RECEIPT
20800 ALMADEN AVE, SUITE 404
SAN JOSE, CA 95120-0520
T: +1 425 555 9876
F: +1 425 555 3456
E: [email protected]
www.moveforyou-co.com
Bill To:
Tony Tzeng
12345 Mango Lane
Seattle, WA 98108
INVOICE DETAILS
Packing services
Storage fees (1 month)
House move (white-glove service)
Vehicle storage and transport
Sales tax 10%
Total Fee including Tax
FEE
$1,282.00
$1,884.00
$5,320.00
$5,186.00
$1,367.20
$15,039.20
Methods of payment
Personal Check: Move-For-You LLC
Wire Transfer: BigBankCo., Account 123456789-0987ABC
Invoice No: 456200-TZE1
Digitize text in the documents
using OCR

17
Classifyand split the documents
Documents scanned into one file
isn’t a problem –owing to
classifiers, the robot can identify
the document types and split the file
to process the documents
accordingly.
Document Understanding offers
different classification capabilities
ranging from keyword-based to
ML-based classification.

18
Validate classification of the
documents
Classification Station is
used to check, correct, and
confirm the results of
document classification and
splitting.

19
You can easily configure
data extraction to choose
most suitable extractor
for each field.
Use a combination of rule-
based and model-based
approaches to ensure
smooth and accurate
processing of different
documents.
Extract data from the documents

20
Validate Extraction of Documents
▪Validate the extracted
information and handle
exceptions using
Validation Station.
▪Now, retrain ML models
using the data
confirmed or corrected
in Validation Station.

21
© Copyright UiPath 2022. All Rights Reserved.
Train Classifiers and Extractors
Let the classifiers and
extractors learn from the
data corrected and
validated in Classification
Station and Validation
Station, respectively.

22
Export the Extracted Data
End-to-end intelligent
document processing
Start & continue the document
processing workflows with
other automation components.
Export the data for further
usage/automation, for example,
to an Excel spreadsheet, to
SAP system, send as an email,
and so on.
Start
Document Understanding
Decisions
Action
Action
Action
End

Document Understanding –
Methodologies

24
DocumentTypes
▪Required information found in
the same place
▪Fixed in format
▪Examples: Forms, passports,
licenses, and time sheets
containing handwritten text,
signatures, checkboxes
▪Repetitive information each
time
▪Found in fixed and variable
document parts such as
tables
▪Examples: Invoices, receipts,
purchase orders, medical
bills, bank statements, utility
bills
▪No fixed format
▪Examples:
Contracts,agreements,
emails,diseasedescriptions,
drug prescriptions, news,
voice scripts​
Structured Semi-structured Unstructured

25
Document Processing Methodologies
Based on the document type, there are two common types
of data extraction methodologies namely, rule-based and
model-based.
▪Rule-based approaches require users to create
rules/templates that can best extract information from
their documents.
▪Model-based approaches rely on ML and statistical
techniques.
Both approaches are extremely potent tools but
sometimeslimited in their abilities to process optimally the
range ofdocuments companies can manage.
The Document Understandingframework overcomes these
limitations of an individual approach by implementing the
hybrid approach.
Hybrid
Rule-based Model-based

26
Document Processing Methodologies
(Cont’d)
Rule-based
Structured fields,
mostly used for
structured documents
Mostly structured
documents, tables,
checkboxes,
handwriting,
signatures
Most structured
documents (forms)
Mostly semi–
structured documents
RegEx Based
Extractor
Form Extractor Forms AI Machine Learning
Extractor
Model-based
Hybrid
A combination of both –rule-based and model-based extractors
Mostly documents combining both structured and less structured
formats

27
Document Processing Methodologies
(Cont’d)
Enables users to
create and use a
customized
Regular
Expression
(RegEx) to extract
information from a
document.
Rule-based
Structured fields,
mostly used for
structured documents
Mostly structured
documents, tables,
checkboxes,
handwriting,
signatures
Most structured
documents (forms)
Mostly semi–
structured documents
RegEx Based
Extractor
Form Extractor Forms AI Machine Learning
Extractor
Model-based
Hybrid
A combination of both –rule-based and model-based extractors
Mostly documents combining both structured and less structured
formats

28
Document Processing Methodologies
(Cont’d)
Enables users to
create templates
to extract, match,
and report
information by
taking into
consideration the
words' position
inside the
document.
Rule-based
Structured fields,
mostly used for
structured documents
Mostly structured
documents, tables,
checkboxes,
handwriting,
signatures
Most structured
documents (forms)
Mostly semi–
structured documents
RegEx Based
Extractor
Form Extractor Forms AI Machine Learning
Extractor
Model-based
Hybrid
A combination of both –rule-based and model-based extractors
Mostly documents combining both structured and less structured
formats

29
Document Processing Methodologies
(Cont’d)
Processes forms
and documents
that have similar
formats and fixed
formats with low
diversity in
layouts and
provides point-
and-click usage
experience.
Rule-based
Structured fields,
mostly used for
structured documents
Mostly structured
documents, tables,
checkboxes,
handwriting,
signatures
Most structured
documents (forms)
Mostly semi–
structured documents
RegEx Based
Extractor
Form Extractor Forms AI Machine Learning
Extractor
Model-based
Hybrid
A combination of both –rule-based and model-based extractors
Mostly documents combining both structured and less structured
formats

30
Document Processing Methodologies
(Cont’d)
Enables users to
extract template-
less similar data
points fromsemi-
structured
orunstructured
documents using
MLmodels.​
Rule-based
Structured fields,
mostly used for
structured documents
Mostly structured
documents, tables,
checkboxes,
handwriting,
signatures
Most structured
documents (forms)
Mostly semi–
structured documents
RegEx Based
Extractor
Form Extractor Forms AI Machine Learning
Extractor
Model-based
Hybrid
A combination of both –rule-based and model-based extractors
Mostly documents combining both structured and less structured
formats

31
Rule-based or template-based approach
Relies on rules (like regular
expressions) and templates
(including anchors)
Processes fixed in format
structured data
Ensures high accuracy for
already known documents

32
Pre-trained models
Machine learning (ML) models
as a template-less approach
Custom models
•No-code light-weight models in Forms AI
•Custom ML models in AI Center
•Third-party models
Model retraining
Learn about sharing data for model retraining here
•Invoices
•Receipts
•Purchase Orders
•Utility Bills
•Passports
•ID Cards*
•Legal Contracts
•W-2 Forms
•W-9 Forms
•Delivery Notes
•Remittance
Advices
•ACORD 125
•I9 Forms
•990 Forms
•4506T Forms
•FM1003 Forms
•Pay slips & personal
earnings statements
•Certificates of origin
•EU declarations of
conformity
•Children’s product
certificates
•Certificates of
incorporation
•Shipping invoices
•CMS1500
•Retraining via AI Center
•Continuous learning loop based on human validated data

33
Make use of pre-trained ML
modelsto process invoices,
receipts, utility bills, ID cards,and
many more document types.
Retrainthe models to optimize
them for your custom documents
and improve the model accuracy
over time!
Bring your own model or third
party models and incorporate
them in your automations.
Pre-trained ML models

34
ML model training via AI Center
You can use Document
Manager to train your custom
ML models or retrain the
existing models in AI Center.
This would help robots
understand the specificities of
your documents better. The
more you work with the model,
the more effective it becomes.
Thus, the accuracy of the
extracted data improves over
time.
Learn about sharing data for model retraining here.

35
Example scenario:

Document Understanding Demo
37
UiPath Document Understanding Workflow -
Animated Flowchart