Drake Pocsatko: We have HOW many documents? We have HOW many Documents? Architecting Modern Document Processing
awschicago
18 views
14 slides
Jun 24, 2024
Slide 1 of 14
1
2
3
4
5
6
7
8
9
10
11
12
13
14
About This Presentation
Drake Pocsatko
We have HOW many documents? We have HOW many Documents? Architecting Modern Document Processing
AWS Community Day Midwest 2024
Size: 1.93 MB
Language: en
Added: Jun 24, 2024
Slides: 14 pages
Slide Content
MIDWEST | OHIO
We have HOW
many documents?
Architecting Modern Document Processing
Drake Pocsatko
Sr Consultant Cloud Enablement
Slalom Ohio
Agenda
•About me
•What are OCR, NLP, and IDP?
•Why automate document processing?
•The Tools for the Job
•Sample Architectures
•Extraction Demo
About Me
•B.S. Computer Science & Engineering – The Ohio State University
•B.A. Physics – Washington & Jefferson College
•Pittsburgh, PA born and raised
•8 years in Columbus, OH
•~3 years with Slalom
•Slalom is a next-generation professional services company
creating value at the intersection of business, technology, and
humanity. Markets all over the world, keeping local
connections.
•Wife, Lauren, of 6 years. 2 dogs & a cat
What are OCR, NLP, and IDP?
•Optical Character Recognition (OCR)
•The ability for software to recognize characters in an image and to convert those characters to
text.
[1]
•Natural Language Processing (NLP)
•A machine learning technology that gives computers the ability to interpret, manipulate, and
comprehend human language.
[2]
•Intelligent Document Processing (IDP)
•Automating the process of manual data entry from paper-based documents or document
images to integrate with other digital business processes.
[3]
[1] Getting started with optical character recognition – AWS
[2] What is Natural Language Processing (NLP)? – AWS
[3] What is Intelligent Document Processing? - AWS
Why Automate Document Processing?
The Problem with Document Processing
•Client “Z” – national healthcare payor
•Z still utilizes paper documents equal to the digital equivalent of petabytes of
data per year or billions of pieces of paper.
•Z annually hires numerous contingent workers to process and adjudicate
these documents.
•High variance across documents AND within document types.
The Benefits of Intelligent Document Processing
•Manual processing for contracts
•1 contract processed by 1 human per 1 day with an average of
~20-25% rate of rework
•Intelligent processing for contracts
•50K contracts processed by IDP every 8 hours with an average of ~7%
needing HIL (human in the loop) review.