Faster document review and production

Lexbe_Webinars 750 views 37 slides Mar 15, 2016
Slide 1
Slide 1 of 37
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37

About This Presentation

Increasing volumes of electronically stored information (ESI) in litigation have created the need for faster and more effective review procedures, software, and systems. Larger cases mean that 'eyes on' linear review of all documents just isn't possible sometimes. Document-intensive matt...


Slide Content

_ 2 TT

200
cm 2000

A Lawyer's Guide to Faster Document Review &
Production
Best Practices for Leveraging Attorney & Staff time with

ComputerAssisted Search, Document Clustering and Predictive Coding

March 16th, 2016

eDiscovery
WebinarSeries Stu van pero

eDiscovery Webinar Series

Info & Future

o Webinars take place monthly covers a variety of relevant eDiscovery
topics.

o IF you have technical issues or questions, please email
[email protected].

What attendees are saying:

o "Excellent presentation! One of the best webinars | have attended!"
o “Time wellspent.”

o "Great in terms of content and presentation. Thanks!"

o "Excellent and informative piece!"

CHE
ebinarSeries mi
A Lawyer's Guide to Faster Document Review & Production | March 16, 2016 | = x b e

eDiscovery Webinar Series

Stu Van Dusen Bio

o eDiscovery Solutions Consultant of Lexbe LC, a
provider of cloud-based litigation processing,
review and document management software &
eDiscovery services

o Specializes in working with firms without a Full in-
house department handling eDiscovery which are
involved in the type of complex litigation that
requires a high level of precision and eDiscovery
expertise to gain the advantage in the discovery
phase of trial.

Stu Van Dusen
800-401-7809 x55
[email protected]

eDiscovery

€ E
WebinarSeries lexbe
A Lawyer's Guide to Faster Document Review & Production | March 16, 2016 e x e

Faster Document Review & Production

Agenda

e Increasingly Document-Intensive Cases & Linear Reviews

e What are Technology Enhanced Reviews?

e When Should Technology Enhanced Reviews be Considered?
e Modern High-Speed Keyword Search

e Grouping Similar Documents for Grouped Review

e Uses and Applications of Predictive Coding

e Summary

a om
A Lawyer's Guide to Faster Document Review & Production | March 16, 2016 | e x b e

Faster Document Review & Production

Zettabytes*

5

iPhones
Bec ence

3 Online Storage
Digital Cameras

Facebook | LinkedIn

DropBox | Backup Devices

Elastic Storage | SaaS | Google Streets

Personal Blogs | Skype | World Satellite Images

Personal Scanners | Customer Service Recordings

1 Public Webcams | Google Drive | Netbooks | Cloud Instance Servers | PaaS

2005 2010 2015 2020
Source: IDC Digital Universe Study

1 Zetabyte =1 lion Ogabytes | ex b e

ocument Review & Productio

ster

Planning & Culling & Depos & Trial &
a ~

Decreasing Document Volume

Increasing Document Relevance

lexbe:

CASE STAGE SOURCE

Collection 8% Internal 4%

Processing 19% eDisc Providers 26%

Review 73% Outside Counsel 70%

Total 100% Total 100%

Best opportunities for further cost savings will be technologies and
process improvements that increase attorney review efficiencies.

N. Pace and L.
Electronic Di

re the Money Goes: Understanding Litigant Expenditures for Producing
nstitute for Civil Justice 2012)

lexbe:

Faster Document Review & Production

What is a Technology Enhanced Review?

e Technology enhanced reviews are those in which additional
applications, algorithms, or indexes are applied to a document set in
order to support the logical grouping of documents or automatic
coding of documents based on some degree of human input.

e Litigators should consider applying these technologies to their review
workflows and methodologies when some resource (time, money, or
people) is critically constrained on a case.

y om
A Lawyer's Guide to Faster Document Review & Production | March 16, 2016 | ex b e

Faster Document Review & Production

Modern Keyword Search

e Early Stage Culling - Reduce amount of ESI to be reviewed by using
keywords to cull document collections.

e Keyword-Based Responsive & Privilege Review - Construct search
queries to return documents that are likely to be responsive, confidential.
Search by name and email of counsel; privilege, work-product,
confidential and related keywords.

e ID Documents for Depo Prep - Find and assign key documents related to
specific case participants to prepare For depositions. Search by email
addresses used, names and nicknames used, important issues associated
with deponent.

e ID of Key Docs for Trial - Find and mark key case documents. Code
documents that will be needed for trial.

y om
A Lawyer's Guide to Faster Document Review & Production | March 16, 2016 | e x b e

Faster Document Review & Production

Modern Keyword Search Benefits

e Fast - Keyword search is very fast compared with other document search
methodologies.

e Inexpensive - Good results can be obtained at little cost compared with
manual review or other computer assisted methodologies.

e Quality - Search can deliver high quality results, particularly if keyword
terms are carefully developed and tested.

e Avoids Manual Review Errors/Inconsistencies - Search results are
computer generated, and so avoid known human review errors that can
result from Fatigue, inadequate training, lack of Focus, etc.

lexbe

A Lawyer's Guide to Faster Document Review & Production | March 16, 2016

Faster Document Review & Production

Multi-Index Based Keyword Search

@ Keyword search is supported best by indexes created from text
extracted from Native files (email, attachments, spreadsheets, etc.) and
a paginated File converted From Native files into PDF or TIFF and OCRed.

@ Most comprehensive approach and minimizes potential of lost data.

Benefits of Multi-Index Approach

Captures Text
Captures Excluded From Captures
Index Method Embedded Text Print Hidden Text
Imaged/OCR Yes No No
Native Extraction No Yes | Yes
Lexbe Multi-Index Yes Yes | Yes

lexbe

A Lawyer's Guide to Faster Document Review & Production | March 16, 2016

Faster Document Review & Production

e Native extraction will not index embedded body content

[title]September, 2000 Pro Forma Balance Sheet[title]

[body text] {body text]
[body embedded content]

sii 2/21/2014
[Comment] Critical Enron Evidence [Comment]

[body embedded content]

[Speakers Notes] [Critical Enron Evidence [Speakers Notes]

lexbe:

Faster Document Review & Production

Multi-Index Based Keyword Search

e Image/OCR will not index embedded Speaker's Notes

[title]September, 2000 Pro Forma Balance Sheet[title]

[body text] [body text]
[body embedded content]

[body embedded content]

lexbe

Faster Document Review & Production

Multi-Index Based Keyword Search
e Multi-Index Approach Captures Everything

[title]September. 2000 Pro Forma Balance Sheet]title]

[body text]|Critical Enron Evidence|[body text]
[body embedded content] _

[si] 2/21/2014|
[Comment] Critical Enron Evidence [Comment]

[body embedded content]
[Speakers Notes] [Critical Enron Evidence [Speaker Notes]

lexbe:

Faster Document Review & Production

Near Duplicate Detection

e NearDup technology automatically recognizes similar documents
within an e-discovery document collection

e Algorithm analyzes, evaluates and compares the actual text
content of the documents to each other

Unstructured Documents NearDup Groupings
a Bae Os Es i a
IFHLL EFTTTT:
apa > san
HL RRERE
R Rah ESE

F4 om
A Lawyer's Guide to Faster Document Review & Production | March 16, 2016 | S x b e

Faster Document Review & Production

Near Duplicate Detection

There are 4 main applications of NearDup analysis:

1) Grouping similar documents:
e Bunch highly similar documents together For more efficient
coding and review

2) Finding hidden ‘key’ or ‘hot’ docs:
e Retrieve and mark unseen documents that have content highly
related to existing ‘hot’ or ‘key’ documents

3) Preventing the inadvertent release of privileged information
e Be automatically alerted to files containing similar content to
documents that have already been coded as privileged

4) Enable email threading:
e Maintain relationships between email conversations

y om
A Lawyer's Guide to Faster Document Review & Production | March 16, 2016 | ex b e

Fast ocument Review & Productio

NearDup Groupings - Faster Responsive Review
NearDup Groups - Ordered

Order Doc# Group No.
Case Name 1 980 15 Benefits

Enron Demo Case 2 515 23
3 339 74
Total # Groups 3,425 4 241 54
fume Knope 5 2% ze Accelerate document
iments Grou 5 tee 20 à a
Tops 2283 7 107 7 review by batch coding
ee ae 5 am (using multidoc edit) larger
Top 50 4,513 10 65 633
Total 19,694 11 63 292 groups
12 59 101
13 59 865
M 5 95
Documents in NearDup Groups 5 = = Increase coding consistency
q mos 2s Of batched documents
18 54 329
22000 19 53 135
m 5 189
2 7 153 iyi
15000 a E 22 Reduce privilege errors
2 4 575
10,000 24 44 82
25 44 102
5000 2 ® 197
27 43 763
a HA a E 2 42 29
Tops Topi0 Tops TopSO Toul a i sd

om
A Lawyer's Guide to Faster Document Review & Production | March 16, 2016 | ex b e

Faster Document Review & Production

NearDup Groupings - Email Threading

Kenneth Lay{Kenneth.La..
david.haug @enron.comt...
kenneth [email protected],
Kenneth Lay {Kenneth Lay)
David Haug(David Haug)

. Jeff.skä[email protected]ü...

.. Jeff.skä[email protected])..

+ kenneth,lay@&enron.com{.
Jeff king @enron.com{
kenneth [email protected]{
jeff sling Genron.comG

onPlan —_eric;pitcher@enro..._jeff skiing Genron.comG,

JAM -Succession Plan Rex Shelby Kenneth Lay {Kenneth Lay)

ssion Plan — Jeffrey McMahon — Ken Lay@ENRON (Ken La...

JAM -Succession Plan richard.amabiei@... [email protected]ü

JAM -Re: Succession Plan [email protected]@enron.comt..

eee eee)

adaaa

2

A Lawyer's Guide to Faster Document Review & Production | March 16, 2016

Re: Succes:
Re: Succes
Succession
Re: Succes:
Re: Succes:
Succession
Succession
Succession
Succession
Re: Succes

Benefits

View email chains with similar
text in date & time order

Avoid confusion of emails only
tangentially related (<50% text
overlap)

Consistently code email chains

for responsiveness, privilege,
attorney-eyes only, etc.

lexbe

Faster Document Review & Production

NearDup Groupings - Preventing Privilege Waiver

Privilege Secure+

Case Name
Enron
Privilege Grouping
Consistent 294
Inconsistent 16
Total 310
Privilege
NearDup Groups

Total # Groups 310
Work Product Grouping
Consistent 301
Inconsistent 9
Total 310
Work Product
NearDup Groups

A Lawyer's Guide to Faster Document Review & Production | March 16, 2016

Benefits
Reduce privilege errors

Avoid sole reliance on human
coding consistency

Establish safeguards to help
maintain privilege

lexbe

Faster Document Review & Productio

What is TAR/Predictive Coding?

Technology Assisted Review

2. Check 3. Apply
1. Train Results Computer-assisted

Review Seed Set coding of

Review Control

remaining case
Sets a

documents

o Predictive coding allows a skilled reviewer to train a
computer algorithm to identify responsive and non-
responsive documents in a litigation document collection.

o As an alternative to manual linear review, predictive coding

can drastically reduce the amount of time needed to review
increasingly large ESI volumes.

y om
A Lawyer's Guide to Faster Document Review & Production | March 16, 2016 | ex b e

Faster Document. Review & Prod
Coding?

Collection

Processing

o Best opportunities for further cost savings will be reducing
review costs.

o Technologies and process improvements, like TAR, reduce costs
by increasing attorney review efficiencies

F4 om
A Lawyer's Guide to Faster Document Review & Production | March 16, 2016 | S x b e

Faster Document. Review & Production

Coding?

Increase Review Speed: TAR is designed to complete review of large
ESI collections faster than human reviewers. Applying TAR in a scalable
environment maximizes the speed advantage of predictive coding.

Decrease Review Costs: Whether paying per document or per hour,
TAR is significantly less expensive than exhaustive manual review.

Increase Review Quality: Many studies conclude that the presumed
quality advantage of ‘gold-standard’ manual review is not accurate.
TAR can support defensible, high-quality review outcomes.

lexbe

A Lawyer's Guide to Faster Document Review & Production | March 16, 2016

aster Document Review & Productio

How Does TAR/Predictive Coding Work?

Technology Assisted Review

2. Check 3. Apply

1. Train Results Computer-assisted
Review Seed Set coding of
remaining case

documents

Review Control
Sets

o Arandomized sample of - 2,400 documents, a seed set,
is selected from the collection.

Askilled document review professional reviews and
codes the seed set.

The coding decisions made in reviewing the seed set
train the predictive coding algorithm to identify
responsive content in the remaining documents.

o

o

7 om
A Lawyer's Guide to Faster Document Review & Production | March 16, 2016 | S x b e

Faster Document Review & Product

Work?

Technology Assisted Review

2. Check 3. Apply
1. Train Results Computer-assisted

coding of
ate remaining case
eer documents

Review Seed Set Review Control

o Iterative samples of 25 computer-reviewed documents,
control sets, are inspected For coding algorithm
accuracy.

The responsiveness designation assigned to the
document by the computer is either confirmed or

overturned.
o AnF-score - derived from precision and recall measures

- indicates the stability of the TAR results.
lexbe

o

A Lawyer's Guide to Faster Document Review & Production | March 16, 2016

Faster Document Review & Productio

Work?

Technology Assisted Review

2. Check 3. Apply
1. Train Results Computer-assisted

Review Seed Set coding of

Review Control
Sets

ig case
documents

o The TAR algorithm reviews the document collection based on
howit was trained during seed set coding and control set
review.

o Remaining Documents are tagged as responsive/non-responsive.

o The speed at which the document collection is reviewed by the
TAR algorithm is largely based on the computing resources
applied to the task.

om
A Lawyer's Guide to Faster Document Review & Production | March 16, 2016 | S x b e

Faster Document Review & Production
Understanding TAR/Predictive Coding Results

TAR/Predictive Coding results (F-scores) indicate:

o What proportion of the responsive documents were found by
the algorithm within a particular margin of error (recall)

o What percentage of documents marked responsive are

actually responsive within a particular margin of error
(precision)

a om
A Lawyer's Guide to Faster Document Review & Production | March 16, 2016 | e x b e

Faster Document Review & Production

Understanding Results: Precision & Recall

Precision: A measure of how often the algorithm accurately predicts a
document to be responsive; the percentage of produced documents
that are actually responsive.

Recall: A measure of what percentage of the responsive documents in a
data set have been classified correctly by the algorithm.
F-Score: Harmonic mean of precision and recall.

**Note: F1 scores should not to be interpreted as a measure of

review quality but rather as an indication of 1) how well the case
lends itself to TAR and 2) the quality of the seed set training.

lexbe

A Lawyer's Guide to Faster Document Review & Production | March 16, 2016

Faster Document Review & Production

Understanding Results: Precision & Recall
High Recall, High Precision: All of the responsive documents in the
collection were appropriately coded by the algorithm (high recall). All of the

documents produced are actually responsive (high precision). Best possible
outcome.

SL EEE A
Predicted EBEEBBEBEBBERBEEEEE

E Non-Responsive
E Responsive

F4 om
A Lawyer's Guide to Faster Document Review & Production | March 16, 2016 | S x b e

Faster Document Review & Production

Understanding Results: Precision & Recall
Low Recall, High Precision: Many of the responsive documents in the
collection were not appropriately coded by the algorithm (low recall).

However, a high percentage of the documents produced are responsive (high
precision). Increased risk of under-producing.

atl EBBBEBEEEBEBBEBEBEBE
Predicted BEBBERBEEBEBEEBEEEE

x x x

E Non-Responsive
E Responsive

F4 om
A Lawyer's Guide to Faster Document Review & Production | March 16, 2016 | S x b e

Faster Document Review & Production

Understanding Results: Precision & Recall
High Recall, Low Precision: All of the responsive documents in the
collection have been appropriately tagged by the algorithm (high recall).

However, many erroneous documents were incorrectly marked responsive
(low precision).

att EBEBREEREBEREBEBE
Predicted CO O AOS

x xx xx

E Non-Responsive
E Responsive

a om
A Lawyer's Guide to Faster Document Review & Production | March 16, 2016 | e x b e

Faster Document Review & Production

Comparing Outcomes: TAR v. Manual Review

From the Sedona Conference Best Practices Commentary on the Use of
Search and Information Retrieval Methods in E-Discovery:

“[T]here appears to be a myth that manual review by humans of large amounts
of information is as accurate and complete as possible ... Even assuming that
the profession had the time and resources to continue to conduct manual
review of massive sets of electronic data sets (which it does not), the relative
efficacy of that approach versus utilizing newly developed automated methods
of review remains very much open to debate." (2007)

From the TREC (Text Retrieval Conference) Legal Track:

“Overall, the myth that exhaustive manual review is the most effective - and
therefore, the most defensible — approach to document review is strongly
refuted. Technology-assisted review can (and does) yield more accurate results
than exhaustive manual review, with much lower effort... Future work may
address which technology-assisted review process(es) will improve most on
manual review, not whether technology assisted review can improve on manual
review.” (2009)

a om
A Lawyer's Guide to Faster Document Review & Production | March 16, 2016 | e x b e

aster Document Review & Prod

The Importance of Transparency

Collected
ESI

Defensibility: Without understanding how a particular TAR/predictive
coding methodology works, it becomes difficult to explain why the
algorithm made certain coding decisions.

TAR is No Panacea: TAR is not meant to be used in any and all review
situations. Without understanding how a particular TAR/predictive coding
methodology works, it is impossible to determine if it is appropriate for
your case.

lexbe

A Lawyer's Guide to Faster Document Review & Production | March 16, 2016

aster Document Review € Productio

Review?

Assisted Review+

2. Check

1. Train Resulta

Human Review of Seed

Set Review Control
(2,400 documents) ec

documents each)

o In TAR, Bayesian Probability models the likelihood of something being
true about a document, i.e. responsive, based on the millions of data
connections created while training the seed set.

o A Naive Bayesian Classifier, used in Assisted Review+, is a probability
model with assumptions that allow for pattern recognition among
multiple independent variables.

y om
A Lawyer's Guide to Faster Document Review & Production | March 16, 2016 | ex b e

Faster Document Review & Production

The Importance of Scalability

Incoming TAR Project
o Applying more server
resources to a TAR/predictive

coding task will increase

000) DES throughput.

CON 2ER .

LOCO ee el
Dosen Walia manure et everson
Boge LL maximizes the value of this

fee apeeeD benefit. nn

Reviewed Documents

y om
A Lawyer's Guide to Faster Document Review & Production | March 16, 2016 | e x b e

Review

Summary

o

TAR/Predictive Coding allows a skilled reviewer to train a computer
algorithm to identify responsive and non-responsive documents.

You can use TAR/Predictive Coding to increase review speed,
decrease review costs, and improve the quality of review results

TAR works by teaching a seed set, testing the algorithm against
control sets, and applying the improved algorithm to the remainder of
the collection

Predictive coding performance results are communicated in the form
of precision and recall scores

It is important to know the underlying logic of the TAR algorithm to
interpret, explain, and defend your results.

Scalable, transparent predictive coding workflows maximize the
intended benefits of technology assisted review.

y om
A Lawyer's Guide to Faster Document Review & Production | March 16, 2016 | ex b e

Thank You
Thank You For Attending

We'll be making the Following available to webinar attendees:

e Arecorded streaming version
e MP3 podcast
e Webinar slide-deck

Please let us know if you have any questions or comments about this webinar or
suggestions for future topics. This webinar is part of the Lexbe eDiscovery
Webinar Series. For notices of Future live and on-Demand webinars as part of this
series please email us at [email protected] or Follow us on LinkedIN.

Please contact us with any questions:

Speaker Moderator
Stu VanDusen Gene Albert
800-401-7809 x55 512-686-3460
[email protected] [email protected]
-
eDiscovery
= =
WebinarSeries

a om
A Lawyer's Guide to Faster Document Review & Production | March 16, 2016 | e x b e

Lexbe eDiscovery Platfo

Ask Us More About

e The Lexbe eDiscovery Platform, our cloud based processing, review and production
tool. Attorney/staff DIY, no users fees or case fees.

e Ourhigh-speed/high-capacity eDiscovery services, and expert professional services.

e Consultations, price quotes, demos and free trials available.

AN LIN [+ -tovoror]

‘Cost-effective eDiscovery’ ‘Secure, easy-to-use and a great “A powerful litigation document
review tool for consideration’ management service”

ABOVE Paralegal

MHELAW Today
Lexbe cost advantages, SaaS “Because of the Lexbe software, "Lexbe is the easiest eDiscovery
convenience and search capabilities the entire playing field has been software | have ever used’
appeal to many small firms leveled for my firm.

Lexbe Sales

eDiscovery (800) 401-7809 x22

€ E
WebinarSeries | be
A Lawyer's Guide to Faster Document Review & Production | March 16, 2016 e x e