8 Things You Can’t Afford to
Ignore About eDiscovery
Brought to you by:John Wang, CCP
Product Manager and eDiscovery Specialist
February 25, 2010
AIIM 8 Things Series
About ZL Technologies
• Experts in Total Information Governance
– Unstructured Content Archiving
– eDiscovery
– Compliance
– Secure Email
– Scalability & Low TCO via Private Clouds
• Select Customers
About John Wang
• Experience / Roles
– 15+ years in Technology
i
• Degrees
– ..• Industry Participation
Product Manager Solutions Architect Developer
EDRM AIIM LexisNexis
• Project Leadership
• Search Guide
Co-author
• Research
proposal,
execution, and
presentation
• Certified
Concordance
Professional
M&T MBA Computer Science Finance
Agenda
1. Early Case Assessment
2. Data Mapping
3. Investigative eDiscovery
4. Concept Search
5. Non-Linear Review
6. Parallel Search
7. End-to-End eDiscovery
8. Cloud Computing
Overview
Did you know?
5 Year Enterprise Data Growth Estimate
85% will be Unstructured!
?
Sources: Gartner
Overview
• ESI is discoverable
• ESI volume is growing at 55+% annually*
• Litigation is increasing
– 42% US organizations expecting more litigation (from 34%)**
– 83% US organizations have been litigated against in 2008**
• Timelines have been shortened
• How do we handle this is an affordable way?
• Can we move from a reactive, bottom-up approach to a
strategic, top-down approach?
• This presentation shows us 8 technologies to do just that!
Sources:
* ESG
** Fulbright & Jaworski
Early Case Assessment
? Did you know?
In-house eDiscovery
Payback Period
Sources: Gartner, Merrill Lynch
Early Case Assessment
3 Questions
– Does the complaint have merit?
– How much will this cost us?
– What has the org learned?
Overview
– Estimate risk to prosecute or
defend a case
– Formulate resolution in first 90 -
120 days
– Examine key facts, allegations,
applicable laws and venues
– Analyze and assess potential
trial themes for both sides
– Pursue the best course
Item Achievement
Payback Period 3-6 months,or 1 large IP case
Litigation Success 76%**
Cost Reduction 50%**
0%
20%
40%
60%
80%
100%
Cost of E-Discovery Litigation Success Rate
Without ECA With ECA
Early Case Assessment Results
Sources:
** Cogent Research
Early Case Assessment
Assess ESI after Collection, Preservation, Processing and Analysis
VOLUME RELEVANCE
Electronic Discovery Reference Model / © 2009 / v2.0 / edrm.net
Information
ManagementIdentification
Preservation
Processing
Production Presentation
Collection
Review
Analysis
Traditional Post-Collection ECA
Early Case Assessment
VOLUME RELEVANCE
Electronic Discovery Reference Model / © 2009 / v2.0 / edrm.net
Information
ManagementIdentification
Preservation
Processing
Production Presentation
Collection
Review
Analysis
ECA “Now”
Compress timeline and assess before collection, reducing processing,
analysis and review
Early Case Assessment
Deployment
– In-house eDiscovery
– Allows faster and
iterative searching,
“going back to the
well”
Process
– Analysis
– Visualization
How does it affect you?
– Resolve cases faster
– Resolve cases more
favorably
– Reduce costs
Action Plan
– Evaluate solutions
– Try solutions on known
cases and case data
– Evaluate results
Data Mapping
Did you know?
Fortune 1000 Data per Firm
In potentially 100s of Repositories!
?
Sources: Industry Sources
Required by Rule 26(a)(1)(B)
• “… a copy of, or a description by
category and location of, all
documents, electronically stored
information, and tangible things”
• Requirements
– Repositories
– Types of ESI per repository
– Custodians
– Retention policy
– Preservation & disposition
– Legal hold enforcement
– Collection method
– Accessibility
Take Advantage of Rule 37(F)
• Provides defense against sanctions for “routine, good-faith operation of an electronic information system.”
Data Mapping
Spoliation “I’m Sorry” Sanctions
The Three Ss of eDiscovery
Data Mapping
How does it affect you?
– Reduce sanction risk
– Reduce overhead from 10 hrs
to 30 min / week
– Reduce costs
– Automate collections and
legal holds
– Work with BCP/DR and
InfoSec/DLP
Action Plan
– Evaluate current solution and
available solutions
– Analyze options if there is a
gap
Data Mapping
Legal Hold Notification
Culling
Collection
Legal Hold
Integrated Data Mapping
Exclusionary EDApproach
– Cull by Custodian
– Cull by Date
– Cull by File type
Limitations
– Blunt tool
– De-selects on secondary characteristics
– Find relevance late in process
– May need to go back to the source late in the process
– More false negatives as the collection grows
Investigative ED
• Approach– Cull by Matter
– Roots in Forensics
• Benefits– Finding highly relevant
information early in the process
– Finds information not necessarily tied to custodians, e.g. file server data
– Supports ECA
Investigative eDiscovery
Review
Cull by File type
Cull by Date
Cull by Custodian
Review
Cull by Matter
Investigative eDiscovery
How does it affect you?
– Higher Success Rates
– Lower Information Risk via Wider Safe Harbor
– Better results
– Successful ECA
Action Plan
– Evaluate past performance wrt initially missed relevant email
– Calculate cost
– Investigate options
Key Technologies
– Billion document search
engines
– Index in-place
– Cloud / GRID scalability
Investigative eDiscovery is based
on the science of forensics, an
older and more complete
approach than traditional
eDiscovery.
New technologies make
Investigative eDiscovery a reality
again.
Concept Search
Did you know?
Keyword Search
Missed Relevant Documents
?
Sources: Blair & Maron
Concept Search
• Attorneys and paralegals are not familiar with the terms in use
– Many words can be used to mean the same thing
– Organizations often create special “code words”
Subway Accident
Subway
Company
“unfortunate
incident”
Victims
“Disaster”
“event,” “incident,” “situation,” “problem,” “difficulty”
Concept Search
How does it affect you?
– Find more relevant
documents
– Discovery case facts faster
– Recommended by courts
and the Sedona
Conference
Action Plan
– Evaluate test cases
– Get review teams involved
for real world analysis
Year Technique
1763 Bayes Theory (Bayesian Inference)
1948 Shannon Entropy(Shannon Information Theory)
1951 K-Nearest Neighborhood
1988 Latent Semantic Indexing (LSI)
1999 Probabilistic LSI
2003 Latent Dirichlet Allocation
Actively Researched and Developed Technology
Non-Linear Review
Did you know?
Legal Review Productivity
Increased Productivity from Non-Linear Review
?
Sources: Deloitte, Industry Sources
Non-Linear Review
Traditional Linear eDiscovery
– Grouped by source, custodian,
date, etc.
– Like documents are scattered
– 10,000s of docs / case
Non-Linear Review
– Grouped by concept, near-
duplication
– Easy navigation via
visualization
– Less context switching
– Better sampling
– 1,000,000s of docs / case
Technologies– Clustering
– Auto-Classification
– Concept Search
– Visualization
0 5,000 10,000 15,000
eDiscovery Review Productivity
Non-Linear Review
How does it affect you?
• Faster review drives
– Lower costs
– Faster results
– Better results
– Successful ECA
Action Plan
– Evaluate current
process and costs
– Justify investigation
– Review options
Key Statistics
• 72% of attorneys say review is the
most expensive part of ED
• Review is up to 80% of ED costs
• Can save $187,500 on a 1.5 M
doc case
Traditional Linear Review
Non-Linear Review
Parallel Search
Did you know?
Keyword Search is still advancing?
Term searches – in seconds to minutes
?
Source: Gartner
Parallel Search
• Keywords
• User names
• Email addresses
• Patent numbers
• SSNs
• etc…
How does it affect you?
– Take the guesswork out of choosing keywords
– Run queries as simulations
– Supports wildcard search, proximity search, etc.
Action Plan
– Review complex searches
– See if parallel search can provide new insights that could not be economically performed before.
Search
100,000 terms across
billions of documents
in seconds to minutes…
End-to-End eDiscovery
Did you know?
eDiscovery Vendors
Offering Products and Services
?
Sources: Socha-Gelbmann 2009 E-Discovery Survey
VOLUME RELEVANCE
Electronic Discovery Reference Model / © 2009 / v2.0 / edrm.net
Information
ManagementIdentification
Preservation
Processing
Production Presentation
Collection
Review
Analysis
End-to-End eDiscovery
Typical Archive Initial Search Review toolCase Analytics
3.5 days to
index 30TB
3 days to
index 1.1TB
4 days to
export 2M docs
• 25% of vendors (150+) will disappear by 2011
• More vendors are entering eDiscovery than leaving
Single / Multi-Vendor
End-to-End eDiscovery
• No data transfer between initial collection, review, and production
• No incompatibilities or inter-stage processing time delays
VOLUME RELEVANCE
Electronic Discovery Reference Model / © 2009 / v2.0 / edrm.net
Information
ManagementIdentification
Preservation
Processing
Production Presentation
Collection
Review
Analysis
Single Platform
End-to-End eDiscovery
• True End-to-End eDiscovery
is:
– Single platform
• Benefits
– Integrated Data Map &
Legal Hold
– Single Collection
– Enterprise-wide search in
review platform
– No intermediate
Productions
• Bottom Line
– Cost and Time Savings
How does it affect you?
– Faster
– More Reliable
– Lower Cost
– Institutional Memory
Action Plan
– Evaluate current process
and costs
– Justify investigation
– Review options
Cloud Computing
Did you know?
Cloud Computing
Market Forecast by 2011 & 2013!
?
Sources: Gartner, Merrill Lynch
Cloud Computing
Industry hype?• Today:
– $56 billion
– 3% of enterprises using cloud
• By 2013:
– $150 billion market?
– 50+% of email archiving in the cloud?
Sources: Gartner, Forrester
Cloud Computing
Industry hype?• Today:
– $56 billion
– 3% of enterprises using cloud
• By 2013:
– $150 billion market?
– 50+% of email archiving in the cloud?
The Good, The Bad, and The Solution …
Sources: Gartner, Forrester
The Good
1. Lower Cost
– Only pay for what you use
2. Scalability
– GRID / MapReduce
3. Increased Storage
– Virtualized file system
4. Flexibility
– Deploy new capability quickly
5. Automation
– Less manpower requirement
6. More mobility
– Inside and outside counsel
Cloud Computing
The Good
1. Lower Cost
– Only pay for what you use
2. Scalability
– GRID / MapReduce
3. Increased Storage
– Virtualized file system
4. Flexibility
– Deploy new capability quickly
5. Automation
– Less manpower requirement
6. More mobility
– Inside and outside counsel
The Bad
1. Guaranteed service levels
– Some have no guarantees
– Data not under your control
2. Security & shared tenancy
– Provider capabilities vary
– Also may have no guarantees
3. Chain of custody
– Forensic examination?
4. Lock-in and pricing
– Ability to get data out?
5. Current adoption
– Only 3% of business users!
Cloud Computing
The Solution
Private Cloud Computing
• What is it?– Cloud infrastructure deployed
in-house
• Added Benefits– Secure
– QoS / SLA
Cloud Computing
How does it affect you?
• Faster review drives
– Lower costs
– Better resource utilization
– Scales for one time projects
Action Plan
– Check internal cloud strategy
– Run savings figuressIT Organizations Will Spend More
Money on Private Cloud Computing
Investments Than on Offerings From
Public Cloud Providers Through 2012
Gartner
8 Things You Can’t Afford to Ignore
with eDiscovery
1. Early Case Assessment
2. Data Mapping
3. Investigative eDiscovery
4. Concept Search
5. Non-Linear Review
6. Parallel Search
7. End-to-End eDiscovery
8. Cloud Computing
More Information
• http://aiim.typepad.com/
• http://www.zlti.com/
ZL Technologies
• Experts in Total Information
Governance
– Unstructured Content
Archiving
– eDiscovery
– Compliance
– Secure Email
– Scalability & Low TCO via
Private Clouds
Top Related