E-Discovery Revisited:A Broader Perspective for IR Researchers
Jack G. Conrad, Thomson R&DICAIL07 / DESI Workshop
June 4, 2007
2Thomson Research & DevelopmentJune 4, 2007
EDD Outline
• EDD ― The Big Picture
• Motivations
• Background
• EDD interactions: the “dance” of the litigants
• The complete EDD pipeline
• Alternative view of the enabling technologies
3Thomson Research & DevelopmentJune 4, 2007
EDD ― The Big Picture
• Electronic Data Discovery ― – Context: Practical Research & TREC
– Motivations: (1) Recent characterization of State of the Art in EDD (2) Informational materials available for participants in
forums like TREC
4Thomson Research & DevelopmentJune 4, 2007
EDD ― The Big Picture
• Electronic Data Discovery ―
– Presently exist 300-500 companies offering some form of EDD software or services.
– Several offer complete services across the E-Discovery spectrum – Kroll On-Track
Recently acquired Engenium (Symetric), the concept search engine co.– LN
Acquired Applied Discovery in recent past, and also offers a full spectrum of EDD services
– EDD performance bar constantly being raised– Essential need to share diverse perspectives in field with next
generation researchers What is the “dance of the litigants”? … the complete EDD pipeline? …
possible interactions of the enabling technologies?
7Thomson Research & DevelopmentJune 4, 2007
Source of EDD Survey Responses
• The Socha-Gelbmann Report, 2005– In total, 240 consumers/providers of EDD software / services were
contacted 139 expressed interest in participating 72 of those were surveyed via spreadsheet or phone interview 3 of the final spreadsheets did not contain enough info to be used
– Conducted among 69 E-Discovery consumers & providers 24 consumers; 45 providers
– Consumers A cross-section of Am Law 200 law firms + large U.S. companies
– Providers A broad-based collection of software & service providers who market
their offerings as E-Discovery tools or services
9Thomson Research & DevelopmentJune 4, 2007
E-Discovery ― Areas of Industry Strength
Consumer Views of Strengths
ED Tools
User Friendly Tools
Ability to StreamlineED ProcessHosting / Processing(Raw Bandwidth)Electronic"Harvesting"Education
Concept Search Tools
Forensics
Review
Electronic Coding
Marketing (case lawon Web sites)Adaptability
Innovation
Providers Views of Strengths
Available Technology
Innovation
Processing
Ability to StreamlineED ProcessCompetition
Growth Potential
Adaptability
ED Tools
Consolidation
Financial Backing
Playing "Fear Factor"MarketingCulling / Filtering
Hosting / Processing(Raw Bandwidth)Forensics
Eductional Materials
Information Sharing
Project Management
Marketing (case lawon Web sites)Move from IT toBusiness Model
10Thomson Research & DevelopmentJune 4, 2007
E-Discovery ― Areas of Industry Weakness
Consumers View of Weaknesses
Project Management
Quality Control
Reliability
Culling / Filtering
Promised Search Results
Maturity of Tools
Promised Deliverables (MktgHonesty)Coding
Clear Communication byProvidersKnow ledge of Litigation Process
Consistent Pricing Models
Consulting for CollectionPreservationLimitations / Mobility of LaborPoolPreparation for Discovery /Retention PoliciesLitigation Solutions Compatiblew / Bus. Solutions"Fear Factor" Marketing
Providers View of Weaknesses
Lack of Standards
Knowledge of the LitigationProcessPromised Deliverables (MktgHonesty)Lack of Best Practices
Maturity of Tools
Clear Communication byProvidersConsistent Pricing Models
Attorneys Knowledge ofTechnologyProject Management
Reliability
Preservation
Processing
Efficiency of Processing
Interoperability
Consulting for CollectionPreservationForensics
Native File Review
Size, Locality of Providers
Vendor Innovation
Understanding AdvancedTechnology
11Thomson Research & DevelopmentJune 4, 2007
EDD Scenarios — “the dance of the litigants”
EDD resources
Party A vs. Company B (David vs. Goliath)
Employment Discrimination
EDD resources
EDD resources
Gov’t vs. Company C
Securities Fraud
Intellectual Property
EDD resources
EDD resources
Company D vs. Company E
12Thomson Research & DevelopmentJune 4, 2007
The EDD Work Flow Model
E-Discovery Pipeline
•Data Gathering
Preservation & Collection
•Data Gathering
Preservation & Collection
•Media Restoration
(data trans. to a std. media)
•Media Restoration
(data trans. to a std. media)
•Data Processing
(filtering, format conversion)
•Data Processing
(filtering, format conversion)
•Online Review: Hosting & Searching
•Online Review: Hosting & Searching
•Production & Delivery
•Production & Delivery
• Data Entry & Scanning
• Data Entry & Scanning
•E-Discovery Consulting (throughout process)•E-Discovery Consulting (throughout process)
Identification (relevant
content and its scope)
Breadth and depth of
discoverable materials
established
Breadth and depth of
discoverable materials
established
Hard copy media converted (e.g., OCR) or audio
records transcribed
Hard copy media converted (e.g., OCR) or audio
records transcribed
Electronically stored info. is
preserved from multiple sources
Electronically stored info. is
preserved from multiple sources
Data transferred from original or
intermediate media to uniform
media for analysis
Data transferred from original or
intermediate media to uniform
media for analysis
Vetting performed to reduce volume
of data (incl. filtering, deduping,
clustering, etc.)
Vetting performed to reduce volume
of data (incl. filtering, deduping,
clustering, etc.)
Primary review stage. Data
transferred to dedicated repository
Primary review stage. Data
transferred to dedicated repository
Searching based upon sources, dates, orig. file
types, key words, etc.
Searching based upon sources, dates, orig. file
types, key words, etc.
Advice to clients on strategies & procedures for conducting E-
Discovery processing
Advice to clients on strategies & procedures for conducting E-
Discovery processing
Delivery of reports to clients, systems, in diff. formats & media
13Thomson Research & DevelopmentJune 4, 2007
The EDD Work Flow Model
E-Discovery Pipeline
•Data Gathering
Preservation & Collection
•Data Gathering
Preservation & Collection
•Media Restoration
(data trans. to a std. media)
•Media Restoration
(data trans. to a std. media)
•Data Processing
(filtering, format conversion)
•Data Processing
(filtering, format conversion)
•Online Review: Hosting & Searching
•Online Review: Hosting & Searching
•Production & Delivery
•Production & Delivery
•Data Entry & Scanning
•Data Entry & Scanning
•E-Discovery Consulting (throughout process)•E-Discovery Consulting (throughout process)
Identification (scope, depth of information)
Proposed extended scope of text
‘retrieval’ task (i.e., including filtering,
organizing &report generation)
Proposed extended scope of text
‘retrieval’ task (i.e., including filtering,
organizing &report generation)
14Thomson Research & DevelopmentJune 4, 2007
I
I I
I I I
IV
E-Discovery Technology Pyramid
Foundation ― collecting: identification, conversion, migration
Second Tier ― vetting: filtering, de- duping, handling similar doc-objects
Third Tier ― organizing: classifying or clustering; tagging & linking
Fourth Tier ― analyzing: consoli- dating & summarizing; production
Indexing
NavigatingSearching &
Reporting
Hosting
15Thomson Research & DevelopmentJune 4, 2007
Additional E-Discovery Challenges
• Workflow Support
• Process Efficiencies– Per Step– Overall
• Tool Integration
• Ease of Use– For Customers– For Support
• High Value to Cost Ratio– Added value through advanced technologies
• A TREC-like forum has much potential to contribute here– Both within and beyond the context of IR
E-Discovery Revisited:A Broader Perspective for IR Researchers
Jack G. Conrad, Thomson R&DICAIL07 / DESI Workshop
June 4, 2007
Top Related