E-Discovery Revisited: A Broader Perspective for IR Researchers Jack G. Conrad, Thomson R&D ICAIL07...

E-Discovery Revisited: A Broader Perspective for IR Researchers Jack G. Conrad, Thomson R&D ICAIL07 / DESI Workshop June 4, 2007
  • date post

  • Category


  • view

  • download


Transcript of E-Discovery Revisited: A Broader Perspective for IR Researchers Jack G. Conrad, Thomson R&D ICAIL07...

Page 1: E-Discovery Revisited: A Broader Perspective for IR Researchers Jack G. Conrad, Thomson R&D ICAIL07 / DESI Workshop June 4, 2007.

E-Discovery Revisited:A Broader Perspective for IR Researchers

Jack G. Conrad, Thomson R&DICAIL07 / DESI Workshop

June 4, 2007

Page 2: E-Discovery Revisited: A Broader Perspective for IR Researchers Jack G. Conrad, Thomson R&D ICAIL07 / DESI Workshop June 4, 2007.

2Thomson Research & DevelopmentJune 4, 2007

EDD Outline

• EDD ― The Big Picture

• Motivations

• Background

• EDD interactions: the “dance” of the litigants

• The complete EDD pipeline

• Alternative view of the enabling technologies

Page 3: E-Discovery Revisited: A Broader Perspective for IR Researchers Jack G. Conrad, Thomson R&D ICAIL07 / DESI Workshop June 4, 2007.

3Thomson Research & DevelopmentJune 4, 2007

EDD ― The Big Picture

• Electronic Data Discovery ― – Context: Practical Research & TREC

– Motivations: (1) Recent characterization of State of the Art in EDD (2) Informational materials available for participants in

forums like TREC

Page 4: E-Discovery Revisited: A Broader Perspective for IR Researchers Jack G. Conrad, Thomson R&D ICAIL07 / DESI Workshop June 4, 2007.

4Thomson Research & DevelopmentJune 4, 2007

EDD ― The Big Picture

• Electronic Data Discovery ―

– Presently exist 300-500 companies offering some form of EDD software or services.

– Several offer complete services across the E-Discovery spectrum – Kroll On-Track

Recently acquired Engenium (Symetric), the concept search engine co.– LN

Acquired Applied Discovery in recent past, and also offers a full spectrum of EDD services

– EDD performance bar constantly being raised– Essential need to share diverse perspectives in field with next

generation researchers What is the “dance of the litigants”? … the complete EDD pipeline? …

possible interactions of the enabling technologies?

Page 5: E-Discovery Revisited: A Broader Perspective for IR Researchers Jack G. Conrad, Thomson R&D ICAIL07 / DESI Workshop June 4, 2007.

5Thomson Research & DevelopmentJune 4, 2007

Source of EDD Survey Responses

• The Socha-Gelbmann Report, 2005– In total, 240 consumers/providers of EDD software / services were

contacted 139 expressed interest in participating 72 of those were surveyed via spreadsheet or phone interview 3 of the final spreadsheets did not contain enough info to be used

– Conducted among 69 E-Discovery consumers & providers 24 consumers; 45 providers

– Consumers A cross-section of Am Law 200 law firms + large U.S. companies

– Providers A broad-based collection of software & service providers who market

their offerings as E-Discovery tools or services

Page 6: E-Discovery Revisited: A Broader Perspective for IR Researchers Jack G. Conrad, Thomson R&D ICAIL07 / DESI Workshop June 4, 2007.

6Thomson Research & DevelopmentJune 4, 2007

E-Discovery ― Areas of Industry Strength

Consumer Views of Strengths

ED Tools

User Friendly Tools

Ability to StreamlineED ProcessHosting / Processing(Raw Bandwidth)Electronic"Harvesting"Education

Concept Search Tools



Electronic Coding

Marketing (case lawon Web sites)Adaptability


Providers Views of Strengths

Available Technology



Ability to StreamlineED ProcessCompetition

Growth Potential


ED Tools


Financial Backing

Playing "Fear Factor"MarketingCulling / Filtering

Hosting / Processing(Raw Bandwidth)Forensics

Eductional Materials

Information Sharing

Project Management

Marketing (case lawon Web sites)Move from IT toBusiness Model

Page 7: E-Discovery Revisited: A Broader Perspective for IR Researchers Jack G. Conrad, Thomson R&D ICAIL07 / DESI Workshop June 4, 2007.

7Thomson Research & DevelopmentJune 4, 2007

E-Discovery ― Areas of Industry Weakness

Consumers View of Weaknesses

Project Management

Quality Control


Culling / Filtering

Promised Search Results

Maturity of Tools

Promised Deliverables (MktgHonesty)Coding

Clear Communication byProvidersKnow ledge of Litigation Process

Consistent Pricing Models

Consulting for CollectionPreservationLimitations / Mobility of LaborPoolPreparation for Discovery /Retention PoliciesLitigation Solutions Compatiblew / Bus. Solutions"Fear Factor" Marketing

Providers View of Weaknesses

Lack of Standards

Knowledge of the LitigationProcessPromised Deliverables (MktgHonesty)Lack of Best Practices

Maturity of Tools

Clear Communication byProvidersConsistent Pricing Models

Attorneys Knowledge ofTechnologyProject Management




Efficiency of Processing


Consulting for CollectionPreservationForensics

Native File Review

Size, Locality of Providers

Vendor Innovation

Understanding AdvancedTechnology

Page 8: E-Discovery Revisited: A Broader Perspective for IR Researchers Jack G. Conrad, Thomson R&D ICAIL07 / DESI Workshop June 4, 2007.

8Thomson Research & DevelopmentJune 4, 2007

EDD Scenarios — “the dance of the litigants”

EDD resources

Party A vs. Company B (David vs. Goliath)

Employment Discrimination

EDD resources

EDD resources

Gov’t vs. Company C

Securities Fraud

Intellectual Property

EDD resources

EDD resources

Company D vs. Company E

Page 9: E-Discovery Revisited: A Broader Perspective for IR Researchers Jack G. Conrad, Thomson R&D ICAIL07 / DESI Workshop June 4, 2007.

9Thomson Research & DevelopmentJune 4, 2007

The EDD Work Flow Model

E-Discovery Pipeline

•Data Gathering

Preservation & Collection

•Data Gathering

Preservation & Collection

•Media Restoration

(data trans. to a std. media)

•Media Restoration

(data trans. to a std. media)

•Data Processing

(filtering, format conversion)

•Data Processing

(filtering, format conversion)

•Online Review: Hosting & Searching

•Online Review: Hosting & Searching

•Production & Delivery

•Production & Delivery

• Data Entry & Scanning

• Data Entry & Scanning

•E-Discovery Consulting (throughout process)•E-Discovery Consulting (throughout process)

Identification (relevant

content and its scope)

Breadth and depth of

discoverable materials


Breadth and depth of

discoverable materials


Hard copy media converted (e.g., OCR) or audio

records transcribed

Hard copy media converted (e.g., OCR) or audio

records transcribed

Electronically stored info. is

preserved from multiple sources

Electronically stored info. is

preserved from multiple sources

Data transferred from original or

intermediate media to uniform

media for analysis

Data transferred from original or

intermediate media to uniform

media for analysis

Vetting performed to reduce volume

of data (incl. filtering, deduping,

clustering, etc.)

Vetting performed to reduce volume

of data (incl. filtering, deduping,

clustering, etc.)

Primary review stage. Data

transferred to dedicated repository

Primary review stage. Data

transferred to dedicated repository

Searching based upon sources, dates, orig. file

types, key words, etc.

Searching based upon sources, dates, orig. file

types, key words, etc.

Advice to clients on strategies & procedures for conducting E-

Discovery processing

Advice to clients on strategies & procedures for conducting E-

Discovery processing

Delivery of reports to clients, systems, in diff. formats & media

Page 10: E-Discovery Revisited: A Broader Perspective for IR Researchers Jack G. Conrad, Thomson R&D ICAIL07 / DESI Workshop June 4, 2007.

10Thomson Research & DevelopmentJune 4, 2007

The EDD Work Flow Model

E-Discovery Pipeline

•Data Gathering

Preservation & Collection

•Data Gathering

Preservation & Collection

•Media Restoration

(data trans. to a std. media)

•Media Restoration

(data trans. to a std. media)

•Data Processing

(filtering, format conversion)

•Data Processing

(filtering, format conversion)

•Online Review: Hosting & Searching

•Online Review: Hosting & Searching

•Production & Delivery

•Production & Delivery

•Data Entry & Scanning

•Data Entry & Scanning

•E-Discovery Consulting (throughout process)•E-Discovery Consulting (throughout process)

Identification (scope, depth of information)

Proposed extended scope of text

‘retrieval’ task (i.e., including filtering,

organizing &report generation)

Proposed extended scope of text

‘retrieval’ task (i.e., including filtering,

organizing &report generation)

Page 11: E-Discovery Revisited: A Broader Perspective for IR Researchers Jack G. Conrad, Thomson R&D ICAIL07 / DESI Workshop June 4, 2007.

11Thomson Research & DevelopmentJune 4, 2007





E-Discovery Technology Pyramid

Foundation ― collecting: identification, conversion, migration

Second Tier ― vetting: filtering, de- duping, handling similar doc-objects

Third Tier ― organizing: classifying or clustering; tagging & linking

Fourth Tier ― analyzing: consoli- dating & summarizing; production


NavigatingSearching &



Page 12: E-Discovery Revisited: A Broader Perspective for IR Researchers Jack G. Conrad, Thomson R&D ICAIL07 / DESI Workshop June 4, 2007.

12Thomson Research & DevelopmentJune 4, 2007

Additional E-Discovery Challenges

• Workflow Support

• Process Efficiencies– Per Step– Overall

• Tool Integration

• Ease of Use– For Customers– For Support

• High Value to Cost Ratio– Added value through advanced technologies

• A TREC-like forum has much potential to contribute here– Both within and beyond the context of IR

Page 13: E-Discovery Revisited: A Broader Perspective for IR Researchers Jack G. Conrad, Thomson R&D ICAIL07 / DESI Workshop June 4, 2007.

E-Discovery Revisited:A Broader Perspective for IR Researchers

Jack G. Conrad, Thomson R&DICAIL07 / DESI Workshop

June 4, 2007