The Bar Association of the City of Richmond
Continuing Legal Education Seminars 2009 – 2010
“E-Discovery for Small and Mid-Sized Firms”
Defining a Legal Strategy ….
The Value in Early Case Assessment
Presented by
Aubrey L. Owens, Jr.
Superior Document Services
October 14, 2009
Table of Contents
Overview: What is Early Case Assessment? Page 1
Identification of Evidence Page 2
Custodian Interviews Page 2
Data Mapping and eScoping Page 3
Preservation and Collection of Evidence Page 4
Forensic Preservation and Collection Page 4
Active Preservation and Collection Page 5
Network Appliance Preservation and Collection Page 5
Search and Analysis of Evidence Page 6
De-duplication Page 6
Proximity Filters Page 6
Keyword Generation and Analysis Page 7
Review of Evidence Page 8
Relevance Review Page 8
Substantive Review Page 10
Production of Evidence Page 11
Native Production Page 11
Image Production Page11
Metadata Page 12
Conclusion Page 13
All Rights Reserved. 2009 Superior Document Services Richmond, Virginia Page | 1
Overview: What is Early Case Assessment?
According to a 2005 Gartner report, 75 percent of global companies will be involved in a legal
or regulatory action that requires a systematic approach to legal discovery in the next year.
Through 2010, companies without formal e-discovery processes will spend nearly twice as
much on gathering and producing documents as they will on legal services. 1
Early case assessment includes the ability to identify the facts of the case through analysis of
the custodians, concept and thread analysis of emails and data sampling other electronic data
to determine the best case strategy. This should be done in order to better prepare the client
for potential expenses that will arise should litigation be propounded. What is the benefit in
propounding litigation when the case is worth $1M and it requires spending $150K just to
process data for review (not including attorney fees)? 2
The December 2006 Amended Federal Rules of Civil Procedure provide the framework for
litigants to meet and confer with their clients and opposing counsel in an effort to clearly
identify the scope of the litigation. Digital (electronic) record keeping is common place in the
majority of companies, a result of the explosion of email communications, as well as the ease
of creating, sharing and storing pertinent business documents in an electronic format. The task
of the litigant is not to produce everything, but to make a defensible production by narrowing
the scope of what is relevant.
1 Kroll On Track Discovery- (2006) PRACTICE POINTS: TEN STEPS FOR EARLY E-DISCOVERY CASE ASSESSMENT
2 Managing the Litigation Lifecycle – Owens, A. (2008) Early Case Assessment
All Rights Reserved. 2009 Superior Document Services Richmond, Virginia Page | 2
Identification of Electronic Evidence
“... Understand your client’s technology. A solid case
assessment involves thoroughly understanding your client’s
information as it relates to the scope of the opposing party’s
discovery requests. Assessing your client’s data will involve
taking an inventory of key custodians and determining your
client’s relevant hardware, document types (e.g., e-mail,
database files, word processing documents, etc.), operating
system and software packages (current and archived),
employee computer use policies, and data storage/retention
policies”3
Custodian Interviews
Early in the process, the Legal and IT teams gather information to identify the potential
custodians and sources of information potentially responsive to the scope of discovery. The
custodian interview process is the first step to uncovering “who, where, when and how data
lives” throughout the enterprise. Legal Counsel will interview each custodian with a series of
questions designed to make a defensible claim to FRCP Rule 26(a)(1)(B), involving the Duty of
Disclosure. Information gathered during each interview process will be pertinent to developing
the case strategy while mitigating risk.
3 Kroll On Track Discovery – (2006) PRACTICE POINTS: TEN STEPS FOR EARLY E-DISCOVERY CASE ASSESSMENT
All Rights Reserved. 2009 Superior Document Services Richmond, Virginia Page | 3
Data Mapping & eScoping
For Corporate legal departments, the struggle to stay on top of litigation in 2009 will get worse
before it gets better. It's enough to make many wish they had a field guide to find their way
through the forest of lawsuits. Fortunately, there are ways that in-house counsel can
proactively prepare for litigation and regulatory and compliance issues, easing the burden of
discovery, while increasing the defensibility of their processes and procedures. Developing a
data map of an organization's information flow is one important step. 4
A well constructed data map tells where information is stored throughout the enterprise, as
well as the technology used in the normal course of business. When used properly, the legal
and IT teams have the ability to target specific areas of information that are well known to be
responsive to the scope of the discovery request. This process is known as eScoping.
eScoping early in the process provides the legal team with information-specific areas that are
obviously responsive, privileged, or irrelevant to the discovery request. In some instances,
eScoping will reduce the amount of information required for preservation, based on the
collection method. Additionally, eScoping provides a foundation for identifying keywords and
other potential custodians.
AccessData FTK® Imager Lite www.accessdata.com
Pinpoint Laboratories SafeCopy® www.pinpointlabs.com
4 Law.com (Corporate Counsel) – Tarr, B. (2009) - Finding Your Way Through Discovery by Data Mapping
All Rights Reserved. 2009 Superior Document Services Richmond, Virginia Page | 4
Preservation and Collection of Evidence
“The reality of electronic discovery is it starts off
as the responsibility of those
who don’t understand the technology and ends
up as the responsibility of
those who don’t understand the law.”
Forensic Preservation and Collection
Computer forensics is the name of the science of resurrecting deleted data. Because operating
systems turn a blind eye to deleted data (or at least that which has gone beyond the realm of
the Recycle Bin), a copy of a drive made by ordinary processes won’t retrieve the sources of
deleted data. Computer forensic scientists use specialized tools and techniques to copy every
sector on a drive, including those holding deleted information. When the stream of data
containing each bit on the media (the “bitstream”) is duplicated to another drive, the resulting
forensically qualified duplicate is called a “clone.” When the bitstream is stored in files, it is
called a “drive image.” Computer forensic tools analyze and extract data from both clones and
images.
Guidance EnCase® www.guidancesoftware.com
AccessData FTK® www.accessdata.com
Paraben® P2 Commander www.paraben.com
All Rights Reserved. 2009 Superior Document Services Richmond, Virginia Page | 5
Active Preservation and Collection
Specific sources of targeted directories and files are captured in a non-forensic method. Active
collections are the fastest way to get a snap shot of files deemed definitely responsive and/or
privilege in regards to the scope of discovery. Most active collections are not defensible,
unless the parties have previously agreed and had this approved by the courts.
Active collections have several risk factors. These include spoliation of metadata and
defensibility of authenticity.
AccessData FTK® Imager Lite www.accessdata.com
Pinpoint Laboratories SafeCopy® www.pinpointlabs.com
Network Appliance Preservation and Collection
The ability to conduct investigations across domains provides a unique advantage in seeking
out potentially relevant electronically stored information. Larger corporations benefit from the
process of implementing this form of data crawling. The benefits of such a solution assist in
day-forward preservation of information in response to legal investigations. Data crawling
network appliances are installed within the corporate firewall and are administered through a
collaboration of the internal IT department and the in-house counsel. Based on specific terms
and other information provided, the appliance will constantly seek information across the
entire enterprise, identifying any electronically stored information meeting the specified
criteria while making a preserved copy of the data in a "legal hold” folder for review by the
legal team. The appliances have features that allow the sending of legal hold notifications,
managing the discovery calendars, searching the information for relevance, and preparing
productions to outside legal counsel. All activities are tracked for defensibility and a
continuous chain of custody report can be accessed on demand.
Clearwell® www.clearwellsystems.com
Kazeon® www.kazeon.com
Relativity® www.kcura.com
Venio FPR® www.veniosystems.com
RedFile® www.redfile.com
All Rights Reserved. 2009 Superior Document Services Richmond, Virginia Page | 6
Search and Analysis of Evidence
“… quickly discover the most important facts about the case, whom you need to interview to
gain additional information, and additional materials you might need to collect (e.g.,
additional custodians)… develop a process … that works with your tool set to obtain as much
early information quickly as possible…” 5
De-duplication
In excess of 90% of all business communications are maintained in electronic form, with the
exception of legacy documents and documents requiring original signatures. Thus, the amount
of redundancy of shared documents within corporate departments has an adverse effect on
preparing a response to a legal discovery request. Within every business there are at minimum
5-7 exact versions of the same document; another
10 near iterations of the original document; and
these electronic iterations are found living across
several custodians within the enterprise. Including
servers, archive folders and personal storage devices.
De-duplication refers to the process of electronically
reading the file header information and the text of
each document to establish a unique fingerprint.
This fingerprint is referred to as a hash value. Hash
values are usually generated by two different
algorithms: MD5 or SHA1. These algorithms allow software to compare all collected
electronically stored information (“ESI”) to determine the original version of a document, and
further report where duplicate instances are located within the enterprise or collected data
set. De-duplication will reduce the amount of ESI for review by 50% in most matters.
Proximity Filters
In addition to using de-duplication, additional proximity filters continue to narrow the scope of
review. These filters target date ranges, directories, custodians, and use of keyword terms.
5 Findlaw.com Analysis Initial Case Assessment
All Rights Reserved. 2009 Superior Document Services Richmond, Virginia Page | 7
Application of Proximity filters will narrow the scope by approximately 70% after de-
duplication.
Keyword Generation and Analysis
The ability to intuitively understand a case and its documents can rely heavily on the use of
keyword terms. The effective use of keywords will identify “hot docs” within the collected data
set. These terms are identified during the meet and confer custodian interview, as well as
eScoping. They generally relate to “code words” that are directly associated with the matter.
Litigators apply the terms to the entire document population to quickly find and categorize
documents based on their relevancy.
Lately, keyword term searches have come under serious scrutiny by the courts for their
unreliability and inconsistency among different parties. Counsel are bound by the FRCP to
share and agree upon the keyword terms being used for the purpose of discovery. The
challenge for counsel is to generate a comprehensive set of suggested terms that will produce
the most relevant results. Technology today provides intuitive analytics to accommodate the
increasing demand of the court for the defensibility of keyword searches.
Trident Pro® www.discoverthewave.com
Casemine® www.casemine.com
LAW® Pre-Discovery http://law.lexisnexis.com/law-prediscovery
CaseLogistix® http://www.anacomp.com/clx/
Discover-e® http://discover-e-legal.com/
All Rights Reserved. 2009 Superior Document Services Richmond, Virginia Page | 8
Review of Evidence
Relevance Review
The first pass review, or relevance review, by the trial team quickly identifies potentially
responsive ESI. The following types of search analyses are applied to the relevance review
process:
Latent Semantic
“… Latent Semantic Analysis (LSA) is a theory and method for extracting and representing the
contextual-usage meaning of words by statistical computations applied to a large corpus of
text (Landauer and Dumais, 1997). The underlying idea is that the aggregate of all the word
contexts in which a given word does and does not appear provides a set of mutual constraints
that largely determines the similarity of meaning of words and sets of words to each other ...” 6
Latent Semantic Analysis allows for iterations of common names to be grouped together to
maximize accuracy in search string results. For example, “William Jefferson” and “Bill Jefferson”
could be joined as one keyword term. Additionally, the ability to join misspelled words into
similar search strings is an additional benefit of using semantic analyses during a review.
6 University of Colorado at Boulder – Landauer, T. K., Foltz, P. W., & Laham, D. (1998). An Introduction to Latent Semantic Analysis
1988
All Rights Reserved. 2009 Superior Document Services Richmond, Virginia Page | 9
Categorization
The use of relevance tags to group documents by category gives a more in-depth, substantive
review or production. Common tags include Responsive, Privilege, Attorney- Client, Hot Docs
and Non-responsive. Also, grouping items by date provides additional criteria when evaluating
ESI for relevancy. The tags are then used to query documents for preparation of depositions
and productions.
Conceptual Search and Clustering
“… a search engine that retrieves documents based on a combination of keyword and conceptual matching.
Documents are automatically classified to determine the concepts to which they belong. Query concepts are
determined automatically from a small description of the query or explicitly entered by the user…”7
The ability to link documents that are closely related using “code words” is made more
efficient by using conceptual search analytics. Conceptual searching decreases the effort
required by the legal team to quickly identify the “smoking gun” and related supporting
documents.
Today, two basic approaches to grouping (sometimes referred to as "clustering") similar
documents exist: rules-based and example-based. In a rules-based model, the review team
establishes criteria that help determine relevancy rates in the overall document collection. A
rules-based approach is similar to keyword searching, but often provides more relevant results
compared to keyword searching, as the search engine may use proximity, word patterns, co-
occurrence of key concepts, and/or thesauri to determine search "hits" and relevancy.
In an example-based approach, documents programmatically describe themselves based on
the concepts that are identified within each document. The system then groups documents
that are contextually similar for review. An additional benefit of most example-based
approaches is that reviewers can employ both discovered material and their knowledge of the
matter to narrowly group potentially relevant content. For example, when relevant documents
are identified during review, an example-based system can regroup the collection of
documents based on the reviewer-provided examples of the relevant document(s). 8
7 University of Kansas - EECS Department Madrid, J. M. & Gauch, S. (2002). Incorporating Conceptual Matching in Search
8 Findlaw.com – Review of Emerging Technologies: Auto-Coding or Clustering
All Rights Reserved. 2009 Superior Document Services Richmond, Virginia Page | 10
Substantive Review
Linear review using in-house or contract attorney reviewers to identify important documents,
while eliminating irrelevant files via document management databases. There are two distinct
platforms for the substantive review:
In-House Relational Database
Reviewers access a database stored within the enterprise environment to identify and
categorize potentially relevant documents. The database provides the review team the ability
to search (boolean and fuzzy) across the entire data set to tag documents based on relevancy.
The in-house platform is typically limited based on the
systems configuration of the end-user computer and other
licensing restrictions for multiple user access.
Concordance®
Summation®
Microsoft Excel®
Microsoft Access®
Hosted Document Repository
“… a hosted litigation support solution is a database hosted online by a third party for a law
firm or multiple firms. This could be a firm with a very large case or multi-party litigation. Each
party can have its own secure log in passwords and not have access to their opponents’ work
product…”9
These solutions improve the review process and offer
quite a few benefits:
• efficient and advanced searching,
• limitless secure user access,
• continuous availability to documents via any web
browser, and
• enhanced functionality to identify and record
document relevance.
9 Bow Tie Law Blog - Court Orders For Hosted Review Solutions: When the Judge Wants to See the Discovery Too (2009)
All Rights Reserved. 2009 Superior Document Services Richmond, Virginia Page | 11
Production of Evidence
Native Production
“…the responding party provides the requesting party with the
documents “as is.” For example, if the producing party used
Microsoft Word internally for generating and storing documents,
they would provide the requesting party with unmodified (native)
Word (.doc) files…”10
A common misconception is that native production allows an opponent to alter the production
without being discovered. This can be prevented by using electronic file signatures (hash
values) that will identify even the smallest data alteration. A few pros and cons should be
considered before using a native review:
• PROS: Native data provides easy access to all data, including metadata. Native
files retain the original look and feel without translation. There is the potential to
implement forensic evidence.
• CONS: Originals can be easily modified – with or without intent. Bates numbering
becomes inefficient, and requires modifying the original document. There is no
easy means for creating redactions with native documents.
Image Production
Electronic conversion of native files to .TIF, .PDF and/or .JPG format
for producing to opposing parties instead of printing to paper is very
common. In addition to providing electronic images, the associated
metadata and extracted text (OCR where applicable) are provided.
This method of production is the standard.
Documents produced in this form are easily accessible, searchable
and quickly identified when document identifiers (bates numbers) are
embedded into the electronic image. These documents are not easily
altered, but redactions can be performed when producing proprietary
10
Fulcrum Inquiry® LLP - How To Select The Right Form Of Electronic Discovery (2007)
All Rights Reserved. 2009 Superior Document Services Richmond, Virginia Page | 12
(or responsive-privilege) content.
Metadata
The who, what, when and how about an electronic document is referred to as metadata. This
data provides information that can aide in settling a dispute regarding the authenticity and
time line for a given file and/or production.
Metadata can provide information that can help resolve a dispute. Metadata can help lawyers
in a variety of ways: 11
1. Reviewing deleted text, and who added text, can be helpful
in understanding the path of negotiations and what was
intended by the parties.
2. Metadata can help resolve issues regarding (i) the
authenticity of a file, or (ii) the timing of documents and the
events they describe.
3. Knowing who obtained a file can assist determining whether
legal privileges have been waived.
4. Metadata may indicate that a file was based on another
older file or was authored by another person, thus calling
into question how much additional work was involved in the
new document or author.
5. Parties sharing drafts through word processing files could easily be embarrassed (or
worse) by the history of what was created and deleted in the exchanged draft. (When
issues of document spoliation do not exist, programs, including an option in Microsoft's
2004 versions, can eliminate the metadata and drafting history.)
11
Fulcrum Inquiry LLP - Metadata Primer For Lawyers (2005)
All Rights Reserved. 2009 Superior Document Services Richmond, Virginia Page | 13
Conclusion
The landscape of legal discovery continues to evolve at a rapid pace as the dynamics of
collaboration between people are constantly being recorded. The time of holding a business
conversation over the phone with a “note taker” have quickly diminished with technology such
as voice –over-internet-protocol (VOIP) where all conversations are recorded with or without
knowledge of both parties. Creating a business communication via a typewriter seems like a
myth from the Pony Express, while today all business communications are created via a
keyboard and saved to a chip of physical memory somewhere. The United States Postal
Service year after year continues to see a drop in shipping letter sized mail while statisticians
report that over 10 billion emails are sent daily world-wide.
One might ask, “What did people do before computers, cell phones, email, etc…?” The
technological advances of the last 3 decades have created a paradigm shift in human
interaction. Litigators also have evolved in the manner for which responses to legal discovery
requests are prepared. The courts have recognized these ever evolving developments by
amending and monitoring the Federal Rules of Civil Procedure. This includes setting standards
for the discovery, handling and production of electronically stored information throughout the
litigation lifecycle.
Implementation of Early Case Assessment techniques during the discovery phase of litigation
quickly identifies a case strategy. Litigators benefit as they are able to identify the level of
technology, retention policies, and volume of information. Thus, a roadmap for mitigating risk
and expense.
“The reality of electronic discovery is it starts off as the responsibility of those who don’t
understand the technology and ends up as the responsibility of those who don’t understand the
law.”
Top Related