De-Duplication

2
leaders in digital evidence Contact e.law: [email protected] near de-duplication, email threading & text compare technology de-duplication technology email threading technology The Exact Duplicate Problem Research shows that anywhere from 30% to 50% or more of electronic document collections are exact duplicates. Duplicate documents significantly increase e.discovery processing costs and legal review time if they are not identified and removed. The Near Duplicate Solution - Near duplicate detection technology (NDD) can be used to detect files with the same content but are in different formats, e.g, MS Word and PDF versions of the same document. Also files with the same content but which have different formatting can be identified using NDD. Near de-duplication creates order from chaos by grouping documents with similar content together and highlighting this to the user. Whilst exact de-duplication can result in the removal of up to 50% of duplicates in potentially discoverable electronic file repositories, near de-duplication can result in finding up to a further 50%. This means faster review and thereby greater time and cost savings. The Email Thread Problem It is estimated that over 250 billion emails are sent and received each day worldwide. A large portion of those are replies or forwards to other emails, creating an email thread. An email thread can contain multiple duplicates of other emails however they are not an exact duplicate as they are unique in their own right. The Email Thread Solution In the context of litigation, being able to follow an email thread is very important. Email threading technology captures and reconstructs email conversations. By identifying the unique emails in a collection, the tool drastically reduces the number of emails that need to be reviewed. Email threading simplifies the review of emails, while allowing the review of the email within its original context. Less cost, less time and less risk! The Exact Duplicate Solution Electronic documents have their own DNA and their own fingerprints. We can use this information to identify electronic documents that are exact duplicates of each other, significantly reducing the volume of documents that is required to be processed and reviewed. The Near Duplicate Problem It is estimated that in enterprise environments, 20% to 50% of all electronic information are near duplicates. Near-duplicate files are documents with minor differences. For example, contract versions containing a few different words. Visit elaw.com.au for more information WE CREATE FROM THE ORDER CHAOS Quality ISO 9001

Transcript of De-Duplication

Page 1: De-Duplication

leaders in digital evidence

Contact e.law: [email protected]

near de-duplication, email threading & text compare technology

de-duplication technology

email threading technology

The Exact Duplicate Problem Research shows that anywhere from 30% to 50% or more of electronic document collections are exact duplicates. Duplicate documents significantly increase e.discovery processing costs and legal review time if they are not identified and removed.

The Near Duplicate Solution - Near duplicate detection technology (NDD) can be used to detect files with the same content but are in different formats, e.g, MS Word and PDF versions of the same document. Also files with the same content but which have different formatting can be identified using NDD. Near de-duplication creates order from chaos by grouping documents with similar content together and highlighting this to the user. Whilst exact de-duplication can result in the removal of up to 50% of duplicates in potentially discoverable electronic file repositories, near de-duplication can result in finding up to a further 50%. This means faster review and thereby greater time and cost savings.

The Email Thread Problem It is estimated that over 250 billion emails are sent and received each day worldwide. A large portion of those are replies or forwards to other emails, creating an email thread. An email thread can contain multiple duplicates of other emails however they are not an exact duplicate as they are unique in their own right.

The Email Thread Solution In the context of litigation, being able to follow an email thread is very important. Email threading technology captures and reconstructs email conversations. By identifying the unique emails in a collection, the tool drastically reduces the number of emails that need to be reviewed. Email threading simplifies the review of emails, while allowing the review of the email within its original context.

Less cost, less time and less risk!

The Exact Duplicate Solution Electronic documents have their own DNA and their own fingerprints. We can use this information to identify electronic documents that are exact duplicates of each other, significantly reducing the volume of documents that is required to be processed and reviewed.

The Near Duplicate ProblemIt is estimated that in enterprise environments, 20% to 50% of all electronic information are near duplicates. Near-duplicate files are documents with minor differences. For example, contract versions containing a few different words.

Visit elaw.com.au for more information

WE CREATE

FROM THEORDER

CHAOS

Quality ISO 9001

Page 2: De-Duplication

leaders in digital evidence

text compare technology

Equivio>Compare is a software application that highlights the textual differences between two documents and has the ability to compare documents of any two file types. e.law has integrated the Equivio>Compare technology into our near duplicate and email threading solutions for a unique document review experience.

Using Text Compare with Near Duplicate Sets This is used when reviewing documents that have been grouped into near-duplicate sets. For example, there is a set of near-duplicates comprising 10 versions of a 50-page contract. The pivot document is identified, which is the most representative document of the near-duplicate set. The lawyer starts by reading the pivot document. Having read the pivot, the lawyer can decide whether it’s necessary to continue reviewing the remaining 9 versions of the contract in our near-duplicate set. If the lawyer decides that the other nine versions do require review, it’s not necessary to read the 50-page contract another nine times. Using Equivio>Compare, the lawyer can simply review the differences of each document vis-à-vis the pivot document.

Using Text Compare with Email ThreadsEquivio>Compare also compares emails. The compare function is useful for highlighting the differences between two inclusives within an email thread. The two inclusives typically share a common ancestry; that is, both emails originate from the sameoriginal email thread, which at some point split into two sub-threads. Equivio>Compare identifies the common part of the chain, and the unique elements in both inclusives.

Chaotic email collection

Re-built email thread

Visit elaw.com.au for more information

READ LESSTHINK MOREWIN BIG

Contact e.law: [email protected]

Focus on “inclusives”

Equivio™, Equivio>NearDuplicates™, Equivio>EmailThreads™, Equivio>Compare™ and Read less, Think more, Win big™ are trademarks of Equivio. Other product names mentioned in this document may be trademarks or registered trademarks of their respective owners. All specifications in this document are subject to change without prior notice.

TM