20161024 mira talk_machine_learning_pratice_law

48
Machines Learning the Practice of Law

Transcript of 20161024 mira talk_machine_learning_pratice_law

Machines Learning the Practice of Law

http://www.complexdiscovery.com/info/2014/01/16/13-years-of-ediscovery-a-quick-merger-acquisition-and-investment-update/

Terminology

Expression Definition

False positives Responsive documents that are irrelevant

False negatives Relevant documents that are not responsive

Synonymy Many words have the same meaning

Polysemy Same word has different meanings

Expression Definition

Precision Proportion of the retrieved documents that are responsive

Recall Proportion of responsive documents that have been retrieved

Terminology

eDiscovery Institute , Comparison of Auto-Categorization with Human Review, (http://www.electronicdiscoveryinstitute.com/pubs/ComparisionAutoCategorization.pdf)

OMG “Oh My God”

‘97

Principle 7

Technologies

• ”Various e-discovery solutions are available including software solutions such as predictive coding and auditing procedures such as sampling. It is naive to expect complete procedural agreement in an adversarial system but there should be a mutual interest in identifying critical documentary evidence while preserving legitimate claims for privilege.[7] Suffice to say traditional approaches to production motions cannot be used for production on this scale.”

• L’Abbé v. Allen-Vanguard, 2011 ONSC 7575

• Excessive rates for paralegals for document review• “[43] Mr. Anello’s response on cross-examination was that while he has used contract

lawyers on some cases to perform first level review, he did not choose to do so in this case. He testified, “Right, I think we used predicative algorithm coding software, which is generally less expensive. And we used various other methods to limit the document review that were less expensive than human beings.” He went on to say predictive coding was the first level review since it goes “a little deeper” than just a word search.

• [44] Given the use of predictive coding for the first level review of massive document disclosure, I do not find it unreasonable for the lawyer to then use paralegals to conduct the next level or levels of review. I make no adjustment on this account.”

• Bennett v Bennett, 2016 ONSC 503

Similar documents identified

Review Sample Create Sample of

identified documents

Bad MatchesMark not responsive

Category with “Example documents”

Good MatchesAdd to category

Categorization

Teaching Set

Predictive Coding

Keywords & Concepts

Report

Keywords & Concepts

Report

100,000 documents

Responsive Documents

100,000 documents100,000

documents

100,000 documents

Sample

Predictive Coding

Statistical Sample QC

Statistical Sample QC

2nd round of Predictive Coding

Teaching Set 500 documents

Responsive

100,000Non-Responsive

300,000Uncategorized

100,000

Responsive

762Non-Responsive

767Uncategorized

250

Responsive

130,000Non-Responsive

330,000Uncategorized

40,000

Responsive

765Non-Responsive

767Uncategorized

250

4,061 coded manually

495,939 categorized by Analytics

500,000 documents

1st round of Predictive Coding

Online Dispute Resolution

Dominic [email protected]@dominicjaar