Still on Stage: Boolean Search. Your Speakers Speaker: Richard Cheng Richard Cheng, CISSP, CISA,...

25
Still on Stage: Boolean Search

Transcript of Still on Stage: Boolean Search. Your Speakers Speaker: Richard Cheng Richard Cheng, CISSP, CISA,...

Page 1: Still on Stage: Boolean Search. Your Speakers Speaker: Richard Cheng Richard Cheng, CISSP, CISA, directs digital forensics and e-discovery cases and.

Still on Stage: Boolean Search

Page 2: Still on Stage: Boolean Search. Your Speakers Speaker: Richard Cheng Richard Cheng, CISSP, CISA, directs digital forensics and e-discovery cases and.

Your Speakers

Page 3: Still on Stage: Boolean Search. Your Speakers Speaker: Richard Cheng Richard Cheng, CISSP, CISA, directs digital forensics and e-discovery cases and.

Speaker: Richard Cheng Richard Cheng, CISSP, CISA, directs digital forensics and e-

discovery cases and consults on IT audits, governance and compliance.  His experience includes the collection and processing of unique and/or proprietary ESI (Apple devices, mobile devices, collaboration sites, and the cloud).  Richard has provided testimony as a neutral expert and technology authority. He has two M.S. degrees from the University of New Haven and a B.S. from MIT.

Page 4: Still on Stage: Boolean Search. Your Speakers Speaker: Richard Cheng Richard Cheng, CISSP, CISA, directs digital forensics and e-discovery cases and.

Speaker: Megan Bell Megan Bell directs data analysis projects.  She is

experienced in the analysis of complex data sets, search and reporting technology and the automation of workflows that increase efficiency and deliver better outcomes.  Her case experience includes data/security breach, IP theft, insurance, and employment matters.  She also has extensive experience in the development and launch of new product technologies.  She has a degree in Chemical Engineering from WPI.

Page 5: Still on Stage: Boolean Search. Your Speakers Speaker: Richard Cheng Richard Cheng, CISSP, CISA, directs digital forensics and e-discovery cases and.

Speaker: Shawnna Childress, P.I.

Page 6: Still on Stage: Boolean Search. Your Speakers Speaker: Richard Cheng Richard Cheng, CISSP, CISA, directs digital forensics and e-discovery cases and.

Overview: Boolean Search Early eDiscovery

famous moments Martha Stewart

voicemail Lehman Brothers’

bankruptcy Merrill Lynch

analyst emails on “junk” investments

It’s not just e-discovery.

Page 7: Still on Stage: Boolean Search. Your Speakers Speaker: Richard Cheng Richard Cheng, CISSP, CISA, directs digital forensics and e-discovery cases and.

Universe of Search Types of Data

Sources: Databases, Email, Files, SharePoint Locations: Local computer, server, backup,

mobile device Search Technologies:

dtSearch Lucene Grep SQL

Automated “predictive” methods/ neural nets

Page 8: Still on Stage: Boolean Search. Your Speakers Speaker: Richard Cheng Richard Cheng, CISSP, CISA, directs digital forensics and e-discovery cases and.

Why Boolean?

Boolean search: Character-based searching. Toolbox of relationship connectors and

limiters to broaden or narrow search Benefits:

Identify important words/ phrases and how used

Research “written” language context and relationship

Easily vary breadth and scope of search Customizable search

Page 9: Still on Stage: Boolean Search. Your Speakers Speaker: Richard Cheng Richard Cheng, CISSP, CISA, directs digital forensics and e-discovery cases and.

Overview of Boolean Search Construction Boolean connectors

AND, OR, NOT

Page 10: Still on Stage: Boolean Search. Your Speakers Speaker: Richard Cheng Richard Cheng, CISSP, CISA, directs digital forensics and e-discovery cases and.

Overview of Boolean Search Construction Other Boolean elements

Proximity, Stemming, Fuzzy Searching Parentheses Wildcards Numeric terms and ranges Fields (i.e., email address)

Differences in Boolean connectors AND versus Proximity Stemming versus Wildcard use

Page 11: Still on Stage: Boolean Search. Your Speakers Speaker: Richard Cheng Richard Cheng, CISSP, CISA, directs digital forensics and e-discovery cases and.

Overview of Boolean Search Construction

Page 12: Still on Stage: Boolean Search. Your Speakers Speaker: Richard Cheng Richard Cheng, CISSP, CISA, directs digital forensics and e-discovery cases and.

Overview of Boolean Search Construction for Foreign Languages

Page 13: Still on Stage: Boolean Search. Your Speakers Speaker: Richard Cheng Richard Cheng, CISSP, CISA, directs digital forensics and e-discovery cases and.

Foreign LanguagesHow will you handle the multiple foreign languages?

Example: Chinese DialectsGan - 赣语 / 贛語 31 millionGuan (Mandarin) - 官话 / 官話 836 millionHui - 徽語 3.2 millionJin - 晋语 / 晉語 45 millionKejia (Hakka) - 客家話 34 millionMin - 閩語 / 闽语 60 millionWu - 吴语 / 吳語 77 millionXiang - 湘语 / 湘語 / 湖南话 / 湖南話 36 millionYue - 粵語 / 粤语 71 millionUnclassified not determined

Page 14: Still on Stage: Boolean Search. Your Speakers Speaker: Richard Cheng Richard Cheng, CISSP, CISA, directs digital forensics and e-discovery cases and.

Optimizing Boolean Search Statement Construction

1. Invest time in identifying relevant search terms and phrases.

2. Determine which search terms to search in combination.

3. Use the most appropriate Boolean logic.4. Adjust Boolean search statements to

account for variations in search term wording, spellings and abbreviations.

5. Modify Boolean search statement when special characters are present.

Page 15: Still on Stage: Boolean Search. Your Speakers Speaker: Richard Cheng Richard Cheng, CISSP, CISA, directs digital forensics and e-discovery cases and.

Examples

Page 16: Still on Stage: Boolean Search. Your Speakers Speaker: Richard Cheng Richard Cheng, CISSP, CISA, directs digital forensics and e-discovery cases and.

1. Capturing the Variation for a Word

Example: eDiscovery

Boolean:“e-Discovery” OR eDiscovery OR “electronic discovery” OR electronic w/1 discovery

Page 17: Still on Stage: Boolean Search. Your Speakers Speaker: Richard Cheng Richard Cheng, CISSP, CISA, directs digital forensics and e-discovery cases and.

2. Searching for Unique Phrases

Example: Search for the ratio 1:1

Boolean: 1?1 AND (NOT(101 OR 111 OR 121 OR

131 OR 141 OR 151 OR 161 OR 171 OR 181 OR 191))

Page 18: Still on Stage: Boolean Search. Your Speakers Speaker: Richard Cheng Richard Cheng, CISSP, CISA, directs digital forensics and e-discovery cases and.

3. Simplifying Complex Compound Phrases Example:

(“product rollout “ OR “product release”) AND (China OR Japan OR Korea OR Asia OR ASEAN OR Taiwan OR Hong Kong)

Boolean: (“product release”) AND (China OR Japan OR Korea

OR Asia OR ASEAN OR Taiwan OR Hong Kong) (“product rollout “) AND (China OR Japan OR Korea

OR Asia OR ASEAN OR Taiwan OR Hong Kong)

Page 19: Still on Stage: Boolean Search. Your Speakers Speaker: Richard Cheng Richard Cheng, CISSP, CISA, directs digital forensics and e-discovery cases and.

4. When Dates are Search Terms

Example: 1/6/11

Boolean: “1?6?11” OR “!1?6?2011” Others?

Page 20: Still on Stage: Boolean Search. Your Speakers Speaker: Richard Cheng Richard Cheng, CISSP, CISA, directs digital forensics and e-discovery cases and.

5. Compound Words

Example: Watch-out

Boolean: Watchout OR Watch?out “watch out”?

Page 21: Still on Stage: Boolean Search. Your Speakers Speaker: Richard Cheng Richard Cheng, CISSP, CISA, directs digital forensics and e-discovery cases and.

6. Noise Filter Issues

Example: The The

Boolean: “The The”

Page 22: Still on Stage: Boolean Search. Your Speakers Speaker: Richard Cheng Richard Cheng, CISSP, CISA, directs digital forensics and e-discovery cases and.

7. Improving Search Results for an Overused and Important Word Example:

When “confidential” is important as a search term and overused

Boolean: confidential AND NOT (“communication is confidential”

OR “confidentiality notice” OR “confidential personal”) confidential AND NOT (confidential w/3 communication) confidential AND NOT (confidential w/3 notice) confidential AND NOT (confidential w/3 personal)

Page 23: Still on Stage: Boolean Search. Your Speakers Speaker: Richard Cheng Richard Cheng, CISSP, CISA, directs digital forensics and e-discovery cases and.

Statistical Sampling Recent court opinions suggest that sampling as used in

Assisted Review is not only useful but may be required in certain cases. Several decisions in the past few years have penalized lawyers for not sampling documents before they were produced (waiver of privilege) and for not sampling the documents that were not produced (omission of responsive data).  In two landmark decisions, U.S. Magistrate Judges John M. Facciola and Paul W. Grimm issued key rulings discussing sampling. Specifically, they criticized counsel who hoped to be excused for inadvertent waiver of privilege because they did not sample the documents produced after key-word searches.

United States v. O’Keefe, 537 F. Supp. 2d 14 (D.D.C. 2008) (Judge Facciola)

Victor Stanley, Inc. v. Creative Pipe, Inc., 250 F.R.D. 251 (D. Md. 2008) (Judge Grimm)

Page 24: Still on Stage: Boolean Search. Your Speakers Speaker: Richard Cheng Richard Cheng, CISSP, CISA, directs digital forensics and e-discovery cases and.

Smoking Gun

Even more recently, another court found waiver of privilege in a “smoking gun” attorney-client communication because counsel failed to sample.

Mt. Hawley Ins. Co. v. Felman Prod., Inc., 2010 WL 1990555 (S.D. W. Va. May 18, 2010)

Page 25: Still on Stage: Boolean Search. Your Speakers Speaker: Richard Cheng Richard Cheng, CISSP, CISA, directs digital forensics and e-discovery cases and.

Q&A