Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

35
Data and Knowledge Data and Knowledge Representation Representation Lecture 1 Lecture 1 Qing Zeng, Ph.D. Qing Zeng, Ph.D.

Transcript of Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

Page 1: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

Data and Knowledge Data and Knowledge RepresentationRepresentation

Lecture 1Lecture 1

Qing Zeng, Ph.D.Qing Zeng, Ph.D.

Page 2: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

IntroductionIntroduction

Instructor, Harvard Medical SchoolInstructor, Harvard Medical SchoolResearch Associate, Brigham and Research Associate, Brigham and

Women’s HospitalWomen’s Hospital

Page 3: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

My ResearchMy Research

Semantic Knowledge-based SystemSemantic Knowledge-based SystemInformation retrievalInformation retrievalInformation integration/presentationInformation integration/presentation

Consumer Information RetrievalConsumer Information RetrievalFlow Cytometry-based ProteomicsFlow Cytometry-based ProteomicsShare Pathology Information NetworkShare Pathology Information Network

Page 4: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

Main TextbookMain Textbook

Knowledge Representation: Knowledge Representation: Logical, Philosophical, and Logical, Philosophical, and Computational FoundationsComputational Foundationsby John F. Sowaby John F. Sowa

$74 from Amazon.com$74 from Amazon.com

Page 5: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

MotivationMotivation

Representing data and knowledge for Representing data and knowledge for computing computing DevelopDevelopMaintainMaintainShareShare

Page 6: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

Medical Data and KnowledgeMedical Data and Knowledge

Large variety of data and knowledgeLarge variety of data and knowledgeMany possible representationsMany possible representations Implication of representation on Implication of representation on

computingcomputing

Page 7: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

Example of Medical DataExample of Medical Data

This is a 51-year-old female admitted through the This is a 51-year-old female admitted through the emergency room with syncopal episode with emergency room with syncopal episode with chest pain and also noted to have epigastric chest pain and also noted to have epigastric discomfort. The patient was admitted and started discomfort. The patient was admitted and started on Lovenox and nitroglycerin paste. The patient on Lovenox and nitroglycerin paste. The patient had serial cardiac enzymes and ruled out for had serial cardiac enzymes and ruled out for myocardial infarction. The patient underwent a myocardial infarction. The patient underwent a dual isotope stress test. There was no evidence of dual isotope stress test. There was no evidence of reversible ischemia on the Cardiolite scan. The reversible ischemia on the Cardiolite scan. The patient has been ambulated. The patient had a patient has been ambulated. The patient had a Holter monitor placed but the report is not Holter monitor placed but the report is not available at this time. The patient has remained available at this time. The patient has remained hemodynamically stable. Will discharge.hemodynamically stable. Will discharge.

Page 8: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

Examples of Medical Examples of Medical KnowledgeKnowledge

Nitrates are a safe and effective treatment that can be used Nitrates are a safe and effective treatment that can be used in patients with angina and left ventricular systolic in patients with angina and left ventricular systolic dysfunction.dysfunction.

On the basis of currently published evidence, amlodipine is On the basis of currently published evidence, amlodipine is the calcium channel antagonist that it is safest to use in the calcium channel antagonist that it is safest to use in patients with heart failure and left ventricular systolic patients with heart failure and left ventricular systolic dysfunction. dysfunction.

Coronary artery bypass grafting may be indicated, in some, Coronary artery bypass grafting may be indicated, in some, for relief of anginafor relief of angina

All patients with heart failure and angina should be referred All patients with heart failure and angina should be referred for specialist assessment.for specialist assessment.

Patients with angina and mild to moderately symptomatically Patients with angina and mild to moderately symptomatically severe heart failure that is well controlled, and who have no severe heart failure that is well controlled, and who have no other contraindications to major surgery, should be other contraindications to major surgery, should be considered for coronary artery bypass grafting on prognostic considered for coronary artery bypass grafting on prognostic (as well as symptomatic) grounds.(as well as symptomatic) grounds.

Page 9: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

ChallengeChallenge

Philosophical differencePhilosophical differenceDomain differenceDomain differenceApplication differenceApplication differenceDeveloper differenceDeveloper differenceLiabilityLiabilityCostCost

Page 10: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

Formalism and Formalism and Conceptualization Conceptualization

Natural Language is the most Natural Language is the most expressive form of formalism and expressive form of formalism and conceptualizationconceptualization

Conceptualization is an abstract and Conceptualization is an abstract and simplified view of the worldsimplified view of the world

Such simplification allow computer Such simplification allow computer and human alike to communicate in and human alike to communicate in an unambiguous fashion (e.g. “and” an unambiguous fashion (e.g. “and” vs. “&”)vs. “&”)

Page 11: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

LogicLogic

A tool for reasoningA tool for reasoningProvide basic concepts used in many Provide basic concepts used in many

computer science fields (AI, IR, DB, computer science fields (AI, IR, DB, etc..)etc..)

Used in many medical applicationsUsed in many medical applications

Page 12: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

Propositional LogicPropositional Logic

PropositionPropositionBasic operatorsBasic operatorsLanguageLanguageTruth tableTruth tableBoolean AlgebraBoolean Algebra

Page 13: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

PropositionProposition

A proposition is a symbolic variable A proposition is a symbolic variable whose value must be either True or whose value must be either True or False, and which stands for a natural False, and which stands for a natural language statement which could be language statement which could be either true or falseeither true or false

Examples:Examples:A = Smith has chest painA = Smith has chest painB = Smith is depressedB = Smith is depressedC = It is raining C = It is raining

Page 14: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

OperatorsOperators

Logic AndLogic And Inclusive OrInclusive OrExclusive OrExclusive OrLogic NotLogic NotLogical ImplicationLogical ImplicationLogical EquivalenceLogical Equivalence

Page 15: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

Logical And Logical And ΛΛ

AA BB A A ΛΛ B B

False False FalseFalse FalseFalse

FalseFalse TrueTrue FalseFalse

TrueTrue FalseFalse FalseFalse

TrueTrue TrueTrue True True

Page 16: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

Inclusive Logical Or (Inclusive Logical Or (V)V)

AA BB A A V BV B

False False FalseFalse FalseFalse

FalseFalse TrueTrue TrueTrue

TrueTrue FalseFalse TrueTrue

TrueTrue TrueTrue TrueTrue

Page 17: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

Exclusive Logical Or ( )Exclusive Logical Or ( )

AA BB A BA B

FalseFalse FalseFalse FalseFalse

FalseFalse TrueTrue TrueTrue

TrueTrue FalseFalse TrueTrue

TrueTrue TrueTrue FalseFalse

Page 18: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

Inclusive vs. ExclusiveInclusive vs. Exclusive

Natural language “Or” can mean Natural language “Or” can mean eithereitherEither discharge the patient, or admit Either discharge the patient, or admit

himhimI will to take the medication, or the fever I will to take the medication, or the fever

will be worsewill be worseTake 2 or 3 pills per dayTake 2 or 3 pills per day

Exclusive not often used (except in Exclusive not often used (except in circuit design)circuit design)

Page 19: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

Medical ExampleMedical Example

““Heart AND Lung disease”: does Heart AND Lung disease”: does patients have to have both? patients have to have both? Or Or either?either?

““Foot AND mouth disease”: what Foot AND mouth disease”: what does “AND” mean in this case?does “AND” mean in this case?

Further reading: Further reading: Mendonca EA, Cimino JJ, Mendonca EA, Cimino JJ,

Campbell KE, Spackman KA.Campbell KE, Spackman KA. Evaluation of a Evaluation of a proposed method for representing drug proposed method for representing drug terminology. Proc AMIA Symp. 1999;:47-51. terminology. Proc AMIA Symp. 1999;:47-51.

Page 20: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

Logical Not ( Logical Not ( ¬¬ ) )

AA ¬A¬A

FalseFalse TrueTrue

TrueTrue FalseFalse

Page 21: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

Logical Implication (Logical Implication (→→))

AA BB A A →→ B B

False False FalseFalse TrueTrue

FalseFalse TrueTrue TrueTrue

TrueTrue FalseFalse FalseFalse

TrueTrue TrueTrue TrueTrue

Page 22: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

Understanding “→”Understanding “→”

This is an operator. Although we call it This is an operator. Although we call it “imply” or “implication”, do not try to “imply” or “implication”, do not try to understand its semantic from the name. understand its semantic from the name. We could have called it “I” and still define We could have called it “I” and still define its semantic the same way.its semantic the same way.

AA→B “means” A is sufficient, but not necessary →B “means” A is sufficient, but not necessary to make B true.to make B true. E.g. Let A be “having cold” and B be “drink water”, A E.g. Let A be “having cold” and B be “drink water”, A

→ B can be interpreted as “should drink water” when → B can be interpreted as “should drink water” when “having cold”. However, you can drink water even “having cold”. However, you can drink water even when you don’t have cold. Thus A → B still is true when you don’t have cold. Thus A → B still is true when A is not true.when A is not true.

Page 23: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

Logical Equivalence (Logical Equivalence (↔)↔)

AA BB A A ↔↔ B B

False False FalseFalse TrueTrue

FalseFalse TrueTrue FalseFalse

TrueTrue FalseFalse FalseFalse

TrueTrue TrueTrue TrueTrue

Page 24: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

Understanding “→”Understanding “→”

AA→B is different from A=B→B is different from A=BA: a person is pregnant. B: a person is A: a person is pregnant. B: a person is

woman.woman.In this case, AIn this case, A→B is true, A=B is not.→B is true, A=B is not.

Use formal logic to represent Use formal logic to represent knowledge of the real world, not the knowledge of the real world, not the other way around. other way around.

Page 25: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

Well-Formed FormulasWell-Formed Formulas

FormulaFormulaA term (string) in prepositional logicA term (string) in prepositional logic

Well-formed formula (WFF)Well-formed formula (WFF)A term that is constructed correctly A term that is constructed correctly

according to propositional logic syntax according to propositional logic syntax rulesrules

Page 26: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

WFFWFF

Constants: Constants: False, TrueFalse, True Variables: Variables: P, Q, RP, Q, R If If aa is WFF, is WFF, ¬a¬a is WFF is WFF If If aa and and bb are WFF, are WFF, aaΛΛbb are WFF are WFF If If aa and and bb are WFF, are WFF, aaννbb are WFF are WFF If If aa and and bb are WFF, are WFF, a→ba→b are WFF are WFF If If aa and and bb are WFF, are WFF, aa↔↔bb are WFF are WFF Any formula that cannot be constructed Any formula that cannot be constructed

using these rules are not WFFusing these rules are not WFF

Page 27: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

Precedence of Logical OperatorsPrecedence of Logical Operators

¬¬ΛΛVV→→↔↔

Page 28: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

Let Try An ExampleLet Try An Example

Order Test A for all male over 70, smokers with Order Test A for all male over 70, smokers with family history of cancer, and women with chronic family history of cancer, and women with chronic cough and family history of cancer. Otherwise, do cough and family history of cancer. Otherwise, do not order it.not order it. Male: a person being maleMale: a person being male Old: a person being over 70Old: a person being over 70 Smoker: a person being a smokerSmoker: a person being a smoker Cough: a person having chronic coughCough: a person having chronic cough FHC: a person having family history of cancerFHC: a person having family history of cancer OrderA: Order Test AOrderA: Order Test A

(Male (Male ۸۸ Old) V (Smoker Old) V (Smoker ۸۸ FHC) V (¬Male FHC) V (¬Male ۸۸ Cough Cough ۸۸ FHC) FHC) ↔ OrderA↔ OrderA

Page 29: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

ExamplesExamples

Smokers are those who are currently Smokers are those who are currently smoking or had quit smoking for less smoking or had quit smoking for less than 6 monthsthan 6 months

A document is completed only after A document is completed only after signed by both the chief resident and signed by both the chief resident and the attending physician. the attending physician.

Smith is depressed whenever it rainsSmith is depressed whenever it rains

Page 30: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

A Few CommentsA Few Comments

Use parentheses if precedence not Use parentheses if precedence not clearclear

Very similar to programming Very similar to programming language operators’ precedencelanguage operators’ precedence

Precedence in natural language Precedence in natural language depend more on context depend more on context E.g. “no heart and lung disease”E.g. “no heart and lung disease”E.g. “no family history and healthy life E.g. “no family history and healthy life

style”.style”.

Page 31: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

Truth TableTruth Table

An easy way to evaluate propositionsAn easy way to evaluate propositions

AA BB A A νν B B ¬B¬B (A (A νν B) B) ΛΛ ¬B¬B

00 00 00 11 00

00 11 11 00 00

11 00 11 11 11

11 11 11 00 00

Page 32: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

Let Try An ExampleLet Try An Example Order Test A for all male over 70, smokers with family history of Order Test A for all male over 70, smokers with family history of

cancer, and women with chronic cough and family history of cancer, and women with chronic cough and family history of cancer. Other wise, do not order it.cancer. Other wise, do not order it.

(Male (Male ۸۸¬Young) V (Smoker ¬Young) V (Smoker ۸۸ FHC) V (¬Male FHC) V (¬Male ۸۸ Cough Cough ۸۸ FHC) FHC) ↔ ↔ OrderAOrderA

MaleMale Young(<=70Young(<=70))

SmokerSmoker FHCFHC CoughCough Order Test Order Test AA

TT TT TT TT TT TT

TT TT TT TT FF TT

TT TT TT FF TT FF

TT TT TT FF FF FF

TT TT FF TT TT FF

…………

Page 33: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

Tautology and ContradictionTautology and Contradiction

Male V Male V ¬Male¬MaleTautology: proposition that is always Tautology: proposition that is always

truetrueHealthy Healthy ΛΛ ¬Healthy ¬HealthyContradiction: proposition that is Contradiction: proposition that is

always falsealways false

Page 34: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

Extra ReadingExtra Reading

Aho’s book chapter 12Aho’s book chapter 12Sowa’s book p1-39Sowa’s book p1-39

Page 35: Data and Knowledge Representation Lecture 1 Qing Zeng, Ph.D.

HomeworkHomework