Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia...

70
Applying Natural Language Applying Natural Language Generation to Electronic Generation to Electronic Health Records in an e- Health Records in an e- Science context Science context Donia Scott Centre for Research in Computing The Open University

Transcript of Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia...

Page 1: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Applying Natural Language Applying Natural Language Generation to Electronic Generation to Electronic Health Records in an e-Health Records in an e-

Science contextScience context

Donia ScottCentre for Research in Computing

The Open University

Page 2: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

OutlineOutline

Background: the CLEF projectPatient records as data-encoded patient historiesRole of NLG in CLEFIntuitive querying with natural languageGenerating tailored reports from CLEF data

Page 3: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Background: the CLEF Background: the CLEF projectproject

CLEF (Clinical E-Science Framework) is an MRC-funded project aiming at providing a repository of well organised data-encoded clinical histories

Aim: to provide the framework for a new type of medical research: in silico experiments

Partners:

NLP: OU, Sheffield

Medical informatics: Manchester

Electronic Health Records: Royal Marsden Hospital, UCL

Privacy/confidentiality: Cambridge

Page 4: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Collect clinical information from multiple sitesAnalyse, structure and integrate itMake it available, using GRID toolsTo authorised clinicians and e-Health scientistsIn a secure and ethical collaborative framework

GRIDGRID

Page 5: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

The CLEF repositoryThe CLEF repository

Chronicle

Repository

Organised data on individual patients

Data from:• Referral letters• Review notes• Lab results• Nurse notes• Hospital admission notes• Hospital discharge notes• Treatment notes• Surgery reports

Page 6: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

The CLEF ChronicleThe CLEF Chronicle

Representing the story of a patient over time

Page 7: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

time time time time time time time time

The story of an illnessThe story of an illnessHuman:1382

Mass:1666

locus

Pain:5735

locus

locus

Radio:1812

plansplans

Chemo:6502

plans

treats

treats

locus

target

attends

attendsattends

Ulcer:1945

finding

Cancer:1914

finding

Breast:1492

locus

Clinic:4096

reason

reason

Biopsy:1066

reason

Clinic:1024plans Clinic:2010plans

reason reason

reason

Page 8: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

000abnormality 4574572342512

0023320133

511

050metastatic

lymphnode count

4494492342511996

3320133511

00oestrogen receptor +ve

invasive tubular adeno

00BRCA1 +ve

1 -1cancer 4494492342511993

3320133511

00oestrogen receptor +ve

invasive tubular adeno

00BRCA1 +ve

-1stage1 cancer 4494492342511989

3320133511

000abnormality 4464462342511984

3320133511

000enlargement 4464462342511982

3320133511

000enlargement 4464462342511980

3320133511

000lymphadenopathy

4464462342511979

3320133511

000recurrent

cancer 4464462342511978

3320133511

000abnormality 4464462342511959

3320133511

000abnormality 4464462342511955

3320133511

00oestrogen receptor +ve

invasive tubular adeno

00BRCA1 +ve

-1cancer 4434432342511948

3320133511

000cancer 2872872342511944

3320133511

000cancer 1311312342511940

3320133511

00oestrogen receptor +ve

invasive tubular adeno

05.8BRCA1 +ve

-1primary cancer 12342511936

3320133511

NodesInvolved

NodesCounted

TumourMarkerHistologyGrademmSizeGenotype

ClinicalCourse

ExistenceStatusNameEventEnd

DateEventStartDateIDSimID

Problems

~200

unsuccessful completed relapse treatment package 34729823425124753320133512

completed chemotherapy cycle 22522523425123833320133512

completed chemotherapy cycle 22422423425123823320133512

completed packed red cell transfusion 22222223425123813320133512

deferred chemotherapy cycle 22222223425123793320133512

completed chemotherapy cycle 22122123425123783320133512

completed packed red cell transfusion 21921923425123773320133512

deferred chemotherapy cycle 21921923425123753320133512

completed chemotherapy cycle 21721723425123733320133512

completed chemotherapy cycle 21621623425123503320133512

completed radiotherapy cycle 21421423425123493320133512

completed radiotherapy cycle 21321323425123483320133512

completed chemotherapy course 22521823425123173320133512

completed radiotherapy course 21521123425123163320133512

incomplete excision completed radical mastectomy 19719723425122903320133512

successful completed primary treatment package 20519723425122873320133512

started hormone anatagonist therapy 044923425119973320133512

complete excision completed lumpectomy 44944923425119903320133512

successful completed primary treatment package 45744923425119873320133512

OutcomeStatusNameEventEndDateEventStartDateIDSimID

Interventions

15

completed examination 46546523425120313320133511

completed testing 46546523425120223320133511

completed Xray 46546523425120213320133511

completed examination 45745723425120133320133511

completed examination 45745723425120123320133511

completed examination 45745723425120103320133511

completed testing 45745723425120013320133511

completed Xray 45745723425120003320133511

completed excision biopsy 44944923425119943320133511

completed histopathology 44944923425119923320133511

completed excision biopsy 44944923425119913320133511

completed cancer staging 44944923425119883320133511

completed examination 44644623425119763320133511

completed examination 44644623425119743320133511

completed examination 44644623425119733320133511

completed examination 44644623425119713320133511

completed testing 44644623425119533320133511

completed Xray 44644623425119513320133511

completed Xray 44344323425119473320133511

completed Xray 28728723425119433320133511

completed Xray 13113123425119393320133511

StatusNameEventEndDateEventStartDateIDSimID

Investigations

~100

daily epirubicin 23425128183320133512

daily doxorubicin 23425124793320133512

daily 5-fluorouracil 23425123203320133512

daily cyclophosphamide 23425123193320133512

daily epirubicin 23425123183320133512

RegimeNameIDSimID

Drugs

~5

clinic mammography screening scheduled 0023425122293320133511

clinic mammography screening completed 84184123425122223320133511

clinic follow up completed 73773723425121963320133511

clinic follow up completed 63363323425121523320133511

clinic follow up completed 54554523425121083320133511

clinic follow up completed 48948923425120643320133511

clinic follow up completed 46546523425120203320133511

clinic initial treatment planning completed 44944923425119863320133511

clinic mammography screening completed 44344323425119463320133511

clinic mammography screening completed 13113123425119383320133511

LocationTypeStatusEventEndDateEventStartDateIDSimID

Consults

~10

Loci

bone metabolism 23479114143322572593

Lbrain 23479113193322572593

Llung 23479112943322572593

Rlung 23479112923322572593

brain 23479112683322572593

Raxilla 23479110903322572593

spleen 23479110723322572593

liver 23479110703322572593

abdomen 23479110653322572593

Raxillary lymphnodes 23479110623322572593

ESR concentration 23479110603322572593

Creatinine concentration 23479110583322572593

Alkaline Phosphatase concentration 23479110563322572593

Bilirubin concentration 23479110543322572593

GGT concentration 23479110523322572593

platelet count 23479110503322572593

leucocyte count 23479110483322572593

haemoglobin concentration 23479110463322572593

blood 23479110443322572593

chest 23479110423322572593

Rbreast 23479110363322572593

LateralityNameIDSimID

~20 ~600

2342511955PROBLEM HAS_FINDING 2342511953INVESTIGATION 3320133511

2342511954LOCUS HAS_TARGET 2342511953INVESTIGATION 3320133511

2342511950CONSULT RECOMMENDED_BY 2342511953INVESTIGATION 3320133511

2342511936PROBLEM INDICATED_BY 2342511953INVESTIGATION 3320133511

3320133511PATIENT HAS_LOCUS 2342511952LOCUS 3320133511

2342511950CONSULT RECOMMENDED_BY 2342511951INVESTIGATION 3320133511

2342511936PROBLEM INDICATED_BY 2342511951INVESTIGATION 3320133511

2342511985CONSULT ARRANGED 2342511950CONSULT 3320133511

2342511937LOCUS HAS_LOCUS 2342511948PROBLEM 3320133511

2342511948PROBLEM HAS_FINDING 2342511947INVESTIGATION 3320133511

2342511949CONSULT ARRANGED 2342511946CONSULT 3320133511

2342511937LOCUS HAS_LOCUS 2342511944PROBLEM 3320133511

2342511944PROBLEM HAS_FINDING 2342511943INVESTIGATION 3320133511

2342511937LOCUS HAS_TARGET 2342511943INVESTIGATION 3320133511

2342511945CONSULT ARRANGED 2342511942CONSULT 3320133511

2342511937LOCUS HAS_LOCUS 2342511940PROBLEM 3320133511

2342511940PROBLEM HAS_FINDING 2342511939INVESTIGATION 3320133511

2342511937LOCUS HAS_TARGET 2342511939INVESTIGATION 3320133511

2342511941CONSULT ARRANGED 2342511938CONSULT 3320133511

3320133511PATIENT HAS_LOCUS 2342511937LOCUS 3320133511

2342511937LOCUS HAS_LOCUS 2342511936PROBLEM 3320133511

Item2IDItem2TypeRelationItem1IDItem1TypeSimID

Relations

A typical cancer patientA typical cancer patient

Page 9: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

The role of NLGThe role of NLG

an intuitive query interface to provide efficient access to aggregated data-encoded patient histories for:

Assisting in diagnosis and treatmentIdentifying patterns in treatmentSelecting subjects for clinical trials

generating reports from the data-encoded histories, for clinicians to use at the point of care.

Page 10: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Intuitive querying of the Intuitive querying of the CLEF repositoryCLEF repository

Page 11: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

What does the CLEF What does the CLEF database providedatabase provide

Evidence from about 20,000 patient records, comprising 3.5 million record components (about 5GB of data). These are all in the area of cancer.162 queriable fieldsvarious text-only records (non-queriable)Two types of data:

StructuredExtracted from narratives by IE

Queriable data is encoded according to various medical terminologies (SNOMED, ICD, UMLS)There are approximately 19,500 different medical codes currently used in the database (a relatively small subset of SNOMED and ICD)

Page 12: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Queriable dataQueriable dataStructured data:

Demographics: Age, gender, postal district, ethnical group, occupation

Laboratory findings:32 types of haematology findings51 types of chemistry findingsCytology reportsHistopathology reports

Imaging studies:Radiology procedure, site, diagnosis, morphology, topography, report, indication, department

Treatments:Prescription drugsChemotherapy protocolIV chemotherapyRadiotherapySurgical procedures

DiagnosesClinical diagnosisCause(s) of death

Data extracted from narratives

Page 13: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Query interface Query interface requirementsrequirements

Designed for:casual and moderate users, who are familiar with the semantic domain of the repository but not with its technical implementationTypically clinicians or medical researchers

Should be able to:Allow the construction of complex queries with nested structures and temporal expressionsMinimise the risk of ambiguitiesOffer good coverage of the data types in the CLEF database

Should be used with:Minimal trainingNo prior knowledge of medical terminologies, formal querying languages, databases

Page 14: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Typical queriesTypical queries“How many patients with AML have had a normal count after two

cycles of treatment?”“ How many patients with primary breast cancer have relapsed in

the last five years? ”“ What is the median time between first drug treatment for

metastatic breast cancer and death? ”“ In breast cancer patients, what is the incidence of lymphoedema

of the arm that persists more than two years after primary surgical treatment? ”

“ What is the average number of x-rays for patients with prostate cancer? ”

“ What is the average time between first treatment for cervical cancer and death for patients aged less than 60 at death compared with those aged over 60? ”

“How many patients between the ages of 40 and 60 when they were first diagnosed with lung cancer had a platelet count higher than 300 but a white cell count lower than 3 before the 4th cycle of any course of chemotherapy they received during treatment? ”

Page 15: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Querying alternativesQuerying alternativesSQL:

Not appropriate for the typical CLEF userRequires deep knowledge of the database structure and content, medical terminologies used in the database

Graphical interfaces:Have to cope with large number of parametersNested structures and temporal restrictions are difficult to express

Natural Language interfaces:More natural and more expressive than formal querying languages, but…

Sensitive to errors in composition, spelling, vocabularyNormally understand only a subset of natural languageComplex queries are difficult to processIt is difficult to trace the source of errors in the result

Page 16: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

The CLEF approachThe CLEF approach

Similar to Natural Language interfaces, however the user edits the conceptual meaning of a query instead of its surface textAllows users to easily construct non-ambiguous queriesGuides the users towards constructing correct queries only (queries compatible with the content of the database)It is semi-database independent but very domain specificBased on the Conceptual Authoring (aka WYSIWYM) technique (Power and Scott, 1998)The query is presented to the user as an interactive text, and it is edited by making selections on various components of the queryEach selection triggers a text re-generation process which results into a new feedback text containing the selection the user made

Page 17: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Query editingQuery editing

Page 18: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Modelling queriesModelling queriesThere are 4 distinct sections of a query:

A description of the subjects (in terms of demographics information and basic diagnosis)A description of treatments that the subjects receivedA description of laboratory findingsAn outcome section (what do we want from the group of patients we have just described)

Each query element can be expressed as a conjunction or disjunction of same-type query elements, e.g.,:

Cancer of the breast and of the lungPatients who received chemotherapy and radiotherapy

Some query elements can be temporally related to each other, e.g.,:

Patients who received chemotherapy within 5 months of surgeryPatients alive 5 years after the diagnosis

Page 19: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Constraining user choicesConstraining user choices

At each step, users are only given correct choices

Choices are context dependentPatients diagnosed with [some cancer] in [some body part]User selects [some cancer] => “squamous cell carcinoma”The interface restricts the choices available for [some body part] to those sites where squamous cell carcinoma can develop

Page 20: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Dealing with ambiguitiesDealing with ambiguities

Once a query is constructed, there is only one way it can be interpreted – there is no disambiguation task to be performed

… but users may be misled into constructing a different query than they intend to

Page 21: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Answer generationAnswer generationThe answer set consists of an age/gender breakdown of the patients that fulfil the query requirementsEach additional clinical feature is combined with the age/gender breakdown to provide more detailed information3 types of rendering:

TextChartsTable

Page 22: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

EvaluationEvaluation

Research questions:Can the WYSIWYM query formulation method be easily learned by users of CLEF?Is it easier to formulate CLEF queries in SQL or with the WYSIWYM query formulation method?Are the interactive feedback texts ambiguous?

Page 23: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Evaluation results show Evaluation results show that…that…

The CLEF Conceptual Authoring query interface works!

The method is easily acquired.

Investigation shows that it is much easier to use than current alternatives (viz. SQL).

The feedback texts tend to be easily understood

It is a viable solution to the querying the CLEF repository.

However ….

Page 24: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Unresolved issuesUnresolved issues

Are the queries we currently support really the ones users will want to ask? Does the query interface provide sufficient data coverage?

Page 25: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Generating reports from Generating reports from the CLEF repositorythe CLEF repository

Page 26: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

The contextThe context

We aim at generating reports from the data-encoded Electronic Patient RecordsOur reports are aimed at clinicians for use at the point of careVarious types of report work on the same input (roughly the same content) but express information from different viewpointsWe address the problem of conceptual restatement in generating summarised reports

Page 27: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Typical inputTypical input

000abnormality 4574572342512002

3320133511

050metastatic

lymphnode count

4494492342511996

3320133511

00oestrogen receptor +ve

invasive tubular adeno

00BRCA1 +ve

1 -1cancer 4494492342511993

3320133511

00oestrogen receptor +ve

invasive tubular adeno

00BRCA1 +ve

-1stage1 cancer 4494492342511989

3320133511

000abnormality 4464462342511984

3320133511

000enlargement 4464462342511982

3320133511

000enlargement 4464462342511980

3320133511

000lymphadenopathy

4464462342511979

3320133511

000recurrent

cancer 4464462342511978

3320133511

000abnormality 4464462342511959

3320133511

000abnormality 4464462342511955

3320133511

00oestrogen receptor +ve

invasive tubular adeno

00BRCA1 +ve

-1cancer 4434432342511948

3320133511

000cancer 2872872342511944

3320133511

000cancer 1311312342511940

3320133511

00oestrogen receptor +ve

invasive tubular adeno

05.8BRCA1 +ve

-1primary cancer 12342511936

3320133511

NodesInvolved

NodesCounted

TumourMarkerHistologyGrademmSizeGenotype

ClinicalCourse

ExistenceStatusNameEventEnd

DateEventStartDateIDSimID

Problems

~200

unsuccessful completed relapse treatment package 34729823425124753320133512

completed chemotherapy cycle 22522523425123833320133512

completed chemotherapy cycle 22422423425123823320133512

completed packed red cell transfusion 22222223425123813320133512

deferred chemotherapy cycle 22222223425123793320133512

completed chemotherapy cycle 22122123425123783320133512

completed packed red cell transfusion 21921923425123773320133512

deferred chemotherapy cycle 21921923425123753320133512

completed chemotherapy cycle 21721723425123733320133512

completed chemotherapy cycle 21621623425123503320133512

completed radiotherapy cycle 21421423425123493320133512

completed radiotherapy cycle 21321323425123483320133512

completed chemotherapy course 22521823425123173320133512

completed radiotherapy course 21521123425123163320133512

incomplete excision completed radical mastectomy 19719723425122903320133512

successful completed primary treatment package 20519723425122873320133512

started hormone anatagonist therapy 044923425119973320133512

complete excision completed lumpectomy 44944923425119903320133512

successful completed primary treatment package 45744923425119873320133512

OutcomeStatusNameEventEndDateEventStartDateIDSimID

Interventions

15

completed examination 46546523425120313320133511

completed testing 46546523425120223320133511

completed Xray 46546523425120213320133511

completed examination 45745723425120133320133511

completed examination 45745723425120123320133511

completed examination 45745723425120103320133511

completed testing 45745723425120013320133511

completed Xray 45745723425120003320133511

completed excision biopsy 44944923425119943320133511

completed histopathology 44944923425119923320133511

completed excision biopsy 44944923425119913320133511

completed cancer staging 44944923425119883320133511

completed examination 44644623425119763320133511

completed examination 44644623425119743320133511

completed examination 44644623425119733320133511

completed examination 44644623425119713320133511

completed testing 44644623425119533320133511

completed Xray 44644623425119513320133511

completed Xray 44344323425119473320133511

completed Xray 28728723425119433320133511

completed Xray 13113123425119393320133511

StatusNameEventEndDateEventStartDateIDSimID

Investigations

~100

daily epirubicin 23425128183320133512

daily doxorubicin 23425124793320133512

daily 5-fluorouracil 23425123203320133512

daily cyclophosphamide 23425123193320133512

daily epirubicin 23425123183320133512

RegimeNameIDSimID

Drugs

~5

clinic mammography screening scheduled 0023425122293320133511

clinic mammography screening completed 84184123425122223320133511

clinic follow up completed 73773723425121963320133511

clinic follow up completed 63363323425121523320133511

clinic follow up completed 54554523425121083320133511

clinic follow up completed 48948923425120643320133511

clinic follow up completed 46546523425120203320133511

clinic initial treatment planning completed 44944923425119863320133511

clinic mammography screening completed 44344323425119463320133511

clinic mammography screening completed 13113123425119383320133511

LocationTypeStatusEventEndDateEventStartDateIDSimID

Consults

~10

Loci

bone metabolism 23479114143322572593

Lbrain 23479113193322572593

Llung 23479112943322572593

Rlung 23479112923322572593

brain 23479112683322572593

Raxilla 23479110903322572593

spleen 23479110723322572593

liver 23479110703322572593

abdomen 23479110653322572593

Raxillary lymphnodes 23479110623322572593

ESR concentration 23479110603322572593

Creatinine concentration 23479110583322572593

Alkaline Phosphatase concentration 23479110563322572593

Bilirubin concentration 23479110543322572593

GGT concentration 23479110523322572593

platelet count 23479110503322572593

leucocyte count 23479110483322572593

haemoglobin concentration 23479110463322572593

blood 23479110443322572593

chest 23479110423322572593

Rbreast 23479110363322572593

LateralityNameIDSimID

~20 ~600

2342511955PROBLEM HAS_FINDING 2342511953INVESTIGATION 3320133511

2342511954LOCUS HAS_TARGET 2342511953INVESTIGATION 3320133511

2342511950CONSULT RECOMMENDED_BY 2342511953INVESTIGATION 3320133511

2342511936PROBLEM INDICATED_BY 2342511953INVESTIGATION 3320133511

3320133511PATIENT HAS_LOCUS 2342511952LOCUS 3320133511

2342511950CONSULT RECOMMENDED_BY 2342511951INVESTIGATION 3320133511

2342511936PROBLEM INDICATED_BY 2342511951INVESTIGATION 3320133511

2342511985CONSULT ARRANGED 2342511950CONSULT 3320133511

2342511937LOCUS HAS_LOCUS 2342511948PROBLEM 3320133511

2342511948PROBLEM HAS_FINDING 2342511947INVESTIGATION 3320133511

2342511949CONSULT ARRANGED 2342511946CONSULT 3320133511

2342511937LOCUS HAS_LOCUS 2342511944PROBLEM 3320133511

2342511944PROBLEM HAS_FINDING 2342511943INVESTIGATION 3320133511

2342511937LOCUS HAS_TARGET 2342511943INVESTIGATION 3320133511

2342511945CONSULT ARRANGED 2342511942CONSULT 3320133511

2342511937LOCUS HAS_LOCUS 2342511940PROBLEM 3320133511

2342511940PROBLEM HAS_FINDING 2342511939INVESTIGATION 3320133511

2342511937LOCUS HAS_TARGET 2342511939INVESTIGATION 3320133511

2342511941CONSULT ARRANGED 2342511938CONSULT 3320133511

3320133511PATIENT HAS_LOCUS 2342511937LOCUS 3320133511

2342511937LOCUS HAS_LOCUS 2342511936PROBLEM 3320133511

Item2IDItem2TypeRelationItem1IDItem1TypeSimID

Relations

Page 28: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Why are textual reports Why are textual reports needed?needed?

Clinicians and other health professionals use patient health summaries at the point of care, where time is a critical resource

Reports provide quick access to an overview of a patient’s medical history

Typically, an electronic patient record contains around 1000 messagesEven structured, this volume of data is very largeAccess to relevant information about particular patients is difficult

Textual reports:are easy to read and understandcan be customised to the type of information neededprovide a quick way of identifying errors in the patient recordalleviate the need to know in detail the structure of the underlying database

Page 29: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Why are paraphrases Why are paraphrases needed?needed?

Alternative views of the patient record, i.e., Reports from various viewpoints:

Full chronological reportsSummaries of investigations, interventions, treatmentsSame content, different textual representation

Potted summaries also important (30-second overview of patient’s history)

Page 30: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Content selectionContent selection•Two notions:

•Spine events: the main concepts in the summary (depending on user-defined type of summary)•Skeleton events: linked to the spine by various relations

•Basic procedure:•Step 1: group linked events into clusters and remove small clusters

•Typically, a small number of very large clusters and a small number of small clusters•Small clusters are assumed not to be related to the main topic of the summary

•Step 2: Identify spine events according to the type of summaryLongitudinal, Investigations, Interventions, Problems

•Step 3: Identify the skeleton events If (“problem is spine event” and “investigation has_indication problem”) then select investigation (unless already selected)Repeat step 2 a certain number of times (given by a threshold parameter)

Page 31: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Spine of Problem eventsSpine of Problem events

Page 32: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

pain

cancer

breast

radiotherapy cycle

Hyperbaric oxygenation

radiotherapy

lump

mammogram

biopsy

cancer

ulcer

Problem

The patient identifies pain in the left breast. A lump in the breast is found through a mammogram.

A biopsy performed on the breast reveals cancer in the left breast. The patient receives radiotherapy to treat the cancer. Skin ulceration develops in the left breast as a result of radiotherapy, which is treated with hyperbaric oxygenation.

Page 33: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

pain

breastradiotherapy

cycle

Hyperbaric oxygenation

radiotherapy

lump

mammogram

biopsycancer

ulcer

Interventions

Radiotherapy on the breast is initiated to treat cancer in the breast. A first radiotherapy cycle is performed.

The radiotherapy causes skin ulceration. The patient receives hyperbaric oxygenation to treat the ulcer.

Page 34: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

pain breast

radiotherapy cycle

Hyperbaric oxygenation

radiotherapy

lump

mammogram

biopsy

cancer

ulcer

Investigations

A mammogram is performed because of pain in the left breast, which identifies a lump in the breast. A biopsy of the lump identifies cancer in the left breast.

Page 35: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

pain

cancer

breast

radiotherapy cycle

Hyperbaric oxygenation

radiotherapy

lump

mammogram

biopsy

cancer

ulcer

pain

breastradiotherapy

cycle

Hyperbaric oxygenation

radiotherapy

lumpmammogram

biopsy cancer

ulcer

Interventions

Problem

pain breast

radiotherapy cycle

Hyperbaric oxygenation

radiotherapy

lump

mammogram

biopsy

cancer

ulcer

Investigations

Page 36: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Discourse structuringDiscourse structuringMostly given by relations in the EPR19 different types of relations, which can be:

Attributive: Problem has_locus LocusRhetorical: Problem caused_by Intervention

Attributive relations do not contribute to the discourse structure

In a first step, events linked through attributive relations are combined:

Message_Problem+Message_Locus =>Message_Problem_Locus

Messages are grouped according to type of summary:

Longitudinal: events occurring in the same week should be grouped together and further grouped into yearsLogical: arrange chronologically and then group similar events (e.g., liver panels, screening consults)

Page 37: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Discourse structuringDiscourse structuring

Within each group:link messages by discourse relations inferred from EPR relations: Cause, Result, Sequenceassume a List relation if no relation specified

Between groups: If all events in one group are linked to events in another group by some EPR relation, link groups through the corresponding discourse relationOtherwise, assume a List relation

Page 38: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

AggregationProblems:

Problem_1:name HAS_LOCUS Locus_1Problem_2:name HAS_LOCUS Locus_2

Enlargement of the liver + Enlargement of the spleen => Enlargement of the liver and/but not of the spleen

Investigations:Investigation_1:name HAS_INDICATION Problem_1

HAS_LOCUS Locus_1Investigation_2:name HAS_INDICATION Problem_2

HAS_LOCUS Locus_2

Examination of the abdomen revealed no enlargement of the liver

Examination of the lymphnodes revealed no lymphadenopathy => Examination revealed no enlargement of the liver and no

lymphadenopathy

Text structuringText structuring

Problem_3 HAS_LOCUS {Locus_1, Locus_2}

Investigation_3 HAS_INDICATION {Problem_1, Problem_2}

Page 39: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

AggregationInterventions

Intervention_1 PART_OF Intervention_0Intervention_2 PART_OF Intervention_0

[ID01]Chemotherapy cycle PART_OF [ID0]Chemotherapy[ID02]Chemotherapy cycle PART_OF [ID0]Chemotherapy[ID03]Chemotherapy cycle PART_OF [ID0]Chemotherapy

3 chemotherapy cycles

EllipsisExamination of the left breast revealed no recurrent cancer in

the left breast =>Examination of the left breast revealed no recurrent cancer

Text structuringText structuring

{count} Intervention_1

Page 40: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Text structuringText structuring

Events can be compacted according to domain-specific rules:

Clinical examination is: examination of the liver, examination of the spleen, examination of the abdomen

Clinical examination was normalClinical examination was normal apart from an enlargement of the spleenClinical examination revealed enlargement of the spleen

Liver panel is: billirubin concentration, ESR concentration, GCT concentration

The liver panel was in the normal range (apart from a very high level of GCT)

Page 41: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Maintaining the thread of Maintaining the thread of discoursediscourse

Textual representation should reflect the relative importance of eventsAt discourse level: spine concepts are preferably realised in nuclear units and skeleton events in satellite unitsAt sentence level: spine events are assigned salient syntactical rolesThe status of an event of being on the spine or on the skeleton determines its realisation as a sentence, a main or subordinate clause, phrase

Page 42: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Typical output of the NL generatorTypical output of the NL generatorYear 1

Week 0 A mammography screening was scheduled at the clinic. Week 1 Primary cancer of the right breast; histopathology: invasive tubular adenocarcinoma.

YEAR 2Week 131 Xray revealed no cancer of the right breast.

YEAR 5Week 287 Xray revealed no cancer of the right breast.

YEAR 8Week 443 Xray revealed cancer of the right breast. Week 446 Examination (indicated by primary cancer of the right breast) revealed no enlargement of the liver or of the spleen, no recurrent cancer of the right breast and no lymphadenopathy of the right axillary lymphnodes. Testing (indicated by primary cancer of the right breast) revealed no abnormality of the haemoglobin concentration and no abnormality of the leucocyte count. An Xray (indicated by primary cancer of the right breast) was performed. Very high level of the ESR concentration. Very high level of the Creatinine concentration. Very high level of the Alkaline Phosphatase concentration. Very high level of the Bilirubin concentration. Very high level of the GGT concentration. No abnormality of the platelet count.

Week 449 An initial treatment planning was completed at the clinic. Excision biopsy revealed no metastatic lymphnode count of the right axilla. Histopathology revealed primary cancer of the right breast. Cancer staging revealed stage1 cancer. Hormone anatagonist therapy was started to treat primary cancer of the right breast. Lumpectomy was performed on the breast to treat primary cancer of the right breast. Primary treatment package was started to treat primary cancer of the right breast.

………………….

YEAR 17Week 893 Xray revealed no cancer of the right breast.

Long chronological report

Page 43: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Typical output of the NL generatorTypical output of the NL generator

Focus on Problems

In week 0, the patient is diagnosed with primary cancer of the right breast, histopathology: invasive tubular adenocarcinoma.

In weeks 131 and 287 Xray revealed no cancer of the right breast.

In week 446, there was no enlargement of the liver or of the spleen, no recurrent cancer of the right breast and no lymphadenopathy of the right axillary lymphnodes revealed by examination. There was no abnormality of the haemoglobin concentration or of the leucocyte count, no abnormality of the platelet count, very high level of the GGT concentration, of the Bilirubin concentration, of the Alkaline Phosphatase concentration, of the Creatinine concentration or of the ESR concentration.

In week 449, excision biopsy revealed no metastatic lymphnode count of the right axilla. Histopathology revealed primary cancer of the right breast. Lumpectomy was performed on the right breast. Hormone anatagonist therapy was initiated to treat primary cancer of the right breast.

In weeks 457 to 737, there was no enlargement of the liver or of the spleen, no recurrent cancer of the right breast and no lymphadenopathy of the right axillary lymphnodes. There was no abnormality of the haemoglobin concentration or of the leucocyte count, no abnormality of the platelet count, very high level of the GGT concentration, of the Bilirubin concentration, of the Alkaline Phosphatase concentration, of the Creatinine concentration and of the ESR concentration.

In weeks 457 to 893, Xray revealed no cancer of the right breast.

Compact reports

Focus on Interventions

In week 0, the patient is diagnosed with primary cancer of the right breast, histopathology: invasive tubular adenocarcinoma.

In week 449, excision biopsy revealed no metastatic lymphnode count of the right axilla. Histopathology revealed primary cancer of the right breast. Lumpectomy was performed on the right breast. Hormone anatagonist therapy was started to treat primary cancer of the right breast.

Focus on InvestigationsIn week 0, the patient is diagnosed with primary cancer of the right breast, histopathology: invasive tubular adenocarcinoma.

In weeks 131 and 287 Xray revealed no cancer of the right breast.

In week 446, examinations revealed no enlargement of the liver or of the spleen, no recurrent cancer of the right breast and no lymphadenopathy of the right axillary lymphnodes. Testing revealed no abnormality of the haemoglobin concentration or of the leucocyte count, no abnormality of the platelet count, very high level of the GGT concentration, of the Bilirubin concentration, of the Alkaline Phosphatase concentration, of the Creatinine concentration or of the ESR concentration.

In week 449, excision biopsy revealed no metastatic lymphnode count of the right axilla. Histopathology revealed primary cancer of the right breast.

In weeks 457 to 737, examinations revealed no enlargement of the liver or of the spleen, no recurrent cancer of the right breast and no lymphadenopathy of the right axillary lymphnodes. Testing revealed no abnormality of the haemoglobin concentration or of the leucocyte count, no abnormality of the platelet count, very high level of the GGT concentration, of the Bilirubin concentration, of the Alkaline Phosphatase concentration, of the Creatinine concentration and of the ESR concentration.

In weeks 457 to 893, Xray revealed no cancer of the right breast

Page 44: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Ongoing work on report Ongoing work on report generationgeneration

Add domain-specific knowledge to improve content selection

Some events are become important depending on context

Change the (sub-)domain Test if the generation method is easily portable

Link NLG to IR to improve IRProduce reports for patients

Page 45: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Summary and ConclusionsSummary and Conclusions

CLEF is now entering the integration phase, moving towards testing and deploymentMajor emphases at this point are on privacy and securityInforming patients a major thread for future work.Integrating IE and NLG

Page 46: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Thank You!

Collaborators:Catalina HallettRichard Power

Page 47: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.
Page 48: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Evaluation procedureEvaluation procedure

Subjects:We tested the performance of 15 subjects.Subjects had a range of expertise in the CLEF domain -- from expert (oncologist) to novice (computer scientist), but most subjects had some medical training.Subjects had no previous experience with the CLEF WYSIWYM query interface, but most were aware of its fundamental principles.

Methodology:Subjects were given a set of four fixed queries to formulate using the CLEF WYSIWYM query interface. The queries were expressed in language as different as possible from the language in the query interface.Each subject received the queries in a different order.

Page 49: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Evaluation – data analysisEvaluation – data analysis

We recorded the time taken to compose each query. the number of operations used for constructing a query and compared it with the optimal number of operations (pre-computed).

We analysed whether performance, as indicated by

SpeedEfficiency

improves with training (experience).

Page 50: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Evaluation resultsEvaluation resultsTime to completionTime to completion

Subjects’ performance improved dramatically with experience.

After their first experience of composing a query, subjects’ completion time halved, and asymptotes at that level.

Time to completion

0

1

2

3

4

5

6

7

1 2 3 4

Order of queryT

ime

(min

s)

Page 51: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Evaluation resultsEvaluation resultsPerformance over time:Performance over time: performance performance

normalised over complexitynormalised over complexity

After just one go with the CLEF interface, subjects are highly proficient in their ability to compose complex queries.By the time they get to their fourth query, subjects’ performance is almost perfect.

Operations

0

0.1

0.2

0.3

0.4

0.5

1 2 3 4

Order of query

(to

tal -

op

tim

al /o

pti

mal

)

Mean : 0.18

Optimal operation = min # of operations needed to compose the query perfectly.

This is a measure of the complexity of the query.

Page 52: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Evaluation – comparison Evaluation – comparison with SQLwith SQL

Very small scale experimentTwo subjects:

with expert knowledge of the structure, organisation and content of the CLEF databasehighly skilled users of SQLwith minimal experience with WYSIWYMwere given access to the SNOMED and ICD codes required to build the SQL

Each subject composed a query first in the CLEF WYSIWYM Interface and then in SQL

Page 53: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Evaluation – comparison Evaluation – comparison with SQLwith SQL

0

2

4

6

8

10

12

Subject 1 Subject 2

WYSIWYM

SQL

Subject 1 – Query 1WYSIWYM: 2.3 minsSQL: 8.5 mins (incomplete)

Subject 2 – Query 2WYSIWYM: 4.5 minsSQL:12 mins (incomplete)

Even with a slowly reacting interface, the subjects were much faster composing queries in WYSIWYM than in SQL

Page 54: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Are the feedback texts Are the feedback texts ambiguous to the usersambiguous to the users

Identified 6 types of ambiguity4 examples of each, with forced-choice judgements by 15 subjectsRandom jugements would give a score of 33%Results show 84% correct judgements

Page 55: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

summary patient records

for clinicians and medical researchers

repository summarisation

for patientssummary patient records

linear text animated dialogue

hypertext

Page 56: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Sample report for CliniciansSample report for Clinicians

In the weeks 195 to 196, self examination revealed lump of the right breast.

In week 197, self examination revealed lump of the right breast. Excision biopsy revealed metastatic lymphnode count of the right axilla. Histopathology revealed cancer of the right breast. Cancer staging revealed stage2 cancer. Radical mastectomy was performed on the breast to treat the primary cancer. The patient was diagnosed with metastatic lymphnode count of the right axilla; 19 nodes involved out of 24. The patient was diagnosed with metastatic cancer of the right axilla; histopathology: invasive undifferentiated adenocarcinoma. The patient was diagnosed with cancer of the right breast; histopathology: invasive undifferentiated adenocarcinoma. The patient was diagnosed with stage2 cancer; histopathology: invasive undifferentiated adenocarcinoma. Primary treatment package was initiated to treat primary cancer of the right breast.

Page 57: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Sample report for CliniciansSample report for Clinicians

In the weeks 195 to 196, self examination revealed lump of the right breast.

In week 197, self examination revealed lump of the right breast. Excision biopsy revealed metastatic lymphnode count of the right axilla. Histopathology revealed cancer of the right breast. Cancer staging revealed stage2 cancer. Radical mastectomy was performed on the breast to treat the primary cancer. The patient was diagnosed with metastatic lymphnode count of the right axilla; 19 nodes involved out of 24. The patient was diagnosed with metastatic cancer of the right axilla; histopathology: invasive undifferentiated adenocarcinoma. The patient was diagnosed with cancer of the right breast; histopathology: invasive undifferentiated adenocarcinoma. The patient was diagnosed with stage2 cancer; histopathology: invasive undifferentiated adenocarcinoma. Primary treatment package was initiated to treat primary cancer of the right breast.

Page 58: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Sample report for PatientsSample report for Patients

You had a consultation with your doctor on September 20th 1993.

On September 27th you did a self examination and you found that you had a lump in your right breast. A self examination is an examination of the breasts by running your hand over each breast and up under your arms and checking for changes to their size, shape or feel.

On October 4th you did another self examination and you found that you still had a lump in your right breast.

On October 11th you had a radical mastectomy to treat cancer in your right breast. A radical mastectomy is an operation to remove the breast, along with the lymph glands under the arm and the muscles of the chest wall. Cancer is a tumour that tends to spread, both locally and to other parts of the body.

Page 59: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Cancer is a tumour that tends to spread, both locally and to other parts of the body.

You had a consultation with your doctor on September 20th 1993.

On September 27th you did a self examination.

A self examination is an examination of the breasts by running your hand over each breast and up under your arms and checking for changes to their size, shape or feel.

On October 4th you did another self examination.

you found that you had a lump in your right breast.

On October 11th you had a radical mastectomy.

to treat cancer in your right breast.

A radical mastectomy is an operation to remove the breast, along with the lymph glands under the arm and the muscles of the chest wall.

SEQUENCE

SEQUENCE

HAS-FINDING

SEQUENCE

MOTIVATION

EXPLANATION

EXPLANATION

EXPLANATION

PresentingPresentingpatient patient recordsrecordsin hypertext:in hypertext:dividing the dividing the text intotext intorelated unitsrelated units

Page 60: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Cancer is a tumour that tends to spread, both locally and to other parts of the body.

You had a consultation with your doctor on September 20th 1993.

On September 27th you did a self examination.

SEQUENCE

A self examination is an examination of the breasts by running your hand over each breast and up under your arms and checking for changes to their size, shape or feel.

On October 4th you did another self examination.

SEQUENCE

you found that you had a lump in your right breast.HAS-FINDING

SEQUENCE

On October 11th you had a radical mastectomy.

to treat cancer in your right breast. MOTIVATION

A radical mastectomy is an operation to remove the breast, along with the lymph glands under the arm and the muscles of the chest wall.

EXPLANATION

EXPLANATION

EXPLANATION

PresentingPresentingpatient patient recordsrecordsin hypertext:in hypertext:giving giving graphical graphical attributes to attributes to the text unitsthe text units

Page 61: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

you found that you had a lump in your right breast.

The radical mastectomy was done to treat cancer in your right breast.

You had a consultation with your doctor on September 20th 1993.

On September 27th you did a self examination.

On October 4th you did another self examination.

A self examination is an examination of the breasts by running your hand over each breast and up under your arms and checking for changes to their size, shape or feel.

On October 11th you had a radical mastectomy.

PresentingPresentingpatient recordspatient recordsin hypertext:in hypertext:using animation using animation to represent to represent discourse discourse patternspatternsdynamicallydynamically

Cancer is a tumour that tends to spread, both locally and to other parts of the body.

A radical mastectomy is an operation to remove the breast, along with the lymph glands under the arm and the muscles of the chest wall.

Page 62: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

You had a consultation with your doctor on September 20th 1993.

Page 63: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

You had a consultation with your doctor on September 20th 1993.

On September 27th you did a self examination.

Page 64: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

You found that you had a lump in your right breast.

You had a consultation with your doctor on September 20th 1993.

On September 27th you did a self examination.

Page 65: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

You had a consultation with your doctor on September 20th 1993.

On September 27th you did a self examination.

You found that you had a lump in your right breast.

A self examination is an examination of the breasts by running your hand over each breast and up under your arms and checking for changes to their size, shape or feel.

Page 66: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

A self examination is an examination of the breasts by running your hand over each breast and up under your arms and checking for changes to their size, shape or feel.

You had a consultation with your doctor on September 20th 1993.

On September 27th you did a self examination.

You found that you had a lump in your right breast.

On October 4th you did another self examination.

Page 67: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

A self examination is an examination of the breasts by running your hand over each breast and up under your arms and checking for changes to their size, shape or feel.

You had a consultation with your doctor on September 20th 1993.

On September 27th you did a self examination.

You found that you had a lump in your right breast.

On October 4th you did another self examination.

On October 11th you had a radical mastectomy.

Page 68: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

A self examination is an examination of the breasts by running your hand over each breast and up under your arms and checking for changes to their size, shape or feel.

You had a consultation with your doctor on September 20th 1993.

On September 27th you did a self examination.

You found that you had a lump in your right breast.

On October 4th you did another self examination.

On October 11th you had a radical mastectomy.

A radical mastectomy is an operation to remove the breast, along with the lymph glands under the arm and the muscles of the chest wall.

Page 69: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

The radical mastectomy was done to treat cancer in your right breast.

A self examination is an examination of the breasts by running your hand over each breast and up under your arms and checking for changes to their size, shape or feel.

You had a consultation with your doctor on September 20th 1993.

On September 27th you did a self examination.

You found that you had a lump in your right breast.

On October 4th you did another self examination.

On October 11th you had a radical mastectomy.

A radical mastectomy is an operation to remove the breast, along with the lymph glands under the arm and the muscles of the chest wall.

Page 70: Applying Natural Language Generation to Electronic Health Records in an e-Science context Donia Scott Centre for Research in Computing The Open University.

Monologues/DialoguesMonologues/Dialogues

Monologue Autonomous agent reads the generated reportAims: accessibility, education (not translation)

Dialogue Report is generated as a script that 2 agents act outAims: accessibility, vicarious learningExample (video clip)