Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical...

28
Evidentiality and Epistemicity in a Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers Corpus of Scientific Biomedical Papers from the British Medical Journal. from the British Medical Journal. A focus on “evidence” and “cause/s” A focus on “evidence” and “cause/s” *I. Riccioni, *R. Bongelli, *C. Canestrari, *C. Buldorini, **R. Pietrobon, *Andrzej Zuczkowski * University of Macerata (Italy) ** Duke University, Durham, North Carolina (USA) ECitS Conference, 5-7 September 2012 University of Kent, Canterbury, UK

Transcript of Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical...

Page 1: Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical Journal. A focus on “evidence” and “cause/s” *I. Riccioni,

Evidentiality and Epistemicity in a Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers Corpus of Scientific Biomedical Papers from the British Medical Journal. from the British Medical Journal. A focus on “evidence” and “cause/s”A focus on “evidence” and “cause/s”

*I. Riccioni, *R. Bongelli, *C. Canestrari, *C. Buldorini, **R. Pietrobon, *Andrzej Zuczkowski

* University of Macerata (Italy)

** Duke University, Durham, North Carolina (USA)

ECitS Conference, 5-7 September 2012

University of Kent, Canterbury, UK

Page 2: Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical Journal. A focus on “evidence” and “cause/s” *I. Riccioni,

INTRODUCTIONINTRODUCTION

As researchers, we analyse linguistic communication, mostlythrough a qualitative and quantitative analysis of the syntactic, semantic, and pragmatic levels.Our theoretical and methodological background integrates aspects from

Conversational Analysis (interruptions, overlaps, negotiation, politeness ,etc.);

Discourse Analysis (speech acts, for example giving advice in trouble talk contexts, etc)

Text Theory (in particular, J. S. Petoefi ‘s structural model of communication.)

Page 3: Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical Journal. A focus on “evidence” and “cause/s” *I. Riccioni,

We have been working for several years on different types of oral (recorded and transcribed) corpora (naturally occuring conversations, political discourses, humorous interactions, doctor-patient dialogues, psychoterapeutic sessions, etc).

We have been working also on the communication of certainty and uncertainty in different types of written texts (academic, biomedical, literary and so on)

In 2009 we got involved in the project titled A Corpus of Scientific Biomedical Texts Spanning over 168 years annotated for Uncertainty with an American colleague from the Duke University of North Caroline Professor Ricardo Pietrobon who is a surgeon interested in “research on research” and “scientific writing” http://goo.gl/zTBPI, https://sites.google.com/site/biouncertainty/ .

Page 4: Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical Journal. A focus on “evidence” and “cause/s” *I. Riccioni,

The communication of Uncertainty in a The communication of Uncertainty in a corpus of scientific biomedical texts corpus of scientific biomedical texts spanning over 168 years spanning over 168 years

Aims:◦identify lexical and morphosyntactic markers of uncertainty and their linguistic scope in a corpus of 80 papers randomly selected from BMJ from 1840 to 2007 and

◦detect their trends over time.

Page 5: Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical Journal. A focus on “evidence” and “cause/s” *I. Riccioni,

LITERATURE BACKGROUNDLITERATURE BACKGROUND

The topic of certainty/uncertainty in communication is related, more or less directly, to what in linguistic literature is called epistemicity and evidentiality (and with related topics/concepts such as subjectivity, modality and hedging or mitigation)

This area of study has attracted a great deal of interest over the past three decades or so, inevitably resulting in a multitude of terms and conflicting definitions (see Dendale and Tasmowski 2001).

Page 6: Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical Journal. A focus on “evidence” and “cause/s” *I. Riccioni,

EPISTEMICITY EPISTEMICITY

It refers to those linguistic markers that, according to different authors, reveal speaker’s/writer’s:

◦ attitude regarding the reliability of the information (e.g. Dendale and Tasmowski 2001, González 2005)

◦ judgment of the likelihood of the proposition (e.g. Nuyts 2001b, Plungian 2001, Cappelli 2007, Cornillie 2007)

◦ commitment to the truth of the message (e.g. Sanders and Spooren 1996, De Haan 1999, González 2005)

Page 7: Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical Journal. A focus on “evidence” and “cause/s” *I. Riccioni,

A piece of information is communicated as certain when the speaker’s/writer’s commitment to its truth is at the maximum or high level, such as in the example (1) “These workers showed that there is an inverse correlation between the height of the hyperbilirubinaemia and the amount of bile excreted in the faecese” (Aethiology of physiological jaundice of the newborn, 1951) (2) “All the ill effects of ruptured perineum and prolapsus uteri are relieved with certainty by a simple plastic operation” (Vesico-vaginal and rectovaginal fistula, 1861)

Page 8: Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical Journal. A focus on “evidence” and “cause/s” *I. Riccioni,

A piece of information is communicated as uncertain when the speaker’s/writer’s commitment to its truth is at the minimum or low level, such as in the example (3) “the evidence suggests that it is not likely to have been wrong in more than a small proportion” (Lung Cancer, 1956)

(4) “Perhaps, however, the strongest proof of the importance of local rest is furnished by those cases in which a pleural effusion has occurred on the affected side.” (On the importance of rest in the treatment of acute phthisis, )

Page 9: Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical Journal. A focus on “evidence” and “cause/s” *I. Riccioni,

EVIDENTIALITY EVIDENTIALITY

With the term evidentiality, scholars generally refer to the coding of

◦ sources of information and ◦ modes of knowing

(Chafe 1986, Nuyts 2001a, 2001b, Plungian 2001, Cornillie 2007, Papafragou et al. 2007)

i.e. the linguistic markers that reveal how speakers/writers gain access to the piece of information they communicate (Willett, 1988).

Page 10: Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical Journal. A focus on “evidence” and “cause/s” *I. Riccioni,

If a doctor says

(5) “I see a cyst”,

he explicitly communicates the information source;

though in the sentence there is no epistemic marker, the verb I see is enough to implicitly communicate Certainty.

Page 11: Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical Journal. A focus on “evidence” and “cause/s” *I. Riccioni,

EVIDENTIALITY & EPISTEMICITYEVIDENTIALITY & EPISTEMICITY

Evidentiality and epistemicity seem to be two sides of the same coin, in that:

◦ When a piece of information is communicated as (if it were) certain (epistemicity) by writers, at the same time it is also communicated as (if it were) known (evidentiality) to them (and vice versa).

◦ When a piece of information is communicated as (if it were) uncertain, at the same time it is also communicated as (if it were) believed by them (and vice versa).

Page 12: Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical Journal. A focus on “evidence” and “cause/s” *I. Riccioni,

KUB THEORYKUB THEORY

The multitude of evidential and epistemic markers (lexical and morpho-syntactic) can be led back and reduced to three main macro-markers:

I know

I do not know

I do not know whether (believe)

These reflect the three basic evidential and epistemic territories of information (adapting Kamio’s terminology (1991, 1994)) of the Known, the Unknown, and the Believed (KUB)

Page 13: Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical Journal. A focus on “evidence” and “cause/s” *I. Riccioni,

The Known is all that writers say they know (perceive, remember etc.) in a broad sense. From an epistemic viewpoint such markers communicate Certainty.

The Believed is all that writers say they do not know if/whether (impressions, opinions, suppositions etc.). From an epistemic viewpoint such markers communicate Uncertainty.

  The Unknown is when writers communicate what

they do not know, i.e. when the information is unknown to them.

Page 14: Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical Journal. A focus on “evidence” and “cause/s” *I. Riccioni,

Lexical markers Morphosyntactic markers

Known certain

verbs (I remember…)

adverbs (surely…)

verbal expressions (I have no doubt…)

declarative sentences in the present, past and future indicative with no lexical evidential or epistemic marker.

Unknown negative form of the verbs of the Known (I don’t remember…)

adjectives (unknown…)

literal questions

Believeduncertain

verbs (I suppose…)

verbal expressions (It is possible…)

adverbs (perhaps…), adjectives (likely…)

modal verbs

modal verbs in conditional and subjunctive moods

if clauses

epistemic future

Page 15: Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical Journal. A focus on “evidence” and “cause/s” *I. Riccioni,

THE PRESENT STUDYTHE PRESENT STUDY

For this conference we carried out the present pilot study on how evidence, causality and their relationships are communicated in BMJ papers (i.e. if they are communicated as certain or uncertain; in declarative or hypothetical structures, etc.) In particular, we focused on the terms

• Evidence

• Cause /causes

• Their relations

The method combined a qualitative analysis with a quantitative, the latter being performed using the WordSmith Tools version 5 (Scott 2008).

Page 16: Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical Journal. A focus on “evidence” and “cause/s” *I. Riccioni,

EVIDENCEEVIDENCE

Out of the 80 papers we extracted and analyzed all 102 fragments where the term “evidence” occurred in a sentence. Our analysis criteria included:

• types of sentence (affirmative, negative, interrogative);

• the sentence is communicated as Certain-Known, Uncertain-Believed, Unknown;

• types of evidence.

Affirmative - Certain - Direct observation

(6) “Auscultation of the chest revealed evidence of increased activity in the right upper lobe.”(The treatment of pulmonary tuberculosis by nitrogen compression, 1914)

Negative - Uncertain - Medical practice

(7) This is simply a conjecture, however, which though possible does not seem probable, and has as yet, so far as my experience goes, no evidence to support it. (The treatment of ringworm of the scalp by the x rays, 1905)

Page 17: Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical Journal. A focus on “evidence” and “cause/s” *I. Riccioni,

Types of evidence:1. direct observation: 27 (23.5%);

2. lab analysis, clinical exams, histological analysis: 25 (21.7%);

3. statistical analysis: 14 (12.2%);

4. literature review: 13 (11.3%);

5. experimental data: 11 (9.6%);

6. medical instruments: 6 (5.2%);

7. others: 5 (4.3%);

8. unqualified: 5 (4.3%);

9. epidemiological data: 3 (2.6%);

10. reasoning, inference: 3 (2.6%);

11. medical practice 3: (2.6%).

Page 18: Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical Journal. A focus on “evidence” and “cause/s” *I. Riccioni,

CAUSECAUSE

Out of the 80 papers we extracted and analysed all 103 fragments where the term “cause/s” occurred in a sentence.

The analysis criteria included:

• Types of sentence (affirmative, negative, hyphotetical, interrogative);

• The causal relations communicated as Certain-Known, Uncertain- Believed, Unknown.

Affirmative - Certain

(8) “Koch has thus added to our conviction that the bacillus is the cause of the symptoms, seeing that, as he remarks, it is impossible to suppose that an organism can develop in such enormous numbers at the expense of the vital fluid, without exerting a serious influence upon the system. “(Remarks on micro-organisms 1880)

Page 19: Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical Journal. A focus on “evidence” and “cause/s” *I. Riccioni,

Affirmative – Uncertain

(9) “When confronted with a case of this kind, we must avoid the administration of any drug likely to cause either undue contraction or relaxation of the organ. Absolute rest is the best treatment.” (The determinant of abortion and how to combat them, 1907)

Affirmative -Unknown

(10) “…the cause of this symptom is unknown, but sleep is an important factor” (Do asthmatics suffer bronchoconstriction during rapid eye movement sleep?, 1986)

Page 20: Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical Journal. A focus on “evidence” and “cause/s” *I. Riccioni,

EVIDENCE AND CAUSALITYEVIDENCE AND CAUSALITY

Out of the 80 papers we extracted 42 fragments where a relation between evidence and causality is explicit. We found 7 different types of relations:

Type 1. evidence is insufficient to establish a causal link: 14 (33%);

(11) “…Recently there has been much experimental data to show the causative relation of adrenalin to these degenerative changes, but it has not been definitely settled whether this is a direct effect or due to increased tension.” (An address of the treatment of chronic degenerative lesions of the heart and aorta, 1909)

Page 21: Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical Journal. A focus on “evidence” and “cause/s” *I. Riccioni,

Type 2. evidence establishes a causal link: 9 (21.4%);

(12) “…This was pretty conclusive evidence that the organism was the cause of the disease, and that it constituted the true infective element; because any other material that might be supposed to accompany it in the blood of the diseased animal must have been got rid of by the successive cultivations in chicken-broth.” (Remarks on micro-organism, 1880)

Type 3. there is no evidence of a causal link: 9 (21.4%);

Type 4. evidence denies a causal link: 3 (7.1%); Type 5. evidence suggests the existence of a causal

link: 3 (7.1%); Type 6. evidence shows a weak causal link: 3 (7.1%); Type 7. evidence suggests the non existence of a

causal link: 1 (2.4%).

Page 22: Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical Journal. A focus on “evidence” and “cause/s” *I. Riccioni,
Page 23: Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical Journal. A focus on “evidence” and “cause/s” *I. Riccioni,

MAIN RESULTSMAIN RESULTS

Analysis of the terms “evidence” and “cause/s” demonstrates that in the BMJ corpus they are mainly communicated as

- Certain-Known and - in affirmative way.

Out of the 11 different types of evidence we identified, the most common patterns are:

◦ direct observation 27 (23.5%);◦ lab analysis, clinical exams, histological analysis 25

(21.7%);◦ statistical analysis 14 (12.2%).

Out of the 7 different relations between evidence and causality we identified, the most common are:

◦ evidence is insufficient to establish a causal link: 14 (33%);◦ evidence establishes a causal link: 9 (21.4%);◦ there is no evidence of a causal link: 9 (21.4%).

Page 24: Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical Journal. A focus on “evidence” and “cause/s” *I. Riccioni,

CONCLUSION & FUTURE STEPSCONCLUSION & FUTURE STEPS At the end of the project we started in 2009, we will

have made a significant improvement in our knowledge about the historical evolution of the communication of certainty, uncertainty, evidence, causality and their relationships in the writing of scientific papers within a 168-year span.

We now plan on:◦ performing a qualitative analysis of the

other terms (related to evidence and causality so far, we have only identified their numerical occurrences using WordSmith Tools);

◦ verifying the significance of a trend observed in the distribution of the terms related to evidence and causality during the period we consider

Page 25: Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical Journal. A focus on “evidence” and “cause/s” *I. Riccioni,

Thanks for your attention!

Page 26: Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical Journal. A focus on “evidence” and “cause/s” *I. Riccioni,

words occurrenceseffect/s/ed 330find/found 251cause/s/ed/ing 250due to 122affect/s/ed/ion/ing 119relat/es/ed/ion/ing 109evidence 102produce/s/ing; production 100associat/s/ed/ion 91observation/s 86influence/s/ed/ing 81indicate/s; indication/s 52prove/s/ed 49depend/s/ing/ed 33relationship/s 29connection/s/ed; connextion 22proof/s 17responsible 13join/ed 5link/ed 4causative 3

Page 27: Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical Journal. A focus on “evidence” and “cause/s” *I. Riccioni,
Page 28: Evidentiality and Epistemicity in a Corpus of Scientific Biomedical Papers from the British Medical Journal. A focus on “evidence” and “cause/s” *I. Riccioni,

Results :• Preliminary results on the corpus data show that

there isn’t a significant difference in the use of the different uncertainty markers along the years.

• The results of the NLP experiments show that most of the Uncertainty markers can be recognized with good accuracy (Bongelli et. al 2012a; Bongelli et. al 2012b);

At the moment, we are working on these results and on theidentification of the scope of the Uncertainty markers. In their grammar Quirk et al (1985) define this word as

“…the general term that we shall use to describe the

semantic ‘influence’ which such words have on neighbouring parts of a sentence. It deserves attention because of its close connection with the ordering of elements.”