An Evidence Perspective on Topical Relevance Types & Its Implications for Task-based Retrieval...

49
An Evidence Perspective on Topical Relevance Types & Its Implications for Task-based Retrieval Xiaoli Huang Dagobert Soergel College of Information Studies, University of Maryland

Transcript of An Evidence Perspective on Topical Relevance Types & Its Implications for Task-based Retrieval...

An Evidence Perspective on Topical Relevance Types &

Its Implications for Task-based Retrieval

Xiaoli HuangDagobert Soergel

College of Information Studies, University of Maryland

Outline

• Introduction

• Method

• Findings: A fine-grained classification of topical relevance types

• Applications: Task-based retrieval

• Outlook

Introduction

Relevance

User-defined Relevance

novelty, accessibility, etc.(Barry &

Schamber)

Topical Relevance

Focus of this study

currency, reliability, accuracy

Inadequate Understanding of Topical Relevance

• A widely held unspoken assumption (Green, 1995): topical relevance = direct matching between the

query topic and the document topic

• Topical relevance is usually treated as an atomic notion, remains vague and unexplicated

What is topical relevance?Relevance theory in Communication (Wilson & Sperber):

An input (a sight, a sound, an utterance, a memory) is relevant to an individual when it connects with background information to yield conclusions that matter to him

Logical relevance (Cooper):A piece of information is logically relevant if it is in a minimal premise set that logically entails the conclusion statement through deductive reasoning

Evidential relevance (Patrick Wilson): A piece of information is evidentially relevant if it either increases or decreases the confirmation of a conclusion through deductive or inductive reasoning

Topical relevance relationships (Green & Bean): Topical relevance is beyond topic matching and involves many relationship types: by inference or by analogy

Understanding topical relevance in a broader context

• The essence of topical relevance is reasoning from evidence to a conclusion of concern (an answer to the user’s question)

an evidence perspective

• The subject of Evidence has been a sustained focus of multi-disciplinary attention (next slide)

• Brings the discussion of topical relevance into the broader context of thinking, reasoning, drawing conclusions, building arguments, and, most generally, building understanding and deriving meaning.

Disciplines dealing with evidence

• Logic and mathematics

• Study of thinking, problem solving, cognition

• Communication (Relevance Theory of Communication)

• Witness Psychology

• Forensic Science

• Intelligence Services

• History

• Law

• Evidence-Based Policy (Twining, 2003)

• Evidence-Based Medicine (EBM)

Four types of topical relevance

Based on thinking about the evidentiary connectionbetween a piece of information and a user's question, topic, or task.

1 Direct relevance Explicitly gives an answer to a user’s questionExample: Topic “Food in Auschwitz” . Survivor talks about food available to Auschwitz inmates

2 Indirect relevanceLets the user infer an answerTalks about seeing emaciated people in Auschwitz

3 Context relevanceProvides peripheral or background information surrounding an answerTalks about physical labor of Auschwitz inmates

4 Comparison relevance Provides a basis for interpretation or inspires some answer through perceived similarity to the questionTalks about food in the Warsaw ghetto

Method

Study context: MALACH relevance assessment

• MALACH: Multilingual Access to Large Spoken ArCHives

• Goal: To improve access to oral history archives

• Collection: 52,000 Holocaust survivor interviewsUSC Shoah Foundation Institute for Visual History and Education (SFI)

• Speech Retrieval Test Collection designed for comparative evaluation of retrieval systems • 20,000 segments from 400 interviews • 105 test topics: real user requests received by SFI• Relevance assessors: Graduate history students• Relevance assessments: 37,000 topical relevance

assessments between topics and interview segments, using the four relevance types, scale 0 – 4

Relevance Assessment Interface

Data collection and analysis

• Data: • Assessors’ Topic Notes (interpretations, examples)• Assessors’ Justifications for relevance assessments

• Method: combines grounded theory & maximum comparison to allow specific relevance types to emerge from the data

• Results: • a fine-grained classification of topical relevance types• examples for each type

Findings

A fine-grained classification of topical relevance types

1 Direct RelevanceA direct answer to a question of interest, exactly, specifically, explicitly on topic, with minimal, if any, inferential reasoning involved

Topic: Strengthening Faith by Holocaust Experience

Evidence: A survivor talks about how an elderly Salonikan Jew helped strengthen their religious faith during their incarceration; “we called him grandfather. He always said to us ‘you must say Kaddish every night.’ I was forced to dispose of corpses in the camp at the time. One day I came back from work and said to him ‘Are you crazy?’ He said: ‘No, something good will happen one day after this. We have to pay a very dear price but we're gonna have our own state of Israel.’ And it happened. I survived with my faith and went to Israel.”

2 Indirect Relevance

• Not the direct answer, but can be used to infer the answer, one or more inferential steps away from the answer. Circumstantial evidence

• Can contribute as much to understanding a topic as direct evidence after “joining the dots”

(2.1) Generic indirect relevance

(2.2) Backward inference (abduction)

(2.3) Forward inference (deduction)

(2.4) Inference from cases (induction)

2.1 Generic Indirect Relevance

Missing only a specific piece of information but strongly points at a fact that is right on topic

Topic: Stories of Varian Fry and the Emergency Rescue Committee who saved thousands in Marseille

Evidence: The survivor mentions obtaining a false name and being rescued from France but does not specifically mention Fry.

Reasoning: Varian Fry created an underground operation to smuggle over 2000 Jews out of France from 1940-1941. Using a false name and being in France constitute strong hints for smuggling associated with Fry.

2.2 Backward Inference (abduction)

• Backward & forward inference causal reasoning

• Backward inference is “tracing back” or “backward chaining”, reasoning from effect to cause

(2.2.1) Inferring an event (phenomenon) from its consequence

(2.2.2) Inferring an event (phenomenon) from events (phenomena) that happen later

(2.2.3) Inferring an action (phenomenon) from reaction to it

2.2.1 Inferring an event (phenomenon) from its consequence

The consequences lay out substantial clues for us to trace back to the event (or phenomenon)

Topic: Did Bulgaria save its Jews from Nazism?

Evidence: A survivor comments about the quality of life being better in Bulgaria.

Reasoning: It does not explicitly address the Bulgarian government’s policy to its Jews, but better living quality in Bulgaria is definitely one important effect resulting from the leniency of the government.

2.2.2 Inferring an event (phenomenon) from events (phenomena) that happen later

Topic: Nazi theft and expropriation of Jewish property

Evidence: Segments describe forced labor of sorting clothes, Jewels, and Jewish ritual objects.

Reasoning: The intensity of sorting labor and the details of sorting process indirectly demonstrate the severity of seizure of properties/ valuables by Nazis that happened earlier.

2.2.3 Inferring an action (phenomenon) from reaction to it

The target event is not mentioned or may not have happened at all, but reaction, perception, feeling, attitude, or attempt is a good mirror to reflect what has gone on before

Topic: Nazi theft and expropriation of Jewish property

Evidence: Segments discuss Jewish efforts to hide property.

Forward inference is “looking ahead” or “forward chaining”, reasoning from cause to effect

Essentially making predictions, can infer only with a low or medium level of certainty

(2.3.1) Inferring an event (phenomenon) from its cause

(2.3.2) Inferring an event (phenomenon) from events (phenomena) that happened earlier

(2.3.3) Inferring reaction (feeling) from action (phenomenon)

2.3 Forward Inference (deduction)

2.3.1 Inferring an event (phenomenon) from events (phenomena) that happened earlier

If the probability of the association of an early event A and a later event B is high, and if the actual occurrence of early event A is known the later event B, the one of interest, did also occur

Topic: Stories of children hidden without their parents and of their rescuers

Evidence: A survivor tells of his sister’s absence on the day of the roundup and explains that she had been delivering food to extended family members already in city X.

Reasoning: Not being at the roundup may lead to later hiding experience of his sister, however faintly.

3 Context Relevance

• Not specifically on a topic, but surrounding the topic

• Helps to develop the big picture where the target event fits in

• to backup an argument but not to base an argument on

• Physical setting or environment, the factors allowing or hindering an event, something happening behind the scene

(3.1) Context by scope (environmental setting, social/political/cultural

background)

(3.2) Context by causal sequence

(3.3) Context by time sequence

(3.4) Context by place

• By extending the focus, we start to see the background and gain a broader view on the target event, which enriches our understanding of what is going on in the foreground

• Sets up a big picture on the physical, political, social, and cultural level.

(3.1.1) Context as environmental setting

(3.1.2) Context as social/political/cultural background

(3.1.3) Context as other supplemental information

3.1 Context by Scope

3.1.2 Context as social / political / cultural background

Topic: Stories of Varian Fry and the Emergency Rescue Committee who saved thousands in Marseille

Evidence: A survivor details the political situation in France in 1940-1941 regarding refugees and the changes in emigration regulations that made fleeing France difficult.

3.2 Context by Causal Sequence

• Situates our understanding of a target event into a causal network

• Both helping and hindering factors behind an event or phenomenon

• These factors are affecting but not sufficiently causing a target event to happen or not to happen

Topic: Stories of children hidden without their parents

Evidence: Mentions of factors hindering hiding, such as “the authorities often raided the convent”; some children were hidden in convents during the war.

3.3 Context by Time Sequence

• Surrounding in the sense of being close by on time line

• Events that happened immediately before / after a target event

• Unlike forward- / backward- inferential evidence, its relation to a target event are certain and explicitly stated

3.3.1 Context as preceding experience/event

3.3.2 Context as following experience/event

Topic: Descriptions of Nazi medical experiments

Evidence: The prisoner selections conducted by Dr. Mengele in concentration camps that are related to medical experiments.

4 Comparison Relevance

• Based on analogical reasoning, identifying both analogous and contrasting persons, places, events, or phenomena

• Recognizing similarity is at the heart of thinking & reasoning:• by looking at similar cases develop a comprehensive view on the

same sort of events, obtain supplemental details, and recognize unique features of the target at hand

• by looking at contrasting cases see the other side of the coin

• Weak evidential value in establishing a fact (even weaker than contextual evidence), but essential to justify a judicial decision by identifying comparable precedents

• Particularly useful to induce some arguments / perspective where little material on the exact event is available

4 Comparison Relevance

• Identify both similar and contrasting cases

• Comparative evidence shares characteristics of the topic but differs from the topic in one or more aspects

• A typical MALACH topic can be described by three major aspects (facets): • external factors: time and place • participants• the event/ experience/ phenomenon itself

4 Comparison Relevance

• Varying external factors and/or participants, we get the same type of event / experience / phenomenon happening in a different place, at a different time, in a different situation, or with a different person

• Varying event / experience / phenomenon , we get a contrasting event / experience / phenomenon happening in the same time-space or involving the same participant(s)

(4.1) Comparison: Varying External Factor(s): Time/Place

(4.2) Comparison: Varying the Participant(s)

(4.3) Comparison: Varying the Act / Experience

4.1 Comparison: Varying External Factor(s): Time/Place

(4.1.1) Comparison: Happening at a difference place

(4.1.2) Comparison: Happening at a different time

Topic: The Postwar Reception of Holocaust Survivors by the American Jewish Community 1945-1954

Evidence: Mention of reception in Netherlands, 1945-1954.

4.2 Comparison: Varying the Participant(s)

(4.2.1) Comparison: A different actor

(4.2.2) Comparison: A different subject being acted upon

Topic: Nazi theft and expropriation of family property and assets

Evidence: Seizure of property by Axis governments other than the Nazis, e.g., Hungary pre-1944.

Topic: Treatment of the disabled during the Holocaust

Evidence: The Nazi experimentation and extermination of gypsies, homosexuals, twins, and the elderly. These non-disabled provoked the same type of cruelty from Nazis during the Holocaust.

4.3 Comparison: Varying the Act / Experience

• Mostly provides contrasting evidence for a topic

• Enriches our thinking on a topic by seeing:

• how similar situations engender different actions (events)

• how similar participants make different decisions or go through different experience

(4.3.1) Comparison: A different act focus on a similar actor

(4.3.2) Comparison: A different experience focus on a similar subject

4.3 Comparison: Varying the Act / Experience

(4.3.1) Comparison: A different actor

(4.3.2) Comparison: A different subject being acted upon

Topic: Strengthening faith by Holocaust experience

Evidence: A survivor speaks of his loss of faith in Auschwitz. He stopped believing in the form religion, be it Jewish, Catholic, Protestant, or else. He saw a religious fellow teach the young people that the Jews of Poland and Czech and Romania sinned but the Jews of Florida and of New York did not sin. He cannot believe there is a God who could see what was happening in Auschwitz and permit it.

Summary

• Direct evidence explicitly gives an answer to a user’s question

• Indirect evidence lets the user infer an answer

• Contextual evidence provides peripheral or background information surrounding an answer

• Comparative evidence provides a basis for interpretation or inspires some answer through perceived similarity to the question

The common interpretation of topical relevance as a matching relationship is too limited

Why should we care about types of relevance other than direct?

The large variety of information needs, user situations, & tasks:

• In many situations, direct evidence is simply not available:• In court cases• In the history domain

• For many tasks direct evidence alone is not sufficient:• Make judicial decisions by identifying comparable cases

in law• Evidence-based Medicine (EBM) puts a focus on

contextual and comparative evidence into clinical problem solving

Applications:

Task-based retrieval

Supporting users analyzing a topic and building an argument

• The user wants to search from different angles and to collect direct, indirect, contextual, and comparative information for her task, for example, analyzing the events of 9/11

• The ideal system should automatically search from different angles and provide a list of results organized by type of relevance: direct, indirect, context, comparison, possibly divided by sub-types

In other words,IR systems should be designed in a way that:

• allows the user to stay at the center of her task

• supports the user in thinking about her topic/ task more comprehensively radiating out from the center of the topic

Present IR systems support only search for direct relevance

• The present system (based on direct matching)

Given a direct question, such as “the events of 9/11”, searches mostly for direct evidence

The user must do separate searches for indirect, contextual, and comparative information

In each of these non-direct searches, the system restricts the user to ask her questions in a direct way

Consider a search example

A search example

Contextual information (e.g. cultural conflicts) pre-9/11

• The user’s query “context pre-9/11”

• The system’s solution to start the search, the user must recall a particular event or person she knows to be relevant

• Limitations of direct-oriented or matching-based IR systems:• they stop users who are curious but know little about the

background• even for knowledgeable users, they require extra mental

effort to transform a non-direct question into a direct one

Improve IR system design

• A starting point: to indicate topical relevance relationships in indexing:• to respond to non-direct search requests• to organize results by types of relevance

• A more advanced and general solution: to equip the system with reasoning power so that it can automatically detect what information in the collection is relevant directly, indirectly, in context, or by comparison

Example system:Evidence-Based Medicine (EBM)

• PICO structure implies types of topical relevance:

• Patient background Context

• major Intervention Direct

• Comparative intervention(s) Comparison

• Outcome

• Using the PICO frame for clinical question-answering systems

• Identify different types of evidence required by clinical questions

• Identify different types of evidence in clinical source texts

• Match and then structure the answer by type of evidence

Take-home message: Retrieval beyond direct match

Incorporating an enriched concept of topical relevance into retrieval system design, extending the system beyond direct topic matching,

can produce systems that support users in looking at a problem from several angles and developing well-structured arguments

Outlook

Outlook: understanding topical relevance through the structure of the argument

• The evidential connection between a piece of information and a conclusion (or an answer) is indicated by the role the piece of information plays in the overall structure of an argument

• Rhetorical structure theory (Mann & Thompson)• RST offers an explanation of the coherent texts by identifying the

role for every part of the discourse• Evidence/Justify, Background/Circumstance, Comparison/Contrast

are among the most frequent roles occurring in discourse

• PICO structure of clinical texts or answers to clinical questions:• Patient background Context• major Intervention Direct• Comparative intervention(s) Comparison• Outcome

Inspire us to enrich and verify topical relevance relationships

Outlook: a domain-specific emphasis of understanding topical relevance

• The different types of evidence take on specific meanings and may be expanded in the context of a particular domain / discipline

• There will be many question- and domain-specific ways in which a piece of information relates to a task/ question, and thus many nuanced types of relevance

• Further research on topical relevance types, as defined as evidential relationships, should look for commonalities and differences across domains and types of tasks/ questions

• Such analyses can be supported by looking at methods from different perspectives: philosophy, psychology (human thinking), decision making, and research methods

Thank you!

Questions?

Xiaoli Huang

[email protected]

www.wam.umd.edu/~xiaoli

This work is supported in part by NSF grant IIS-0122466

Fig. 2. A research framework for IR: Contextual extensions and relevance criteria. (From Ingwersen & Järvelin, 2005, p. 322)