Extracting stories from heterogeneous information sources V.S. Subrahmanian, M. Fayzullin University...
-
Upload
avis-obrien -
Category
Documents
-
view
215 -
download
0
Transcript of Extracting stories from heterogeneous information sources V.S. Subrahmanian, M. Fayzullin University...
Extracting stories from heterogeneous information
sources
V.S. Subrahmanian, M. FayzullinUniversity of Maryland
M. Albanese, C. Cesarano, A. PicarielloUniv. of Napoli, Italy
10/20/2004 KF Workshop 2
Talk Outline
Motivating examples Story Architecture The Model Conclusions
10/20/2004 KF Workshop 3
STORY Participants
Joint research project
University of Maryland, College Park, USA V.S. Subrahmanian M. Fayzullin Amelia Sagoff
Università di Napoli, Federico II Antonio Picariello Massimiliano Albanese Carmine Cesarano
10/20/2004 KF Workshop 4
Motivating example: Pakistani Nuclear Scientists
Nuclear proliferation is the issue of the day
Complex web of Nuclear scientists Personnel at weapons
locations Arms dealers Customs officials Shipping companies Front companies Manufacturers …
Nuclear monitors may want the “story” on any person or place or event to decide if further investigation is warranted.
Only the relevant data should be presented to the analyst.
10/20/2004 KF Workshop 5
Motivating example: soldier in Baghdad
Soldier in Baghdad sees a car pulling up towards a checkpoint.
Wants the quick story on:
Owner of the car Associates of the car’s
owner Estimated threat.
Soldier is driving a truck. Wants the quick story on his route:
Are certain intersections dangerous?
Are the residents sympathetic to US troops
Are there nearby friendly units?
Any recent reports of gunfire?
Any suspicious change in activity levels?
Only the relevant data should be presented to the soldier.
10/20/2004 KF Workshop 6
Motivating example: US Immigration
Customs official sees a traveller Wants the quick story on him
Where does he work? Who does he work for? What is his area of expertise? Any warrants? Is he on a watch list? Who are his associates – anyone suspicious?
Just the right data should be presented to him.
10/20/2004 KF Workshop 7
A motivating example: Pompeii
Pompeii is a spectacular archaeological site. Visitor experience can be greatly improved
by: Automatically notifying visitors of interesting
phenomena without posting extra signs Allowing visitors to explore the stories of various
monuments, paintings, sculptures, etc. in Pompeii. Allowing visitors to explore the stories of the
characters, events and places depicted in these monuments, paintings, sculptures, etc.
Visitors interests vary – so information about exhibits must adapt in real time to their interests to enhance the experience of the visitor.
10/20/2004 KF Workshop 8
3 Applications
[75% done] Pompeii [Preliminary demo available, about
50% done] Pakistani Nuclear scientists
[Just initiated – demo expected in Jan 2004] Tribes and tribal leaders in the Pakistan/Afghanistan Borderlands
10/20/2004 KF Workshop 9
Pompeii Visitors
Visitor arrives at ticket counter and buys ticket.
10/20/2004 KF Workshop 10
Pompeii Visitors
Visitor arrives at ticket counter and buys ticket.
ANALOG: Soldier inBaghdad sets out on a mission.
10/20/2004 KF Workshop 11
Pompeii Visitors
Ticket agent asks if they would like to use the storyfacility and if they would like to use their cell phone
and/ or PDA to get stories of interest to them.
10/20/2004 KF Workshop 12
Pompeii Visitors
Ticket agent asks if they would like to use the storyfacility and if they would like to use their cell phone
and/ or PDA to get stories of interest to them.
ANALOG: Soldier inBaghdad chooses to receive stories on hisradio or PDA.
10/20/2004 KF Workshop 13
Pompeii Visitors
As visitor walks through Pompeii, STORY identifies where he is and predicts where he might go in the future (probabilistically). Ex. if he is at location L, it might predict that he will go to the House of the Vetti.
10/20/2004 KF Workshop 14
Pompeii Visitors
As visitor walks through Pompeii, STORY identifies where he is and predicts where he might go in the future (probabilistically). Ex. if he is at location L, it might predict that he will go to the House of the Vetti.
ANALOG: As soldier drives through Baghdad, STORY identifies where he is andCorrelates where he will go with his route plan.
10/20/2004 KF Workshop 15
Pompeii Visitors
Based on this prediction of where he might go in future, it identifies potential stories he might be interested in and
downloads parts of these stories to his PDA/cell. E.g. It might download stories about Pentheus.
See items
You are here (Triclinium in the House of the Vetti)
10/20/2004 KF Workshop 16
Pompeii Visitors
Based on this prediction of where he might go in future, it identifies potential stories he might be interested in and
downloads parts of these stories to his PDA/cell. E.g. It might download stories about Pentheus.
See items
You are here (Triclinium in the House of the Vetti)
ANALOG: STORY findsstories satisfying the soldier’s conditions of interest and downloads them to his PDA or to the nearest radio broadcast location.
10/20/2004 KF Workshop 17
Pompeii Visitors
The visitor chooses which story he is interested in. STORY dynamically generates the story and delivers it to the user’s PDA/cell phone, e.g. user might choose story of Pentheus.
10/20/2004 KF Workshop 18
Pompeii Visitors
The visitor chooses which story he is interested in. STORY dynamically generates the story and delivers it to the user’s PDA/cell phone, e.g. user might choose story of Pentheus.
ANALOG: STORY delivers the story to the soldier. He can then further interact with the story if needed using voice and cursor prompts.
10/20/2004 KF Workshop 19
Pompeii Visitors
The user can choose to explore the story in greater detail (e.g. if he is seeing the story of Pentheus, he can also explore the story
of Agave).
10/20/2004 KF Workshop 20
The system
STORYSpatio-Temporal Object
RepositorY
10/20/2004 KF Workshop 21
Story
A story is a narrative, true or presumed to be true, relating to important events and celebrated persons of a more or less remote past; a historical relation or anecdote. (Oxford English Dictionary).
We adopt the view that narratives in the context of computing are really interactive multimedia presentations.
Such a view allows a straight piece of text to be a special case of a narrative, or a straight piece of speech to be a narrative.
10/20/2004 KF Workshop 22
Considerations about stories
The concept of story is dramatically different for the examples mentioned earlier. A visitor to Pompeii cares about mythological,
historical, artistic facts. Soldier in Baghdad cares about security and mission
related facts. Who are the people around me and not who is depicted on the walls.
Nuclear analyst cares about the nuclear networks – who is selling what to whom? Who is moving the money? What front companies are involved?
What goes into a story depends not only on basic facts about entity of interest but also on the application domain and specific items of interest to the user.
10/20/2004 KF Workshop 23
STORY System
STORY is a system for extracting story content from multiple
data distributed sources (databases, web pages, digitized historical documents, maps, etc.)
creating a succinct story based on the above content that adapts to user preferences and interests in real time and
delivering these stories to users across both wireless, wired, and cellular networks and multiple output devices.
10/20/2004 KF Workshop 24
Story Architecture
10/20/2004 KF Workshop 25
Main Components
STORY application developer component. what data sources should be accessed in order to
produce stories, and what criteria define a good story.
It includes specifications of context when stories should be generated.
STORY end user component what hardware she would like her stories to be
rendered on (e.g. PDA, laptop, cell phone), what constitutes a “good” story and methods to analyze collections stories and render judgements about them.
10/20/2004 KF Workshop 26
The “Death of Pentheus” painting
Who was Pentheus? Who punished him? Who punished him? Why he was punished? What do we know about
his family? Was this event depicted
by other artists at the same period or in earlier periods or in later periods in the same or different geographical region?
What is the story behind the Vetti?
10/20/2004 KF Workshop 27
Entities
Entity: Describes an “object” of interest. All the known people depicted via images and
sculptures People related in some way Places
In the case of the soldiers in Baghdad, terrors groups, front companies etc.
There is no need to enumerate this set of entities. They are dynamically created in STORY.
10/20/2004 KF Workshop 28
Attributes
We assume the existence of some set A whose elements are called attributes.
An attribute A in A has a domain dom(A). The set of ordinary attributes is
associated with the set of entities E iff E Adom(A)
…. Each entity can be characterized by the values of an ordinary attribute!
Example: Attribute: mother, Value: Agave Attribute: cartag, Value: AMD 124 Attribute: employers, Value = {ibm, hp }
10/20/2004 KF Workshop 29
Temporal attributes
Time Varying Attribute (TVA) = (A, dom(A)) Timevalue for (TVA) = a set of triples (vi, Li, Ui)
Vi values; Li, Ui integer or UNKNOWN () Must satisfy the requirement that an attribute
does not have two distincts values at the same time.
Example: attribute: job Value = { (cardinal, 1500,1509),
(pope,1510,1545)} Example:
Attribute: worked-for Value = {(ibm,1990,1998), (hp,1999,2004)}
10/20/2004 KF Workshop 30
Story Schema
A story schema is a pair (E,A)
Examples Set of entities in Pompeii:
Set of all objects in Pompei Set of all objects and events depicted Any entities related to the previous categories.
Set of all people/organizations associated with Iraqi cars
Set of all car ids Set of owners of such cars Set of people associated with such owners via one
or many links.
Set of entities
Set of attributes ofinterest
10/20/2004 KF Workshop 31
Story Instance
An instance w.r.t. story schema (E,A) is a partial mapping
Input: an entity of E and an attribute of A
Output: a value v in dom(A) if A is an ordinary
attribute, or a timevalue if A is a TVA
10/20/2004 KF Workshop 32
Example
Pentheus was a Greek king who was an enemy of the god Bacchus. Angered by this, the Maenads (who were priestesses worshipping Bacchus) transformed Pentheus into an animal and had his mother, Agave, kill him.
A story schema (together with associated values) for this could be the following:
Occupation: is a time-varying attribute specifying Pentheus' occupation.
The value of this attribute could be king which says that he was king at an unknown time.
Enemy: is a time-varying attribute specifying who were enemies of Pentheus.
The value of this attribute could be Bacchus, Maenads. Notice that Bacchus and the Maenads are other entities.
Punishment: is a time-varying attribute specifying the punishments of Pentheus.
The value of this attribute could be “ transformed into an animal”,”killed”
Mother: is an ordinary attribute having the value : “Agave”.
10/20/2004 KF Workshop 33
Example: US Immigration
Entity: a visitor to the US Attributes:
Name Citizenship Passport-number Photo Biometric attributes Purpose of visit Countries travelled to (TVA) Area of technical interests Known suspicious affiliations
10/20/2004 KF Workshop 34
Pentheus Story
Entity Attribute ValuePentheus Occupation
EnemyPunishmentMother
{(king, ,)}{({bacchus, maenads}, ,)}{({“transformed into an animal”, “killed”}, ,)}Agave
Bacchus OccupationEnemyFriends
God{(Pentheus, ,)}{(Maenads, ,)}
Maenads OccupationFriends
{(priestess, ,)} {(Bacchus, ,)}
Irrelevant time value
10/20/2004 KF Workshop 35
How the system works
The story application developer first specifies a set of data sources that are to be accessed. www a relational database an object oriented database database of web documents Flat files a set of URLs Some combination of the above.
10/20/2004 KF Workshop 36
10/20/2004 KF Workshop 37
How the system works (2)
The story application developer then specifies a set of properties (not their values) of a place or a person or an artifact or an event that an end-user might be interested in.
The properties of interest may be things like father, mother, occupation, collaborators and so on.
Associates priorities with the properties – these depend on his application needs.
10/20/2004 KF Workshop 38
Attribute Extractor
Uses the mediator as well as WordNet to ask queries to appropriate data sources.
It extracts information about the values of the attributes involved. For example, in our Pentheus application, the
attribute extractor accesses HTML pages and extracts from those pages, the names of all entities involved, and for each such entity, it tries to check whether a given attribute has a value.
We have also defined algorithms to extract information from relational, flat files and XML sources.
10/20/2004 KF Workshop 39
Attribute Extractor (2)
Results returned by the attribute extractor a set of (entity, attribute, value) triples a set of such triples with an associated time
stamp - can be stored in an RDF database
Or relational DBMS or an XML DBMS. We have also implemented a web spider
that can crawl over a set of data sources and populate the attribute database.
10/20/2004 KF Workshop 40
Source Access Table (SAT)
We assume that our data sources have an associated application program interface (API)
The SAT describes how to extract an attribute's value using a source's API
A SAT- tuple is (A,s,fA,s) fA,s is a partial function (body of software code) that maps
objects to values or time values A SAT table is a finite set of SAT-tuples
Basically SAT specifies what code (fA,s) to use to extract values of attribute A w.r.t. source s.
Size of SAT is at most O(m*n) where m is the number of sources and n is the number of attributes.
Methods to process such f’s have been previously developed in many systems, e.g.
TSIMMIS from Stanford HERMES, IMPACT from UMD Etc.
10/20/2004 KF Workshop 41
Valid and Full instance
Intuitively an instance is valid w.r.t. some source access table if
every fact (i.e. every assignment of value to an attribute for an entity) is supported by at least one source.
full when it accumulates all the facts reported by various sources.
NOT ENOUGH. Generalization needed Conflict management needed
10/20/2004 KF Workshop 42
Extraction of attribute values
Web sources1. the web is searched for pages related to the entity
of interest (a person, a place, or an event) in a specific domain (Greek Mythology, Roman History, …) using a metasearch engine such as Google.
2. An HTML parser analyzes the pages returned by the search engine and extracts significant pieces of text, taking into account the structure of the page.
3. A lexical analysis is performed using Wordnet.4. The result of this step is a tagged version of the
original text, in which each word is labeled with its corresponding part of speech.
10/20/2004 KF Workshop 43
Extraction of attribute values
4. An entity detection algorithm recognizes, based on some heuristics we have developed, the names of people, organizations, places, etc occurring in the text.
5. This algorithm can be trained on large data corpora to acquire a knowledge base that improves its performance. The algorithm is also capable of recognizing
different representations of the same name (e.g. Dr.H.J.Smith, H.J.Smith, Hanan J.Smith) and classifying the names (e.g. Dr.H.J.Smith is a person while Glass Inc. is a company).
10/20/2004 KF Workshop 44
Extraction of attribute values
6. Some minor tasks Pronoun resolution
the issue of mapping a pronoun into an entity named somewhere
word sense disambiguation Each word may represent different parts of
speech and may have several meanings depending on the context
7. The result of executing these algorithms is a rewritten and unambiguous version of the original text.
10/20/2004 KF Workshop 45
Extraction of attributes values
8. A semantic parser applies a set of rules that, based on the structure of sentences, permit us to deduce the entity-attribute-value triples.
Semantic rules are of the form Tail Head
Tail is a condition to be evaluated on a sentence of words from the text.
If this condition is satisfied, the head says how to extract one or more entity-attribute-value triples from the sentence.
Our system contains over 300 rules. We plan to increase this to around 1000 in the next 3 months.
10/20/2004 KF Workshop 46
User can cut and paste a sentence andspecify the entity, attribute, value in it.STORY learns a more general rule from it.Learned rule
10/20/2004 KF Workshop 47
XML sources
Consider an XML node N= name,value,
{c1,…cn}> where {c1,…cn}are children nodes
Assuming that N is a root node in an XML document, and nodes may act both as entities and the attributes….
e is an entity A is an attribute
<person><name> John Doe </name><height> 170 </height><eyes> black </eyes>…
</person>
10/20/2004 KF Workshop 48
GetXMLAttr(N,e,A)
GetXMLAttr(N,e,A) begin \\
Result := If N.value=e or N.name=e then
for each child c of N such that c.name=A do Result := Result U {c.value }
end for else
for each child c of N do Result := Result U GetXMLAttr(c,e,A)
end for end if return Result
end
10/20/2004 KF Workshop 49
CPR
There are good stories and bad stories The STORY architecture supports the goals of
succinctness and exploration and creates stories with respect to three important parameters: the priority of the story content, the continuity of the story, the non-repetition of facts covered by the story
We want to deliver the most important facts to the intended audience.
So far, we have focused primarily on priority and non-repetition, worrying less about continuity.
10/20/2004 KF Workshop 50
CPR examples
In the story of Pentheus, it makes more sense to first say that his parents were Cadmus and Agave, then say he reigned as King of Thebes, and then explain why he was killed. This rendering of the story is in chronological order,
ensuring a kind of temporal continuity. Other measures of continuity are also possible
within the STORY framework. A repetition function may evaluates how much
repetition there is in a given story. For example, in the case of Pentheus, we may
extract the fact that Agave is a parent of Pentheus, and that Agave is the mother of Penthus. Including both these facts in a story is repetitive as the latter fact subsumes the former.
10/20/2004 KF Workshop 51
Story evaluation function
eval(S)=. (s)+. (s) - . (s) , , are arbitrary functions from the set of all
possible stories S about some entities to [0,1] describes whether high priority facts are
included in the story. For example, the fact that Pentheus' mother was
Agave is more important than the length of Pentheus' big toe.
describes how continuous the story is. This means that a story should not jump wildly from
one fact to another. describes repetition.
clearly, stories that repeat the same or similar facts over and over again leave much to be desired.
10/20/2004 KF Workshop 52
CPR functions
There are many ways of defining how continuous a story is, how repetitive a story is, etc.
Our story creation algorithms can work with any continuity, priority and repetition functions whatsoever. We have defined small sets of differen
continuity and repetition functions. User context can be used to learn priority
functions.
10/20/2004 KF Workshop 53
Attribute Hierarchy
The attributes of interest are arranged in an attribute hierarchy where attributes can be labeled with priorities. The story application developer can browse
and edit this hierarchy (for example if he wishes to add new attributes).
He can add priorities to selected items in the hierarchy (all sub elements of a given element in the hierarchy will inherit the priority value for the parent unless otherwise stated).
10/20/2004 KF Workshop 54
10/20/2004 KF Workshop 55
Conflict Management
As multiple data sources may be used to extract attributes, conflicts might occur. For example, one source may say that
Pentheus‘ mother is Agave, while another may say it is Hera.
STORY allows conflict resolution with an application specific method.
Conflicts do not always need to be resolved. Sometimes, you just report the existence of a conflict, and specify what should be reported.
10/20/2004 KF Workshop 56
Example Conflict Management Policies
Temporal Conflict Resolution Suppose different data sources provide different values v1, …,
vn. Suppose value vi was inserted into the data source at time ti. In this case, we pick the value vi such that ti = max{ t1,t2, …,tn}. If multiple exist, one is selected randomly.
Source based conflict resolution. The developer of a story may assign a credibility ci to each
source si that provides a value vi for attribute A of entity e. This strategy picks value vi such that ci = max {c1,…, cn}. If multiple exist, one is selected randomly.
Voting based conflict resolution. Each value vi returned by at least one data source has a vote
that represents the number of sources that return value vi. In this case, this conflict resolution strategy returns the value with the highest vote. If multiple vi's have the same highest vote, one is picked randomly and returned.
10/20/2004 KF Workshop 57
Generalization Module
Goal: to generalize multiple RDF triples into one. For example, if we know that Pentheus's father is
Cadmus, and his mother is Agave, we may want to generalize this to say that Pentheus's parents are Cadmus and Agave.
If Pentheus was king of one town for some period, king of another town for another period of time, and so on, we may merely want to say that Pentheus was king of many places.
The Generalization Module looks at the RDF-triples stored in the RDF database and augments it with triples that include generalization attributes … that succinctly summarize a set of less general
(i.e. more specific) attributes.
10/20/2004 KF Workshop 58
Generalized Story Schema
A generalized story schema consists of a regular story schema, a function that associates an equivalence relation with each
attribute domain and a function that associates a generalization function with each
attribute domain. An equivalence relation on dom(A) specifies when certain
values in the domain are considered equivalent. For example, we may consider string values “king” and “monarch” to be equivalent in dom(occupation).
For a time varying attribute we may consider (“king“”,L,U) and “monarch”,L',U' to be equivalent independently of whether L=L and U=U' is true or not.
Our system uses WordNet and specialized heuristics we have developed to infer equivalence relationships between terms.
Generalization currently being plugged into the system.
10/20/2004 KF Workshop 59
STORY creation
Construct a story of length k or less from the RDF database. examining all triples in the RDF entity of interest, including triples extracted from the data sources by
the attribute extractor as well as triples created by the generalization module.
It then finds the k triples that optimize any objective function satisfying the following conditions: monotonic in priority of the triples and monotonic w.r.t. the continuity function selected by
the STORY application developer, and anti-monotonic in the amount of repetition between
tuples.
10/20/2004 KF Workshop 60
Closed Instance: handles generalizations
Consider the full instance associated with our source access table.
Now split this instance into equivalence classes using th selected equivalence relation.
Suppose the equivalence classes thus generated are X1, …, Xn.
For each equivalence class Xi we compute the generalization vi using the generalization function associated with attribute A. We insert the tuple (e,A, vi) into the full instance.
This process is repeated for all entities e and all attributes A
Closed instance is obtained after adding all such triples to the full instance.
10/20/2004 KF Workshop 61
Story Computation Problem
Given a closed instance I, a positive integer k, and an entity e as input,
find a story of size k that maximizes the value of a given evaluation function eval.
The story returned is called on Optimal Story.
10/20/2004 KF Workshop 62
Story Algorithms
OptSTORY algorithm: finds the story that optimizes the objective function. This algorithm has the disadvantage of being
very slow. DynStory(S) uses a dynamic programming
approach GenStory(S) which is based on genetic
programming. DynStory and GenStory find suboptimal
stories, but do so very fast.
10/20/2004 KF Workshop 63
GPS Support SubsystemCurrent implementation
Outdoor positioning at Pompeii implemented using DGPS
Mobile devices are equipped with IEEE 802.11b wireless Ethernet to allow internet connection
10/20/2004 KF Workshop 64
GIS Support SubsystemOutdoor and indoor positioning
Outdoor positioning GPS has been successfully adopted in a lot of
applications Indoor positioning
GPS receivers are blind in indoor spaces Different kinds of positioning systems will be used
Infrared or ultrasound sensors Radio Frequency sensors WLAN-based positioning
We have methods to optimally position a set of sensors to monitor the site, but the system is not yet implemented.
10/20/2004 KF Workshop 65
STORY presentation
Our STORY architecture applies to several different hardware options our current implementation works for both
PDAs and laptops.
Multiple languages we currently support English, Spanish and
Italian.
Multiple output rendering via a graphical user interface or via speech
10/20/2004 KF Workshop 66
10/20/2004 KF Workshop 67
10/20/2004 KF Workshop 68
Methods to mergemultiple such sentences into one arebeing implemented.
10/20/2004 KF Workshop 69
User Preferences
A specific tourist interested in the (mythological) Greek cuisine may add attributes relevant to this, together with appropriate priorities for them.
In the same vein, he can change the priorities set by the STORY application developer.
A learning component learns the user's preferences over time and automatically adjusts his priorities. We are currently adding these capabilities to
the STORY implementation.
10/20/2004 KF Workshop 70
Recommendations
Recommendations for current users are based on the behavior of past users
Behavior is represented through the usage patterns of the users
A Usage pattern p of length k is defined as
Useful for pre-fetching! Paths can also be time-stamped togauge user interests.
10/20/2004 KF Workshop 71
Comparison of usage patterns
Some distances (e.g. Levenstein) have been defined to evaluate the distance between sequences of symbols from a given alphabet Only the alignment of the symbols is taken into
account
Our approach Evaluate the similarity between
patterns based on the similarity between objects
10/20/2004 KF Workshop 72
Analysis Tools
A historian in Pompeii (rather than the casual tourist) may want to know how perceptions of some of the prominent families in Pompeii (such as the Vetti) changed over time by analyzing historical records from different periods of time.
For the intelligence community, we may be interested in knowing how opinions about events may change over time and space.
EX: How have perceptions of Abu Ghraib changed over the past 3 months in different countries in the Middle East?
Developed over 5 algorithms to gauge opinion and perform a spatio-temporal analysis of opinion. Running tests on them.
10/20/2004 KF Workshop 73
STORY Experiments
Parameters to be evaluated Value of the facts included into the stories Quality of the prose (does it read nicely)
Experiments plan 61 students enrolled as reviewers
51 non experts (no a priori knowledge about the subjects of the stories)
10 experts (a priori knowledge) Facts and prose evaluated for
Different algorithms Different rendering techniques Different CPR parameters settings Different lengths of the stories
10/20/2004 KF Workshop 74
Value of the facts vs. length of the story: Trends
10/20/2004 KF Workshop 75
Value of the facts vs. length of the story: Considerations Highest Priorities:
GenSTORY (version 1: using original sentences from sources if available instead of only using templates) wins
Runner up is DynSTORY (version 1) Even if we ignore how the stories are
rendered, GenSTORY still wins. Including the original sentences in the
story adds more information content than rendering the same fact through a template.
10/20/2004 KF Workshop 76
Quality of the prose vs. length of the story: Trends
10/20/2004 KF Workshop 77
Quality of the prose vs. length of the story: Considerations
The quality of the prose is high and seems independent of the algorithm used
Quality of prose decreases as the story length increases (not surprising).
Including sentences from text sources into stories improves story quality.
10/20/2004 KF Workshop 78
Value of the facts and quality of the prose: Summary
10/20/2004 KF Workshop 79
Value of the facts vs. CPR parameters: Trends
10/20/2004 KF Workshop 80
Value of the facts vs. CPR parameters: Considerations
Best “value of facts” is obtained when the priority is set to a high value Users are more interested in priority
than in continuity and repetition Repetition is to avoid when the length
of the story is very short For low values of L the best results are
obtained when R is set to a high value
10/20/2004 KF Workshop 81
Contact Information
V.S. Subrahamanian Department of Computer Science,
University of Maryland at College Park, USA
email: [email protected] Antonio Picariello
Dipartimento di Informatica e Sistemistica, Università di Napoli “Federico II”, Italy
email: [email protected]
10/20/2004 KF Workshop 82
Acknowledgment
SSOPRINTENDENZA ARCHEOLOGICA DI POMPEI Prof. Gian Pietro Guzzo Dott. Anna Maria Sodo
US Army DANA ULERY (ARL)
Industry JOE LEWTHWAITE (General Dynamics)