The Role of CNL and AMRin Scalable Abstractive
Summarizationfor Multilingual Media Monitoring
Normunds Grūzītis and Guntis Bārzdiņš
University of Latvia, IMCSNational information agency LETA
5th Workshop on Controlled Natural Language, 25–26 July 2016, Aberdeen, Scotland
Large-scale media monitoring
BBC monitoring journalists translate from 30 languages into English, follow 400 social media accounts every day.
A monitoring journalist typically monitors 4 TV channels and several online sources simultaneously. This is about the maximum that any person can cope with mentally and physically. The required human effort thus scales linearly with the number of monitored sources.
Monitoring journalists constantly need to be on the lookout for more sources and follow important stories—but as it is, they are tied down with mundane, routine monitoring tasks.
Monitoring 250 video channels results in a daily buffer of 2.5TB, a weekly buffer of 19Tb, and an annual buffer of 1Pb.
SUMMA – Scalable Understanding of Multilingual MediA
Identify people, places, events of interestDiscover trends, emerging events, crucial new stories
H2020 grant No. 688139
Timeline
Storyline
Event-based multi-document summarization: storyline highlights across a set of related stories
unrestricted
sort of CNL?(templates)
• Extractive summarization selects representative sentences from the input documents
• Abstractive summarization builds a semantic representation from which a summary is generated
• What semantic representation?
Sentence A: I saw Joe’s dog, which was running in the garden.Sentence B: The dog was chasing a cat.Summary: Joe’s dog was chasing a cat in the garden.
Liu F., Flanigan J., Thomson S., Sadeh N., Smith N.A. Toward Abstractive Summarization Using Semantic Representations. NAACL 2015
Abstractive summarization
AMR – Abstract Meaning Representation• A semantic representation aimed at large-scale human annotation
• A practical, replicable amount of abstraction
• Captures many aspects of meaning in a single simple data structure
• Aims to abstract away from (English) syntax
• Rooted, labeled graphs
• Makes heavy use of PropBank framesets
• An actual sembank of nearly 50K sentences
• Sentences paired with their whole-sentence, logical meanings
AMR – Abstract Meaning Representation• A form of AMR has been around for a long time (Langkilde and Knight, 1998)
• It has changed a lot since then: PropBank, DBpedia, etc.
• Banarescu et al. (2013) – the fundamentals of the current AMR annotation scheme
• Uses the PENMAN notation (Bateman, 1990)
• A way of representing a directed labeled graph in a simple tree-like form
• Easy to read and write (for a human), and to traverse (for a program)
• From semantic role labelling (SRL) to whole-sentence representation
AMR – Abstract Meaning Representation• Nodes are variables labelled by concepts
• Entities, events, states, properties
• d / dog: d is an instance of dog
• Edges are semantic relations
• E.g. “The dog is eating bones.”
(e / eat-01:ARG0 (d / dog):ARG1 (b / bone))
eat.01: consume (VN-class: eat-39.1, FN-frame: Ingestion) ARG0-PAG: consumer, eater (VN-role: agent) ARG1-PPT: meal (VN-role: patient)
e / eat-01
b / boned / dog
ARG0 ARG1
AMR – Abstract Meaning Representation“Bob ate four cakes that he bought.”(x2 / eat-01
:ARG0 (x1 / person
:name (n / name
:op1 "Bob")
:wiki "Bob_X")
:ARG1 (x4 / cake
:quant 4
:ARG1-of (x7 / buy-01
:ARG0 x1)))
e / eat-01
x4 / cakex1 / person
ARG0 ARG1
x7 / buy-01
ARG1
-of
"Bob_X"
name
wik
i
ARG0
4
quant
AMR – Abstract Meaning Representation
Schneider N., Flanigan J., O’Gorman T. AMR Tutorial at NAACL 2015https://github.com/nschneid/amr-tutorial/
• AMR is still biased towards English or other source languages
• Not an Interlingua, but close: Comparison of English AMRs to Chinese and CzechXue N., Bojar O., Hajič J., Palmer M., Uresova Z., Zhang X. LREC 2014
• Meanwhile, AMR is agnostic about how to derive meanings from strings, and vice versa
Natural Language Understanding• While it has been recently showed that the CNL approach can be scaled up..
• Embedded CNLs allowing for CNL-based domain-specific information extraction
• CNL as an efficient and user-friendly interface for Big Data end-point querying
• CNL for bootstrapping robust NL interfaces
• High-level CNL for legal sources
• ..use cases like media monitoring are not limited to a particular domain, the input sources vary from newswire texts to TV and radio transcripts to user-generated content in social networks
• In the era of Big Data, there is a dominating view that Deep Learning is the only way to cope with robust and scalable NLU
• NLU cannot be approached by CNLs, and grammars in general (?)
SemEval 2016 Task 8 on AMR parsing1. Riga (University of Latvia / LETA): 0.61962. CAMR (Brandeis University / Boulder Learning Inc. / Rensselaer Polytechnic Institute): 0.61953. ICL-HD (Ruprecht-Karls-Universität Heidelberg): 0.60054. UCL+Sheffield (University College London / University of Sheffield): 0.59835. M2L (Kyoto University): 0.59526. CMU (Carnegie Mellon University / University of Washington): 0.56367. CU-NLP (OK Robot Go Ltd. / University of Colorado): 0.55668. UofR (University of Rochester): 0.49859. MeaningFactory (University of Groningen): 0.4702*10. CLIP@UMD (University of Maryland): 0.437011. DynamicPower (National Institute for Japanese Language and Linguistics): 0.3706*
* Did not use AMR training data
NLG from AMR• The potential of grammar-based and CNL approaches becomes obvious in the opposite direction
• e.g. in the generation of story highlights from summarized (pruned) AMR graphs
• Text generation from AMR is still recognized as a future task• An unexplored niche for grammars and CNLs• GF, for instance, as an excellent framework for implementing multilingual AMR verbalizers• Issue: AMR to AST mapping
Pourdamghani N., Gao Y., Hermjakob U., Knight K. Aligning English Strings with Abstract Meaning Representation Graphs. EMNLP 2014
Butler A. Deterministic natural language generation from meaning representations for machine translation. NAACL 2016 Workshop on Semantics-Driven Machine Translation
Pourdamghani N., Knight K., Hermjakob U. Generating English from Abstract Meaning Representations. INLG 2016 (to appear)
Flanigan J., Dyer C., Smith N.A., Carbonell J. Generation from Abstract Meaning Representation using Tree Transducers. NAACL 2016
NLG from AMR• Butler A. 2016. Deterministic natural language generation from meaning representations for
machine translation. NAACL Workshop on Semantics-Driven Machine Translation
• Converts PENMAN-style representations to Penn-style trees
• Uses Tregex and Tsurgeon utilities which are a part of the Stanford NLP library
• Covers a wide range of constructions
• A simple example: “Girls see a boy.”
AMR to GF conversion: first experiment“Girls see a boy.”(x2 (see-01 (:ARG0 (x1 girl)) (:ARG1 (x4 boy))))
mkCl : NP ⟶ VP ⟶ ClmkVP : V2 ⟶ NP ⟶ VPmkNP : Quant ⟶ Num ⟶ CN ⟶ VPmkCN : N ⟶ CN
(mkCl (mkNP a_Quant singularNum (mkCN girl_N)) (mkVP see_V2 (mkNP a_Quant singularNum (mkCN boy_N))))
adjoin (Cl (VP @)) with PB-framemove ARG0 under Clmove ARG1 under VPadjoin (NP a_Quant singularNum (CN @)) with ARG0/1
excise var
AMR to GF conversion: first experiment“The boy sees the two pretty girls.”(x3 (see-01 (:ARG0 (x2 boy)) (:ARG1 (x7 (girl (:quant 2) (:mod (x6 pretty)))))))
mkCN : A ⟶ N ⟶ CNmkNum : Digits ⟶ NummkDigits : Str ⟶ Digits
(mkCl (mkNP a_Quant singularNum (mkCN boy_N)) (mkVP see_V2 (mkNP a_Quant (mkNum (mkDigits "2")) (mkCN pretty_A girl_N))))
move mod under CNreplace Num with quantadjoin (Num (Digits @)) with quant
Story headlines: Templates? Application grammar? CNL?Multilingual Headlines Generator(a GF toy example by Jose P. Moreno)http://grammaticalframework.org/demos/multilingual_headlines.html
Conclusion• There is a potential for cooperating with the DL folks in both NLU and NLG
• Especially in NLG which is recognized among the next problems to “solve” by DL
• Especially in domain specific use cases that can be approached by CNL
• AMR to text issues to be addressed: number, time, co-references, articles, concepts and WSD (for multilingual NLG), named entities, reification; the management of transformation rules
Top Related