Application of Word Embedding in Multi- Document ... · information keeps being a problem as data...

6
Application of Word Embedding in Multi- Document Summarization by Combining Sentences Felicia Christie School of Electrical Engineering and Informatics Institut Teknologi Bandung Bandung, Indonesia [email protected] Masayu Leylia Khodra School of Electrical Engineering and Informatics Institut Teknologi Bandung Bandung, Indonesia [email protected] AbstractWe attempted to include semantic aspect as an improvement of a previously lexical-based summarization method. From our previous research, we integrated calculation of word vectors and document vectors. Evaluation of our system was done by ROUGE-1 and ROUGE-2 with the same set of documents for standardizing against other Indonesian systems which uses more NLP resource in their creation. Quality evaluation is done by questionnaire of comparison against gold standard summaries for grammatical mistakes and informativity. Keywordsautomatic summarization, sentence fusion, word embedding, word graph I. I NTRODUCTION In the current period of huge amounts of information flow, both data and information are publicly available to any users with decent internet. Accumulating relevant information keeps being a problem as data online keeps growing every day. Processing relevant information can be done by human labor or machine. Our research centered on automatic summarization of news article to lessen the amount of processing needed, especially in Indonesian news articles, which will be used as comparison to other Indonesian summarizers. The system created accepts multiple news documents of the same topic, then generates a summary which represents the input documents. There are two categories of summarization depending on the method [1]. Extractive summarization involves extracting component of the input (e.g.: sentences), then combining them to produce the final summary [2]. Abstractive summarization includes paraphrasing the input texts to produce a final rephrased summary. The resulting summary might even contain logic inference to gain implicitly stated information. There are several methods available using abstractive summarization, including using available frames as the base and even sentence generation. Abstractive summarization has been said as the most ideal method of summarization [5], as it is most similar to how a human would generate a summary, and it has the potential of creating the most similar result to human summary [4]. Sentence fusion and extraction was previously mentioned as semi-extractive method of summarization [5], although several works has listed the methods as abstractive too [6] [7] [8] as the methods both involve paraphrasing, but use parts contained in input documents. A fully abstractive approach can create sentences by paraphrasing using an internal dictionary, and thus not limited to words contained in the input documents. Sentence fusion itself is defined as a method to generate new sentences, in which given a group of similar sentences, will output a sentence containing common information to all documents [8]. This definition is then expanded to also include complementary information [9]. Sentence fusion can only be done with more than one input sentences, as there need to be a joining process, while sentence compression is a method that can either be used to shorten a sentence or to join several sentences to form a single sentence (multi-sentence compression). Our main idea is to implement sentence fusion on Indonesian news articles to form a light-weight multi- document summarizing system. It has minimal dependency to NLP resources as in Indonesian, these resources tend to bottleneck the performance. For linguistic processing, we use CRF to build a POS-Tagger on universal dependencies dataset. The processing steps we did was: (1) Tokenizing word-wise and sentence-wise. (2) POS-Tagging, (3) Cluster the separated sentences by their similarity and removing the non-members or one-element-clusters, (4) Build word graph for each of the clusters, (5) Extract the sentences from the word graphs, (6) Get the word in the document nearest to the average of its vectors, and (7) Use integer linear programming to select the sentences used in the end summary. The processes are inferred from [7] which discussed a low NLP-resource-dependency multi-document summarization system, and [29] which was our previous research applying [7] on Indonesian dataset. We focus on the integration of word embedding to the previous systems and make adjustments to the variables as much as needed. Variables in we consider include threshold value for clustering and the term weighting we use. In our implementation, we use numpy and sklearn to help preprocess especially tokenizing, then nltk on Python to build our POS-Tagger using CRF. Implementation of word graph uses the module takahe [12] [13] while for ILP we use PuLP with GLPK Solver to solve the linear algebra problem. Indonesian news dataset [14] [29] are used to evaluate the created system. Grammatical accuracy and informativity are evaluated manually by questionnaire respondents. ROUGE (Recall-Oriented Understudy of Gisting Evaluation) is also used for a more standardized benchmarking for summarization system performances. We compare our system with the baseline [29] and two other more recent systems which uses heavier NLP dependency [26] [27]. II. RELATED WORKS A. Automatic Summarization According to the number of inputted documents, we differentiate summarization problem into single document and multi-document summarization. The two methods show different problems to handle [16], including: (1) Differing levels of redundancy in the input, (2) Placement of information, (3) Compression ratio, which is generally larger 2018 International Conference on Electrical Engineering and Computer Science (ICEECS) 168

Transcript of Application of Word Embedding in Multi- Document ... · information keeps being a problem as data...

Page 1: Application of Word Embedding in Multi- Document ... · information keeps being a problem as data online keeps growing every day. Processing relevant information can be done by human

Application of Word Embedding in Multi-

Document Summarization by Combining Sentences

Felicia Christie

School of Electrical Engineering and

Informatics Institut Teknologi Bandung

Bandung, Indonesia

[email protected]

Masayu Leylia Khodra

School of Electrical Engineering and Informatics

Institut Teknologi Bandung

Bandung, Indonesia [email protected]

Abstract—We attempted to include semantic aspect as an

improvement of a previously lexical-based summarization

method. From our previous research, we integrated calculation of word vectors and document vectors. Evaluation of our

system was done by ROUGE-1 and ROUGE-2 with the same set of documents for standardizing against other Indonesian

systems which uses more NLP resource in their creation.

Quality evaluation is done by questionnaire of comparison against gold standard summaries for grammatical mistakes

and informativity.

Keywords—automatic summarization, sentence fusion, word

embedding, word graph

I. INTRODUCTION

In the current period of huge amounts of information flow, both data and information are publicly available to any users with decent internet. Accumulating relevant information keeps being a problem as data online keeps growing every day. Processing relevant information can be done by human labor or machine. Our research centered on automatic summarization of news article to lessen the amount of processing needed, especially in Indonesian news articles, which will be used as comparison to other Indonesian summarizers. The system created accepts multiple news documents of the same topic, then generates a summary which represents the input documents.

There are two categories of summarization depending on the method [1]. Extractive summarization involves extracting component of the input (e.g.: sentences), then combining them to produce the final summary [2]. Abstractive summarization includes paraphrasing the input texts to produce a final rephrased summary. The resulting summary might even contain logic inference to gain implicitly stated information. There are several methods available using abstractive summarization, including using available frames as the base and even sentence generation. Abstractive summarization has been said as the most ideal method of summarization [5], as it is most similar to how a human would generate a summary, and it has the potential of creating the most similar result to human summary [4].

Sentence fusion and extraction was previously mentioned as semi-extractive method of summarization [5], although several works has listed the methods as abstractive too [6] [7] [8] as the methods both involve paraphrasing, but useparts contained in input documents. A fully abstractiveapproach can create sentences by paraphrasing using aninternal dictionary, and thus not limited to words containedin the input documents. Sentence fusion itself is defined as amethod to generate new sentences, in which given a group ofsimilar sentences, will output a sentence containing common information to all documents [8]. This definition is then

expanded to also include complementary information [9]. Sentence fusion can only be done with more than one input sentences, as there need to be a joining process, while sentence compression is a method that can either be used to shorten a sentence or to join several sentences to form a single sentence (multi-sentence compression).

Our main idea is to implement sentence fusion on Indonesian news articles to form a light-weight multi-document summarizing system. It has minimal dependency to NLP resources as in Indonesian, these resources tend to bottleneck the performance. For linguistic processing, we use CRF to build a POS-Tagger on universal dependencies dataset. The processing steps we did was: (1) Tokenizing word-wise and sentence-wise. (2) POS-Tagging, (3) Cluster the separated sentences by their similarity and removing the non-members or one-element-clusters, (4) Build word graph for each of the clusters, (5) Extract the sentences from the word graphs, (6) Get the word in the document nearest to the average of its vectors, and (7) Use integer linear programming to select the sentences used in the end summary. The processes are inferred from [7] which discussed a low NLP-resource-dependency multi-document summarization system, and [29] which was our previous research applying [7] on Indonesian dataset. We focus on the integration of word embedding to the previous systems and make adjustments to the variables as much as needed. Variables in we consider include threshold value for clustering and the term weighting we use. In our implementation, we use numpy and sklearn to help preprocess especially tokenizing, then nltk on Python to build our POS-Tagger using CRF. Implementation of word graph uses the module takahe [12] [13] while for ILP we use PuLP with GLPK Solver to solve the linear algebra problem.

Indonesian news dataset [14] [29] are used to evaluate the created system. Grammatical accuracy and informativity are evaluated manually by questionnaire respondents. ROUGE (Recall-Oriented Understudy of Gisting Evaluation) is also used for a more standardized benchmarking for summarization system performances. We compare our system with the baseline [29] and two other more recent systems which uses heavier NLP dependency [26] [27].

II. RELATED WORKS

A. Automatic Summarization

According to the number of inputted documents, we differentiate summarization problem into single document and multi-document summarization. The two methods show different problems to handle [16], including: (1) Differing levels of redundancy in the input, (2) Placement of information, (3) Compression ratio, which is generally larger

2018 International Conference on Electrical Engineering and Computer Science (ICEECS)

168

Page 2: Application of Word Embedding in Multi- Document ... · information keeps being a problem as data online keeps growing every day. Processing relevant information can be done by human

on multi-document summarization since repeated contents are inserted only once, and (4) Co-reference in sentences, which might occur when different sources are used together

Automatic summarization consists of 3 phases [1] [17]; analysis, transformation, and synthesis. First, input documents are analyzed to select features relevant to the aim, then the contents of the features analyzed. Next, those features are being transformed, reshaped into system internal representation of information. In the end, the system generates their internal representation as human-readable text.

Previously, we have mentioned extractive and abstractive summarization. Extractive summarization can be represented as a classification problem, which we give class of included or not included to list of sentences. This method often involves weighting input sentences, then selecting ones with the highest score, and finally putting them together to create the final summary. Abstractive summarization is a way to summarize in which at least one unit from the final summary is not obtainable from just the input documents. This method is divided into two approaches [6], which are sentence structure and sentence semantics. Structure-based approaches shows important information using cognitive schemas, e.g.: templates, rule-based information extraction, or using common data structures like tree or graph to represent relation, like graph structuress and trees structures. Semantic-based approach involves representation of the meaning (semantic) of input documents inside the system, and could involve natural language generation to create a summary based on what that the system ‘comprehended’.

Generation of new sentences has been said to be a fully abstractive approach to automatic summarization. In this approach, sentences are represented abstractly inside the system [18], then rephrased in the way the system knows, which is considered similar to humans’ method of summarizing, but has considerable difficulty in generating completely new sentences because of the computational resources to represent and extract information in the system. Other methods of that involve less resources include sentence fusion and compression, which has been mentioned as semi-extractive [5] and abstractive [6] [7] [8].

Several methods of sentence fusion had been proposed, which are by dependency parse trees [9] and word graph [12]. In word graph, each node consists of a word and its POS-Tag, and each edge represent the sequence of words in a sentence. On generating the graph, it has been defined a priority list in which to arrange the words [12]. First priority would be non-stopwords which has not been placed inside the graph, next is non-stopwords which has appeared inside the graph, and last priority are the stopwords. Fig. 1 shows an example of a completed word graph.

Fig. 1. Example of Word Graph [29]

B. Sentence Fusion in Multi-document Summarization

Summarizing can be represented as a classification

problem [19], in which it could be considered as a knapsack

problem to pick final sentences. Advances in abstractive

summarization has proposed a semi-abstractive system

which minimizes dependency on natural language resources. Reference [7] is one of the minimalism system which uses

only a POS-Tagger and an optional list of stopwords. This

summarizer system has four major components: (1)

Preprocessor, (2) Sentence Clusterer, (3) Sentence Fusion,

and (4) Sentence Selection.

Preprocess module handles tokenization, POS-Tagging

and stopword removal. The output of this process are then

clustered by lexical cosine similarity with complete linkage

strategy. These result in several clusters containing lexically

similar sentences, which still may or may not contain similar

information, depending on the similarity threshold. Each

cluster are processed to form word graphs to extract

sentences according to commonly used paths in the graph

and compression rate. In this implementation, an improved

version [20] of word graph is used, which integrates a key-

phrase extraction module to re-rank the sentences, as the

previous version had a tendency to create sentences

containing no significant information. Final sentences

chosen for the summary are singled out from all generated

sentences by using Integer Linear Programming (ILP) to

represent the selection process as a linear algebra problem.

Another research which made use of word graph for

sentence fusion [8] has slight differences on clustering process and was also more dependent on NLP tools to

extract sentences from word graphs. Clusters are formed

after selecting one most informative article, and each

sentence in the article act as bases for the clusters. Overall,

the difference to [7] is its larger natural language

dependency and point of start in declaring clusters.

Sentences from other articles are then clustered into

available clusters. Informativity and linguistic factor is

considered when obtaining generated sentences from the

word graphs. Informativity of the articles is calculated using

previously made available modules such as TextRank, while

linguistic quality is calculated using a trigram language

model.

C. Researches on Indonesian Multi-document Summarizer

Sentence fusion using word graph has been applied on

an Indonesian-based research [20] to summarize tweets

regarding trending topics on the social media Twitter. This research had encountered difficulty on formalizing the posts

as social media contents tend to not follow grammatical

rules and even has their own lingo, and as such, POS-

Tagging is also difficult. As there is not an Indonesian

formalizer with good accuracy at the time, word graphs in

this implementation ignores the POS-Tag and matches

nodes only by the word. This shows that word graph can be

applied on Indonesian dataset.

Reference [14] discussed a summarizer system which

uses information extraction to collect answers questions that

a news article should provide, which are the 5W1H of an

article. Arrangement and selection of extracted parts are

done with Maximal Marginal Relevance (MMR).

Information extraction of each parts of 5W1H is done with

machine learning by providing manually annotated parts

169

Page 3: Application of Word Embedding in Multi- Document ... · information keeps being a problem as data online keeps growing every day. Processing relevant information can be done by human

which answers 5W1H from each article. Next, the features of each word token are identified, then SMO and Ibk

decision tree is used to choose the best features to be used.

The resulting summary is created with the assistance of a

skeleton to defines the arrangement of each answer of

5W1H from the original input documents. The skeleton of a

sentence is:

<who, what> di <where> pada <when> karena <why>.

<how>

The words “di”, “pada”, and “karena” are Indonesian

prepositions inserted in the skeleton to give better flow to

the resulting summary. This research made use of the

Indonesian NLP tool INA-NLP in processing [11].

Another summarizer system focusing on Indonesian

news articles [26] uses sentence structure to create more

grammatically sound summary. The typical sentence

structure in Indonesian consists of subject-predicate-object-

adverbs (in Indonesian, it would be subyek-predikat-obyek-

keterangan / SPOK). This research uses tree-dependency

parser from SyntaxNet, and separates sentences by its

structural component, so they would have a collection of

subjects, predicates, objects, etc. Those collections are then

combined with graph-based approach, then MMR (maximal marginal relevance) is used to select sentences. This method

proved to be able to store relevance held by original

sentences, and results in ROUGE-2 average score of 0.276

and F1-measure 0.274.

A more recent project [27] creates an extractive

summarization system with neural network and graph

convolutional network topology to calculate sentence

relation. For 100-word summaries, they obtained the best

performance with Personalized Discourse Graph and Greedy

sentence selection, while on 200-word summaries, MMR

generated the best summaries. The system they build,

however, requires a considerable amount of processes and

NLP resources, including a new collection of news articles

and their human-generated summaries for 100 and 200

word. In average, this system obtained 0.37 score of

ROUGE-2 for 100 words and 0.378 for 200 words.

III. PROPOSED SOLUTION

Fig. 2 illustrates the architecture of the system, which

consists of five major modules, which is the preprocessor

module, clustering module, key-phrase extractor, sentence

fuser, and sentence selector module.

A. Preprocessor

Preprocessor accepts several news articles contained in our MySQL database, tokenizes whole articles into an array of sentences. Afterwards, we gave POS-Tag each word of each sentences and adjust the default POS-Tag from sentence fusion and word graph module takahe [13] to match with Indonesian POS-Tags. For tokenizing we use sent_tokenizer for sentence and Moses tokenizer for word, and for POS-Tagging, a model built with CRF with available Indonesian dataset [30].

B. Clusterer

Here we group sentences containing similar information

by both lexical and semantic content using cosine similarity

on preprocessed sentences. Lexical weight is determined by

Raw-TF and Raw-TF-IDF, while semantics are represented by using word embedding values to represent sentences.

Clustering is done by DBScan and we also tried using

Hierarchical Agglomerative but it performed stagnantly

lower than DBScan and the idea was scrapped. This module

is a crucial part as clustering helps placing similar

information together to reduce redundancy.

Fig. 2. Architecture of Implemented System

C. Key-phrase Extractor

Here we make use of the word embedding models and

calculate the average vector of a document by the words it

consisted of, then we iterate through all the words to find a

word closest to the vector. First, we add all the vectors of all

the tokens including stopwords and punctuation marks in a

document, then we divide the amount by the number of

tokens. From each document, a word which is closest was

taken.

D. Sentence Fuser

We use word graph [12] to combine similar sentences,

which has been briefly explained before. The input of this

module are clusters of POS-Tagged sentences and a list of stopwords which are considered less important. By word

graph, we utilize duplicated and redundant words to weigh

more to said information. We define the parameters

according to [7] [29] and use minimal 8 tokens per sentence

and 200 sentences generated per graph. In implementing the

module, we make use of the previously implemented word

graph module takahe [19] which source code was provided

publicly on Github.

E. Sentence Selection

Sentences to be included in the final summary is chosen

by using linear equations in Integer Linear Programming

(ILP) [21] [22] as it had previously shown as a better

performing method in sentence selection compared to MMR

[24]. Sentence selection is implemented using PuLP as

Python interface, and GLPK as its linear equation solver.

Using Integer Linear Programming, the following

limitations were declared for sentence selection based on previous researches [7] [22].

170

Page 4: Application of Word Embedding in Multi- Document ... · information keeps being a problem as data online keeps growing every day. Processing relevant information can be done by human

Maximize (1)

With: (2)

(3) (3)

(4)

In the rules stated above, variable w is the weight of sentence-i and c is a boolean which describes the inclusion

variable of the concept-i, l describes the number of words in

sentence j while s is a boolean variable which describes

inclusion variable of sentence j in the final summary. Rule

(2) defines the word limit of a summary as L is the maximal

number of words in the final summary. The variable n

represents the number of appearances of a concept i in

sentence j. By including Rule (3) and Rule (4), we connect

the constraints of weight to generated sentences. Concept (c)

in the equation is defined as bigram appearing in more than

a third of input documents.

Based on [7] and [29] we also adapted the previous

method of selecting maximal a sentence from a cluster to

eliminate the possibility duplicate sentences. We also

append the list of keyphrases obtained on module C to the

list of concepts we use in ILP to maximize informativity.

IV. EXPERIMENT RESULTS

The summarizer is evaluated on ROUGE-1 and

ROUGE-2 by the module ROUGE 2.0 in Java [31] which can be accessed publicly, with the configuration values

being tested is shown on Table I.

TABLE I. Experimental Variables

Variable Value

Term Weighting {TF, TF-IDF [29], Word2Vec Model,

FastText Model [28], Wikipedia

FastText Model [32]}

Summary Word

Count

{100, 200}

Clustering

Threshold

TF, TFIDF: {0.3, 0.4, 0.5, 0.6}

Word2Vec Model, Fasttext Model,

Wikipedia Fasttext: {0.01, 0.05, 0.1,

0.15, 0.2}

Word embedding models are used from previous

research [28], and another additional lightweight model for

Indonesian Fasttext from the official Github repository [32].

We try to keep the size minimal and keep the models used to

be under 1 GB. From our initial dataset, we use news-set g-

gempa-dieng-b to tune the parameters, and we obtained the

following parameters as the best performing ones. As the

number of clusters depends on similarity threshold, and we

have the rule where each cluster can only have one sentence

at maximal on the final summary, the threshold could differ

for 100-word and 200-word summaries. To accommodate

this difference, we listed in Table II and Table III for each best performing configuration.

TABLE II. Best performance on 100 Words Summaries

Config Avg

Recall

Max

Recall

Min

Recall

F-

Measure

TF-0.5 0.178 0.302 0.096 0.174

TFIDF-0.6 0.132 0.273 0.038 0.139

WIKIFT-0.01 0.160 0.341 0.008 0.163

FT-0.15 0.129 0.079 0.162 0.112

W2V-0.05 0.358 0.571 0.162 0.365

TABLE III. Best performance on 200 Words Summaries

Config Avg

Recall

Max

Recall

Min

Recall

F-Measure

WIKIFT-0.05

0.339 0.558 0.162 0.327

TFIDF-0.5 0.348 0.610 0.120 0.337

TF-0.3 0.339 0.558 0.162 0.327

FT-0.1 0.277 0.442 0.179 0.295

W2V-0.05 0.358 0.571 0.162 0.365

We also observe the ROUGE-2-gram performance

against other Indonesian News summarization, which we list

on Table IV and Table V.

TABLE IV. Performance Comparison on 100 Words Summaries

Config Avg

Recall

Min

Recall

Max

Recall

F-

Measure

FT-0.15 0.129 0.162 0.079 0.112

TFIDF-0.6 0.132 0.038 0.272 0.138

WIKIFT-0.01 0.160 0.008 0.340 0.162

TF-0.5 0.178 0.096 0.301 0.173

W2V-0.05 0.213 0.112 0.324 0.213

Baseline (Christie &

Khodra, 2016)

0.231 0.047 0.655 -

MMR SO-0.8-0.5

(Reztaputra & Khodra, 2017)

0.28 0.094 0.515 0.288

Adaptasi Sarkar

(Reztaputra & Khodra, 2017)

0.297 0.663 0.071 0.299

PDG-Greedy (Garmastewira &

Khodra, 2018)

0.37 0.218 0.703 0.363

171

Page 5: Application of Word Embedding in Multi- Document ... · information keeps being a problem as data online keeps growing every day. Processing relevant information can be done by human

TABLE V. Performance Comparison on 200 Words Summaries

Config Avg Recall

Min Recall

Max Recall

F-Measure

FT-0.1 0.276 0.179 0.441 0.294

MMR SO-0.8-0.5

(Reztaputra &

Khodra, 2017)

0.288 0.129 0.529 0.272

Adaptasi Sarkar

(Reztaputra & Khodra, 2017)

0.292 0.1 0.487 0.277

Baseline (Christie & Khodra, 2016)

0.319 0.076 0.293 -

WIKIFT-0.05 0.339 0.162 0.558 0.327

TF-0.3 0.339 0.162 0.558 0.327

TFIDF-0.5 0.347 0.119 0.610 0.337

W2V-0.05 0.358 0.162 0.571 0.364

PDG-MMR

(Garmastewira & Khodra, 2018)

0.378 0.109 0.617 0.344

While our system was not the best performing system by

ROUGE standards, using word embedding proves high

improvements against our baseline. It should also be noted

that the best working summarization models requires more

training data, including annotated news topics to build the

graph convolutional network, and thus is not as lightweight

in the build as ours.

Qualitative evaluation results of our system results can be viewed on Table VI, in which we selected the worst and

best resulting summary for each set of word count, then ask

respondents to compare our results with human-written gold

standard summaries.

TABLE VI. Qualitative Evaluation of Resulting

Summaries

WordCount / Topic Grammatical

Quality (1-3)

Informativity

(1- 5)

100 / Dieng (A) / best 2.41 3.72

200 / Dieng (A) / best 2.25 3.84

100 / Bunuh Diri / worst 2.22 3.50

200 / Gedung Roboh / worst 1.47 2.44

V. CONCLUSIONS AND FUTURE WORK

Our conclusion is that word embedding in clustering process helps increasing the performance of summarization by sentence fusion, as our baseline runs on 0.319 recall while our system runs on 0.358 for 200-word summaries, and while the average performance on 100-word drops from 0.231 to 0.218, the minimal and maximal recall value stabilizes more, as the minimal recall value increases quite significantly. Evaluation of quality with questionnaire results in 2.33/3 for grammatical and 3.78/5 for informativity for our best performing configuration.

Our ideas for future ideas to increase the score include testing with a larger word embedding model, and fine tuning the parameters to more than one dataset, as the current method of tuning to just one topic might cause the tuning to be biased to how the articles are written for that particular topic. Another consideration for clustering algorithm is

HDBScan in which we would not need to tune the parameters as much.

ACKNOWLEDGMENT

The authors convey their gratitude to Garmastewira [28] and Reztaputra [27] for summary baselines and development ideas, to survey respondents for their inputs, and our source for gold standard summaries.

REFERENCES

[1] I. Mani and U. Hahn, “The challenges of automatic summarization”, IEEE, 2000.

[2] I. Mani and M. T. Maybury, "Advances in automatic text summarization Vol. 293," Cambridge, MA, 1999.

[3] M. White, T. Korelsky, C. Cardie, V. Ng, D. Pierce and K. Wagstaff, "Multidocument summarization via information extraction," Proceedings of the first International Conference on Human Language Technology Research, pp. 1-7, 2001.

[4] P.-E. Genest and G. Lapalme, "Fully abstractive approach to guided summarization," Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, pp. 354-358, 2012.

[5] P.-E. Genest and G. Lapalme, "Absum: a knowledge-based abstract

summarizer," Génération de résumés par abstraction, 2013.

[6] A. Khan and N. Salim, “A review on abstractive summarization methods”, Skudai, Johor: Journal of Theoretical and Applied Information Technology, 2014.

[7] R. Bois, "Multi-document summarization through sentence fusion," 2014.

[8] S. Banerjee, P. Mitra and K. Sugiyama, "Multi-document abstractive

summarization using ILP-based multi-sentence compression," Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015.

[9] R. Barzilay and K. R. McKeown, "Sentence fusion for multidocument news summarization," Computational Linguistics September 2005, Vol

31, No. 3, pp. 297-328, 2005.

[10] E. Marsi and E. Krahmer, "explorations in sentence fusion," Proceedings of the European Workshop on Natural Language Generation, pp. 109-117, 2005.

[11] A. F. Wicaksono and A. Purwarianti, "HMM based part-of-speech tagger for Bahasa Indonesia," Proceedings of 4th International MALINDO Workshop, 2010.

[12] K. Filippova, "Multi-sentence compression: finding shortest path in word graphs," Proceedings of the 23rd International Conference on Computational Lingusitics, pp. 322-330, Agustus 2010.

[13] F. Boudin, "takahe," 25 March 2014. [Online]. Available: https://github.com/boudinfl/takahe. [Accessed 15 June 2016].

[14] R. Ilyas, "Peringkas otomatis dengan ekstraksi informasi untuk kumpulan berita online (Masters’ Thesis in Institut Teknologi Bandung)," Institut Teknologi Bandung, Bandung, 2015.

[15] J. Goldstein, V. Mittal, J. Carbonell and M. Kantrowitz, "Multi-document summarization by sentence extraction," Proceedings of the 2000 NAACL-ANLP Workshop on Automatic summarization - Volume 4 , pp. 40-48, 2000.

[16] N. Arackal and D. P M, "A survey on exsisting extractive text

summarization techniques," 2014.

[17] P.-E. Genest and G. Lapalme, "Framework for abstractive summarization using text-to-text generation," Proceedings of the Workshop on Monolingual Text-To-Text Generation, pp. 64-73, 2011.

[18] S. Teufel and M. Moens, "Sentence extraction as a classification task," Proceedings of ACL (Vol. 97, No. 1997), pp. 58-65, 1997.

[19] F. Boudin and E. Morin, "Keyphrase extraction for n-best reranking in

multi-sentence compression," Proceedings of NAACL-HLT 2013, pp. 298-305, 2013.

[20] Y. A. Winatmoko and M. L. Khodra, "Automatic summarization of tweets in providing Indonesian trending topic explanation," The 4th International Conference on Electrical Engineering and Informatics, pp.

1027-1033, 2013.

172

Page 6: Application of Word Embedding in Multi- Document ... · information keeps being a problem as data online keeps growing every day. Processing relevant information can be done by human

[21] R. S. Garfinkel and G. L. Nemhauser, "Integer programming," vol. 4, 1972.

[22] D. Gillick and B. Favre, "A scalable model for summarization," Proceedings of the Workshop on Integer Linear Programming for

Natural Langauge Processing , pp. 10-18, 2009.

[23] F. Jin, M. Huang and X. Zhu, "A comparative study on ranking and selection strategies for multi-document sumarization," Coling 2010, pp. 525-533, 2010.

[24] P.-E. Genest, G. Lapalme and M. Yousfi-Monod, "Hextac: the creation of a manual extractive run," Génération de résumés par abstraction, p. 7, 2013.

[25] W. Setiawan and M. L. Khodra, "Pengembangan sistem penggabung

kalimat otomatis," Jurnal Sarjana ITB bidang Teknik Elektro dan Informatika 1, no. 2, 2012.

[26] J. C. Cheung, “Comparing abstractive and extractive summarization of evaluative text: controversiality and content selection”, University of

British Columbia, 2008.

[27] R. Reztaputra and M. L. Khodra, "Peringkasan otomatis kumpulan artikel berita online berbasiskan struktur kalimat," Jurnal STEI, 2017.

[28] Garmastewira and M. L. Khodra, "Peringkasan artikel berita online

otomatis dengan graph convolutional network," Jurnal STEI, 2018.

[29] Christie and M.L. Khodra, “Multi-document summarization using sentence fusion for Indonesian news articles”, In Advanced Informatics: Concepts, Theory And Application (ICAICTA), 2016 International

Conference On (pp. 1-6). IEEE.

[30] Dinakaramani, A, et al. "Designing an Indonesian part of speech tagset and manually tagged Indonesian corpus." Asian Language Processing (IALP), 2014 International Conference on. IEEE, 2014.

[31] Ganesan, K. “ROUGE 2.0: Updated and improved measures for evaluation of summarization tasks.” CoRR. 2015

[32] Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. “Enriching word vectors with subword information.” Transactions of the Association for

Computational Linguistics, 135-146. 2017

173