Arabic question answering
-
Upload
arabicnlpimamu2013 -
Category
Technology
-
view
214 -
download
1
Transcript of Arabic question answering
![Page 1: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/1.jpg)
Imam University College of Computer and Information systems
Computer sciences Department
Arabic Question Answering :by Asma Ahmad Asma alharbi
nadia AL-Mutiri Supervised by: Dr .Amal Al seef
Second semester :1434-14352013
![Page 2: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/2.jpg)
Arabic Question Answering
Overview:O The implementation of Arabic
Question-Answering system components .
O QASAL & QARAB System components.
O Yes/No Arabic Question Answering.
![Page 3: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/3.jpg)
ARABIQA GENERIC ARCHITECTURE
![Page 4: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/4.jpg)
Named Entity Recognizer
O A NER system identifies proper names, temporal and numeric expressions .
O in this Arabic NER system is based ME approach.
O For the proper names recognition:
O For temporal and numeric expressions: is totally based on patterns and a small dictionary containing the names of days and months in Arabic, and numbers written in letters.
![Page 5: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/5.jpg)
The implementation of Arabic Question-Answering
systemO NooJ is a linguistic environment that
includes large-coverage dictionaries and grammars.
O a spell-checker that corrects the most frequent errors.
O a named entity recognition tool which is set of rules described into local grammars
![Page 6: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/6.jpg)
QASAL System components
![Page 7: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/7.jpg)
Question analysis: this step it is apply the set of linguistic resources to the input
question.For example shows the NooJ’s text annotation structure that gives the
linguistic analysis of each word form in our sample question
![Page 8: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/8.jpg)
Passage retrieval: The first task of this step could be the selection of one or more automatically extract the answer of the input question.
![Page 9: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/9.jpg)
Answer Extraction: this last step uses the displayed concordance table to automatically extract the answer of the input question.
Example1 :Answer Extraction for the factoid question: متى تونس ؟استقّل0ت
![Page 10: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/10.jpg)
Example 2:
![Page 11: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/11.jpg)
QARAB System
f
NLB Tool
Question
Question analyzer
IR Ranked Document
s
Passage selection
Hypothesized
Answer
Al-Raya Newspape
rDocument
Answer Generati
on
full IR
system
![Page 12: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/12.jpg)
Information Retrieval system.
O To search the document collection to select documents containing information relevant to the user’s query.
O Lundquist et al. [1999] IR system that can be constructed using a relational database management system (RDBMS).
O But in this paper it contain following database relations:
1. ROOT_TABLE.2. STEM_TABLE.3. POSTING_TABLE.4. DOCUMENT_TABLE.5. PARAGRAPH_TABLE.
![Page 13: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/13.jpg)
The NLb system
The NLB model is:1. Tokenizer.2. type finder.3. feature finder.4. proper noun phrase parser.
![Page 14: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/14.jpg)
How to extract the Answer
Assume the user posed the following question to QARAB:
ليس بالده بأن قال والذي الكويتي المركزي البنك محافظ هو منالميزانية؟ عجز من للحد الدينار قيمه لخفض نيه لديها
The IR return this passage . How!?
الصباح العزيز عبد سالم الشيخ الكويتي المركزي البنك محافظللحد الكويتي الدينار قيمة لخفض النية لديها ليس بالده ان أمس
الدينار . قيمة خفض بأن وقال الميزانية في المتزايد العجز من. الدولية المالية األسواق في ومصداقيتها الكويت باقتصاد سيضر
![Page 15: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/15.jpg)
Step1: O performing token and remove the
stop word of question , Then tagging the word for POS.
![Page 16: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/16.jpg)
Step 2:O QARAB constructs the query as a
“bag of words” and passes it to the IR system.
![Page 17: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/17.jpg)
Exampleالكويتي محافظقال المركزي العزيز البنك عبد سالم الشيخ
لديها ليس بالده ان امس الدينار الصباح قيمة لخفض النيةلّلحد في العجزمن الكويتي خفض. الميزانيةالمتزايد بأن وقالالدينار باقتصاد قيمة االسواق الكويت سيضر في ومصداقيتها. الدولية المالية
Step 3: Determine the expected type of the answer: Who? >>> personal name .من
Step4: Generating the answer.الكويتي المركزي البنك محافظ الصباح قال العزيز عبد سالم ان الشيخ امس
في المتزايد العجز من للحد الكويتي الدينار قيمة لخفض النية لديها ليس بالدهفي. ومصداقيتها الكويت باقتصاد سيضر الدينار قيمة خفض بأن وقال الميزانية
. الدولية المالية االسواق
![Page 18: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/18.jpg)
Yes/No ArabicQuestion
Answering
![Page 19: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/19.jpg)
SYSTEM ARCHITECTURE:
Question Analysis module
Text retrieval module
Answer Selection module
![Page 20: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/20.jpg)
Question AnalysisO Removing the question mark.O Removing the interrogative particleO Tokenizing: the tokenizer divides the
user question into its separate words .And normalize the (Alef) letter.
O Removing the stop words.O Removing the negation particles. (if it
exits) and set the negation property of the question representation
![Page 21: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/21.jpg)
Question AnalysisO Tagging: to determine the type of a
word, verb or noun and obtain its root.
O Parsing: recall that the Arabic sentence after the interrogative particle is nominal or verbal.
![Page 22: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/22.jpg)
Question AnalysisIn nominal sentence, we are interested with the
beginning noun “topic” (مبتدأ) which is the firstnoun after the interrogative particle (هل). And
the comment noun (خبر) and we can mark it as the
last noun without the article (ال).In verbal sentence we are interested with the
verb of the sentence which occur immediately after
the interrogative particle (ال) , and the subject that follow the verb.
![Page 23: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/23.jpg)
Question Analysis
Logical Representation(With Nominal Sentences)Affirmative questions O N (Topic, root (Comment), root
({remaining words }))O N (Topic, root (Comment Synonyms),
root ({remaining words}))O ~N (Topic, root (Comment
Antonyms), root ({remaining words}))
![Page 24: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/24.jpg)
Question AnalysisLogical Representation(With Nominal Sentences)
O Negated questions :O ~N (Topic, root (Comment), root
({remaining words}))O ~N (topic, root (Comment
Synonyms), root ({remaining words}))
O N (Topic, root (Comment Antonyms), root ({remaining words}))
![Page 25: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/25.jpg)
Question AnalysisO Example
النافذه؟ كسرت سميره هلمبتدأ : سميره
حطمت -----> ) خبر (synonymكسرتO N(سميره, root ( كسرت),root(النافذه))O N(سميره, root (حطمت ),root(النافذه))
![Page 26: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/26.jpg)
Question AnalysisLogical Representation(With Verbal Sentences)Affirmative questions :O V (Subject noun, root (verb), root ({remaining
words}))O V (Subject noun, root (verb Synonyms), root
({remaining words}))O ~V (Subject noun, root (verb Antonyms), root
({remaining words}))
![Page 27: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/27.jpg)
Question Analysis
Logical Representation(With Verbal Sentences)
Negated questions O ~V (Subject noun, root (verb), root
({remaining words}))O ~ V (Subject noun, root (verb Synonyms),
root ({remaining words}))O V (Subject noun, root (verb Antonyms),
root ({remaining words}))
![Page 28: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/28.jpg)
Question Analysis
Exampleالباب؟ محمد فتح هل
( اغلق : ---> فعل (Antonymفتحفاعل : محمد
O V(محمد, root (فتح),root(الباب))O ~V(محمد, root (اغلق),root(الباب))
![Page 29: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/29.jpg)
Text Processing & Retrieval
They are 20 documents in corpus. This module uses two techniques to retrieve the top 5
candidate paragraphs (with variable length (that are most relevant to the user question:
O Paragraphs technique: - Split the documents into its built-in paragraphs and retrieve the top 5 paragraphs regardless from which document they are, according to some indexing scheme.
O Document technique-:Retrieve the top 5 documents after they are ranked, then use the first indexing scheme to retrieve the top 5 paragraphs.
![Page 30: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/30.jpg)
Answer Selection & generation
After the 5 paragraphs are selected using documents technique or paragraphs technique, we need to select the best sentence to represent the answer, and accordingly generates yes or no .
![Page 31: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/31.jpg)
Answer Selection & generation
O Split the paragraphs into their sentences .
O In normal sentences we are interested in the exact topic (مبتدأ) not its used root, so we omit each sentence that does not contain it (in the original form )In verbal sentence we are interested in the exact subject (فاعل) not its used root , so we omit each sentence that does not contain it (in the original form )
![Page 32: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/32.jpg)
Answer Selection & generation
O In the result sentence , we look for the remaining terms (in root form) that derived from the
question in the logical representation (except the subject or the topic ), if the they exist , assign
those indexes according to their position in the sentence. So each sentence will have its own rank
as follow :Rank =last occurrence - first occurrenceO look for ( النفي negation particles in the (ادوات
selected answer (if exist).
![Page 33: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/33.jpg)
Answer Selection & generation
O Using the selected answer and the logical representation of the question to generate yes ,or no a follows :
1. Yes ,if : The question and the answer are affirmative .The question and the answer are negated.
2. No, if :The question if affirmative and the answer are negated.The question is negated and the answer is affirmative.
![Page 34: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/34.jpg)
EXPERIMENTS AND RESULTS
69% Arabic QA system
97.3% Arabic Q-A uses QARAB
83.3% PR system
![Page 35: Arabic question answering](https://reader038.fdocuments.in/reader038/viewer/2022102815/557954b0d8b42ab6648b4956/html5/thumbnails/35.jpg)
conclusionO We have described the generic
architecture for AQ answer O compare with deferent system O How presses the question and give
the answers.