Khirulnizam malay proverb detection - mobilecase 19 sept 2012 - copy

20
Malay Proverb Detection; Implementation on Mobile Environment. Khirulnizam Abd Rahman Faculty of Science and Technology Universiti Sains Islam Malaysia [email protected] http://khirulnizam.com

description

 

Transcript of Khirulnizam malay proverb detection - mobilecase 19 sept 2012 - copy

Malay Proverb Detection; Implementation on Mobile

Environment.

Khirulnizam Abd RahmanFaculty of Science and Technology

Universiti Sains Islam [email protected]

http://khirulnizam.com

1. Introduction2. Previous Works3. The Design4. Discussions5. Conclusions

Presentation Structure

1. Introduction - Proverb A proverb is a short, generally known

sentence of the folk which contains wisdom, truth, morals, and traditional views in a metaphorical, fixed and memorisable form and which is handed down from generation to generation” (Mieder 1993, p. 5 and 24f.).

3

Proverbs (peribahasa) in Malay language are beautiful elements to deliver advices, Malay teachings, moral values and comparison through metaphoric phrases

There are four categories of Malay proverbs (as described by Abdullah & Ainon, 2011) which are simpulan bahasa, perumpamaan, bidalan and pepatah.

1. Introduction – Malay Proverb

Sample of Proverbs

Malay Proverbs Meaning in English

Kalau kamu tunggu sehingga kucing bertanduk sekali pun, Aminah tidak akan hidup kembali.

Even if you waited forever, Aminah will not live again.

Anak di rumah kelaparan kera di hutan disusukan.

Helping others are more important than helping your own next to kin.

5

Proverb Detection – the apps was developed to experiment the detection proverb in sentence/paragraph using the pattern matching approach.

The mobile platform is just one of the environment to test.

There’s also the web-based version.

1. Introduction

Multiword Malay Indexing (Rais et. al, 2011) Malay Proverb Dictionary (Supyan et. al,

2004) Hindi to Urdu Proverbs Translation

(Brahmaleen et. al, 2010) German-English Idioms Treatment (Dmitra,

2010)

2. Previous Works

2.1 Malay MWE Info RetrievalRais et. Al (2011) – Multiword Malay

indexing using combination of query translation approach and weighting schemes.

Dictionary is crucial in Multiword detection.

8

2.2 Malay Proverb Dictionary, ATMA, UKMMalay Proverb Dictionary – using structured

query, Supyan et. al. 2004.Provides searchable database of Malay proverbs

and idiomsThe application receives a complete or a part of

proverb/idioms, using the pattern matching to search the list of

the proverbs/idioms, and the outputs are all the proverbs/idioms that similar to the user’s request.

Contains almost 24,000 entries.9

2.3 Hindi to Urdu Proverbs Translation Brahmaleen (2010) – Pattern matching and

structured query for Hindi proverbs.

10

2.4 Germany to English Idioms Translation Dmitra (2010) – German-English Idioms

Treatment Method - syntactic matching rules METIS II

11

3. App Design

While not-end-of-sentence words-combined = wi wi+1

Search words-combine in proverb-database If found in proverb-database Put words-combined in the proverb-list-

output i=i+2 Else i++ Wend

3.1 Simpulan Bahasa Detection

While not-end-of-sentence words-combined = wi wi+1 wi+2

Search words-combined in proverb-database If found in proverb-database Put words-combined in the proverb-list-output i=i+number-of-words-in-proverb-

detected Else i++ Wend

3.2 Peribahasa Detection

3.3 UI

Available at : http://bit.ly/pbahasa

Challenges in Malay proverbs Word with affixes – Example: “Kembang

sayap” = “Mengembangkan sayap” (spread your wings).

Another word in between (stopword) Example: “berpijak di bumi nyata” or

sometimes “berpijak di bumi yang nyata” which means “do not day-dreaming”.

4. Discussions

The researchers found out that by using simple pattern matching, it failed to detect proverbs that have the aforementioned problems.

Another challenge is to determine the right meaning for proverb that has ambiguous meaning – the same proverbs may have more than one meaning. However in this experiment, all meanings are listed for proverb with ambiguous meaning.

4. Discussions

Next thing to do: Include stemming, stop-word removal prior

to detection.

4. Discussions

Brief testing and observation has been done to the result of this application. The researchers concluded that there are more to improve, which is implementing another approach.

One of the approaches to be studied is the syntactic matching rules proposed by Dmitra [4] for detecting Germany idioms.

This Malay proverb detector is a prototype to experiment with the pattern matching approach on mobile platform.

Though it is still in experimentation stage, the researchers hope that it could contribute to the public by facilitating the new Malay language learner.

Conclusions

Thank you Malay Proverb Detection; Implementation on Mobile Environment. [email protected] http://khirulnizam.com

Q & A

Available at : http://bit.ly/pbahasa