Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

57
Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting Ismail Fahmi, PhD Konsultan Perpusnas RI Inisiator Indonesia OneSearch [email protected] Workshop KPDI 8 Workshop Indonesian OneSearch Konferensi Perpustakaan Digital Indonesia 8 Bogor, 3 November 2015

Transcript of Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Page 1: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Ismail Fahmi, PhD Konsultan Perpusnas RI Inisiator Indonesia OneSearch [email protected]

Workshop KPDI 8

Workshop Indonesian OneSearch Konferensi Perpustakaan Digital Indonesia 8

Bogor, 3 November 2015

Page 2: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Perkenalan…

Ismail Fahmi

2004 – 2009 S3, Information Science, Universitas Groningen, Belanda 2003 – 2004 S2, Information Science, Universitas Groningen, Belanda 1992 – 1997 S1, Teknik Elektro, ITB 2009 – Sekarang Engineer di Weborama, Perusahaan Penyedia Platform Iklan berbasis big data audience (Paris/Amsterdam) 2012 – Sekarang Co-Founder Awesometrics, Media Monitoring & Analytics Company 2014 – Sekarang Founder PT. Media Kernels Indonesia, a Natural Language Processing- based Company 2015 – Sekarang Konsultan Perpustakaan Nasional, Inisiator Indonesia OneSearch 2000 – 2003 Inisiator IndonesiaDLN (Digital Library Network pertama di Indonesia) Mengembangkan Ganesha Digital Library (GDL) Mendirikan Knowledge Management Research Group (KMRG) ITB Membangun Digital Library ITB

Page 3: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Daftar Isi

1. Roadmap

2. Interoperabilitas

3. Registration

4. Harvesting & Indexing

5. Searching

6. Virtual Community

7. Kesimpulan

3

Page 4: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Mengapa Masih Butuh Portal Baru?

• Belum ada portal yang mengindeks seluruh jenis koleksi (jurnal, ILS, grey literatur / digital library). – PortalGaruda dan ISJD hanya khusus untuk journal

– Garuda.dikti.go.id sudah tidak aktif lagi.

• Belum ada portal yang sangat mudah digunakan oleh pengguna dengan fitur informasi yang sangat kaya. – PortalGaruda yang paling bagus user interfacenya, namun belum lengkap

fitur informasi yang disajikan.

– Perlu portal yang sangat memudahkan pengguna mencari informasi yang dibutuhkan, dan bahkan bisa meningkatkan serendipity.

• Belum ada portal yang mobile friendly, padahal saat ini mobile device semakin banyak digunakan oleh pengguna. – PortalGaruda dan ISJD didesain untuk browser desktop.

– Untuk meningkatkan diseminasi dan usability, portal harus semakin berorientasi pada pengguna.

4

Page 5: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Requirement untuk Portal Baru

• Repository dan koleksi – Mencakup semua jenis repositori dan koleksi (journal, ILS, digital library)

• Interoperability dan update – OAI-PMH (Harvesting), satu-satunya metode harvesting – OAI-PMP (Posting), metode posting offline – Otomatis update

• User Interface dan fitur – Simple, mudah digunakan, powerful search engine, diperkaya dengan

faceted search, dan informasi yang lengkap.

• Mobile – Mendukung mobile device (smartphone, tablet)

• Authority dan report – Informasi tentang pengarang, citation index, dan statistik yang penting

dan menarik untuk kontributor.

• Sustainable – Didukung oleh sistem yang memungkinkan portal berkembang untuk

jangka panjang.

5

Page 6: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Indonesia OneSearch

6

Any platforms Any collections

Page 7: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Road Map

7

Tahap 1 (2015): OneSearch Basic

Input

• Bibliography

• Perpusnas tematic databases

Process

• Harvesting

• Search & Facet

• Bibliography indexing

Output

• Search bibliography

• Search Summon

• OAI Manager

Tahap 2 (2016): OneSearch Advanced

Input

• PDF Fulltext

Process

• PDF crawling

• PDF to text and image converting

• Fulltext indexing

• Linguistics processing

• Semantic indexing

• Content analysis

• Statistics: collection, usage

Output

• View fulltext online (eReader)

• Content analysis and research

• Citation Index

• View collection and usage statistics

Tahap 3 (2017): NoPlagiarism

Input

• PDF Fulltext

• Wikipedia (Bahasa Indonesia)

• Online news (Bahasa Indonesia)

Process

• Wikipedia crawling

• Online news crawling

• Document fingerprint indexing

• Similarity analyzer

• Similarity Report builder

Output

• Document upload

• Document similarity detection report

• Admin

Page 8: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Tahap 1 (Mulai 2015)

• OneSearch Basic

• Target – Teknis:

• Metadata Harvesting, otomatis, tidak ada penambahan manual.

• Interoperability menggunakan OAI-PMH

• Metadata Prefix: MARCXML dan OAI_DC

• Prototipe Server

• Database dan Form Registrasi Repository

• Reporting dan analytics

• Integrasi beberapa Software: – ILS: INLIS Lite, SLIMs, KOHA

– Journal: OJS

– Digital Library/Repository: Dspace, dll.

– Summon

– Non-Teknis: • Strategi pengelolaan dan sustainability Indonesia OneSearch

• Masukan dari komunitas

• Pembentukan Tim

• Kerjasama, sosialiasi 8

Page 9: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Tahap 2 (Mulai 2016)

• OneSearch Advanced

• Target: – Teknis:

• Full-text Harvesting

• Penerapan teknologi NLP (Natural Language Processing)

• Content analysis terhadap full-text

• Pengguna mendapatkan banyak kemudahan dalam melakukan pencarian dan riset.

• Authority index.

• Citation parsing and indexing.

– Non-Teknis:

• Kampanye tentang sharing full-teks,

• Demo manfaat content analysis untuk pengguna.

9

Page 10: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Perl ParsCit

• Parsing and indexing citation.

• Using ParsCit open source software. – https://github.com/knmnyn/ParsCit

– http://wing.comp.nus.edu.sg/parsCit/

– ParsCit is used by CiteseerX to parse document for citation.

10

Page 11: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

CiteSeerX

11

Page 12: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Teknologi NLP di dalam OneSearch

• Text mining, content analysis:

– Terminologi extraction

– Named entity extraction: person, organization, location, event, time

– Quote extraction

– Cooccurrence analysis

– Relationship extraction: S-P-O (subject-predicate-object) relations, entity relations.

– Clustering, topic mapping

12

Page 13: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Contoh S-P-O Relationship Extraction

13

Page 14: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Contoh Relationship Mapping

14

Page 15: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Tahap 3 (Mulai 2017)

• NoPlagiarism

• Target: – Teknis:

• Membangun sistem plagiarism detector.

• Mengindeks news, wikipedia, blogs, dan seluruh full-teks dalam Indonesia OneSearch untuk plagiarism detector; paling lengkap untuk teks dalam bahasa Indonesia.

• Menyediakan infrastruktur server dan data center untuk plagiarism detector.

• Membangun interface seperti Turnitin untuk pengguna di PT, lembaga riset, dan individual.

– Non-Teknis:

• Ujicoba dan sosialiasi NoPlagiarism ke dosen dan mahasiswa di perguruan tinggi.

• Kampanye anti-plagiarism.

15

Page 16: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Turnitin

16

Kemampuan mendeteksi plagiarisme sangat tergantung pada database artikel yang diindeks. Turnitin mengindeks sebagian besar artikel berbahasa Inggris,

tetapi tidak untuk yang berbahasa Indonesia dan tidak dishare.

Page 17: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Indonesia OneSearch + NoPlagiarism

17

Page 18: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Konfigurasi Lengkap Indonesia OneSearch

18

Page 19: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Program Perpustakaan Nasional

19

Indonesia OneSearch is a program by the National Library of Indonesia

Page 20: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Daftar Isi

1. Roadmap

2. Interoperabilitas

3. Registration

4. Harvesting & Indexing

5. Searching

6. Virtual Community

7. Kesimpulan

20

Page 21: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Interoperability

21

Digital Library servers (Eprints, Dspace, dll)

Otomation/Digital Library servers (SLIMs)

Library Otomation servers (INLIS)

Other Repository (Omeka, dll)

E-Journal (OJS)

MULTI-PLAFORM LIBRARY INFORMATION SYSTEM

OAI PMH

OAI PMP

Harvesting (online)

Posting (offline)

OAI PMH

Harvesting

Fulltext Files (PDF)

Download

Page 22: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Skenario Interoperability

slims-ucs.onesearch.id

UCS A UCS B

OAI-PMH OAI-PMH OAI-PMH

UCS Upload UCS Upload

OAI-PMH

Open journal system

Page 23: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

OAI-PMH

23

Page 24: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

OAI-PMH Structure Model

24

Page 25: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Skenario Interoperabilitas

25

Page 26: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Contoh

26

http://www.jurnal.unsyiah.ac.id/AIJST/oai

Page 27: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Identify

27

Page 28: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

ListRecords (oai_dc)

28

Record Header

Record Metadata

Page 30: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

ListRecords (marcxml)

30

http://dev.pnri.go.id/oaiinlis/oai2.aspx?verb=ListRecords&metadataPrefix=marcxml

Record Header

Record Metadata

Page 31: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

ResumptionToken (marcxml)

31

Page 33: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Contoh Library OAI-PMH (SLIMs)

33

Page 34: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Daftar Isi

1. Roadmap

2. Interoperabilitas

3. Registration

4. Harvesting & Indexing

5. Searching

6. Virtual Community

7. Kesimpulan

34

Page 35: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Bergabung dengan Indonesia OneSearch

• Aspek Legal: – Institusi Anda akan bekerja sama dengan Perpustakaan Nasional RI.

– MOU dan Perjanjian Kerjasama (jika dibutuhkan) dapat dilakukan dengan Perpustakaan Nasional RI.

• Aspek Teknis: – Registrasi Online via situs OneSearch.id

– Konsultasi/support dengan Tim teknis Indonesia OneSearch

35

Page 36: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Jenis Repositori

• Journal

• Integrated Library System (ILS)

• Digital Repository/Library

36

Page 37: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Registrasi Jurnal

37

http://onesearch.id/Repositories/AddJournal

Page 38: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Registrasi Jurnal (lanj.)

38

Library Type

Software Platform

Metadata Prefix

Page 39: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Registrasi Jurnal (lanj.)

39

Subject Area mengadopsi ISJD PDII LIPI

Page 40: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Digital Repository

40

Page 41: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

ILS

41

Page 42: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Browse OneSearch Repositories

42

Page 43: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Daftar Isi

1. Roadmap

2. Interoperabilitas

3. Registration

4. Harvesting & Indexing

5. Searching

6. Virtual Community

7. Kesimpulan

43

Page 44: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Harvesting & Indexing

• Secara otomatis dan periodik dilakukan oleh server Indonesia OneSearch.

• Selalu pastikan agar OAI-PMH anda aktif dan dapat diakses oleh server IOS.

44

Page 45: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Daftar Isi

1. Roadmap

2. Interoperabilitas

3. Registration

4. Harvesting & Indexing

5. Searching

6. Virtual Community

7. Kesimpulan

45

Page 46: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Search & Browse IOS

46

Page 47: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Integrasi dengan Summon®

Integration with Summon® Service

Page 48: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Alamat Repository Anda di IOS

48

Page 49: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Daftar Isi

1. Roadmap

2. Interoperabilitas

3. Registration

4. Harvesting & Indexing

5. Searching

6. Virtual Community

7. Kesimpulan

49

Page 50: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Contoh: KINK (Katalog Induk Nasional Kesehatan)

Page 51: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Alur data OneSearch dan KINK

Onesearch.kink.kemkes.go.id

Koha, Dspace, SLIMs, dll (Online)

Indonesia OneSearch

UCS SLIMs

OAI PMH

Filtering

OAI PMH

SLIMs (Offline)

UCS upload

Page 52: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Tahap 1: repository online

1. Pusat Komunikasi Publik 2. Sekretariat Badan Litbang Kesehatan 3. Pusdiklat Aparatur 4. Poltekkes Jakarta II 5. Poltekkes Jakarta III 6. Poltekkes Malang 7. Poltekkes Surabaya 8. Poltekkes Semarang 9. Poletekkes Yogyakarta 10.Poltekkes Padang 11.Poltekkes Bandung 12.Poltekkes Tanjung Karang 13.Poltekkes Denpasar

Page 53: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Tahap 2: repository offline

1. Ditjen P2PL 2. Sekretariat Badan PPSDM 3. Poltekkes Jakarta I 4. Poltekkes Bengkulu 5. Poltekkes Aceh 6. Poltekkes Palembang 7. Poltekkes Tasikmalaya 8. Poltekkes Pontianak 9. Poltekkes Banjarmasin

Page 54: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Alamat URL Virtual Hosting

54

Page 55: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Daftar Isi

1. Roadmap

2. Interoperabilitas

3. Registration

4. Harvesting & Indexing

5. Searching

6. Virtual Community

7. Kesimpulan

55

Page 56: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Kesimpulan

• Persyaratan Bergabung dengan Indonesia OneSearch: – Siapkah persyaratan teknis:

• Memiliki sistem otomasi perpustakaan (ILS) yang menggunakan software seperti SLIMs, Koha, dll.

• Atau memiliki sistem digital library, yang menggunakan software seperti DSPACE, Eprints, dll.

• Syarat: http://wiki.onesearch.id/doku.php?id=syarat-bergabung • Pastikan sistem otomasi/digital library sudah mendukung protokol OAI-PMH. • Contoh untuk SLIMs, lihat di http://wiki.onesearch.id/doku.php?id=oai-slims.

• Registrasi ke Indonesia OneSearch: – Kontak Indonesia OneSearch (Ismail Fahmi, [email protected]) – Isi Form ‘suggestion’ sesuai dengan jenis repository: Journal, Digital

Repository/Library, atau ILS.

• Harvesting, Indexing, Launching – Langkah selanjutnya akan dilakukan oleh Indonesia OneSearch, untuk

mengharvest dan mengindeks data dari repositori perpustakaan anda. – Setelah proses harvesting dan indexing selesai, koleksi dari perpustakaan

anda akan bisa diakses dari Indonesia OneSearch.

56

Page 57: Indonesia Onesearch: Registration, Harvesting, Indexing, Searching, and Community Virtual Hosting

Terimakasih

57

Ismail Fahmi Indonesia OneSearch, Inisiator HP: 0812 8908 3894 Email: [email protected]