Drupal.org Search Evaluation

Post on 22-Nov-2014

3.579 views 1 download

description

An Evaluation of Drupal.org Search System.

Transcript of Drupal.org Search Evaluation

Enterprise Search Engine Survey

Isriya Paireepairit

Drupal.org Case

Drupal

• Software

• Content Management System

• Web-based

• PHP

Drupal.org

• Home of Drupal the CMS

• For Drupal users, downloaders, developers

• Definitely use Drupal as CMS

• As well as Drupal Search Function

Drupal.org Content Types

• Projects

• Modules

• Themes

• Translations

• Forums (Support, Discussion, Chit-chat)

• Documents (Manual, Howto)

• Issues (Bugs, Feature Requests)

• API Documents (for Developers)

• User page

• News/Announcement

• As mid April 2008

• Content: 250,000 nodes

• Registered User: 280,000 users

• Page Visits: ~1M/day (Compete.com)

Drupal.org Content Size

Drupal Search Function

• Indexing

• Minimum word length is configurable

• CJK Handling

Drupal Search Function

• Search result ranking

• Weightable

• 3 default factors

• Keyword relevance

• Recency

• Number of comments

Drupal.org Implementation

• Keyword relevance: 10

• Recency: 5

• Number of comments: 1

Source: http://www.civicactions.com/blog/search/part_1

Demohttp://drupal.org/search

Good

• Simplicity

• Advanced Search

• (Some) Specific content type search

• Detailed result

Simplicity

Advanced Search

(Some) Specific Content Types

Detailed Result

Some Problems

1

2

3

Improvement Ideas

• Add more priority to some content types

• Projects > Documents > Forums

• Add sorting option

• By type

• Also by date, number of comments

More Ideas

• Weight by

• Number of incoming links (like PageRank)

• Tag/Category/Taxonomy

• Misspelling Handler

• Synonym Handler

• e.g. “Category” = “Taxonomy”

More Experimental IdeasFaceted Search

Further Issues

• Overall site performance

• Indexing and Searching is resource-consuming

• Solution

• “Outsource” search function to dedicated search software?

• Google Box

• Apache Solr