Making your domain objects searchable with Hibenate Search

49
Making Your Domain Objects Searchable with Hibernate Search Gustavo Fernandes Sunday, 23 May 2010

description

Presentation about Hibernate Search done in Lucene Apache Eurocon at Prague, Czech Republic on May 20th

Transcript of Making your domain objects searchable with Hibenate Search

Page 1: Making your domain objects searchable with Hibenate Search

Making  Your  Domain  Objects  Searchable  with  Hibernate

SearchGustavo  Fernandes

Sunday, 23 May 2010

Page 2: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Agenda

2

Mo#va#ons  and  Goals

Indexing

Retrieval

Scalability

Sunday, 23 May 2010

Page 3: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Hibernate  in  a  nutshell

3IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 4: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Hibernate  in  a  nutshell

4

@Entitypublic class Author { @Id @GeneratedValue private Integer id; private String name; @OneToMany private Set<Book> books;}

@Entitypublic class Book { private Integer id; private String title;}

@Entitypublic class Book { private Integer id; private String title;}

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 5: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Hibernate  in  a  nutshell

5

@Entitypublic class Author { @Id @GeneratedValue private Integer id; private String name; @OneToMany private Set<Book> books;}

@Entitypublic class Book { private Integer id; private String title;}

@Entitypublic class Book { private Integer id; private String title;}

Author author = new Author(“Stephen King”);Book aBook = new Book(“Blaze”);HashSet<Book> books = new HashSet<Book>();books.add(aBook);author.setBooks(books);Session session = sessionFactory.openSession(); Transaction tx = session.beginTransaction();session.save(author);tx.commit();

Author author = new Author(“Stephen King”);Book aBook = new Book(“Blaze”);HashSet<Book> books = new HashSet<Book>();books.add(aBook);author.setBooks(books);Session session = sessionFactory.openSession(); Transaction tx = session.beginTransaction();session.save(author);tx.commit();

Author author = new Author(“Stephen King”);Book aBook = new Book(“Blaze”);HashSet<Book> books = new HashSet<Book>();books.add(aBook);author.setBooks(books);Session session = sessionFactory.openSession(); Transaction tx = session.beginTransaction();session.save(author);tx.commit();

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 6: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Hibernate  in  a  nutshell

6

@Entitypublic class Author { @Id @GeneratedValue private Integer id; private String name; @OneToMany private Set<Book> books;}

@Entitypublic class Book { private Integer id; private String title;}

@Entitypublic class Book { private Integer id; private String title;}

Author author = new Author(“Stephen King”);Book aBook = new Book(“Blaze”);HashSet<Book> books = new HashSet<Book>();books.add(aBook);author.setBooks(books);Session session = sessionFactory.openSession(); Transaction tx = session.beginTransaction();session.save(author);tx.commit();

Author author = new Author(“Stephen King”);Book aBook = new Book(“Blaze”);HashSet<Book> books = new HashSet<Book>();books.add(aBook);author.setBooks(books);Session session = sessionFactory.openSession(); Transaction tx = session.beginTransaction();session.save(author);tx.commit();

Author author = new Author(“Stephen King”);Book aBook = new Book(“Blaze”);HashSet<Book> books = new HashSet<Book>();books.add(aBook);author.setBooks(books);Session session = sessionFactory.openSession(); Transaction tx = session.beginTransaction();session.save(author);tx.commit();

Select * from Author;+----+--------------+| id | name |+----+--------------+| 1 | Stephen King | +----+--------------+

Select * from Book;+----+----------+| id | title |+----+----------+| 1 | Blaze |+----+----------+

Select * from Book_Author;+---------+------------+| Book_id | authors_id |+---------+------------+| 1 | 1 |+---------+------------+

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 7: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Hibernate  extension  which  uses  Lucene  internally

Bring  full  text  search  capabiliIes  to  Hibernate

Object-­‐Document  mapping

Take  care  of  the  plumbing

Keep  database  and  index  in  sync

ConvenIon  over  configuraIon

Flexible

7

Meet  Hibernate  Search

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 8: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Meet  Hibernate  Search

Current  version:  3.2.0-­‐Final  (May/2010)

LGPL  License

Lucene  version  supported:  2.9.2

Solr  version  supported:  1.4

8IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 9: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Meet  Hibernate  Search

Dependencies:

<dependency> <groupId>org.hibernate</groupId> <artifactId>hibernate-search</artifactId> <version>3.2.0.Final</version> </dependency>

9IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 10: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing

Mapping  Objects  <-­‐>  Documents

Support  for  types

Analyzers/Boost  

Transparent/Manual  Indexing

10IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 11: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Mapping  EnIIes@Entitypublic class Author {

@Id @GeneratedValue private Integer id;

private String name;

@OneToMany private Set<Book> books; }

11IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 12: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Mapping  EnIIes@Indexed@Entitypublic class Author {

@Id @GeneratedValue private Integer id;

private String name;

@OneToMany private Set<Book> books; }

12IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 13: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Mapping  EnIIes@Indexed@Entitypublic class Author {

@Id @GeneratedValue private Integer id;

private String name;

@OneToMany private Set<Book> books; }

13IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 14: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Mapping  EnIIes@Indexed(index=”Author_Index”)@Entitypublic class Author { @Id @GeneratedValue private Integer id;

private String name;

@OneToMany private Set<Book> books; }

14IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 15: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Mapping  EnIIes@Indexed(index=”Author_Index”)@Entitypublic class Author { @Id @GeneratedValue @DocumentId private Integer id;

private String name;

@OneToMany private Set<Book> books; }

15IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 16: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Mapping  Fields@Indexed(index=”Author_Index”)@Entitypublic class Author { @Id @GeneratedValue @DocumentId private Integer id; @Field private String name;

@OneToMany private Set<Book> books; }

16IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 17: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Mapping  Fields@Indexed(index=”Author_Index”)@Entitypublic class Author { @Id @GeneratedValue @DocumentId private Integer id; @Field(name = name_field, store = Store.YES, index = Index.TOKENIZED) private String name;

@OneToMany private Set<Book> books; }

17IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 18: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Mapping  Fields@Indexed(index=”Author_Index”)@Entitypublic class Author { @Id @GeneratedValue @DocumentId private Integer id; @Fields( { @Field(index = Index.TOKENIZED), @Field(name= “nameForSort”, index = Index.UN_TOKENIZED) } ) private String name;

@OneToMany private Set<Book> books; }

18IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 19: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Mapping  RelaIonships@Indexed(index=”Author_Index”)@Entitypublic class Author { @Id @GeneratedValue @DocumentId private Integer id; @Field(index = Index.TOKENIZED) private String name;

@OneToMany @IndexEmbedded private Set<Book> books;

}

19IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 20: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Types

20

@Indexed(index=”Author_Index”)@Entitypublic class Author { @Id @GeneratedValue @DocumentId private Integer id; @Field(index = Index.TOKENIZED) private String name;

@OneToMany @IndexEmbedded private Set<Book> books;

@Field(bridge = @FieldBridge(impl = AddressBridge.class)) private Adress address;

}

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 21: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Boost

21

@Indexed(index=”Author_Index”)@Entitypublic class Author { @Id @GeneratedValue @DocumentId private Integer id; @Field(index = Index.TOKENIZED) @Boost(1.5f) private String name;

@OneToMany @IndexEmbedded private Set<Book> books;

@Field(bridge = @FieldBridge(impl = AddressBridge.class)) @Boost(0.75f) private Adress address;

}

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 22: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Analyzers

22

@Entity @Indexedpublic class Author { @Id @GeneratedValue @DocumentId private Integer id;

private String bio; ...}

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 23: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Analyzers

23

@Entity @Indexed@AnalyzerDef(name=”combinedAnalyzers”, tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class),)public class Author { @Id @GeneratedValue @DocumentId private Integer id; private String bio; ...}

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 24: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Analyzers

24

@Entity @Indexed@AnalyzerDef(name=”combinedAnalyzers”, tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class), filters = { @TokenFilterDef(factory = LowerCaseFilterFactory.class) })public class Author { @Id @GeneratedValue @DocumentId private Integer id;

private String bio; ...}

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 25: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Analyzers

25

@Entity @Indexed@AnalyzerDef(name=”combinedAnalyzers”, charFilters = { @CharFilterDef(factory = MappingCharFilterFactory.class) }, tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class), filters = { @TokenFilterDef(factory = LowerCaseFilterFactory.class) })public class Author { @Id @GeneratedValue @DocumentId private Integer id;

private String bio; ...}

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 26: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Analyzers

26

@Entity @Indexed@AnalyzerDef(name=”combinedAnalyzers”, charFilters = { @CharFilterDef(factory = MappingCharFilterFactory.class) }, tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class), filters = { @TokenFilterDef(factory = LowerCaseFilterFactory.class) })public class Author { @Id @GeneratedValue @DocumentId private Integer id; @Analyzer(definition = “combinedAnalyzers”) private String bio; ...}

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 27: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Index  -­‐  Fluent  APISearchMapping mapping = new SearchMapping();

mapping .analyzerDef("customAnalyzer", StandardTokenizerFactory.class) .filter(LowerCaseFilterFactory.class) .filter(SnowballPorterFilterFactory.class) .param("language", "English") .entity(Author.class) .indexed() .property("id",ElementType.FIELD).documentId() .property("adress", ElementType.FIELD) .field().bridge(AdressBrigde.class).store(Store.YES) .property("books", ElementType.FIELD).indexEmbedded() .property("name", ElementType.METHOD).field().store(Store.YES) .entity(Book.class) .indexed() .property("id", ElementType.METHOD).documentId() .property("title", ElementType.METHOD) .field().analyzer("customAnalyzer");

27IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 28: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Backend

28

Source:  Hibernate  Search  in  AcIon

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 29: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  Backend

hibernate.work.execu#on    async

hibernate.work.thread_pool_size    1029

Source:  Hibernate  Search  in  AcIon

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 30: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Indexing  -­‐  JMS  backend  

hibernate.worker.backend        jms

hibernate.worker.jms.connec#on_factory        /Connec#onFactory

hibernate.worker.jms.queue      queue/hsearch

30

Source:  Hibernate  Search  in  AcIon

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 31: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Manual  Indexing

Use  case Non-­‐exclusive  database

Manual  Indexing  types: Single  enIty

Mass  indexer

31IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 32: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Manual  Indexing  -­‐  Single  EnItyFullTextSession fullTextSession = Search.getFullTextSession(session);

Transaction tx = fullTextSession.beginTransaction();

Object author = fullTextSession.load( Author.class, 1 );

fullTextSession.index(author);

tx.commit();

32IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 33: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Mass  IndexingfullTextSession.createIndexer().startAndWait();fullTextSession.createIndexer().start();

33IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 34: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Retrieval  -­‐  Lucene  Queries  +  Hibernate  API// Wraps Hibernate Session Object

org.hibernate.seach.FullTextSession fullTextSession = org.hibernate.search.Search.getFullTextSession(session);

// Lucene queryVersion v = Version.LUCENE_29;

org.apache.lucene.queryParser.QueryParser queryParser = new org.apache.lucene.queryParser.QueryParser(v, "name", new StandardAnalyzer (v));

org.apache.lucene.search.Query query = queryParser.parse("+King");

// Hibernate search queryorg.hibernate.Query textQuery = fullTextSession.createFullTextQuery(query, Author.class);

Author loadedAuthor = (Author)textQuery.list();

34IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 35: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Retrieval  -­‐  Hibernate  Search

1.  Executes  Lucene  Query  and  get  the  results

2.  Retrieves  document  ids  from  the  index

3.  Load  objects  from  database  

4.  Return  domain  objects

35IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 36: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Retrieval  -­‐  Results  ManipulaIon Pagina#on

Type  restric#on

Projec#on

Result  mapping

36IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 37: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Retrieval  -­‐  IndexReader shared  strategy:  shared  IndexReader  (default)          hibernate.search.reader.strategy = shared

not-­‐shared  strategy:  open  IndexReader  for  every  query          hibernate.search.reader.strategy = not-shared

Extensible  by  using  ReaderProvider  Interfacehibernate.search.reader.strategy = com.mycompany.CoolReaderProvider

37IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 38: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Scalability

Sharding

Clustering

38IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 39: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Scalability  -­‐  Sharding

•Default:  one  index  per  en#ty  type

•Shard:  two  or  more  indexes  per  en#ty  type

•Use  cases  • Performance

• Maintenance

39

IndexApplicationQueryIndex

A - Z

Shard A

Shard B

Shard C

Application

A - H

I - N

O - Z

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 40: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Scalability  -­‐  Sharding

Indexes  separated  physically

Virtual  Index

40

Shard A

Shard B

Shard C

VirtualIndex

ApplicationQueryIndex

A - H

I - N

O - Z

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 41: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Scalability  -­‐  Sharding

Configura#onhibernate.search.com.sourcesense.Author.sharding_strategy.nbr_of_shard 2

41IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 42: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Scalability  -­‐  Shard  Strategy

Default  algorithm:    ID  Hash

42

12345

f(x) = x % N

1 2

3

4

5

Shard 1

Shard 2

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 43: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Custom  Sharding  Strategy

Implement  IndexShardingStrategy

hibernate.search.com.sourcesense.Author.sharding_strategy BookTitleStrategy

43IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 44: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Synchronous  Clustering

Every  node  can  read  and  write  to  the  index

Pessimist  locking  prevents  corrup#on

Single  index  shared  among  every  node

Choose  your  flavour:  NFS,  Database,  distributed  caches

44IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 45: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Clustering

Read-­‐Write  Synchronous  cluster

45

Index

Node 1

IndexWriter

Node 2

IndexWriter Node 3

IndexWriter

Node 4

IndexWriter

IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 46: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Asynchronous  Clustering

46IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 47: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Asynchronous  Cluster

Advantages Only  master  writes

No  indexing  in  slaves  -­‐>  no  waiIng  for  locks

Downside Data  is  not  visible  immediately  by  the  slaves

47IntroducIon  ◆  Indexing  ◆  Retrieval  ◆  Scaling

Sunday, 23 May 2010

Page 48: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

To  learn  more...

48

hibernate.org/subprojects/search.html

anonsvn.jboss.org/repos/hibernate/search/

Sunday, 23 May 2010

Page 49: Making your domain objects searchable with Hibenate Search

Apache  Lucene  EuroCon 20  May  2010

Thank  you

49

[email protected]

twicer:  @gustavonalle

Sunday, 23 May 2010