Spcua 2013 Alexey Kozhemiakin Enterprise Search

40
May 22 nd 2013, Kiev Enterprise search portals SharePoint 2013 Alexey Kozhemiakin

description

English version of my slides from SPCUA 2013

Transcript of Spcua 2013 Alexey Kozhemiakin Enterprise Search

Page 1: Spcua 2013 Alexey Kozhemiakin Enterprise Search

May 22nd 2013, Kiev

Enterprise search portals SharePoint 2013

Alexey Kozhemiakin

Page 2: Spcua 2013 Alexey Kozhemiakin Enterprise Search

May 22nd 2013, Kiev

or “How to make a cool search”

Alexey Kozhemiakin

Page 3: Spcua 2013 Alexey Kozhemiakin Enterprise Search

3

Who’s speaking to you?

• Solution Architect @epam

• Focusing on search• Sharepoint Search FAST/2010/2013• Apache Lucene, Solr, elasticsearch,

Oracle Endeca…

• http://powersearching.wordpress.com

Page 4: Spcua 2013 Alexey Kozhemiakin Enterprise Search

4

Agenda

• Enterprise Search Portal• Insight into SP2013 Search• Key changes from SP2010• A bit of magic – relevancy calculation

• Search governance, useful hint & tips

Page 5: Spcua 2013 Alexey Kozhemiakin Enterprise Search

5

Key search patterns

• I know what I’m searching and where to find it

• I know what I’m searching but don’t know where to find it.

• I don’t‘ know what I’m searching

http://aghy.hu/AghyBlog_EN/Lists/Posts/Post.aspx?ID=199

Page 6: Spcua 2013 Alexey Kozhemiakin Enterprise Search

6

• Demand:• Fast growing enterprises• Zoo of internal systems

• Solution: • “google” inside enterprise

• Quick-wins for business:• Single point of smart search and information retrieval• Reduce search time by employee• Better inner communications and simplified reuse of

conent

Enterprise Search Portal

Page 7: Spcua 2013 Alexey Kozhemiakin Enterprise Search

7

But after deployment…

• «.. Search sucks»• Out of the box search knows nothing about you• «Typical But…• … Microsoft takes care of decent search algorithm»• … we’re not sure we can do better»• ... we don’t need search, everybody know where content is»• … make our search like in facebook/google/bing (instead of

requirements)»

Page 8: Spcua 2013 Alexey Kozhemiakin Enterprise Search

8

Why it’s hard

• Ambiguous short queries• Unstructured not optimized content• Different active vocabulary of content users and

creators• Limited resources ($), while in internet search:• Auto and manual testing of search quality (assessors)• Continuous improvement

Page 9: Spcua 2013 Alexey Kozhemiakin Enterprise Search

9

Search architecture in SP2013

Page 10: Spcua 2013 Alexey Kozhemiakin Enterprise Search

10

Search in two phase process

• Matching – all docs with keywords• Linguistics: stemming, phonetics• Synonyms

• Ranking• «Фичи»

• TF-IDF, BM25• Вес полей• Тип файла• Дата изменения• Популярность• …

Page 11: Spcua 2013 Alexey Kozhemiakin Enterprise Search

11

Ranking in FAST

• Linear combination of features

Page 12: Spcua 2013 Alexey Kozhemiakin Enterprise Search

12

Ranking in FAST

• Impact of each component to final rank

1st 2nd 3rd 4th0

1000

2000

3000

4000

5000

6000

7000

8000

term:fast term:search freshness static rank proximity

Page 13: Spcua 2013 Alexey Kozhemiakin Enterprise Search

13

Migration FAST->SP2013

Page 14: Spcua 2013 Alexey Kozhemiakin Enterprise Search

14

Ranking in SP2013

Page 15: Spcua 2013 Alexey Kozhemiakin Enterprise Search

15

Ranking in SP2013

• Default Relevancy Model• Two neural networks• Freshness in not included in ranking• Features Type Instance

BM25 BM25Static UrlDepthBucketedStatic InternalFileTypeBucketedStatic LanguageStatic ClickDistanceStatic QueryLogClicksStatic QueryLogSkipsStatic LastClicksStatic EventRateMinSpan - soft TitleMinSpan - soft TitleMinSpan - soft TitleMinSpan - soft Content

Page 16: Spcua 2013 Alexey Kozhemiakin Enterprise Search

16

Ranking in SP2013

• Default relevancy model

Page 17: Spcua 2013 Alexey Kozhemiakin Enterprise Search

17

Explain rank

• /_layout/15/explainrank.aspx• rankdetail property

Page 18: Spcua 2013 Alexey Kozhemiakin Enterprise Search

18

Explain rank

• Manual validation in excel

Page 19: Spcua 2013 Alexey Kozhemiakin Enterprise Search

19

Page 20: Spcua 2013 Alexey Kozhemiakin Enterprise Search

20

Search Governance

1. Search analytics2. Fine tuning and adaptation3. Regular testing4. Security assessment5. Promotion whithin company6. Content optimization and basic SEO

Page 21: Spcua 2013 Alexey Kozhemiakin Enterprise Search

21

1. Search analytics

• Search analytics• Search analytics• Search analytics

• Obey! Use Search analytics

Page 22: Spcua 2013 Alexey Kozhemiakin Enterprise Search

22

1. Search analytics

• OOTB in SP2013• Most popular queries• «No Results/abandoned» queries

• 3rd party tools (Google Analytics, Omniture, WebTrends)• Measure search quality (!)

• % click on results• Which results• Return after clicks

• Session analysis• Query segmantation

Page 23: Spcua 2013 Alexey Kozhemiakin Enterprise Search

23

Query segmantation

• Analyze and improve not only top N queries, but classes of queries

Page 24: Spcua 2013 Alexey Kozhemiakin Enterprise Search

24

2. Fine tuning

• Authoritative Pages• Quick win – content source priority

• Query Rules• Smart search for users

• Synonyms• Separate mapping file• Expansion only• Termsets synonyms NOT working

• Relevancy models

Page 25: Spcua 2013 Alexey Kozhemiakin Enterprise Search

25

Authoritative Pages

• Impacts ClickDistance• ClickDistance, UrlDepth have hich impact on total

score (see explain rank)• Configures in CA, CSOM

Page 26: Spcua 2013 Alexey Kozhemiakin Enterprise Search

26

Query Rules (Rule + Action)

• The tool to make search smarter• Interactive feedback to user queries• Post processing of queries• Leverage navigational queries• …

Page 27: Spcua 2013 Alexey Kozhemiakin Enterprise Search

27

Condition for Query Rules

• Query Matches Keyword Exactly• Advanced Query Text Match• Query Matches Dictionary Exactly

• Query Contains Action Term

• Query More Common in Source• Result Type Commonly Clicked

Page 28: Spcua 2013 Alexey Kozhemiakin Enterprise Search

28

Actions для Query Rules

• Create and display a result block• Change ranked search results• Best Bets• XRANK

• Works additive to total rank• Not explained in rankdetail• How to choose correct value?

Page 29: Spcua 2013 Alexey Kozhemiakin Enterprise Search

29

Templates for QueryRules

• Typical navigational keywords from our portal• Software, soft, download, install• How to• Policy, Blog• Portal• Music, Video• Presentation, Documents, Report• Training, tutorial• Book, ebook

• You will have different ones!

Page 30: Spcua 2013 Alexey Kozhemiakin Enterprise Search

30

Custom Rank Models

• Сбор Query Judgments• Tune neural network coefficients using machine

learning• Gradient Descent, Lambda Rank

• Microsoft.Office.Server.Search.RankerTuning

Page 31: Spcua 2013 Alexey Kozhemiakin Enterprise Search

31

Custom Rank Models

• Modify manually new model or very simple (not default one!)• A/B testing of weights• Measure, measure: Precision, NDCG

Page 32: Spcua 2013 Alexey Kozhemiakin Enterprise Search

32

Custom Rank Models

• Example of simple model – people search

Page 33: Spcua 2013 Alexey Kozhemiakin Enterprise Search

33

3. Search quality testing

• Why need? It’s your compass.• «Unit testing»• Periodical manual testing

Page 34: Spcua 2013 Alexey Kozhemiakin Enterprise Search

34

4. Security «audit»

• Search reveals breaches in security• Security by obscurity

• Examples of queries:• «confidential»• Salaries, performance reviews

• Solution – automatic monitoring of sensitive queries

Page 35: Spcua 2013 Alexey Kozhemiakin Enterprise Search

35

5. Adoption of content

• Use with departments• Get help with search monitoring of their queries

• Guideline to format content• Basic SEO• Titles• Friendly urls • Custom meta tags <meta name=…

• Title, description• Custom Automatically appear in crawled properties

Page 36: Spcua 2013 Alexey Kozhemiakin Enterprise Search

36

6. Promotion within company

• Image – «you will find everything here»• Integrate with other portals• Propose Search as a serivce• Widget «Global search»

• Badges, gamification

Page 37: Spcua 2013 Alexey Kozhemiakin Enterprise Search

37

Promotion

• Social Best-bets

Page 38: Spcua 2013 Alexey Kozhemiakin Enterprise Search

38

Semantic search

• Cannot be solved in general• Analytics + fine tuning• See practices above

• NLP – question answering• Rocket science• English only• Part of speech tagging, dependency parsing

• Stanford NLP, Open NLP, IR

Page 39: Spcua 2013 Alexey Kozhemiakin Enterprise Search

39

«References»

• Patents - http://goo.gl/20sbR

• Explain Rank page - http://goo.gl/o3ZmN

• How SP2013 relevancy models works - http://goo.gl/arf0P

• MS Enterprise Search approach - http://goo.gl/x8SDO

• Customizing ranking models in SP 2013 - http://goo.gl/lBJAp

Page 40: Spcua 2013 Alexey Kozhemiakin Enterprise Search

May 22nd 2013, Kiev

Thanks

Skype: Alexey_KozhemiakinEmail: [email protected]: http://powersearching.wordpress.com

40