Spcua 2013 Alexey Kozhemiakin Enterprise Search
-
Upload
alex-kozhemiakin -
Category
Technology
-
view
104 -
download
1
description
Transcript of Spcua 2013 Alexey Kozhemiakin Enterprise Search
May 22nd 2013, Kiev
Enterprise search portals SharePoint 2013
Alexey Kozhemiakin
May 22nd 2013, Kiev
or “How to make a cool search”
Alexey Kozhemiakin
3
Who’s speaking to you?
• Solution Architect @epam
• Focusing on search• Sharepoint Search FAST/2010/2013• Apache Lucene, Solr, elasticsearch,
Oracle Endeca…
• http://powersearching.wordpress.com
4
Agenda
• Enterprise Search Portal• Insight into SP2013 Search• Key changes from SP2010• A bit of magic – relevancy calculation
• Search governance, useful hint & tips
5
Key search patterns
• I know what I’m searching and where to find it
• I know what I’m searching but don’t know where to find it.
• I don’t‘ know what I’m searching
http://aghy.hu/AghyBlog_EN/Lists/Posts/Post.aspx?ID=199
6
• Demand:• Fast growing enterprises• Zoo of internal systems
• Solution: • “google” inside enterprise
• Quick-wins for business:• Single point of smart search and information retrieval• Reduce search time by employee• Better inner communications and simplified reuse of
conent
Enterprise Search Portal
7
But after deployment…
• «.. Search sucks»• Out of the box search knows nothing about you• «Typical But…• … Microsoft takes care of decent search algorithm»• … we’re not sure we can do better»• ... we don’t need search, everybody know where content is»• … make our search like in facebook/google/bing (instead of
requirements)»
8
Why it’s hard
• Ambiguous short queries• Unstructured not optimized content• Different active vocabulary of content users and
creators• Limited resources ($), while in internet search:• Auto and manual testing of search quality (assessors)• Continuous improvement
9
Search architecture in SP2013
10
Search in two phase process
• Matching – all docs with keywords• Linguistics: stemming, phonetics• Synonyms
• Ranking• «Фичи»
• TF-IDF, BM25• Вес полей• Тип файла• Дата изменения• Популярность• …
11
Ranking in FAST
• Linear combination of features
12
Ranking in FAST
• Impact of each component to final rank
1st 2nd 3rd 4th0
1000
2000
3000
4000
5000
6000
7000
8000
term:fast term:search freshness static rank proximity
13
Migration FAST->SP2013
14
Ranking in SP2013
15
Ranking in SP2013
• Default Relevancy Model• Two neural networks• Freshness in not included in ranking• Features Type Instance
BM25 BM25Static UrlDepthBucketedStatic InternalFileTypeBucketedStatic LanguageStatic ClickDistanceStatic QueryLogClicksStatic QueryLogSkipsStatic LastClicksStatic EventRateMinSpan - soft TitleMinSpan - soft TitleMinSpan - soft TitleMinSpan - soft Content
16
Ranking in SP2013
• Default relevancy model
17
Explain rank
• /_layout/15/explainrank.aspx• rankdetail property
18
Explain rank
• Manual validation in excel
19
20
Search Governance
1. Search analytics2. Fine tuning and adaptation3. Regular testing4. Security assessment5. Promotion whithin company6. Content optimization and basic SEO
21
1. Search analytics
• Search analytics• Search analytics• Search analytics
• Obey! Use Search analytics
22
1. Search analytics
• OOTB in SP2013• Most popular queries• «No Results/abandoned» queries
• 3rd party tools (Google Analytics, Omniture, WebTrends)• Measure search quality (!)
• % click on results• Which results• Return after clicks
• Session analysis• Query segmantation
23
Query segmantation
• Analyze and improve not only top N queries, but classes of queries
24
2. Fine tuning
• Authoritative Pages• Quick win – content source priority
• Query Rules• Smart search for users
• Synonyms• Separate mapping file• Expansion only• Termsets synonyms NOT working
• Relevancy models
25
Authoritative Pages
• Impacts ClickDistance• ClickDistance, UrlDepth have hich impact on total
score (see explain rank)• Configures in CA, CSOM
26
Query Rules (Rule + Action)
• The tool to make search smarter• Interactive feedback to user queries• Post processing of queries• Leverage navigational queries• …
27
Condition for Query Rules
• Query Matches Keyword Exactly• Advanced Query Text Match• Query Matches Dictionary Exactly
• Query Contains Action Term
• Query More Common in Source• Result Type Commonly Clicked
28
Actions для Query Rules
• Create and display a result block• Change ranked search results• Best Bets• XRANK
• Works additive to total rank• Not explained in rankdetail• How to choose correct value?
29
Templates for QueryRules
• Typical navigational keywords from our portal• Software, soft, download, install• How to• Policy, Blog• Portal• Music, Video• Presentation, Documents, Report• Training, tutorial• Book, ebook
• You will have different ones!
30
Custom Rank Models
• Сбор Query Judgments• Tune neural network coefficients using machine
learning• Gradient Descent, Lambda Rank
• Microsoft.Office.Server.Search.RankerTuning
31
Custom Rank Models
• Modify manually new model or very simple (not default one!)• A/B testing of weights• Measure, measure: Precision, NDCG
32
Custom Rank Models
• Example of simple model – people search
33
3. Search quality testing
• Why need? It’s your compass.• «Unit testing»• Periodical manual testing
34
4. Security «audit»
• Search reveals breaches in security• Security by obscurity
• Examples of queries:• «confidential»• Salaries, performance reviews
• Solution – automatic monitoring of sensitive queries
35
5. Adoption of content
• Use with departments• Get help with search monitoring of their queries
• Guideline to format content• Basic SEO• Titles• Friendly urls • Custom meta tags <meta name=…
• Title, description• Custom Automatically appear in crawled properties
36
6. Promotion within company
• Image – «you will find everything here»• Integrate with other portals• Propose Search as a serivce• Widget «Global search»
• Badges, gamification
37
Promotion
• Social Best-bets
38
Semantic search
• Cannot be solved in general• Analytics + fine tuning• See practices above
• NLP – question answering• Rocket science• English only• Part of speech tagging, dependency parsing
• Stanford NLP, Open NLP, IR
39
«References»
• Patents - http://goo.gl/20sbR
• Explain Rank page - http://goo.gl/o3ZmN
• How SP2013 relevancy models works - http://goo.gl/arf0P
• MS Enterprise Search approach - http://goo.gl/x8SDO
• Customizing ranking models in SP 2013 - http://goo.gl/lBJAp
May 22nd 2013, Kiev
Thanks
Skype: Alexey_KozhemiakinEmail: [email protected]: http://powersearching.wordpress.com
40