Smarter share point kc user group fast presentation march 2015

21
FAST Search for SharePoint 2010 March 2015 Kyle Bodenstab MCITP SharePoint 2010 Database Administrator Jack Henry & Associates

Transcript of Smarter share point kc user group fast presentation march 2015

FAST Search for SharePoint 2010

March 2015

Kyle Bodenstab MCITP SharePoint 2010 Database Administrator Jack Henry & Associates

SharePoint 2010 Search Products

• SharePoint Foundation

• SharePoint Server 2010

• FAST Search Server 2010 for SharePoint

SharePoint 2010 Search Products

• SharePoint Foundation

− SharePoint site search within a single farm

SharePoint 2010 Search Products

• SharePoint Server 2010

− All the features of Foundation, plus

− Shallow refinement

− Taxonomy tags

− Crawl external farms, Windows File Shares, Exchange Public Folders, LOB apps, structured content in DBs, etc.

− 100 million item index capability

SharePoint 2010 Search Products

• FAST Search Server 2010 for SharePoint

− All the features of Foundation and Server, plus

− Contextual search

− Deep refinement

− Thumbnails, Previews, and Visual Search

− Advanced linguistics

− Social tags and people search

− Document promotion/demotion

− Search driven applications

− 500 million to 1 billion item index capable

What does this mean to users?

• FAST can:

− Deliver results that are relevant

− Search in the language of the business

− Tune results to improve accuracy

− Provide a single platform for indexing and presenting all content in the enterprise, not just SharePoint content

FAST Terms

• Metadata

• (Content) Processing

• (Content) Extraction

FAST Terms

• Metadata

− Is essential to the success of SharePoint search whether it be FAST or Server

− Manual metadata is unreliable and costly

− Poor metadata leads to poor findability

FAST Terms

• FAST Content Processing

− Is designed as a pipeline that performs: − Format conversion

− Language encoding and detection

− Tokenization

− Lemmatization

− Property extraction

− Vectorization

− Date/Time Normalization

− Custom processing

− Property mapping

FAST Terms

• FAST Content Extraction

− Recognize and deliver entities from unstructured content such as: − People, companies, locations (shallow refiners)

− Modified date, result type, language (deep refiners)

− Dictionaries (custom deep refiners);

− Business and industry specific concepts

− Customer names, competitor names

− Employee titles and expertise

− Product names

− Project names

FAST Terms

• FAST Content Processing term definitions − Language encoding and detection – looks at the language of the content so appropriate

dictionaries can be applied downstream

− Tokenization – breaks text into rules regarding punctuation, diacritics, accents, compound words, phrases, etc.

− Lemmatization – applies linguistic normalization to content so users queries match documents that contain words and phrases in either canonical or inflected forms (singular/plural, masculine/feminine) ie, mice would also find mouse.

− Property extraction – recognizing entities such as companies, people, locations, etc within content

− Vectorization – creates document vectors based on the weighting of phrase/terms based on frequency of occurrence – find documents similar to this one result

− Date/Time Normalization – converts date/times to standard representation ie 24-Mar-11 is the same as March 24, 2011

− Custom processing – extend content processing with custom dictionaries

− Property mapping – manages the metadata discovered in the pipeline to the index managed properties

SharePoint Search

• Default Ranking

− URL Depth – Higher ranking based on shorter URL.

− Doc Rank – Higher ranking based on the number and relative importance of links pointing to an item.

− Site Rank – Higher ranking based on the number and relative importance of links pointing to the items on a site.

− HW Boost – Placeholder used for generic usage of static rank points

Search Results

• Dynamic Ranking − Freshness – Higher ranking based on age of content. Content just

added is given more points than content that is older.

− Context – Higher ranking based on the search word hits in the content.

− Proximity – Higher ranking based on a short distance between query terms in the content.

− Managed Property – Higher ranking based on content of a specific item type defined by a managed property.

− Authority – Higher ranking when the query terms are included in the link text.

− Query Authority (Click-through) – Higher ranking when query terms are associated with previous query results and clicked search results.

How Do Users Find Content?

• Site Structure

• Library Structure

• SharePoint Search

Demo

Site Structure

• Plan your site collection and sub-sites

• Consider splitting off projects to their own sites

• Keep things clean!

Library Structure

• Plan your libraries

• Consider using multiple shallow libraries vs a single deep library

• Plan and use metadata tagging

• Keep things clean!

SharePoint 2013 Enterprise Search

• New search capabilities in SP2013:

− Single search result center

− Search user interface improvements − Hover preview of document results

− Results based by type – document, people, sites, etc.

− Results block of similar content

− Accurate query suggestions

− Relevance improvements − New ranking models

− Query rules

− Changes in crawling − Continuous crawls

− Results removal from crawl logs

SharePoint 2013 Enterprise Search

• New search capabilities in SP2013 (continued):

− Discovering structure and entities in unstructured content − Configure the crawler to look for entities such as product names

within the body or title of content.

− Create custom dictionaries as an entity

− Removal of redundant information – menu, headers, boilerplate content

− More flexible search schema − Refinable and sortable managed properties

− Multiple search schemas

− Search health reports

SharePoint 2013 Enterprise Search

• New search capabilities in SP2013 (continued):

− New search architecture

Questions

Contact information:

Kyle Bodenstab, MCITP [email protected]

LinkedIn - www.linkedin.com/in/KyleBodenstab Twitter - @jackson_curve For the lighter side of life – jacksoncurve.blogspot.com