Connect with life Nauzad Kapadia Quartz Systems [email protected].
Connect with life Nauzad Kapadia Quartz Systems [email protected].
-
Upload
connor-boldon -
Category
Documents
-
view
226 -
download
3
Transcript of Connect with life Nauzad Kapadia Quartz Systems [email protected].
Connect with life
www.connectwithlife.co.in
Integrated Full Text Search (iFTS) with SQL Server 2008
Nauzad KapadiaQuartz [email protected]
Session Objectives And Takeaways
Session Objective(s): Discover what we have learned from using the new Integrated Full Text SearchFind out what works and what doesn’t work
iFTS is faster than SQL 2005 full textA great base for future improvements
BackgroundBased on MSSearchLack of Integration limited the performance of queriesLimits the ability to integrate with high availability and scalability functionality
New Architecture
Demo – Using Full Text Search
StopLists• New STOPLIST support
Simplified noise words utilization and manageability.DB object associated with the FT index.
CREATE FULLTEXT STOPLIST stoplist_name
[ FROM {[database_name.] source_stoplist_name} | SYSTEM STOPLIST]
[AUTHORIZATION owner_name]
ALTER FULLTEXT STOPLIST stoplist_name{ | ADD <keyword> LANGUAGE language_term | DROP
{| <keyword> LANGUAGE language_term| ALL LANGUAGE language_term| ALL}
Demo – Stop Lists and Creating FullText Indexes
Thesaurus
• Thesaurus improvements
• Stored in internal tables (in tempdb) in XML form instead of being parsed from external files
• Instance level thesaurus
sys.sp_fulltext_load_thesaurus_file (lcid)
Loads all the data specified in the Thesaurus XML corresponding to the language with specified lcid.
Demo - Thesaurus
SQL Server 2008: IFTS
• New family of Word-Breakers (WB): WBs are components responsible of parse the textual data in a given language and pass the tokenized result to the Full-Text Index.
• 51 languages/WBs out of the box
• Improved quality in many already existing word-breakers
SQL Server 2008: IFTS
EnglishEnglish UKSimplified ChineseTraditional ChineseChinese (Hong Kong)Chinese (Macau)Chinese (Singapore)ThaiKoreanFrenchGermanJapaneseItalianSpanishBengaliBulgarianCatalanCroatian
NeutralPunjabiRomanianSerbian CyrillicSerbian LatinSlovakSlovenianTamilTeluguUkrainianUrdu LithuanianMalayIcelandicIndonesianHindiGujarati
VietnameseArabicNorwegianPortuguese BrazilianRussianDutchMalayalamMarathiHebrewCanadaLatvianSwedishDanishPolishTurkish
• WBs available in SQL Server 2008:
Languages present but disabled by default
New languages supported in SQL Server 2008
Existing in SQL Server 2005, and being replaced by new WBs in SQL Server 2008
Unchanged language/WB from SQL Server 2005
• The indexing performance has improved in most scenarios
2005 Crawl 2005 Total IFTS
Crawl IFTS Total
20M rows 1k text data
02:06 02:25 01:22 01:28
5M rows 8k text data
02:10 02:41 02:22 02:32
20M rows 1k nvarchar data
01:37 01:55 01:20 01:26
Measured on 4 processor AMD64 2793 MHz, 8G RAM. Numbers are in HH:MM format. Total time is combining time to crawl and time of merge into index
Indexing Performance
For some HW configuration and data types, specific best practices are recommended to improve indexing performance (i.e: capping SQL Server’s memory)
• To see the word frequency• Sys.dm_fts_index_keywords()• Sys.dm_fts_index_keywords_by_document()
• Get number and size of fragments• Sys.fulltext_index_fragments
• Understanding Query Behavior• Sys.dm_fts_parser(““This is test” AND “This is also a
test”, 1033,0)
Helpful Commands
Demo – Understanding Indexes
- Due to new architecture, we have now new Full-Text Indexes. Former ones are not compatible in SQL Server 2008.
Solution: Full-Text Catalog Upgrade Option
- Import: (default) Faster method although performance and semantic implications are possible.
- Rebuild: Slower method although ideal final state of new FTCatalogs guaranteed.
- Reset: Faster Upgrade method although your Search app will not have the FTCatalogs available afterwards. You need to rebuild them when possible.
- Possible Upgrade methods:
1. In place Upgrade: User will be prompted for what Upgrade Option to choose for existing FTCatalogs.
2. Restore/Attach : Instance level setting will be applied to former Full-Text Catalogs brought up with the former DB.
Upgrading
• Put full text index on separate file group to avoid fragmenting main data files
• Use varchar(max) instead of text/image
• If you see excessive blocking by FTGATHER• turn off auto change tracking • Schedule manual job to do updates• Watch number of fragments to determine how often to
schedule
• Don’t run full text master merges with other index rebuilds or reorgs at the same time
• If you have large documents (>2MB) may need to reduce SQL memory a bit so FDHost daemon has memory to run.
Best Practices
Related ContentWebcasts
MSDN Webcast: Using Full-Text Search in SQL Server Expresshttp://msevents.microsoft.com/CUI/EventDetail.aspx?EventID=1032294979&Culture=en-US
White Paper on SQL Server FTS 2005http://msdn2.microsoft.com/en-us/library/ms345119.aspx
MSDN http://msdn2.microsoft.com/en-us/library/ms142571.aspx
Technical Case Studyhttp://www.microsoft.com/technet/itshowcase/content/intdocmgmtsql2005.mspx
FTS 2008 (iFTS) White Paperhttp://msdn.microsoft.com/en-us/library/cc721269(SQL.100).aspx
Program Manager on SQL Server [email protected]
© 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after
the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.