BP-8 Global Federation and Search
-
Upload
alfresco-software -
Category
Technology
-
view
628 -
download
4
description
Transcript of BP-8 Global Federation and Search
![Page 1: BP-8 Global Federation and Search](https://reader034.fdocuments.in/reader034/viewer/2022052623/559925ed1a28ab2f5e8b47d2/html5/thumbnails/1.jpg)
Global Federation and Search!
Robin Bramley, Ixxus!
![Page 2: BP-8 Global Federation and Search](https://reader034.fdocuments.in/reader034/viewer/2022052623/559925ed1a28ab2f5e8b47d2/html5/thumbnails/2.jpg)
Agenda!
• Who I am!• Setting the scene!• The business challenge!• Alfresco!• Solr!• Big Content!• Global considerations!
• Scaling strategies!• Alfresco 4!• Federation approaches!• ʻIntelligentʼ storage!• Challenges!
![Page 3: BP-8 Global Federation and Search](https://reader034.fdocuments.in/reader034/viewer/2022052623/559925ed1a28ab2f5e8b47d2/html5/thumbnails/3.jpg)
My Background!
• Senior Architect @ Ixxus!• The UK Alfresco Platinum Partner!• Lucid Imagination partner!
• Worked at consultancies for 13 years!• Developing solutions with Alfresco since 0.6!• First UK Alfresco Gold partner!
• Around the edges I also write!• GroovyMag author – inc. 4 hands-on Grails articles!• DZone Most Valuable Blogger!
• Re-published posts include Event Driven indexing with Solr!• Open source contributions include!
• OpenID support for Acegi / Spring Security!• Codenarc support for Hudson / Jenkins CI Violations plugin!
![Page 4: BP-8 Global Federation and Search](https://reader034.fdocuments.in/reader034/viewer/2022052623/559925ed1a28ab2f5e8b47d2/html5/thumbnails/4.jpg)
The challenge!
• Many global organisations face similar challenges around sharing information in a timely fashion between regions.
• For publishers this is often exacerbated due to the size of some their assets such as print quality images or video.
![Page 5: BP-8 Global Federation and Search](https://reader034.fdocuments.in/reader034/viewer/2022052623/559925ed1a28ab2f5e8b47d2/html5/thumbnails/5.jpg)
Alfresco!
Hopefully this needs little introduction. • Clue: itʼs an ECM!
![Page 6: BP-8 Global Federation and Search](https://reader034.fdocuments.in/reader034/viewer/2022052623/559925ed1a28ab2f5e8b47d2/html5/thumbnails/6.jpg)
Apache Solr!
RESTful Search Service • POST it documents!• GET query results!
• Built on top of Lucene!• Originated from CNET (created by Yonik Seeley)!• Features !
• Schema!• Request handlers!• Query types!• Response Writers!• Admin pages!• Replication!• Sharding!
• Professional support available from Lucid Imagination!
![Page 7: BP-8 Global Federation and Search](https://reader034.fdocuments.in/reader034/viewer/2022052623/559925ed1a28ab2f5e8b47d2/html5/thumbnails/7.jpg)
Big Content
![Page 8: BP-8 Global Federation and Search](https://reader034.fdocuments.in/reader034/viewer/2022052623/559925ed1a28ab2f5e8b47d2/html5/thumbnails/8.jpg)
Going global!
![Page 9: BP-8 Global Federation and Search](https://reader034.fdocuments.in/reader034/viewer/2022052623/559925ed1a28ab2f5e8b47d2/html5/thumbnails/9.jpg)
Going global!
Global systems can pose additional challenges
• Infrastructure • Network!
• Bandwidth!• Latency!• Reliability!
• Languages • Timezones • Collaboration • Workflow • Security permissions
![Page 10: BP-8 Global Federation and Search](https://reader034.fdocuments.in/reader034/viewer/2022052623/559925ed1a28ab2f5e8b47d2/html5/thumbnails/10.jpg)
Scaling strategies!
You can scale / divide & conquer systems in a number of ways:
• Scale up (vertical)
![Page 11: BP-8 Global Federation and Search](https://reader034.fdocuments.in/reader034/viewer/2022052623/559925ed1a28ab2f5e8b47d2/html5/thumbnails/11.jpg)
Scaling strategies!
• Scale out (horizontal)
• Typically clustering!
• But could also be!
• Replication!
• Separation of responsibilities!
![Page 12: BP-8 Global Federation and Search](https://reader034.fdocuments.in/reader034/viewer/2022052623/559925ed1a28ab2f5e8b47d2/html5/thumbnails/12.jpg)
Scaling strategies!
• Partitioning
• Data Sharding!
• Silos !• Divisional / departmental!• Regional!
![Page 13: BP-8 Global Federation and Search](https://reader034.fdocuments.in/reader034/viewer/2022052623/559925ed1a28ab2f5e8b47d2/html5/thumbnails/13.jpg)
Alfresco 4!
What’s new in Alfresco 4.0? • Wonʼt repeat the full press release here…!• ʻCloud-scale performanceʼ!
• Alfresco Index Server based on Apache Solr!• Enhanced clustering!
![Page 14: BP-8 Global Federation and Search](https://reader034.fdocuments.in/reader034/viewer/2022052623/559925ed1a28ab2f5e8b47d2/html5/thumbnails/14.jpg)
• Based on Solr 1.4.1!• Uses a custom alfrescoDataType fieldType!• Leverages dynamic schema fields heavily!
• Only statically defined field is ʻidʼ!• Everything else (*) is a multi-valued dynamic field!
• Though it uses the Alfresco model dictionary under the hood!• Analysis chain (same for index/query)!
• Whitespace tokenized !• Word Delimited!
• Breaks up camelCase etc.!• Converted to lower case!
• Adds a cmis request handler!• Uses SSL client certificate authentication!
Alfresco 4 Solr!
![Page 15: BP-8 Global Federation and Search](https://reader034.fdocuments.in/reader034/viewer/2022052623/559925ed1a28ab2f5e8b47d2/html5/thumbnails/15.jpg)
Federating!
![Page 16: BP-8 Global Federation and Search](https://reader034.fdocuments.in/reader034/viewer/2022052623/559925ed1a28ab2f5e8b47d2/html5/thumbnails/16.jpg)
Federation Approaches!
Pros • Can index many different
data sources!• File systems!• Databases!
Cons • Timeliness!• Pull model not suitable for all
scenarios!• Additional storage
requirements!• Indexing can be inefficient in
a global scenario!• Permissions!
Build an index with a crawler
![Page 17: BP-8 Global Federation and Search](https://reader034.fdocuments.in/reader034/viewer/2022052623/559925ed1a28ab2f5e8b47d2/html5/thumbnails/17.jpg)
Federation Approaches!
Federated Search using OpenSearch • A collection of simple formats for sharing
search results!• Can use an Atom response format!• Elements such as totalResults used in
CMIS Atom binding!• Was a big deal in Alfresco 2.0 (2007)!
• Alfresco Explorer has an OpenSearch client!• Alfresco has an OpenSearch server!
• Provided keyword search!• Wiki stated: ʻNote: Advanced Web Client Search and
Query Language searches will be OpenSearch enabled some time in the future, probably in line with up-and-coming CM standards.ʼ!
• Client not in Share!• CMIS a better bet for complex queries!
![Page 18: BP-8 Global Federation and Search](https://reader034.fdocuments.in/reader034/viewer/2022052623/559925ed1a28ab2f5e8b47d2/html5/thumbnails/18.jpg)
Federation Approaches!
Pros • Can work across
heterogeneous search engines!
• Can implement asynchronous results!
Cons • Rebuilding the wheel?!• Authentication is a challenge
(without SAML or OAuth) !
Build a meta-search service
![Page 19: BP-8 Global Federation and Search](https://reader034.fdocuments.in/reader034/viewer/2022052623/559925ed1a28ab2f5e8b47d2/html5/thumbnails/19.jpg)
Federation Approaches!Solr shards • Treat separate Alfresco repositories Solr cores as separate shards!
Pros • Distributed queries are a
standard Solr feature!
Cons • The repositories need to be
backed by a single authentication source!
• E.g. LDAP!• Asynchronous results arenʼt
supported OOTB!
![Page 20: BP-8 Global Federation and Search](https://reader034.fdocuments.in/reader034/viewer/2022052623/559925ed1a28ab2f5e8b47d2/html5/thumbnails/20.jpg)
ʻIntelligentʼ storage!
Storage Cloud Technology • Underpinning for the repository is a storage cloud technology!
• Uses a Content Store Selector!• Base layer built on commodity hardware!
• Keeps multiple replicas of the content!• Management layer !
• Cost-based routing!• Knows where content resides!
• On-demand content migration between repositories!
![Page 21: BP-8 Global Federation and Search](https://reader034.fdocuments.in/reader034/viewer/2022052623/559925ed1a28ab2f5e8b47d2/html5/thumbnails/21.jpg)
Challenges!
• Large file size • Has to work with streaming!• Beware of anything that attempts to buffer a full file into memory!
• E.g. to POST it!• Watch out for processes that need to copy a file!
• User expectations • Need training on asynchronous behaviour!• Search results and their appearance!
• Grouping / sort!• Pagination (of distinct result sets)!
• Time to migrate large content!• Can be lengthy if there isnʼt a ʻnearʼ copy!
![Page 22: BP-8 Global Federation and Search](https://reader034.fdocuments.in/reader034/viewer/2022052623/559925ed1a28ab2f5e8b47d2/html5/thumbnails/22.jpg)
Twitter: @rbramleyBlog: http://leanjavaengineering.com/!
Web: http://www.ixxus.com !