PowerPoint Presentation
A search engine for Blackboard Learn the impossible made possible.
OverviewIntroWhyHowWhatFuture thoughtsQ&A2
IntroductionKULeuven Association
3
Toledo ~ Blackboard++32.000 active courses and organizations135.000 active users40.000-50.000 avg active users/day85.000 active users/top days
Self-hosted since 2001Learn April 2014 release
Introduction4
IntroductionContent related 3.529.176 total # Content items 2.765.371 total # Docs (attach)10.6 TB CMS storage500-1.000 changes/dayStart semester: 10.000 - 20.000 changes/day
5
IntroductionTeam (7+4+1 FTE)Strong focus on technology and integrationCentrally supported ICT tools for educationBlackboardExam setupStreaming videoE portfolioE bulletin boardMOOCsEvent subscription
6
WhyStudents/Staff Find stuffThat s the way internet, smartphones, computers, real life? worksPolicy makers: find what you searchKnowledge workers spend from 15% to 35% of their time searching for informationEnterprise Search solutionSearch over all environments (Toledo, sharepoint, website, wikis, blogs)Improve/introduce search functionality in every environmentTeamAny self respecting site has a searchBecause we can!
7
How: Hybrid SolutionEnterprise search: Hybrid solutionMicrosoft SharePoint Search for Microsoft repositoriesElasticSearch for other repositories (Blackboard, SAP, Plone, )Scalable: adding nodes results in automatically reorganization of data over available nodesHigh available: automatically detect new or failed nodes, and reorganize and rebalance dataDeveloper-Friendly, RESTful APIDocument-Oriented as structured JSON documentsReal-time data and analyticsBuild on top of Apache Lucene: a high performance, full-featured Information Retrieval libraryApache 2 Open Source License
8
CONVoor toegangscontrole dient een connector framework opgezet te worden (Google Connector Manager of ManiFoldCF)Voor overkoepelende zoek dient koppeling tussen SharePoint en de Lucene oplossing opgezet te worden (met focus op de afscherming van de resultaten)2 zoekomgevingen te beherenPROVolledige integratie met Microsoft Enterprise contentAndere omgevingen (Toledo, Plone, wikis en vertical search) kunnen vlotter ontsloten worden door bestaande connectoren en met expertise die binnen de respectievelijke teams aanwezig isZoekervaring binnen een omgeving is het hoogst (zowel voor Microsoft als voor de andere omgevingen)Flexibel
8
How: Elastic Search
queryrepositorycontentEnvironmentswith theirrepository:BlackboardSAPPlone WCMElasticSearchSearch architecture
Bevragen en content aanleveren vanuit de verschillende omgevingenFrontend en de connectoren9
How: complexity10
10
How: evolutionIndexing/crawling without filtering who can access whatDifficult to find what you search: irrelevant search resultsNot user-friendly: search result is not accessibleNot secure: certain context of a search result is displayed where user has no access toPeriodical indexing/crawling: does not reflect actual authorization in repositoryAuthorization = show search results dependingAvailability of course/content itemCourse/community membershipNear real time indexing/crawling reflects changes inContentroles: Access to content items by adding membership and availability to content itemAdditional searchindex which reflects memberships of each user Document level security forced by frontendOnly search results where the user has access to11
How: Current search architecture
queryrepositorycontent
rolesEnvironmentswith theirrepository:BlackboardSAPPlone WCMElasticSearchconnectorfrontend
Next slide
Bevragen en content aanleveren vanuit de verschillende omgevingenFrontend en de connectoren12
How: Architecture for indexing data in ElasticSearch
Blackboard
filesystemRabbitMQqueuesfileindexBbSyncqueueSync
PlonePloneSync
SAPODataSyncElasticSearchenvironmentsreusablecomponentsspecificconnectors
queueSync
140 mandagen in 2015, april 2016 demo op Blackboard user conference13
How: Architecture for indexing data in ElasticSearchbbSync:pushes changes in Blackboard to the concerning queue (index, file)queueSync: takes json documents of index queue and indexes them in ElasticSearchtakes CMS link out of file queue document, parses the CMS file from the filesystem and indexes the result in ElasticSearch
14
How: Architecture for indexing data in ElasticSearchEvery developed component and every rabbitMQ queue runs as a seperate Docker containerScalableMultiple instances of every componentPossibility to add an instance(s) of a componentHigh AvailableQueuesIf instance of a component stops working, another instance processes a queue entryMetrics for monitoring/alerting/grafics15
Architecture for querying ElasticSearch
queryrepositorycontent
rolesElasticSearchconnectorfrontendAuthentication Basic Authentication for service accountsOAuth integration possibleDocument level securityshow only search results the user has access to
140 mandagen in 2015, april 2016 demo op Blackboard user conferenceNiet json documenten worden geanalyseerd of geparsed via fileSync (bijv. Pdf, Word, )16
Architecture for querying ElasticSearch
Search UI connects via a service account with Basic Authentication to the ElasticSearch frontendHands the user search query along with the uid if the user17
Architecture for querying ElasticSearch
ElasticSearch frontend first searches the user roles the user with uid has18
Architecture for querying ElasticSearch
19
Architecture for querying ElasticSearch
ElasticSearch frontend expands the user search query with the users roles20
Architecture for querying ElasticSearch
ElasticSearch returns the search results confirm the query and only returning the results for documents which have the right roles21
Architecture for querying ElasticSearch
22
How: the peopleProject manager Enterprise Search 80 MDProject managementTechnical overview and key-decisionsManagement reporting & adviceContact software vendorsInter team facilitator2 java developers 250 MD Senior profilesExtremely skilled (gurus)DevopsSysadmin team 60 MDBleeding edge in several technologiesDockerElastic SearchAgile deployment with PuppetDevops
23
What24
WhatRandom searchSpeedAmount specific data
25
What26
The Beer course
WhatSearch student vs staff & respecting availability27
WhatFull text search (inside files)28
Future thoughtsSearch UI improvements: filter/facets highlighting (context) implement operators phrase matching User Testing Relevance-RankingVertical search applicationsSpecific view on search index (certain subset with specific filter/facets and )Search announcements, courses, Invite a big LMS vendor to incorporate it
29
Q&A30
31
32
Top Related