II-SDV 2015, 20 - 21 April, in Nice

86
1 Search Technologies: Who We Are The leading independent IT services firm specializing in the design, implementation, and management of enterprise search and big data search solutions.

Transcript of II-SDV 2015, 20 - 21 April, in Nice

Page 1: II-SDV 2015, 20 - 21 April, in Nice

1

Search Technologies: Who We Are

The leading independent IT services firm specializing in the design,

implementation, and management of enterprise search and big data

search solutions.

Page 2: II-SDV 2015, 20 - 21 April, in Nice

2

Search Technologies: Background

San Diego

London UK

San Jose, CR

Cincinnati

San Francisco

Washington (HQ)

Frankfurt DE

• Founded 2005

• 150+ employees

• 600+ customers worldwide

• Deep enterprise search expertise

• Consistent revenue growth

• Consistent profitability

Page 3: II-SDV 2015, 20 - 21 April, in Nice

3

Search Engine and Big Data ExpertiseOur Technology and Integration Partners

Page 5: II-SDV 2015, 20 - 21 April, in Nice

5

Search Technologies: What We Do

• All aspects of search application implementation

– Content access and processing, search system architecture, configuration, deployment

– Accuracy analysis, metrics, engine scoring, relevancy ranking, query enhancement

– User interface, analytics, visualization

• Technology assets to support implementation– Aspire high performance content processing

– Content Connectors (Document, Jive, SharePoint, Salesforce, Box.com, etc.)

• Engagement models

– Most projects start with an “assessment”

– Fully project-managed solutions, designed, delivered, and supported

– Experts for hire, supporting in-house teams or as a subcontractor

Page 6: II-SDV 2015, 20 - 21 April, in Nice

6

Search Technologies: Expertise by Role

Role Responsibilities

Project Manager Ensures project is on time and within budget.

Architect Designs overall solution architecture.

Requirements Analyst Documents requirements and solution goals.

Engineer Hands-on software development and configuration.

Lead Engineer Lead developer – understands application from end to end.

Search Quality Analyst Analyzes search metrics, improves quality of results.

Data Analyst Analyzes data; defines content processing needs.

Support Engineer Provides 8x5 of 24x7 for software and managed services

Page 7: II-SDV 2015, 20 - 21 April, in Nice

7

Q&A

Page 8: II-SDV 2015, 20 - 21 April, in Nice

8

Microsoft Search Expertise

• Over 150 SharePoint/FAST customers

• 50 Engineers trained on Microsoft search technologies

• Projects with Elsevier, CPA, Florida Power & Light, Library of Congress,

GPO, Norton Rose, Daily Mail Group, Accenture, Unilever

• Working with all versions and combinations of SharePoint and FAST

(ESP, FSIS, FSIA, Etc.)

• FAST’s Worldwide Partner of the Year back in 2006

– 30,000+ days of implementation experience since…

• Already up-to-speed with SharePoint 2013

Page 9: II-SDV 2015, 20 - 21 April, in Nice

9

Google Search Appliance Expertise

• 30 Engineers trained on GSA

• Over 100 Google Search Appliance Customers

• Projects with AARP, Isilon, Petco, SAIC, ESRI, General Electric, Vistage,

Ning, NVIDIA, Hershey, Mayo Clinic and others

• Google recommends us for their most challenging GSA integration

projects

• Focus on integration and implementation issues that GSA does not

handle out-of-the-box

• 3rd Party Connector Development

Page 10: II-SDV 2015, 20 - 21 April, in Nice

10

Open Source Expertise

• 40 Engineers Trained on Solr/Lucene or ElasticSearch stack (ELK)

• Projects with Comcast, BBC, U.S. House, MemoryLane, Bloomberg,

Citibank, BusinessLink, Genentech, Qualcomm, YP.com

• Focus on extending document processing and query parsing

frameworks to enable open source to function in complex enterprise

scenarios

• Focus on extreme scale and performance scenarios – YP.com, Comcast

Page 11: II-SDV 2015, 20 - 21 April, in Nice

11

Big Data Expertise

• Expertise with Big Data technologies

– NoSQL – Hadoop, Cloudera

– Data Mining and Machine Learning – Mahout

– Distributed databases – Cassandra / Datastax

• Projects in Data Warehouse Search, Fraud Detection, Automated

Candidate Matching

Page 12: II-SDV 2015, 20 - 21 April, in Nice

12

Assessment & Delivery

Page 13: II-SDV 2015, 20 - 21 April, in Nice

13

Primary Engagement Models

• Provide complete search & big data solutions, from

requirements analysis and design through to

implementation and ongoing support

• Provide expert services to support in-house customer

projects or larger system integrators

Page 14: II-SDV 2015, 20 - 21 April, in Nice

14

Delivery Model

Assessment

• Deep dive to evaluate

technical situation and

business objectives

• Document Analysis

• Develop detailed project

plan and schedule

Implementation

• Focus on technical

execution and quality

• Tightly manage objectives

as per Assessment

• Ensure completion to

timeframe and budget

Support

• Knowledge transfer from

Implementation team

ensures smooth hand-off to

Support

– 8x5 or 24x7

– Managed Services

– Hosting

AssessmentStatement

of Work Implementation Completion Support

Page 15: II-SDV 2015, 20 - 21 April, in Nice

15

Define Business and

Technical Objectives

Review Existing

Applications

Review Data, Environment

and User Requirements

Review Performance Requirements

Define Future Architecture

Generate Assessment

Report

Assessment ProcessAn Assessment typically involves the following steps, but is always customized to the requirements of the customer.

DEFINE REVIEW & ANALYZE RECOMMEND REPORT

Page 16: II-SDV 2015, 20 - 21 April, in Nice

16

Assessment Document

• Executive Summary

• High Level Requirements

• System Overview

• Detailed Requirements

• New Initiatives

• Proposed Project Plan

• Conclusions and Next Steps

Page 17: II-SDV 2015, 20 - 21 April, in Nice

17

Assessment Benefits

• Deep focus on Technical and Business Objectives

• Detailed, documented options and recommendations

• Better visibility into the Project Scope

• Better communications and Project Management

• Opportunity to leverage expertise on your team

Page 18: II-SDV 2015, 20 - 21 April, in Nice

18

Project Execution Model• Projects organized around 3 key personnel

• Agile development methodologies (Sprints and Scrums)

• JIRA and Greenhopper used for issue tracking and project management

Technical Lead

Project Manager

Architect

Page 19: II-SDV 2015, 20 - 21 April, in Nice

19

Support & Managed Services

Page 20: II-SDV 2015, 20 - 21 April, in Nice

20

Technical Support & Managed Services• Standard and Premium Support available worldwide

• Application Managed Services available worldwide

• Communication Channels• Support Online Portal (http://support.searchtechnologies.com

• Support Phone Line (619 564 4351 option 1)

• Support Email ([email protected])

• Support Time Frames• 8x5 or 24x7

Regular Support Premium Support

Critical 4 business hours after logging the issue 2 hours after logging the issue and Call Support

Major 1 business day after logging the issue 4 business hours after logging the issue

Minor 2 business day after logging the issue 1 business day after logging the issue

Trivial 1 business week after logging the issue 2 business days after logging the issue

Page 21: II-SDV 2015, 20 - 21 April, in Nice

21

Support Online Portalhttp://support.searchtechnologies.com

Page 22: II-SDV 2015, 20 - 21 April, in Nice

22

Other Online Resources (Wiki)http://wiki.searchtechnologies.com

Page 23: II-SDV 2015, 20 - 21 April, in Nice

23

Hosting Services

• 10+ hosted customer applications

• 24x7 Technical Support

• Cloud Hosted Services

Page 24: II-SDV 2015, 20 - 21 April, in Nice

24

Organization

Page 25: II-SDV 2015, 20 - 21 April, in Nice

25

Executive TeamExecutive Enterprise Search Industry Experience

Kamran KhanPresident & CEO

19 years: International Sales, VP Sales, Executive

John Steinhauer VP Technology

16 years: Development Management, Project Management, Executive

Pat BoothDirector of Finance

17 years: Finance, Operations, Executive

Paul NelsonChief Architect

25 years: Development, Innovation, Architecting, Dev. Management

John BackVP Sales - US

15 years: Sales, VP Sales

Graham CharlesworthVP Sales - Europe

17 years: Business Development, VP Sales, Executive

Dennis TranVice President

21 years: International Sales, VP Sales

Graham GillenVP Marketing

15 years: VP Marketing, Product Marketing, Analyst & Partner Relations

Iain FletcherDirector Marketing Europe

17 years: International Sales, Product Management & Marketing

Years in the Search / IT Industry

Page 26: II-SDV 2015, 20 - 21 April, in Nice

26

Organization Chart

Kamran KhanCEO

Pat Booth

Director of Finance

Joni Morgan

Sr. Bus Analyst

Nathalie Rodriguez

Corp. Accountant

Karen Pramis

Corp. Accountant

Graham Gillen

VP Marketing

Stacy BrooksMarketing Mgr

Iain FletcherDir. Marketing

Europe

Telemarketing Associates

Graham Charlesworth

VP Sales Europe

Graham Jackson

Account Mgr

Bernd Rahmig

DE Acct Mgr

Linda BerryEU Finance &

Admin

John SteinhauerVP Technology

Phil LewisUK Tech Dir

16 Engineers

Maynor AlvaradoCR Tech Dir

59 Engineers

Joan SchaechEast PS Mgr

31 Engineers

Matt LumsdenWest PS Mgr9 Engineers

John-Henry GrossProduct Mgr

John BackVP Sales NA

Mary Jo HoughtonAccount Mgr - NE

Jerry JunkerAccount Mgr - MW

Joe AbramsAccount Mgr – W

Dennis TranGoogle Accounts

Mimy Indra

Account Mgr -Federal

Jan SeatonDirector HR

Paula SmallRecruiter

Amanda BolanosSr. Admin Asst

Kristin Andrews

Receptionist/AA

Paul NelsonVP, Chief Architect

Page 27: II-SDV 2015, 20 - 21 April, in Nice

27

Engineering Team

• Project Engineering

– Frontline technical consultants working on customer projects

• Project Management

– Global organization to manage customer projects

• Core Engineering

– Building assets and tools used by project teams and customers

• Technical Support and Managed Services

– Supporting software and Applications

• Sales Engineering

– Technical expertise to drive sales

Page 28: II-SDV 2015, 20 - 21 April, in Nice

28

Aspire Content Processing, Connectors and QPLTechnology Assets

Page 29: II-SDV 2015, 20 - 21 April, in Nice

29

Content sources

Connectors

AspireContent Processing

PipelinesIndexes

Search Engine

Web Browser

Staging Repository

Publishers

Technology Assets

1. Aspire Framework– High Performance Content Processing

– Ingests and processes content and publishes to a variety of indexes for commercial and open source search engines

2. Aspire Data Connectors– API level access to content repositories

3. Query Processing Language (QPL)– Advanced query processing

Complements to commercial and open source search technologies

1

2

3 QPL

Page 30: II-SDV 2015, 20 - 21 April, in Nice

30

Aspire Content Processing

Page 31: II-SDV 2015, 20 - 21 April, in Nice

31

Importance of Content Processing

• Inconsistent and sparse content, especially metadata, is a

leading cause of user dissatisfaction and underperformance

in search applications

• Meticulous preprocessing prior to indexing is a critical, yet

often neglected aspect of search systems

• The original format and structure of the content is typically

optimized for human consumption, content processing

optimizes it for indexing and search

Page 32: II-SDV 2015, 20 - 21 April, in Nice

32

Content sources

Connectors

AspireContent Processing Pipelines

Indexes

Search Engine

Web Browser

Staging Repository

Publishers

Content Processing Supports

QPL

• Optimum Relevancy & Recall

• Search Navigators

• Content Grouping

• Secure Content Hub For Enterprise Content

• Support for Advanced Analytics

Page 33: II-SDV 2015, 20 - 21 April, in Nice

33

Content sources

Connectors

AspireContent Processing Pipelines

Indexes

Search Engine

Web Browser

Staging Repository

Publishers

Content Processing Stages

QPL

• Connectors – Secure access to content

• Staging Repository – Fast & secure re-indexing

• Pipelines

– Cleansing, enriching and normalizing prior to indexing

• Publishers – Output to search engine

Page 34: II-SDV 2015, 20 - 21 April, in Nice

34

What is Aspire?

• A vendor neutral framework to support high-volume, high-

performance content processing

• A toolkit to create custom components needed to

implement high quality search implementations

• A highly effective and low cost way to prepare data for

indexing by extracting and normalizing metadata, cleansing

and enriching data

• A framework that enables Search Technologies to create

outstanding search experiences for customers

Page 35: II-SDV 2015, 20 - 21 April, in Nice

35

Content Processing Examples• Normalization

– Names, dates, synonyms, spelling

• Entity identification and resolution

• Derive additional metadata from content

• Discover hierarchy metadata

• Categorization

• Document Matching

• Document segmentation and concatenation

• Link analysis

• Duplicate detection

• Security analysis

Index

security

category

metadata

Page 36: II-SDV 2015, 20 - 21 April, in Nice

36

Indexes

Semantics

Text Mining

Quality Metrics

Aspire Aspire Aspire Aspire

Aspire Aspire Aspire Aspire

Big Data Framework

Big Data Array

Aspire Reference Architecture with Big Data Scaling for Big Data Solutions

Content sources

Connectors

AspireContent Processing Pipelines

Indexes

Search Engine

Web Browser

Staging Repository

Publishers

QPL

Page 37: II-SDV 2015, 20 - 21 April, in Nice

37

Aspire Benefits

• Vendor neutral framework “future proofs” solutions

• Mature toolkit provides full set of components to create

solutions faster, economically and reliably

• Improved index quality enabled by content processing

• Java based solution supports a wide array of computing

platforms and is scalable

• Workflow and scripting support enables more flexible and

maintainable solutions

Page 38: II-SDV 2015, 20 - 21 April, in Nice

38

Customers Using Aspire

• Search Technologies

• ACS / Xerox

• Adecco

• ASCO

• Aspermont

• BASF

• Bayer (POC)

• Blackberry

• Bloomberg/BNA

• Boehringer Ingelheim

• Carson-Dellosa

• CBBB

• Celera Systems

• Chick-Fil-A

• CPA Global

• EMC Corporation

• Evonik (Germany)

• Florida Power & Light

• GE Research

• GFR Media

• Haymarket (PistonHeads)

• Haymarket (HIFI)

• Hershey

• JobSite

• Just Eat

• Labour

• LOC

• Mitre

• NARA

• Deloitte

• Nectar

• NetDocuments

• New York Housing

• OLRC

• OSD/CAPE

• Penske Truck leasing

• Reed Business International

• Rolls Royce

• SCIE

• Seagate

• Shire

• Sony Media

• Sprint

• Thoughtworks

• United Nations

Page 39: II-SDV 2015, 20 - 21 April, in Nice

39

Aspire Fundamentals• An OSGi framework + plug-in components architecture

• Vendor independent

• Intuitive Admin UI

• Rich library of component bundles and components

– Connectors to content sources

– Document processing components

• Parsing, extracting, splitting, joining, metadata mapping, etc.

• Scripting support using Groovy

– Publishers to leading search engines

• Integration with Hadoop

Page 40: II-SDV 2015, 20 - 21 April, in Nice

40

Intuitive Modern Administration UI

Page 41: II-SDV 2015, 20 - 21 April, in Nice

41

Aspire Community

Licensing & Maintenance

• Free to download and use

• Registration Required

• License Agreement Required

• Maintenance & Support is not available

Packaging

• Framework and Core Components

• Publishers: Solr, CloudSearch and GSA

• Connectors for File system and RDB

• No security

• Javadoc for Programming New Components

• Administration Tool

• Archetypes for quickly creating new components and

distributions

• Access to Aspire Wiki

• Access to the Maven repository

– But for a limited set of components

Licensing & Maintenance

• Priced per server per month

• Maintenance and Technical Support included

Packaging

• Aspire Community, plus

– All currently available publishers*

– Corporate Site Map

– Enterprise Security

– Distributed Processing

– Connectors: CIFS, Heritrix, Enhanced RDB

– Dynamic Crawler Controls

• Access to Wiki

• Access to the Aspire Maven Repository

– Includes access to all released pipeline

components

• Technical Support (via support portal, telephone,

and e-mail) 8x5 or 24 x 7 support available

(additional cost)* Except FAST Content API

Aspire Enterprise

Page 42: II-SDV 2015, 20 - 21 April, in Nice

42

Connectors

Page 43: II-SDV 2015, 20 - 21 April, in Nice

43

Connectors Provide• API level access to repositories

• Retrieval of:

– Content and metadata

– ACLs for repositories that support security

– Hierarchy information

• Full and incremental crawling

• Multiple modes for crawl scheduling

• Search engine independence

• Ease of install and configuration from a common Admin UI

Connectors

Page 44: II-SDV 2015, 20 - 21 April, in Nice

44

Connectors• Aspire Enterprise Connectors

– File (CIFS)

– RDB

– Heritrix

• Premium Connectors

• SharePoint 2010

• SharePoint 2013

• Lotus Notes

• Amazon S3

• Confluence

• Documentum

• EMC eRoom

• Socialcast

• IBM Connections

• Salesforce.com

• TeamForge

• Oracle RightNow

• Jive

Connectors

Page 45: II-SDV 2015, 20 - 21 April, in Nice

45

QPL – Query Processing Language

Page 46: II-SDV 2015, 20 - 21 April, in Nice

46

We Expect Help With Queries

Page 47: II-SDV 2015, 20 - 21 April, in Nice

47

What is Query Processing?

• Analyzing the content of a query, determine a users intent and

optimize it for the search engine

• Examples:

– Term consolidation: red wine → “red wine”

– Term expansion: FSA → FSA OR “Financial Services Authority”

– Semantic expansion: Gun → Gun OR Rifle OR Pistol OR Firearm

– Geographic: Near Buffalo NY → &q=*:*&fq={!geofilt pt=45.15,-

93.85 sfield=store d=5}

– Normalization: Bill Smith → William Smith

Page 48: II-SDV 2015, 20 - 21 April, in Nice

48

Benefit of Query Processing

• Improved Precision and Recall

– Users want to type just a few terms

– Search engines want users to speak advanced Boolean

• Improved User Experience

– Query processing acts like a skilled interpreter

• Remove the extraneous

• Fill in the details to bridge the gap between human and machine

Page 49: II-SDV 2015, 20 - 21 April, in Nice

49

Query Parsing Language - QPL

• Search Engine Independent Server to Process Queries

– Scripting rule-based approach

– Supports maintainability of business logic

– Search engine independence reduces TCO

– Gives search engineers control, where it belongs

• UI engineers should not be controlling queries

– Search Technologies expert services to implement

and tune

QPL

Page 50: II-SDV 2015, 20 - 21 April, in Nice

50

DPMS and Aspire - EXAMPLES

Page 51: II-SDV 2015, 20 - 21 April, in Nice

51

DPMS Example #1 – Federal Register

Page 52: II-SDV 2015, 20 - 21 April, in Nice

52

DPMS Example #1 – Federal Register

Page 53: II-SDV 2015, 20 - 21 April, in Nice

53

DPMS Example #1 – Federal Register

Page 54: II-SDV 2015, 20 - 21 April, in Nice

54

DPMS Example #2 – World’s Patent Data

• Consolidation of 80 million XML encoded patents from 95 patent offices into a single, searchable application.

• A long and rich history since 1790 with numerous liguistics, normalization, cleansing, enrichment and data linking challenges

• Forward and backward references

• Assignee, inventor, corporate hierarchies for which normalization is required

• Multiple classification hierarchies which change over time

• Hierarchical claims structure

• Whole document comparison features (similarity search)

• KEY ISSUES: Controlling complexity and handling scale

Page 55: II-SDV 2015, 20 - 21 April, in Nice

55

DPMS Example #2 – World’s Patent Data

Page 56: II-SDV 2015, 20 - 21 April, in Nice

56

DPMS Example #2 – World’s Patent Data

Page 57: II-SDV 2015, 20 - 21 April, in Nice

57

DPMS Example #2 – World’s Patent Data

Page 58: II-SDV 2015, 20 - 21 April, in Nice

58

Document Processing Methodology for Search

• The Philosophy

– Understand the Document Model

– Understand the User Model• Includes business-level requirements

– Create the Search Engine Model• Search = the pivot point between User and Data

– Document everything

Page 59: II-SDV 2015, 20 - 21 April, in Nice

59

DPMS – The Methodology

Assessment

(Search Technologies Architect and Business

Analyst)

DPMSAnalysis

(Knowledge Engineer, Business Analyst, etc.)

Assessment ReportExpert assessment and recommendations

Validation

Aspire

DMDs

Review(Architect, Domain

Experts, Peers)

1Assessment

2Detailed Analysis

3Execution

Implementation(Developer)

Validate DMDs

SearchEngine

Page 60: II-SDV 2015, 20 - 21 April, in Nice

60

Business Process Overview

Submission

Ingest Process

Congressional Submission

Workflow (folder)

Migration

Application

Bulk Submission

Process

Preservation

Archival Processing

Workflow

Archival Updating

Workflow

Access

Public User

Access & Delivery

Application

Authorized User

Access & Delivery

Application

Processing

Package Updating

Workflow

Access Processing

Workflow

Publishing Process

ILS Integration

Application

Submission

Process

Congressional Submission

Workflow (interactive)what renditionsare available?

how will metadata be

extracted and merged?

what manual edits may be

required?

how are PDF files processed?

how will the HTML rendition be

created

how will parser data and input files be

validated

what’s on the search form?

how will the content and metadata be

indexed

what are the navigators?

how will the MODS be created?

how are search results formatted?

what do content URLs look like?

DMD Defines How Data Flows Through System

Page 61: II-SDV 2015, 20 - 21 April, in Nice

61

Google Additional Slides

Page 62: II-SDV 2015, 20 - 21 April, in Nice

62

What Search Technologies Provides

• GSA Search Assessment Analysis

• Search application development

• Corporate Wide Search Solution

• SharePoint GSA search integration

• Custom Connectors, such as RightNow, Lotus Connections, Confluence, etc.

• System architecture and design

• Security integration

• Performance analysis and optimization

• Managed Service and 24x7 Support

Page 63: II-SDV 2015, 20 - 21 April, in Nice

63

GSA Assessment Services

• Search Application Assessment

– Requirements gathering and planning

• Entity Recognition Assessment

– Entity identification and implementation planning

• Sensitive Data Assessment

– Data security above and beyond document-level ACL compliance

Page 64: II-SDV 2015, 20 - 21 April, in Nice

64

Customer Examples

• EMC – Storage Platform

– Corporate Wide Search Platform for internal users and partners

– Aspire connectors: SharePoint, File system, Database, eRoom, JIVE, Teamforge, Socialcast

• Isilon Systems – Storage Platform

– Customer Support – RightNow Connector

– Sales – Salesforce.com

• Amirsys – Medical Diagnosis

– Decision Support Portal

– Used by 40,000 physicians in 50 countries

• Savvis – Service and Web Hosting Company

– Command Center application

– SharePoint Connector

Page 65: II-SDV 2015, 20 - 21 April, in Nice

65

Case Study Slides

Page 66: II-SDV 2015, 20 - 21 April, in Nice

66

Example CustomersCorporate Wide Search

• EMC

• Norton Rose (FAST ESP) – Application Management, Technical Support

• PTC (FAST ESP) – Tier 3 Technical Support, Hosting

• BNA (Solr/Lucene) – Application Management, Consulting, Hosting

• Unilever (Verity K2, RetrievalWare, FAST ESP) – Application Management, Consulting

• NXT Customers (NXT) – 40 Hosted NXT Applications

• Chick-Fil-A (FAST ESP) – Application Management, Consulting

• Seagate - GSA-based CWS. 3 connectors + Aspire Enterprise framework to normalize

Data Warehouse (Big Data)

• State Compensation Insurance Fund

E-Commerce

• Nordstrom

• Apple (anonymous)

• Samsung (anonymous)

Search & Match

• Adecco (anonymous) and/or Jobsite

Media & Publishing

• Reed Elsevier (Reconstruction Data etc.)

• CPA (FAST ESP) – Application Management, Consulting, Hosting

• Haymarket

• Gartner (?)

Government

• GPO (FAST ESP) – Application Management, Consulting

• Library of Congress (FAST ESP) – Application Management, Consulting, Hosting

• NARA – National Archives – Application and Infrastructure Architecture and Development, Consulting

• OLRC

Need more examples inDifferent solution areasMaybe not so many on CWS

Page 67: II-SDV 2015, 20 - 21 April, in Nice

67

Corporate Wide Search / Enterprise Search

Page 68: II-SDV 2015, 20 - 21 April, in Nice

68

Comcast

Page 69: II-SDV 2015, 20 - 21 April, in Nice

69

Comcast

Background

• Built on Solr/Lucene

• Largest cable and home internet provider in the US

• Search Technologies provides expert architecture, design and

development services to in-house team.

• Replacement of a home-grown system with new Solr / Hadoop

application used to service set-top box requests and browsing of TV

listings by subscribers

Page 70: II-SDV 2015, 20 - 21 April, in Nice

70

Comcast

Key Details

• Very fast indexer - 500 records per second

• Recommendations engine processes 2.8 billion records in 8 hours

(down from 24 hours).

• Vote-counting recommendations algorithm calculates recommended

movies and TV shows for a million movies and shows in Comcast’s

library.

• Millisecond search response - using Solrj

• Integration with and improvement of existing

ranking/grouping/boosting rules

Page 71: II-SDV 2015, 20 - 21 April, in Nice

71

Capital Group

Page 72: II-SDV 2015, 20 - 21 April, in Nice

72

Capital Group

Background

• Built on FAST ESP

• Global investment and financial management firm

• Search Technologies built the complete solution

• Intranet search portal serving multiple applications and departments

covering every aspect of the business

• Searching prior customer communications, presentations, and legal

documents

• Used by every aspect of the business

Page 73: II-SDV 2015, 20 - 21 April, in Nice

73

Capital Group

Key Details

• New, highly customised search user interface

• Migration from legacy RetrievalWare system

• Core technologies: Java Server Faces, Weblogic 9.2, Apache Web

Services API, Apache commons, Embedded Java DB

• Support for Chinese and Japanese

• Customised feeding and document processing

• Data resides in Documentum, Lotus Notes, Oracle

• Full Windows AD-based security

Page 74: II-SDV 2015, 20 - 21 April, in Nice

74

SAIC

Page 75: II-SDV 2015, 20 - 21 April, in Nice

75

SAIC

• Background

• Built on Google GSA

• Large Government-focused systems integrator

• Search Technologies provides expert services

• Intranet application

Page 76: II-SDV 2015, 20 - 21 April, in Nice

76

SAIC

• Key Details

• Indexing SharePoint Cluster of 50 Site collections

• Hundreds of User-Managed Sub-Sites

• Document-level security and NTLM authentication

• XSLT customization to display fields according to document type

• Massive expansion planned

Page 77: II-SDV 2015, 20 - 21 April, in Nice

77

Media & Publishing

Page 78: II-SDV 2015, 20 - 21 April, in Nice

78

Yellowpages.com

Page 79: II-SDV 2015, 20 - 21 April, in Nice

79

Yellowpages.com

Background

• Originally Built on FAST ESP. Recently migrated to Solr/Lucene

• Worlds Leading Internet Yellowpages Site

• Owned by AT&T

• Search Technologies involved since 2005 providing expert services on

both FAST ESP and Solr/Lucene

Page 80: II-SDV 2015, 20 - 21 April, in Nice

80

Yellowpages.com

Key Details

• Business Listings available for all 50 states

• Massively scalable search clusters in 2 data centers

• ATG based JSP GUI

• Oracle content updated daily

• Handling over 2000 queries per second

• Linguistic work (spellings, synonyms)

Page 81: II-SDV 2015, 20 - 21 April, in Nice

81

GPO.gov

Page 82: II-SDV 2015, 20 - 21 April, in Nice

82

GPO.gov

Background

• Built on FAST ESP & Documentum

• The publishing arm of the Federal Government

• Search Technologies is the main contractor for search, including

architecture, design, development and implementation

• The Federal Digital System www.gpo.gov/fdsys provides public access

to information provided by Congress and other Federal agencies

“The GPO and the Office of the Federal Register accomplished a minor miracle in warp speed time” - Ray Mosely, Director of the Federal Register

Page 83: II-SDV 2015, 20 - 21 April, in Nice

83

GPO.govKey Details

• 50+ data sources, each with its own legacy, format & purpose, including

US Laws, Congressional Reports, Daily Congressional Records,

Economic Indicators, Reports to the President and the Budget of the US

Government

• Developed a document processing infrastructure to prepare incoming

data sets for indexing

Page 84: II-SDV 2015, 20 - 21 April, in Nice

84

Computer Patent Annuities (CPA)

Page 85: II-SDV 2015, 20 - 21 April, in Nice

85

Computer Patent Annuities (CPA)

Background

• Built on FAST ESP

• Leading legal/intellectual property services provider

• Search Technologies is providing the complete solution

• A major new patent search application involving 90MM patents from

100+ authorities around the world

Page 86: II-SDV 2015, 20 - 21 April, in Nice

86

Computer Patent Annuities (CPA)

Key Details

• Data cleansing, normalization & enrichment

• Establishing new relationships between patents

• Fast “similarity searching” requiring highly optimized indexes

• Collaborative tools for patent research teams

• Search-driven BI features in SharePoint