Document Mining Analysis Study for 2Q2015 - Data...
-
Upload
nguyendung -
Category
Documents
-
view
218 -
download
3
Transcript of Document Mining Analysis Study for 2Q2015 - Data...
Document Mining Analysis
Study for 2Q2015
Study Scope and Deliverables
Kick-off Meeting
Date: 5-01-2015
Rembrandt Group LLC
• Agenda
• Study
• Scope
• Assumptions
• Deliverables
• Architecture
• Time Line
• Project Plan
• Outstanding Issues
Document Mining Analysis
Kickoff Meeting & Discussion
2
Rembrandt Group
Document Mining Analysis
Project Objectives
1. Document an initial understanding of current data requirements of a proposed system using KMF to search three types of files – MS PowerPoint, MS Word and PDFs – for information specified by the user. The information is to be formatted into a template, such as an Excel spreadsheet, for further processing by the user, the Client
2. Specify and define systems architecture to support the above by providing another level of detail to the existing analytics that the Client currently uses
3. Determine the data and business definitions
4. Define computations and summation of details gathered by KMF
5. Define Scope of the project (sources and levels of computations and target reports)
4. Review the information templates provided by the Client for sources and target end user reports or summations of KMF search findings
5. Create a Cutover Plan for turning over to applications to the Client
3
Rembrandt Group
Document Mining Analysis
Project Objectives (cont’d)KMF is to be used to build a pilot application for the Client to search various Document types: MS POWERPOINT, MS WORD and PDF sources to be located on a hard drive or server (to be determined)
Project Name: Document Mining Analysis
Basic High level System Functionality:
1. The system will take input from the user, populate a template (assumed to be Excel)
2. Specification needs to be finalized regarding the selection of a word, string or series of statements for input
3. Sources are POWER POINT (approx. 500 files) WORD ( approx. 200 files) and PDFs (approx. 100 files)
4. The user will be able to make and execute selections from the sources via KMF according to search criteria specifications
5. According to the requirement, KMF will access the selected sources, combine sources, execute any logic and present the results
6. Upon user approval, the results will be downloaded to an Excel spreadsheet in a format to be defined
4
Rembrandt Group
Document Mining Analysis: Solution Architecture
User Supplies
Input to
Spreadsheets*
Input Captured
Sent to KMF
Input to
Excel
Spreadsheets*
Output Captured
Sent to User PC
KMF
Server Directory A C:\sub\sub\
Power
PointWORD PDF
KMF Finds Sources
Connects Sources
(Note: Server & Directories TBD)
KMF extracts values from files
KMF Combines Sources
Looks for strings or values
Performs logic and computations
Formats results
Returns results to Users
Note: Logic and Computation Formulas
Defined by the ClientAnalytics (AA)
*Note: Spreadsheet requirements
defined by AA
Rembrandt Group5
500
files200
files100
files
Document Mining Analysis
Project Assumptions - Phase I
the ClientAnalytics (AA) to provide or approve:
• Input (variables) to launch spreadsheet (define timeframes between x and x, all, etc.)
• Any business rules for selecting sources & timeframe
• 500 POWERPOINT, 200 WORD and 100 PDF documents on a disk or server for Rembrandt team to use for analysis
• Definition & finalization of sources to be searched
• Processing logic
• Passwords or IDs to access sources
• Output format (flat file, database or Excel spreadsheet)
Rembrandt to provide:
• KMF engine to execute system
• Resources to perform analysis, design, implementation testing & deployment
6
Rembrandt Group
Document Mining Analysis
Project Objectives Phases I & 2
1. Phase 1 (@ 3 months): Design a Cutover Plan for Turnover of Phase1
Option 1: Customer takes ownership of the system & system is transferred to customer site
Option 2: Customer keeps system offsite & agrees to Phase 2 before transferring internally
2. Phase 2: Design, Build, Test & Deploy Phase 2
Define sources: internal & external
Develop tasks for harmonizing data definitions, models & templates
Develop tasks for selecting & finalizing technical architecture/infrastructure components
Develop actionable tasks for developing the detailed design for this project
Develop a resource plan & appropriate IT structure to support system
Develop a time line for executing the plan & projects
7
Rembrandt Group
Document Mining Analysis: Approach/Timing
Workplan
Time Line and High Level Project Plan
to be determined after finalization of requirements
Rembrandt Group8
Document Mining Analysis: Pricing
Pricing to be determined after finalization of requirements
1. Number of Sources
2. Processing, Logic, Computations
3. Data to be returned from search and in what desired format
4. Process Flow, Screen Shots, etc.
5. Output Format (Excel, flat file, database)
6. Acceptance Criteria
Rembrandt Group9
Supplemental Slides
(Used to Enhance Discussion)
Rembrandt Group10
Bring in Knowledge From Anywhere With KMF
KMFF
KMF Makes it Simple
11
Rembrandt Group
KMF
12
Rembrandt Group
KMF Today
HistoryArtificial Intelligence Specialization, Fortune 500 Clients,
Granted 4 US patents for Artificial Intelligence
Product KMF Platform
Market All industries, sectors needing complex data processing
Partial List of
Companies using
KMF
Navy Army L.A.
Louisiana
13
Rembrandt Group
Market Research Competitive Analytics with KMF Enabled
Flat Files Data
Multiple Systems
Planning & SALES SD-AP
Financial Data
(SAP Product MSTR)
Reports to Field Forces
Libraries
Emails, Web monitoring
External Libraries
Internet News
KMF
Engine
• Control of Information
• No Coding
• Fast and accurate
• 24x7x365 updates
• Not Millions like DW
Reports
Users:
• Home Office
• Sales (RD DMs)
• Reps
Marketing
• VPs
• Analysts
• Ad Hocs
• Primary Research
• Secondary Research
• Reports
• Graphics
Competitive Analytics
• Category
• Brand /SKU
• Product
Strategic Planning
KMF
Sales/
MRE
Mgmt
PortalCompetitive
Information
to
• Reps
• Product
• Channels
• Distribution
• Stores
Benefits with KMF
• Better Data
• No Coding
• Faster Access to results
• No Manual Effort
• Reduces Total Effort
• Reduces research Costs
• Faster Products to Market
Social Media Sites
Internet Sites
Rentrack
Nielsen
IRI
Social Media
DWDMCRM
SFA
External
Data
Internal Data
Sales Market Research PortalPaid DBs
Charts
Graphics
14
BrainJuicer
Rembrandt Group