11121 Leavells Road Fredericksburg, VA 22407 Email: Search ...
11121 ≡ Kon Leong, CEO Turning The RIM Tide: How To Channel The Data Tsunami Into A Data Lake...
-
Upload
gloria-barrett -
Category
Documents
-
view
218 -
download
0
Transcript of 11121 ≡ Kon Leong, CEO Turning The RIM Tide: How To Channel The Data Tsunami Into A Data Lake...
11121
≡
Kon Leong, CEOwww.ZLTI.com
Turning The RIM Tide: How To Channel The Data Tsunami
Into A Data Lake
2015May 18th, 2:30 PM
11121
Turning the RIM Tide:How To Channel The Data Tsunami Into A “Data Lake”
≡
● The 3 Steps To The Data Lake 1. Gather All The Data 2. Enable All The Functions 3. All Together Now – The Data Lake
● Rethinking RIM … Again
● Back To The Future From IG To Information Management
● Case Studies
11121
Unified Collection of Unstructured Big Data
All D
ata
Type
s
E-mail Exchange, Lotus
Share-Point,Quickr
FileSystemXML
Black-berry
Bloom-berg IM
PST/NSFMIMEMSG OCS ECM Fax
SocialMedia
≡Al
l Col
lecti
on
Mod
es
ProactiveArchiving(by policy)
Index In-Place(without storing
a copy)
Reactive Archiving(on demand)
UNIFIEDCollection
Data-in-the-Cloud
Data-On-Premise
All D
ata
Loca
tions
Office365, Gmail, Salesforce, etc.
11121
Governance Functions
Reduce Storage Overload
User Need
Corporate / Gov’t Compliance - 100% Capture, Index, Store, Search - Tamper-Proofing - Monitoring & Surveillance
Litigation Support, eDiscovery* - Search, Hold, Attorney-Client - Save Discovery Costs - Case Management
Stay In Control, Minimize R.O.I.
Support Litigation, Slash Costs
Records Retention Management* - Enable Electronic Records Retention - Enforce Doc Level Granular Policy
Manage Electronic Corporate Records
Solution
Corporate eMemory Mgmt* - Enable Enterprise Data Mining - “Classified” Secure Access, Audits
“Big Data” Analyticsfor Competitive Advantage
“Silos” = duplicates
Storage Management* - Dedupe/Offload Large Email/Files - Avoid Storage Quota / Admin O’H - End-User Access, Productivity
10X
≡
2000
2002
2005
2008
2012
; inconsistent search; disjointed retention
circa
; differing viewsloss of data control Solution: Unified Archive: One Copy, One Policy Control Point, One Search, One View
11121
Getting It All Together
≡
● UNIFIED Applications e-Discovery
Compliance
Records
Storage
Analytics
One central console
Single data copy
No data moved between apps
● UNIFIED Architecture One Data Copy
One Policy Control Point
One Search
One Data Schema
Much Lower Admin O’H
Much Lower Storage Costs
Much Faster Performance
DATA CONTROL
● UNIFIED Data Types DominoExchange,
Files, ECM, ERP, Twitter,
IM, Soc.Media, Salesforce
Ingest all data types
All stored together, no silos
Notes/Exchange – no migration
Benefits
11121
The Architecture At 30,000 Feet
Exte
rnal
Dat
a
Hadoop, eDiscovery & Other Ecosystems
Analytics*
Storage
Records*
Compliance
E-Discovery*
Cloud Options
Hadoop-compatible Infrastructure
Dat
a M
anag
emen
t Lay
erak
a “D
ata
Lake
”
Unified ArchiveRepository must scale to BILLIONS of items
ONE Copy/Policy/Search/Data Schema/System
Capture: Single Instance │ Copy │In-Place │ Crawl │ Stub │ Classify │ OCR │ Encrypt │ Compress │ Restore
farm
Event LogsSys/Net Logs
Structured DataERP, BI,
STRU
CTU
RED
≡D
ata
Crea
tion
Laye
r
RIM
e-D
Storage
Analytics
IMSharedFiles
Share-Point
PSTNSF
ECM SocialMedia
EmailExchange, Domino
UN
STRUCTU
RED
Salesforce, Gmail
“lake”
11121
As one of the top broker-dealers in the U.S., this firm manages over 30,000 broker-dealers and employees. The company needed to manage files and email for SEC compliance, e-discovery, records management and storage optimization. Previous solutions were unstable and did not meet minimum requirements.
Challenge
Satisfied SEC compliance requirements on document retention, while cutting costs
Provided highly available access
Reduced server counts by 75%, while significantly increasing throughput
Centralized data flow from multiple silos
Solved multiple data coherency issues caused by silos under one unified system
Expected boost in marketing and management performance through analytics
Value
Deployed Unified Data Management Platform to provide compliance, e-discovery and records management for files, email and many other file types. Currently, evaluating early use cases for Analytics.
Solution
≡
Files, Email, IM, Bloomberg, Social Media Compliance, Records, e-Discovery
A Top Wealth Management FirmCASE
STUDY
11121
eTrash
Non-Records
Records
eTreasure?or
“Big Data Analytics” “Data Lake”
“Unified Repository”
Retention?
AccessPrivileges?
Classification?
Audit?
0% 100%Need for R I M
Scope of New Records Management
≡
The Future Of RIM
11121
Analytics
The Power of Unstructured BIG DATA
Legal Compliance
IT Records
≡
Who Will “Own” It?
Acquisition: Committee
Ownership: GC, CRO, CCO, IT
Operation: IT, RIM, GC, CCO
Analytics: CDO, CDS, 3rd Party
11121
Turning the RIM Tide:How To Channel The Data Tsunami Into A “Data Lake”
≡
● Avoid Data Silos
● Take A Unified Approach 1. Add Info Governance 2. Add Analytics
● Expand Scope Of RIM
● Prepare For Life Beyond IG
SUMMARY
11121
≡
Kon Leong, CEOwww.ZLTI.com
Turning The RIM Tide: How To Channel The Data Tsunami
Into A Data Lake
2015
11121
≡
to cost-justify the archive.
Analytics for Strategic Advantagewith Corporate eMemory®
CO
MP
LIA
NC
E
E-D
ISC
OV
ER
Y
ES
TO
RA
GE
RE
CO
RD
S
“Unified Archive”Unstructured / Structured Big Data
There are already 4 ROIs
AnalyticsUse Cases
Reduce Costs
and Risks
(Defensive)
Make Money,
MaximizeValue
(Offensive)*
IP – Prior Art KM
Governance
BoD
HR*SCM
Investigations* R&D Sentiment
Trending Topics Retiree*
Expert Network
Rev Rec
Effectiveness*Sales Analysis*
11121
≡
11121
Employee Effectiveness
Phone Calls
Documents
Web Logs
Calendars
Influence
Expertise
Leadership
Efficiency
Effectiveness
Powerful Technology Competitive Advantage
Data Sources Inferences
≡
Dial For Priv
Dial For
Priacy
Dial For
Prvacy
Dial For Priacy
Privacy vs. Control
11121
People Analytics – Early Sample Use Cases
≡
HR/Sales● Who knows whom?● Who is likely to quit?● Analyze transactions, networks, roles*● When were our last visits to our top twenty clients?● Competitive intelligence on demand● Avg. 1,100 connections per salesperson. Map them.● Reduce the probation period 12 months to 1 month
HR/Expert Network● Who knows what? Domain expert database● Who fixed what when? Replay events, for crisis mgmt., product recalls, product liability● Retain knowledge of retired/moved/former employees● Identify experts, offer contracts
HR/Human Networking● Who knows whom (internal ↔ external)● Strength rankings of client/employee relationships
HR/Productivity/Sentiments● Who are the most productive employees?● Workforce sentiment ● Top Trending Topics
HR/Leadership● Who are the real “go-to” people, the most respected?● Who should be promoted?
Compliance/SarbOx● Revenue recognition detection. Identify side-letters● Detect bribery, collusion, FCPA violations
Security/Threat Prevention● 80% threats come from internal vs. external● Prevention – Detect anomalies vs. benchmark● Information Vectors – Detect Ed Snowdens in progress
Security/Investigations● Ad hoc investigations, organization-wide, in seconds● Replay events, instantaneously
Transparency/Org. Politics● Instantaneous transparency, identify political friction
Post-Merger Integration● Track progress, view bottlenecks, expose politics
R&D Resource Pooling● Coordinate global R&D, leverage past research, eliminate duplication, shorten timeframe● R&D patent / prior art research
Legal● De-dupe legal advice (reduce outside counsel fees)● Reuse legal language, check variances
11121
≡
11121
≡
11121
≡
11121
≡
11121
Gartner Says Beware of the Data Lake FallacyStamford, Conn., July 28, 2014
Information Leaders Must Understand the Gaps in Data Lake Concept and Take Necessary Precautions
The growing hype surrounding data lakes is causing substantial confusion in the information management space, according to Gartner, Inc. Several vendors are marketing data lakes as an essential component to capitalize on Big Data opportunities, but there is little alignment between vendors about what comprises a data lake, or how to get value from it.
"In broad terms, data lakes are marketed as enterprise-wide data management platforms for analyzing disparate sources of data in its native format," said Nick Heudecker, research director at Gartner. "The idea is simple: instead of placing data in a purpose-built data store, you move it into a data lake in its original format. This eliminates the upfront costs of data ingestion, like transformation. Once data is placed into the lake, it's available for analysis by everyone in the organization."
However, while the marketing hype suggests audiences throughout an enterprise will leverage data lakes, this positioning assumes that all those audiences are highly skilled at data manipulation and analysis, as data lakes lack semantic consistency and governed metadata …
≡
11121
You have to manage your Data Lake – the fallacy of technology being magic
Steve Jones, Capgemini August 5, 2014 (excerpts)
Gartner published a report calling Data Lakes a fallacy in which they point out many of the issues with an unmanaged Hadoop environment. It’s a great headline but actually the paper itself raises exactly the points that we made in Capgemini back in 2011 about what companies should be doing in this space. Back then we published a paper on Mastering Big Data which talked about how data governance was a core requirement to get value out of Big Data.
Gartner raised a very valid point, basically that Hadoop isn’t magic. Just dumping data into a single repository doesn’t mean that it’s now magically easy to use. But dismissing data lakes altogether is “throwing the baby out with the bathwater.”
It’s great that Gartner is highlighting the need for governance and the need for considering both structured and unstructured information in a unified manner.
At Capgemini, we’ve been championing governance in a Big Data world and it’s great to have Gartner agreeing with us.
≡
11121
≡
● Lot of labor / time to collect● Data may be old, incomplete● Adds data silos more risk
Working on Dead GrassLiving
Analytics – Today vs. Tomorrow
● Tap into data, instantaneously● Data is up to date, complete● Data is under governance, no silos
Reactive E-Discovery Proactive E-DiscoveryReactive Analytics Proactive Analytics