Big Data Visualization: Turning Big Data Into Big Insights – White ...
Big Data Visualization
-
Upload
raffael-marty -
Category
Data & Analytics
-
view
17.398 -
download
0
Transcript of Big Data Visualization
![Page 1: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/1.jpg)
Raffael Marty, CEO
Big Data Visualization
London February, 2015
![Page 2: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/2.jpg)
Secur i ty. Analyt ics . Ins ight .2
• Visualization
• Design Principles
• Dashboards
• SOC Dashboard
• Data Discovery and Exploration
• Data Requirements for Visualization
• Big Data Lake
Overview
![Page 3: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/3.jpg)
Secur i ty. Analyt ics . Ins ight .3
I am Raffy - I do Viz!
IBM Research
![Page 4: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/4.jpg)
4
Visualization
![Page 5: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/5.jpg)
Secur i ty. Analyt ics . Ins ight .5
Why Visualization?the stats ...
http://en.wikipedia.org/wiki/Anscombe%27s_quartet
the data...
![Page 6: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/6.jpg)
Secur i ty. Analyt ics . Ins ight .6
Why Visualization?
http://en.wikipedia.org/wiki/Anscombe%27s_quartet
Human analyst: • pattern detection • remembers context • fantastic intuition • can predict
![Page 7: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/7.jpg)
Secur i ty. Analyt ics . Ins ight .7
Visualization To …
Present / Communicate Discover / Explore
![Page 8: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/8.jpg)
Design Principles
![Page 9: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/9.jpg)
Secur i ty. Analyt ics . Ins ight .9
Choosing Visualizations
Objective AudienceData
![Page 10: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/10.jpg)
Secur i ty. Analyt ics . Ins ight .10
• Objective: Find attackers in the network moving laterally
• Defines data needed (netflow, sflow, …)
• maybe restrict to a network segment
• Audience: security analyst, risk team, …
• Informs how to visualize / present data
For Example - Lateral Movement
Recon Weaponize Deliver Exploit Install C2 Act
![Page 11: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/11.jpg)
Secur i ty. Analyt ics . Ins ight .11
• Show comparisons, contrasts,
differences • Show causality, mechanism,
explanation, systematic structure. • Show multivariate data; that is,
show more than 1 or 2 variables.
by Edward Tufte
Principals of Analytic Design
![Page 12: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/12.jpg)
Secur i ty. Analyt ics . Ins ight .12
Show Context
42
![Page 13: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/13.jpg)
Secur i ty. Analyt ics . Ins ight .
42 is just a number
and means nothing without context
13
Show Context
![Page 14: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/14.jpg)
![Page 15: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/15.jpg)
Secur i ty. Analyt ics . Ins ight .15
Use Numbers To Highlight Most Important Parts of Data
NumbersSummaries
![Page 16: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/16.jpg)
Secur i ty. Analyt ics . Ins ight .16
Additional information about objects, such as:
• machine • roles • criticality • location • owner • …
• user • roles • office location • …
Add Context
source destination
machine and user context
machine role
user role
![Page 17: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/17.jpg)
Secur i ty. Analyt ics . Ins ight .17
Traffic Flow Analysis With Context
![Page 18: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/18.jpg)
Secur i ty. Analyt ics . Ins ight .18
http://www.scifiinterfaces.com/
• Black background • Blue or green colors • Glow
Aesthetics Matter
![Page 19: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/19.jpg)
Secur i ty. Analyt ics . Ins ight .19
B O R I N G
![Page 20: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/20.jpg)
Secur i ty. Analyt ics . Ins ight .20
Sexier
![Page 21: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/21.jpg)
Secur i ty. Analyt ics . Ins ight .21
• Audience, audience, audience!
• Comprehensive Information (enough context)
• Highlight important data
• Use graphics when appropriate
• Good choice of graphics and design
• Aesthetically pleasing
• Enough information to decide if action is necessary
• No scrolling
• Real-time vs. batch? (Refresh-rates)
• Clear organization
Dashboard Design Principles
![Page 22: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/22.jpg)
22
SOC Dashboards
![Page 23: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/23.jpg)
Secur i ty. Analyt ics . Ins ight .23
Mostly Blank
![Page 24: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/24.jpg)
Secur i ty. Analyt ics . Ins ight .24
• Disappears too quickly
• Analysts focus is on their own screens
• SOC dashboard just distracts
• Detailed information not legible
• Put the detailed dashboards on the analysts screens!
Dashboards For Discovery
![Page 25: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/25.jpg)
Secur i ty. Analyt ics . Ins ight .25
• Provide analyst with context
• “What else is going on in the environment right now?”
• Bring Into Focus
• Turn something benign into something interesting
• Disprove
• Turn something interesting into something benign
Use SOC Dashboard For Context
Environment informs detection policies
![Page 26: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/26.jpg)
Secur i ty. Analyt ics . Ins ight .26
Show Comparisons
Current Measure
week prior
![Page 27: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/27.jpg)
Secur i ty. Analyt ics . Ins ight .27
• News feed summary (FS ISAC feeds, mailinglists, threat feeds)
• Monitoring twitter or IRC for certain activity / keywords
• Volumes or metrics (e.g., #firewall blocks, #IDS alerts, #failed transactions)
• Top N metrics:
• Top 10 suspicious users
• Top 10 servers connecting outbound
What To Put on Screens
Provide context to individual security alerts
http://raffy.ch/blog/2015/01/15/dashboards-in-the-security-opartions-center-soc/
![Page 28: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/28.jpg)
28
Data Discovery & Exploration
![Page 29: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/29.jpg)
Secur i ty. Analyt ics . Ins ight .29
Visualize Me Lots (>1TB) of Data
![Page 30: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/30.jpg)
Secur i ty. Analyt ics . Ins ight .30
Information Visualization Mantra
Overview Zoom / Filter Details on Demand
Principle by Ben Shneiderman
• summary / aggregation • data mining • signal detection (IDS, behavioral, etc.)
![Page 31: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/31.jpg)
Secur i ty. Analyt ics . Ins ight .31
• Access to data
• Parsed data and data context
• Data architecture for central data access and fast queries
• Application of data mining (how?, what?, scalable, …)
• Visualization tools that support
• Complex visual types (||-coordinates, treemaps,
heat maps, link graphs)
• Linked views
• Data mining (clustering, …)
• Collaboration, information sharing
• Visual analytics workflow
Visualization Challenges
![Page 32: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/32.jpg)
Big Data Lake
![Page 33: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/33.jpg)
Secur i ty. Analyt ics . Ins ight .33
• One central location to store all cyber security data • “Data collected only once and third party software leveraging it” • Scalability and interoperability
• More than deploying an off the shelf product from a vendor • Data use influences both data formats and technologies to store the data
• search, analytics, relationships, and distributed processing • correlation, and statistical summarization
• What to do with Context? Enrich or join? • Hard problems:
• Parsing: can you re-parse? Common naming scheme! • Data store capabilities (search, analytics, distributed processing, etc.) • Access to data: SQL (even in Hadoop context), how can products access the data?
The Big Data Lake
![Page 34: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/34.jpg)
Secur i ty. Analyt ics . Ins ight .34
Federated Data Access
SIEM
dispatcher
SIEM connector SIEM console
Prod A
AD / LDAPHR
…
IDS
FW Prod B
DBs
Data Lake
Caveats:
• Dispatcher?
• Standard access to dispatcher /
products enabled
• Data lake technology?
SNMP
![Page 35: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/35.jpg)
Secur i ty. Analyt ics . Ins ight .35
Multiple Data Stores
raw logs
key-value
structured
real-timeprocessing
(un)-structured data
context
SQL
storage
stats
index
queue
distributedprocessing
access
graph
Caveat:
• Need multiple types of data stores
![Page 36: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/36.jpg)
Secur i ty. Analyt ics . Ins ight .36
Technologies (Example)
raw logs
key-value(Cassandra)
columnar(parquet)
real-time processing
(Spark)
(un)-structured data
context
SQL(Impala,
SparkSQL)
HDFS
aggregates
index(ES)
queue(Kafka)
distributedprocessing
(Spark)
access
graph(GraphX)
Caveat:
• No out of the box
solution available
![Page 37: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/37.jpg)
Secur i ty. Analyt ics . Ins ight .37
SIEM Integration - Log Management First
SIEM
columnar or
search engineor
log management
processing
SIEM connector
raw logs
SIEM console
SQL or searchinterface
processingfiltering
HDFS
e.g., PIG parsing
![Page 38: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/38.jpg)
Secur i ty. Analyt ics . Ins ight .38
Simple SIEM Integration
raw, csv, jsonflume
log data
SQL(Impala,
with SerDe)
HDFS
SIEM connector
SIEM
Requirement:
• SIEM connector to forward text-based data to Flume.
SQL interface Tableau, etc.
SIEM console
![Page 39: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/39.jpg)
Secur i ty. Analyt ics . Ins ight .39
SIEM Integration - Advanced
SIEM
columnar(parquet)
processing
syslog data
SQL(Impala,
SparkSQL)
HDFS
index(ES)
queue(Kafka)
access
other data sources
SIEM connector
raw logs
SIEM console
SQL and search interface
Tableau, Kibana, etc.requires parsing and formatting in a SIEM readable format (e.g., CEF)
![Page 40: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/40.jpg)
Secur i ty. Analyt ics . Ins ight .40
What I am Working On
Data Stores Analytics Forensics Models Admin
10.9.79.109 --> 3.16.204.150 10.8.24.80 --> 192.168.148.19310.8.50.85 --> 192.168.148.19310.8.48.128 --> 192.168.148.19310.9.79.6 --> 192.168.148.193
10.9.79.6
10.8.48.128
80
538.8.8.8
127.0.0.1
Anomalies
Decomposition
Data
Seasonal
Trend
Anomaly Details
“Hunt” ExplainVisual Search
• Big data backend • Own visualization engine (Web-based) • Visualization workflows
![Page 41: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/41.jpg)
Secur i ty. Analyt ics . Ins ight .41
BlackHat Workshop
Visual Analytics - Delivering Actionable Security
Intelligence
August 1-6 2015, Las Vegas, USA
big data | analytics | visualization
![Page 42: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/42.jpg)
Secur i ty. Analyt ics . Ins ight .42
http://secviz.org
List: secviz.org/mailinglist
Twitter: @secviz
Share, discuss, challenge, and learn about security visualization.
Security Visualization Community
![Page 43: Big Data Visualization](https://reader031.fdocuments.in/reader031/viewer/2022022412/58f9b379760da3da068bd67c/html5/thumbnails/43.jpg)
Secur i ty. Analyt ics . Ins ight .
http://slideshare.net/zrlram
http://secviz.org and @secviz
Further resources: