Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies
description
Transcript of Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies
Tin Kam HoTin Kam HoComputing Sciences Research CenterComputing Sciences Research CenterBell Labs, Lucent TechnologiesBell Labs, Lucent Technologies
In collaboration with In collaboration with
David Wittman, J. Anthony Tyson of UC DavisDavid Wittman, J. Anthony Tyson of UC DavisSamuel Carliles, William O’Mullane, Alex Szalay of JHUSamuel Carliles, William O’Mullane, Alex Szalay of JHU
Interactive Pattern Discovery with
Large Imaging Databases
What Is the Story in this Image?
1. Describe each symbol shape with a numerical vector [23 12 17 28 11 …]
2. Find clusters of symbol shapes
3. Interpret each cluster using context
Solving the Puzzle with a 3-step Approach
10.10.10 51.37.50.54.41.35.37 39.47.33.44 13.13 33.52.6.52 83.65.73.68 73.84 72.65.83 83.69.84 65 71.79.65.76 79.70 82.69.83.84.79.82.73.78.7183.69.82.86.73.67.69 79.78 73.84.83 76.79.78.71.13.68.73.83.84.65.78.67.6978.69.84.87.79.82.75 70.65.83.84.69.82 84.72.65.78 84.72.69 83.89.83.84.69.7773.84.83.69.76.70 68.73.83.67.79.78.78.69.67.84.83 67.65.76.76.83 65.70.84.69.82 65 67.65.66.76.69 66.82.69.65.75.14
*** SERVICE GOAL -- AT&T said it has set a goal ofrestoring service on its long-distance network faster thanthe system itself disconnects calls after a cable break.
Tracking Intensive Rain Cells in Radar Images
http://dls.physics.ucdavis.edu/
BVRz to 26 mag over 28 sq. degree
The Deep Lens Survey(Tyson, Wittman, … )
Weak Gravitational LensingUses distortion of background galaxies to map foreground mass concentrations
J.A. Tyson, DLS 2002
Catalog of Extracted Objects
Stars or Galaxies?
J.A. Tyson, DLS 2002
• Discrimination task depends on tiny differences in color and shape
• Survey is to an unpreceded depth: most objects have never been observed before and nobody knows their true classification
• How does one build confidence on the results of the classifier?
• Need to correlate several perspectives: object characteristics in the color space, shape parameters, the brightness statistics
• Visualization can help verify correctness of preprocessing steps, clean up undesirable artifacts, choose relevant samples, spot explicit patterns, select useful features, and suggest algorithms and models
The Virtual Observatory
http://www.us-vo.org/http://www.ivoa.net/
Essential Steps in Automatic Pattern Recognition
Feature Extraction
Classifier Training
Classification
Clustering
Cluster Validation
Cluster Interpretation
Samples
features
classifier
features
class membership
Supervised learningUnsupervised learning
feature 1
feature 2
Feature Set A Set BUnknown Relationship
Clustering
Data Mining
Parameters Responses Feature Computation
Filtering, Clustering
Simulation Analysis
Data Relationships Across Multiple Feature Sets
Key Algorithms
• Clustering:
find natural groups in data, construct index
structures to facilitate proximity queries
• Dimensionality reduction:
embed high-dimensional data in 2D displays
• Navigation:
traverse index structures in systematic ways
Clustering Methods
• Model based Clustering
identification of finite mixtures
• Partitional Clusteringdivides data set into N mutually exclusive subsets
• Hierarchical Clusteringtop-down procedures: tree splitting
bottom-up, agglomerative procedures: merge similar clusters successively
Similarity / Clustering of Objects from Different Perspectives
• Objects can be described by many types of attributes:
position, weight, shape, spectrum, time variability, …
• Meaningful similarity metric exists only for the same type of attributes
• Clusters found from one perspective need to be correlated to those from others
e.g. Are the objects similar in color also similar in shape?
Shape clustersColor clusters
Exploratory Tools Needed
To bring in domain expertise, interpretation context
To visualize data or classifier geometry
To track point/class correlations
To test tentative classifications
To compare groupings from different perspectives
To relate numerical data to other data types
To facilitate systematic, repeatable explorations
Mirage for Interactive Pattern Recognition
Data Display in Linked Views• Show patterns in histograms,
scatter plots, parallel coordinates, tables, and images
Selection and Tracking• Select points in any view, broadcast
to all others
Traversal of Data Structures• Walk in histograms, cluster graphs
or trees, echoed in all other views
Graphical Utilities• Open multiple-page plots with
arbitrary configuration
Command Scripts• Run prepared groups of operations
as an animation
Intuitive Graphical Tool for Exploratory Data Analysis Visualization of Clusters and Classes Correlation of Proximity Structures Manual or Automatic Classification
http://www.cs.bell-labs.com/who/tkh/mirage
Software Features
• Based on Java Swing library
• Intuitive, easy-to-use graphical operations
• Mutiple-page, arbitrary plot configurations
• Online or offline cluster analysis
• GUI or Script driven command execution
• Database interface via JDBC
• Ready to be adapted for on-line monitoring
• Ready to be integrated with database access and decision support systems
Design Motivated by the Needs
Interactive plays, intuitive operationsto bring domain experts into the loop
Multiple types of plots, extensible for moreto visualize data or classifier geometry
Linked views, traversal actions to track point/class correlations
Highlights, colorsto test tentative classifications
Projection to arbitrary subspacesto compare groupings in different perspectives
Linking data with imagesto relate numerical data to other types
Command scriptingto facilitate systematic, repeatable explorations
Challenges for the Analysis Tool
• Separate treatment of non-comparable groups of variables
• Versatile visualization utilities allowing many perspectives
• Support for exploratory discovery across diverse data types
• Integrate manual & automatic pattern recognition methods
Also, a good tool should
-- leverage existing visualization and analysis methods
-- enable continued growth: new visualization, analysis tools
-- support interface with existing databases
-- be scalable in data volume and processing speed
Mirage Core
Data Access Clients
Data Analysis Methods
Custom Data Views
Data Exchange Pipes
VO Data Archives External Rendering Code
Web Services Other Analysis Platforms
Cone Search, CAS
Extinction Calculator Python? Matlab?
FITS viewer, …
Towards Extensibility
VO Enabled Mirage(with Samuel Carliles, William O’Mullane, and Alex Szalay)
VO Enabled Mirage
• http://skyservice.pha.jhu.edu/develop/vo/mirage/
• Load VOTable data and perform VO Cone/SIAP and SDSS CAS searches using IVOA Client Package
• Astronomical imaging module loads FITS images using JSky classes, supporting image operations:
Select data points and broadcast selection to other views. Cut levels. Colormap. SAO DS9-style brightness/contrast enhance. Zoom.
Extinction Web Service(with Chris Miller, Simon Krughoff)
Using DIRBE/IRAS Dust Maps by Schlegel et al.
Mirage Core
Object selection
Extracts RA,DEC,[mag]from Mirage data set
SOAP client callsExtinction server
Merges resultswith Mirage data set
Extinction Service
Positions, mags
Positions, mags, filterIDs
E(b-v), dered_mags
Enhanced data set
Result stream
205th Meeting of the American Astronomical Society
9-13 January 2005 San Diego, CA
Wednesday, 12 January
Astronomical Research with the Virtual Observatory
More at
NVO Public Release 1.0
Analysis of Simulations of Control Dynamics in Optical Transport Systems
(with the FROG collaboration)
Head End Terminal
Repeater
Fiber link
Repeater
Repeater
Repeater
Gain Equalizer
Tail End Terminal
Signal Spectrum with noise floor
Monitoring Network Traffic
Software tool for online monitoring and analysis of QoS in IP networks
• continuously monitors traffic statistics at edge and core devices
• synthesizes statistics in real time to obtain network-wide QoS status and general network element health indicators
• Mirage refreshes displays on alerts of database updates via Java Messaging Service
SEQUINSEQUIN
SNMP polling
SLAverification
BillingProvisioning
MPLS IP Core(QoS-guaranteed paths)
DiffServ Edge(aggregation and
classification)
(With Marina Thottan, Ken Swanson)