Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

28
Tin Kam Ho Tin Kam Ho Computing Sciences Research Center Computing Sciences Research Center Bell Labs, Lucent Technologies Bell Labs, Lucent Technologies In collaboration with In collaboration with David Wittman, J. Anthony Tyson of UC Davis David Wittman, J. Anthony Tyson of UC Davis Samuel Carliles, William O’Mullane, Alex Samuel Carliles, William O’Mullane, Alex Szalay of JHU Szalay of JHU Interactive Pattern Discovery with Large Imaging Databases

description

Interactive Pattern Discovery with Large Imaging Databases. Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies In collaboration with David Wittman, J. Anthony Tyson of UC Davis Samuel Carliles, William O’Mullane, Alex Szalay of JHU. - PowerPoint PPT Presentation

Transcript of Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

Page 1: Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

Tin Kam HoTin Kam HoComputing Sciences Research CenterComputing Sciences Research CenterBell Labs, Lucent TechnologiesBell Labs, Lucent Technologies

In collaboration with In collaboration with

David Wittman, J. Anthony Tyson of UC DavisDavid Wittman, J. Anthony Tyson of UC DavisSamuel Carliles, William O’Mullane, Alex Szalay of JHUSamuel Carliles, William O’Mullane, Alex Szalay of JHU

Interactive Pattern Discovery with

Large Imaging Databases

Page 2: Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

What Is the Story in this Image?

Page 3: Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

1. Describe each symbol shape with a numerical vector [23 12 17 28 11 …]

2. Find clusters of symbol shapes

3. Interpret each cluster using context

Solving the Puzzle with a 3-step Approach

Page 4: Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

10.10.10 51.37.50.54.41.35.37 39.47.33.44 13.13 33.52.6.52 83.65.73.68 73.84 72.65.83 83.69.84 65 71.79.65.76 79.70 82.69.83.84.79.82.73.78.7183.69.82.86.73.67.69 79.78 73.84.83 76.79.78.71.13.68.73.83.84.65.78.67.6978.69.84.87.79.82.75 70.65.83.84.69.82 84.72.65.78 84.72.69 83.89.83.84.69.7773.84.83.69.76.70 68.73.83.67.79.78.78.69.67.84.83 67.65.76.76.83 65.70.84.69.82 65 67.65.66.76.69 66.82.69.65.75.14

*** SERVICE GOAL -- AT&T said it has set a goal ofrestoring service on its long-distance network faster thanthe system itself disconnects calls after a cable break.

Page 5: Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

Tracking Intensive Rain Cells in Radar Images

Page 6: Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

http://dls.physics.ucdavis.edu/

BVRz to 26 mag over 28 sq. degree

The Deep Lens Survey(Tyson, Wittman, … )

Page 7: Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

Weak Gravitational LensingUses distortion of background galaxies to map foreground mass concentrations

J.A. Tyson, DLS 2002

Page 8: Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

Catalog of Extracted Objects

Page 9: Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

Stars or Galaxies?

J.A. Tyson, DLS 2002

Page 10: Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

• Discrimination task depends on tiny differences in color and shape

• Survey is to an unpreceded depth: most objects have never been observed before and nobody knows their true classification

• How does one build confidence on the results of the classifier?

• Need to correlate several perspectives: object characteristics in the color space, shape parameters, the brightness statistics

• Visualization can help verify correctness of preprocessing steps, clean up undesirable artifacts, choose relevant samples, spot explicit patterns, select useful features, and suggest algorithms and models

Page 11: Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

The Virtual Observatory

http://www.us-vo.org/http://www.ivoa.net/

Page 12: Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

Essential Steps in Automatic Pattern Recognition

Feature Extraction

Classifier Training

Classification

Clustering

Cluster Validation

Cluster Interpretation

Samples

features

classifier

features

class membership

Supervised learningUnsupervised learning

feature 1

feature 2

Page 13: Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

Feature Set A Set BUnknown Relationship

Clustering

Data Mining

Parameters Responses Feature Computation

Filtering, Clustering

Simulation Analysis

Data Relationships Across Multiple Feature Sets

Page 14: Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

Key Algorithms

• Clustering:

find natural groups in data, construct index

structures to facilitate proximity queries

• Dimensionality reduction:

embed high-dimensional data in 2D displays

• Navigation:

traverse index structures in systematic ways

Page 15: Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

Clustering Methods

• Model based Clustering

identification of finite mixtures

• Partitional Clusteringdivides data set into N mutually exclusive subsets

• Hierarchical Clusteringtop-down procedures: tree splitting

bottom-up, agglomerative procedures: merge similar clusters successively

Page 16: Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

Similarity / Clustering of Objects from Different Perspectives

• Objects can be described by many types of attributes:

position, weight, shape, spectrum, time variability, …

• Meaningful similarity metric exists only for the same type of attributes

• Clusters found from one perspective need to be correlated to those from others

e.g. Are the objects similar in color also similar in shape?

Shape clustersColor clusters

Page 17: Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

Exploratory Tools Needed

To bring in domain expertise, interpretation context

To visualize data or classifier geometry

To track point/class correlations

To test tentative classifications

To compare groupings from different perspectives

To relate numerical data to other data types

To facilitate systematic, repeatable explorations

Page 18: Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

Mirage for Interactive Pattern Recognition

Data Display in Linked Views• Show patterns in histograms,

scatter plots, parallel coordinates, tables, and images

Selection and Tracking• Select points in any view, broadcast

to all others

Traversal of Data Structures• Walk in histograms, cluster graphs

or trees, echoed in all other views

Graphical Utilities• Open multiple-page plots with

arbitrary configuration

Command Scripts• Run prepared groups of operations

as an animation

Intuitive Graphical Tool for Exploratory Data Analysis Visualization of Clusters and Classes Correlation of Proximity Structures Manual or Automatic Classification

http://www.cs.bell-labs.com/who/tkh/mirage

Page 19: Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

Software Features

• Based on Java Swing library

• Intuitive, easy-to-use graphical operations

• Mutiple-page, arbitrary plot configurations

• Online or offline cluster analysis

• GUI or Script driven command execution

• Database interface via JDBC

• Ready to be adapted for on-line monitoring

• Ready to be integrated with database access and decision support systems

Page 20: Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

Design Motivated by the Needs

Interactive plays, intuitive operationsto bring domain experts into the loop

Multiple types of plots, extensible for moreto visualize data or classifier geometry

Linked views, traversal actions to track point/class correlations

Highlights, colorsto test tentative classifications

Projection to arbitrary subspacesto compare groupings in different perspectives

Linking data with imagesto relate numerical data to other types

Command scriptingto facilitate systematic, repeatable explorations

Page 21: Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

Challenges for the Analysis Tool

• Separate treatment of non-comparable groups of variables

• Versatile visualization utilities allowing many perspectives

• Support for exploratory discovery across diverse data types

• Integrate manual & automatic pattern recognition methods

Also, a good tool should

-- leverage existing visualization and analysis methods

-- enable continued growth: new visualization, analysis tools

-- support interface with existing databases

-- be scalable in data volume and processing speed

Page 22: Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

Mirage Core

Data Access Clients

Data Analysis Methods

Custom Data Views

Data Exchange Pipes

VO Data Archives External Rendering Code

Web Services Other Analysis Platforms

Cone Search, CAS

Extinction Calculator Python? Matlab?

FITS viewer, …

Towards Extensibility

Page 23: Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

VO Enabled Mirage(with Samuel Carliles, William O’Mullane, and Alex Szalay)

Page 24: Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

VO Enabled Mirage

• http://skyservice.pha.jhu.edu/develop/vo/mirage/

• Load VOTable data and perform VO Cone/SIAP and SDSS CAS searches using IVOA Client Package

• Astronomical imaging module loads FITS images using JSky classes, supporting image operations:

Select data points and broadcast selection to other views. Cut levels. Colormap. SAO DS9-style brightness/contrast enhance. Zoom.

Page 25: Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

Extinction Web Service(with Chris Miller, Simon Krughoff)

Using DIRBE/IRAS Dust Maps by Schlegel et al.

Mirage Core

Object selection

Extracts RA,DEC,[mag]from Mirage data set

SOAP client callsExtinction server

Merges resultswith Mirage data set

Extinction Service

Positions, mags

Positions, mags, filterIDs

E(b-v), dered_mags

Enhanced data set

Result stream

Page 26: Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

205th Meeting of the American Astronomical Society

9-13 January 2005 San Diego, CA

Wednesday, 12 January

Astronomical Research with the Virtual Observatory

More at

NVO Public Release 1.0

Page 27: Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

Analysis of Simulations of Control Dynamics in Optical Transport Systems

(with the FROG collaboration)

Head End Terminal

Repeater

Fiber link

Repeater

Repeater

Repeater

Gain Equalizer

Tail End Terminal

Signal Spectrum with noise floor

Page 28: Tin Kam Ho Computing Sciences Research Center Bell Labs, Lucent Technologies

Monitoring Network Traffic

Software tool for online monitoring and analysis of QoS in IP networks

• continuously monitors traffic statistics at edge and core devices

• synthesizes statistics in real time to obtain network-wide QoS status and general network element health indicators

• Mirage refreshes displays on alerts of database updates via Java Messaging Service

SEQUINSEQUIN

SNMP polling

SLAverification

BillingProvisioning

MPLS IP Core(QoS-guaranteed paths)

DiffServ Edge(aggregation and

classification)

(With Marina Thottan, Ken Swanson)