Similarity-Based Comparisons for Interactive Fine-Grained ...smaji/presentations/... · • We...

1
Qualitative Results Overview Approach A. Collect grid-based similarity comp- arisons that do not require prior expertise B. Broadcast grid-based comparisons to triplet comparisons Results Similarity Comparisons INTERACTIVE CATEGORIZATION Compute per-class probabilities as: where Efficient computation Approximate per-class probabilities as: i.e. sum of weights of examples of class where enumerates training examples Weight ݓ represents how likely ܢ is true location ܢ: such that Incorporating Users ܦis grid of images for each question Incorporate independent user response as: Method Avg. #Qs CV, Color Similarity 2.70 CV, Shape Similarity 2.67 CV, Pattern Similarity 2.67 CV, Color/Shape/Pattern Similarity 2.64 No CV, Color/Shape/Pattern Similarity 4.21 Multiple Metrics System supports multiple similarity metrics as different types of questions Simulate perceptual spaces using CUB-200-2011 attribute annotations Learned Embedding Learn category-level embedding of = 200 nodes Category-level embedding requires much fewer comparisons compared to at the instance-level Catherine Wah 1 Grant Van Horn 1 Steve Branson 2 Subhransu Maji 3 Pietro Perona 2 Serge Belongie 4 1 University of California, San Diego vision.ucsd.edu {cwah@cs,gvanhorn@}ucsd.edu 2 California Institute of Technology vision.caltech.edu {sbranson,perona}@caltech.edu 3 Toyota Technological Institute at Chicago ttic.edu [email protected] 4 Cornell Tech tech.cornell.edu [email protected] TRAIN TIME TEST TIME 2) Computer Vision | ݔ3) Learn Perceptual Embedding 2) Collect Similarity Comparisons ? 1) Image Database w/ Class Labels Mallard Cardinal 1) Query Image 3) Human-in-the-Loop Categorization | ݔ, ? Problem Parts and attributes exhibit weaknesses ¾ Scalability issues; costly; reliance on experts, but experts are scarce Proposed Solution Use relative similarity comparisons to reduce dependence on expert- derived part and attribute vocabularies Contributions We present an efficient, flexible, and scalable system for interactive fine-grained visual categorization ¾ Based on perceptual similarity ¾ Combines similarity metrics and computer vision methods in a unified framework Outperforms state-of-the-art relevance feedback-based and part/attribute-based approaches Is this more similar to… ݔ This one? ݔ Or this one? ݔ ݏ, > ݏ, ݏ, > ݏ, ݏ, > ݏ, ݏ, > ݏ, ݏ, > ݏ, ݏ, > ݏ, ݏ, > ݏ, ݏ, > ݏ, ? ݏ( , ): perceptual similarity between images ݔ and ݔ A B Q1: Most Similar? Query Image Q2: Most Similar? Vermilion Fly- catcher Q1: Most Similar By Color? Q2: Most Similar By Pattern? Hooded Merganser Query Image Interactive Categorization Similarity comparisons are advantageous compared to part/attribute questions Using computer vision reduces the burden on the user Intelligently selecting image displays reduces effort The system is robust to user noise Deterministic users No computer vision Deterministic users With computer vision Simulated noisy users With computer vision Learning a Metric Given set of triplet comparisons , learn embedding ܈of training images with stochastic triplet embedding [van der Maaten & Weinberger 2012] From ܈, generate similarity matrix א × Selecting the Display Approximate solution: maximizes expected information gain in terms of entropy of , ܢ , | ݔGroup images into equal-weight clusters [Fang & Geman 2005] From each cluster, select image with largest ݓ = , , ݔ more similar to ݔ than ݔ ݔQuery image Class ݐTime step User responses at ݐ ܢTrue location of ݔin perceptual space Computer Vision Easy to map off-the-shelf CV algorithms into framework, e.g., multiclass classification scores , ݔܢ ן | ݔ ,| ݔ, ן , | ݔ= , ܢ, | ݔ ܢ ܢ ݓ= , ܢ, | ݔ= | , ܢ, ݔ , ݔܢ ܢݑ= ߶ ݏ(ܢ, ܢ ) σ ߶ ݏ(ܢ, ܢ ) א ,| ݔ, σ ݓ , σ ݓ ݓ = , ܢ , | ݔ= | , ܢ , ݔ , ܢ ݔ ݓ ௧ାଵ = ݑ௧ାଵ ܢ ݓ = ߶ σ ߶ א ݓ Efficient update rule: Initialize weights ݓ = , ܢ ݔUpdate weights ݓ ௧ାଵ when user answers a similarity question Update per-class probabilities A A B B C C D D D D Similarity Comparisons for Interactive Fine-Grained Categorization

Transcript of Similarity-Based Comparisons for Interactive Fine-Grained ...smaji/presentations/... · • We...

  • Qualitative Results

    Overview Approach

    A. Collect grid-based similarity comp-arisons that do not require prior expertise

    B. Broadcast grid-based comparisons to triplet comparisons

    Results

    Similarity Comparisons

    INTERACTIVE CATEGORIZATION • Compute per-class probabilities as:

    where

    Efficient computation • Approximate per-class probabilities as:

    i.e. sum of weights of examples of class where enumerates training examples

    • Weight represents how likely is true location : such that

    Incorporating Users • is grid of images for each question

    Incorporate independent user response as:

    Method Avg. #Qs

    CV, Color Similarity 2.70

    CV, Shape Similarity 2.67

    CV, Pattern Similarity 2.67

    CV, Color/Shape/Pattern Similarity 2.64

    No CV, Color/Shape/Pattern Similarity 4.21

    Multiple Metrics • System supports multiple similarity

    metrics as different types of questions

    • Simulate perceptual spaces using CUB-200-2011 attribute annotations

    Learned Embedding • Learn category-level embedding of

    = 200 nodes • Category-level embedding requires

    much fewer comparisons compared to at the instance-level

    Catherine Wah1 Grant Van Horn1 Steve Branson2 Subhransu Maji3 Pietro Perona2 Serge Belongie4 1University of California, San Diego

    vision.ucsd.edu {cwah@cs,gvanhorn@}ucsd.edu

    2California Institute of Technology vision.caltech.edu

    {sbranson,perona}@caltech.edu

    3 Toyota Technological Institute at Chicago ttic.edu

    [email protected]

    4 Cornell Tech tech.cornell.edu

    [email protected]

    TRAI

    N T

    IME

    TEST

    TIM

    E

    2) Computer Vision

    |

    3) Learn Perceptual Embedding

    2) Collect Similarity Comparisons

    ?

    1) Image Database w/ Class Labels

    Mallard

    Cardinal

    1) Query Image 3) Human-in-the-Loop Categorization

    |,

    ?

    Problem • Parts and attributes exhibit weaknesses

    Scalability issues; costly; reliance on experts, but experts are scarce Proposed Solution • Use relative similarity comparisons to reduce dependence on expert-

    derived part and attribute vocabularies

    Contributions • We present an efficient, flexible, and scalable system for interactive

    fine-grained visual categorization Based on perceptual similarity Combines similarity metrics and computer vision methods in a unified framework

    • Outperforms state-of-the-art relevance feedback-based and part/attribute-based approaches

    Is this more similar to…

    This one?

    Or this one?

    , > , , > , , > , , > , , > , , > , , > , , > , …

    ?

    ( , ): perceptual similarity between images and

    A

    B

    Q1: Most Similar? Query Image

    Q2: Most Similar?

    Vermilion Fly-

    catcher

    Q1: Most Similar By Color? Q2: Most Similar By Pattern?

    Hooded Merganser

    Query Image

    Interactive Categorization • Similarity comparisons are advantageous compared to part/attribute questions • Using computer vision reduces the burden on the user • Intelligently selecting image displays reduces effort • The system is robust to user noise

    Deterministic users No computer vision

    Deterministic users With computer vision

    Simulated noisy users With computer vision

    Learning a Metric • Given set of triplet comparisons , learn

    embedding of training images with stochastic triplet embedding [van der Maaten & Weinberger 2012]

    • From , generate similarity matrix ×

    Selecting the Display • Approximate solution: maximizes

    expected information gain in terms of entropy of , , |

    • Group images into equal-weight clusters [Fang & Geman 2005]

    • From each cluster, select image with largest

    = , , more similar to than

    Query image Class

    Time step User responses at

    True location of in perceptual space

    Computer Vision • Easy to map off-the-shelf CV

    algorithms into framework, e.g., multiclass classification scores

    , |

    , | , , | = , , |

    = , , | = | , , ,

    = ( , )

    ( , )

    , | , ,

    = , , | = | , , ,

    =

    =

    Efficient update rule: Initialize weights = , Update weights when user answers a similarity question Update per-class probabilities

    A

    A

    B

    B

    C

    C

    D

    D

    D

    D

    Similarity Comparisons for Interactive Fine-Grained Categorization

    Subhransu Maji

    Slide Number 1