Dr. Virendra C. Bhavsar - Faculty of Computer Science | UNBwdu/ssworkshop/submissions/bhavsar... ·...

26
1 Dr. Virendra C. Bhavsar Professor and Director, Advanced Computational Res. Lab. Faculty of Computer Science University of New Brunswick (UNB) Fredericton, Canada [email protected] Thanks: BCS Student: Marcel Ball MCS Students: Anurag Singh, Jin Jing, Sebastien Mathieu,, Jie Li PhD Student: Lu Yang Post-Doctoral Fellows: Dr. Biplab Sarker and Dr. Manish Joshi Collaborators: Dr. Riyanarto Sarno and Dr. Harold Boley June 14, 2010 Semantic Matching

Transcript of Dr. Virendra C. Bhavsar - Faculty of Computer Science | UNBwdu/ssworkshop/submissions/bhavsar... ·...

Page 1: Dr. Virendra C. Bhavsar - Faculty of Computer Science | UNBwdu/ssworkshop/submissions/bhavsar... · 2010-06-25 · Dr. Virendra C. Bhavsar ... graphics, software engineering, natural

1

Dr. Virendra C. BhavsarProfessor and Director, Advanced Computational Res. Lab.

Faculty of Computer Science

University of New Brunswick (UNB)

Fredericton, Canada

[email protected]:

BCS Student: Marcel Ball

MCS Students: Anurag Singh, Jin Jing, Sebastien Mathieu,, Jie Li

PhD Student: Lu Yang

Post-Doctoral Fellows: Dr. Biplab Sarker and Dr. Manish Joshi

Collaborators: Dr. Riyanarto Sarno and Dr. Harold Boley

June 14, 2010

Semantic Matching

Page 2: Dr. Virendra C. Bhavsar - Faculty of Computer Science | UNBwdu/ssworkshop/submissions/bhavsar... · 2010-06-25 · Dr. Virendra C. Bhavsar ... graphics, software engineering, natural

2

Virendra C. Bhavsar

• UNB: since 1983; > 35 years of software research

development experience

• Interests: real-time embedded systems, computer

graphics, software engineering, natural language

processing, databases, bioinformatics, parallel computing,

artificial intelligence, …

• Bioinformatics - Canadian Potato Genomics Project

• Atlantic Computational Excellence Network (ACEnet):

~30 million Atlantic Canada project in high performance

computing

• Semantic Matching

Page 3: Dr. Virendra C. Bhavsar - Faculty of Computer Science | UNBwdu/ssworkshop/submissions/bhavsar... · 2010-06-25 · Dr. Virendra C. Bhavsar ... graphics, software engineering, natural

3

Outline

• Syntactic Matching

• Semantic Matching

• Semantic Matching: Taxonomy, Ontology and

Partonomy

• UNB Semantic Matching Engines – Applications

• Conclusion

Page 4: Dr. Virendra C. Bhavsar - Faculty of Computer Science | UNBwdu/ssworkshop/submissions/bhavsar... · 2010-06-25 · Dr. Virendra C. Bhavsar ... graphics, software engineering, natural

4

Exact String Matching

• Binary result 0.0 or 1.0

Permutation of strings

“Java Programming” versus “Programming in Java”

Number of identical words

Maximum length of the two strings

Example 1

For two node labels “a b c” and “a b d e”, their similarity is:

2

4= 0.5

Syntactic Matching

Page 5: Dr. Virendra C. Bhavsar - Faculty of Computer Science | UNBwdu/ssworkshop/submissions/bhavsar... · 2010-06-25 · Dr. Virendra C. Bhavsar ... graphics, software engineering, natural

5

Example 2

Node labels “electric chair” and “committee chair”

1

2= 0.5 meaningful?

• Syntactic Matching does not consider additional

domain knowledge

•Semantic matching techniques are needed for the

above problems

Syntactic Matching

Page 6: Dr. Virendra C. Bhavsar - Faculty of Computer Science | UNBwdu/ssworkshop/submissions/bhavsar... · 2010-06-25 · Dr. Virendra C. Bhavsar ... graphics, software engineering, natural

6

Semantic Matching Applications

• Semantic searching, e.g. Google

• e-Business

• e-Learning

• Matchmaking portals

• Information Retrieval

• Web Services

• Information Integration

• Semantic Web

Page 7: Dr. Virendra C. Bhavsar - Faculty of Computer Science | UNBwdu/ssworkshop/submissions/bhavsar... · 2010-06-25 · Dr. Virendra C. Bhavsar ... graphics, software engineering, natural

7

Semantic Matching

• Examples

{Car : Truck} {Toyota Corrolla : Toyota Camry}

{Car : Automobile} {Car : Apple}

• Semantic Similarity versus Semantic Distance

Matching of: words, short texts, documents,

schemas/structures, pictures, videos

• Taxonomy

• Partonomy

• Ontology

Page 8: Dr. Virendra C. Bhavsar - Faculty of Computer Science | UNBwdu/ssworkshop/submissions/bhavsar... · 2010-06-25 · Dr. Virendra C. Bhavsar ... graphics, software engineering, natural

8

Taxonomy

• Practice and science of classification

Page 9: Dr. Virendra C. Bhavsar - Faculty of Computer Science | UNBwdu/ssworkshop/submissions/bhavsar... · 2010-06-25 · Dr. Virendra C. Bhavsar ... graphics, software engineering, natural

9

Ontology

• Domain Ontology: Explicit formal

specifications of the terms in a domain and relations among them

Upper Ontology: Across domains

Page 10: Dr. Virendra C. Bhavsar - Faculty of Computer Science | UNBwdu/ssworkshop/submissions/bhavsar... · 2010-06-25 · Dr. Virendra C. Bhavsar ... graphics, software engineering, natural

10

Concept Similarity in a Taxonomy

Given a taxonomy and two

concepts (e.g., A and B),

find the semantic similarity

of the two concepts

A B

Taxonomy

Page 11: Dr. Virendra C. Bhavsar - Faculty of Computer Science | UNBwdu/ssworkshop/submissions/bhavsar... · 2010-06-25 · Dr. Virendra C. Bhavsar ... graphics, software engineering, natural

11

{Produce, Green goods} 3.034

{Fruit} 3.374

{Apple} 3.945{Berry} 4.907

{Banana} 5.267

{Boxberry} 7.576 {Cranberry} 6.285

Concept Similarity in a Taxonomy

Page 12: Dr. Virendra C. Bhavsar - Faculty of Computer Science | UNBwdu/ssworkshop/submissions/bhavsar... · 2010-06-25 · Dr. Virendra C. Bhavsar ... graphics, software engineering, natural

12

• More and more on-line transactions (e.g. e-Bay, Kijiji, etc.)

• Buyers and sellers input key words and/or specify values

for some product features

• A list of recommended sellers (with product advertisements)

and/or buyers (with product requests) is presented

• Flat representation of products cannot represent the

hierarchical „part-of‟ relationship of product parts

• Match-making is not precise

• Negotiation space is large

Motivation

Page 13: Dr. Virendra C. Bhavsar - Faculty of Computer Science | UNBwdu/ssworkshop/submissions/bhavsar... · 2010-06-25 · Dr. Virendra C. Bhavsar ... graphics, software engineering, natural

13

Main Server

User Info

User Profiles

User Agents

Agents

Matcher1 Matchern

To other sites

(network)

Web

BrowserUser

e-Market

• e-business, e-learning …

• Buyer-Seller matching

• Metadata for buyers and sellers

• Keywords/keyphrases

e-Business Applications

Page 14: Dr. Virendra C. Bhavsar - Faculty of Computer Science | UNBwdu/ssworkshop/submissions/bhavsar... · 2010-06-25 · Dr. Virendra C. Bhavsar ... graphics, software engineering, natural

14

Programming Techniques

Applicative

Programming

0.6

0.5General

Automatic

Programming

Concurrent

ProgrammingSequential

Programming

Object-Oriented

Programming

Distributed

Programming

Parallel

Programming

0.8 0.50.9

0.7

0.7 0.5

• The taxonomy tree of “Programming Techniques” according

to the ACM Computing Classification System

•Arc Weights

Semantic Matching ─ A Taxonomy Tree

Page 15: Dr. Virendra C. Bhavsar - Faculty of Computer Science | UNBwdu/ssworkshop/submissions/bhavsar... · 2010-06-25 · Dr. Virendra C. Bhavsar ... graphics, software engineering, natural

15

Partonomy

• Tree representation for product/service descriptions

• Weights

2002

Car

FordBlack

Make

Color Year

0.3

0.2

0.5

Page 16: Dr. Virendra C. Bhavsar - Faculty of Computer Science | UNBwdu/ssworkshop/submissions/bhavsar... · 2010-06-25 · Dr. Virendra C. Bhavsar ... graphics, software engineering, natural

16

Similarity of Buyer and Sellers

buyer seller1

2002

Car

FordBlack

Make

Color Year

0.1

0.1

0.8

2002

Car

FordRed

Make

Color Year

0.05

0.05

0.9

0.925

2002

Car

FordRed

Make

Color Year

0.2

0.2

0.6

seller2

2002

Car

FordRed

Make

Color Year

0.1

0.6 0.3

seller3

0.85 0.65

Page 17: Dr. Virendra C. Bhavsar - Faculty of Computer Science | UNBwdu/ssworkshop/submissions/bhavsar... · 2010-06-25 · Dr. Virendra C. Bhavsar ... graphics, software engineering, natural

17

Semantic Matching ─ Local Similarity

• Local similarity measures for leaf nodes

• “Price” type

• “Date” type

• . . .

Page 18: Dr. Virendra C. Bhavsar - Faculty of Computer Science | UNBwdu/ssworkshop/submissions/bhavsar... · 2010-06-25 · Dr. Virendra C. Bhavsar ... graphics, software engineering, natural

18

PriceRangeSim ([Bpref, Bmax], [Smin, Spref])

Begin

If Spref <= Bpref similarity = 1.0

else if Bmax < Smin similarity = 0.0

else if Bmax = Smin

similarity =

else

{ MIN = min{MIN, Smin}

MAX = max{MAX, Bmax}

similarity =

}

return similarity

End.

• This algorithm can be easily adapted to the “price”-typed attributes

e.g. “salary range” in job seeking and recruiting e-Market

• Pseudo code of the price-range similarity algorithm

MINMAX

005.0

MINMAX

minmax SB

Semantic Matching ─ Price Matching Algorithm

Page 19: Dr. Virendra C. Bhavsar - Faculty of Computer Science | UNBwdu/ssworkshop/submissions/bhavsar... · 2010-06-25 · Dr. Virendra C. Bhavsar ... graphics, software engineering, natural

19

UNB Similarity Engines -

Implementation

• Java Implementation

• Testing on systematically varied cases

Page 20: Dr. Virendra C. Bhavsar - Faculty of Computer Science | UNBwdu/ssworkshop/submissions/bhavsar... · 2010-06-25 · Dr. Virendra C. Bhavsar ... graphics, software engineering, natural

20

• eduSource e-Learning project

•Learning Object Metadata Generator: LOMGen

Partonomy Tree Similarity Engine ─

eLearning Application

SimilarityEngine(Java)

Translator(XSLT)

CANLOM(XML)

Prefilter(SQL)

LOMGen(Java)

LOR(HTML)

Enduser

Administrator

user input

prefilter parameters (Query URI)

WOO RuleML file

Recommended results

HTML files

partial CanCore filesCanCore

files

prefiltered CanCore files

WOO RuleML files

DATABASE(Access)

UI (Java)

Keyword Table

Administrator input

(1)

(2)

(4) (5)

(6) (7)

(3)

(8)

(a)

(b)

(c)

Search

Results

Page 21: Dr. Virendra C. Bhavsar - Faculty of Computer Science | UNBwdu/ssworkshop/submissions/bhavsar... · 2010-06-25 · Dr. Virendra C. Bhavsar ... graphics, software engineering, natural

21

(si (wi + w'i)/2) (A(si)(wi + w'i)/2)A(si) ≥ si

lom

educational

0.5

general

format platform0.50.50.5

Introduction

to Oracle

t t´

technical0.3334 0.33330.3333

edu-set gen-set tec-set

language

en

title

HTML WinXP

lom

0.1

general

format platform0.90.80.2

Basic

Oracle

technical0.70.3

gen-set tec-set

language

en

title

* WinXP

* : Don’t Care

• Partonomy similarity [Bhavsar et al. 2004]

Fragments of learning object trees [Boley et al. 2005] for learning object

matching (http://www.cs.unb.ca/agentmatcher)

Partonomy Tree Similarity Algorithm

─ Similarity Algorithm

Page 22: Dr. Virendra C. Bhavsar - Faculty of Computer Science | UNBwdu/ssworkshop/submissions/bhavsar... · 2010-06-25 · Dr. Virendra C. Bhavsar ... graphics, software engineering, natural

22

• Teclantic protal http://www.teclantic.ca

•ca)

Partonomy Tree Similarity Engine

─ Matchmaking Application

Page 23: Dr. Virendra C. Bhavsar - Faculty of Computer Science | UNBwdu/ssworkshop/submissions/bhavsar... · 2010-06-25 · Dr. Virendra C. Bhavsar ... graphics, software engineering, natural

23

Current Work

• Weighted Tree Semantic Tree Similarity Engines

•Semantic searching

• Weighted Graph Similarity Engines

• Multi-core and cluster implementations

• Matchmaking portals

Page 24: Dr. Virendra C. Bhavsar - Faculty of Computer Science | UNBwdu/ssworkshop/submissions/bhavsar... · 2010-06-25 · Dr. Virendra C. Bhavsar ... graphics, software engineering, natural

24

Conclusion

• UNB Weighted Tree Similarity Engines

• Semantic Global and Local Matching

• Applications: e-Learning, e-Business, Matchmaking portals, …

• Looking for licensing and adapting the UNB technology to commercial partners

Page 25: Dr. Virendra C. Bhavsar - Faculty of Computer Science | UNBwdu/ssworkshop/submissions/bhavsar... · 2010-06-25 · Dr. Virendra C. Bhavsar ... graphics, software engineering, natural

25

Publications

5 Journal papers

10 Conference papers

1 Book Chapter

4 MCS Theses

1 PhD Thesis

Page 26: Dr. Virendra C. Bhavsar - Faculty of Computer Science | UNBwdu/ssworkshop/submissions/bhavsar... · 2010-06-25 · Dr. Virendra C. Bhavsar ... graphics, software engineering, natural

26

Looking for a Post-doctoral Fellow

to start working right now!

Thank you !