Modeling Search Computing Applications

25
Modeling Search Computing Applications Alessandro Bozzon , Marco Brambilla, Alessandro Campi, Stefano Ceri, Francesco Corcoglioniti, Piero Fraternali, Salvatore Vadacca

description

 

Transcript of Modeling Search Computing Applications

Page 1: Modeling Search Computing Applications

Modeling Search Computing Applications

Alessandro Bozzon, Marco Brambilla, Alessandro Campi,Stefano Ceri, Francesco Corcoglioniti, Piero Fraternali, Salvatore Vadacca

ICWE 2010, Vienna

Page 2: Modeling Search Computing Applications

SeCo – Search Computing

Motivating Examples

“Who are the strongest candidates in Europe for competing on software ideas?”

“Who is the best doctor who can cure insomnia in a close-by hospital?”

“Where can I attend an interesting scientific conference in my field and at the same time relax on a beautiful beach nearby?”

This information is available on Internet, but no software system is capable of computing the answer.

Queries span over multiple semantic domains and require composing ranking of results.

Page 3: Modeling Search Computing Applications

SeCo – Search Computing

Their Common Aspect

Multi-domain queries

Individual answers are on the Web

A knowledgeable user would do the query step-by-step:– Search database conferences, get their city– Check that the city average temperature is warm enough– Search low-cost flights via a broker for that city– Search luxury hotels via another broker

We want a system for supporting this search process– Build several “solutions” which already integrate all dimensions– Rank “solutions” according to a rank function and outputing

results in rank order– Possibly add dimensions while the search proceeds or change

the relative weight of each search

Page 4: Modeling Search Computing Applications

SeCo – Search Computing

Search Computing architecture: overall view

Main Query flow

DomainRepository

Front End

Query Planner

Cache

Query To Domain Mapper

Cache

Query Analysis

Cache

Query Engine

OP 1 OP 2 OP N Cache...

WS-Framework

Cache

ServiceRepository

Result Transformation

Cache

WSWorld

High-Level Query

Sub-queries

ConcreteQuery Plan

Low-level queries Merged Results

DomainFramework

Cache

Final UserResults

<Uses> relation

High level query“Where can I attend a DB

scientific conference close to a beautiful beach reachable

with cheap flights?”Sub query 1“Where can I attend a DB scientific conference?”

Sub query 2“place close to

a beautiful beach?”

Sub query 3“place reachable with

cheap flight?”

Low level query 1ConfSearch(“DB”,placeX,dateY)Low level query 2

TourSearch(“Beach”,PlaceX)Low level query 3Flight(“cost<200”,PlaceX,DateY)

Query plan

Services invocations and operators execution

Results

Presented results

MSVVEIS’08 - Barcelona – IberiaLID’08 – Rome - AlitaliaRCIS’08- Marrakech- AirFrance

Page 5: Modeling Search Computing Applications

SeCo – Search Computing

Service Registration

Workshop sessions:

• Semantic Resource Framework

• Wrapping Technology and Ontological Annotation

• Search Computing and Research Evaluation

Service Marts:

• Conceptual representation of resources as entities and connections

• Logical representation of signatures

• Physical repre. as service implementations

Page 6: Modeling Search Computing Applications

SeCo – Search Computing

Query Processing

DomainRepository

Front End

Query Planner

Cache

Query To Domain Mapper

Cache

Query Analysis

Cache

Query Engine

OP 1 OP 2 OP N Cache...

WS-Framework

Cache

ServiceRepository

Result Transformation

Cache

WSWorld

High-Level Query

Sub-queries

ConcreteQuery Plan

Low-level queries Merged Results

DomainFramework

Cache

Final UserResults

Query Planner includes:

Language for querying services

Models for building (top-k vs top-flow) query plans

Methods for query optimization

Query Engine includes:

Panta Rhei, a query execution model.

Workshop sessions:

• Query Processing

• Rank Join

Page 7: Modeling Search Computing Applications

SeCo – Search Computing

Front-end Research in the SeCo framework

DomainRepository

Front End

Query Planner

Cache

Query To Domain Mapper

Cache

Query Analysis

Cache

Query Engine

OP 1 OP 2 OP N Cache...

WS-Framework

Cache

ServiceRepository

Result Transformation

Cache

WSWorld

High-Level Query

Sub-queries

ConcreteQuery Plan

Low-level queries Merged Results

DomainFramework

Cache

Final UserResults

Liquid Query

Client-side framework for configuration and automatic rendering of query and result interfaces

User interaction primitives that allow to perform explanatory search

Workshop sessions:

• Search as a Process

• Visual Interfaces for Complex Search

Page 8: Modeling Search Computing Applications

SeCo – Search Computing

Model Driven Development Process of SeCo Applications

Implement search service

Wrap or materialize service

Register service mart and interface

Service Mart model

Service developer

Service publisher

Design Query TemplateExpert user

Liquid Query model

Sea

rch

Ser

vice

D

evel

opm

ent

Ser

vice

A

dapt

atio

n an

d R

egis

trat

ion

App

licat

ion

Con

figur

atio

n

Refine Query PlanSeCo expert

Que

ry P

lan

Ref

inem

ent

Manual optimization needed?

N

Y

Query Plan model

Page 9: Modeling Search Computing Applications

SeCo – Search Computing

A Model-driven Perspective on Search Computing

MDE approaches applied to search computing1. metamodels describing the objects of interest,

• shared knowledge and vision • bases for future tool interoperability

2. specification of applications through model transformations• formalized representation of the intended semantics • tool interoperability

3. definition of a domain specific language (DSL) for query processing• Simplified definition and visual representation of the query

manipulation processes

Page 10: Modeling Search Computing Applications

SeCo – Search Computing

SeCo: MDE Overview

The SeCo system can be seen as a set of models and model transformations – At design time– At runtime (Query plan execution)

Query Model

Result Model

Service Mart Model

Designer ChoicesQueryToPlan

Query Plan ModelQuery Parameters

DESIGN TIME

RUN TIME

Conceptual level (CIM)

Logical level (PIM)

Physical level (PSM)

Conceptual level (CIM)

Logical level (PIM)

Physical level (PSM)

Designer Choices

Page 11: Modeling Search Computing Applications

SeCo – Search Computing

SeCo Overview: Models

4 artifact models – Service Mart, Query, Query Parameters, Result

A query plan model– For the runtime query transformation

11

Query Model

Result Model

Service Mart Model

Designer ChoicesQueryToPlan

Query Plan ModelQuery Parameters

DESIGN TIME

RUN TIME

Conceptual level (CIM)

Logical level (PIM)

Physical level (PSM)

Conceptual level (CIM)

Logical level (PIM)

Physical level (PSM)

Designer Choices

Page 12: Modeling Search Computing Applications

SeCo – Search Computing

Service Mart Metamodel 12

CompositionMarts

Interfaces

Patterns

ServiceMart

id: Integername: String[0..1]description: String[0..1]semantics: Semantics[0..1]domain: Semantics[0..1]

Attribute

id: Integername: String[0..1]description: String[0..1]semantics: Semantics[0..1]

ComposedAttribute

averageCardinality: Integer

AtomicAttribute

type: dataType

ConnectionPattern

id: Integername: String[0..1]description: String[0..1]

AttributeConstraint

operator: RelationalOperator

RankingType

id: Integername: String[0..1]description: String[0..1]semantics: Semantics[0..1]rankingFunction: Expression[0..1]rankingDirection: SortDirection

AccessPattern

id: Integername: String[0..1]description: String[0..1]

ServiceInterface

id: Integername: String[0..1]description: String[0..1]erspi: Floatcacheable: BooleancacheTimeToLive: Milliseconds[0..1]cost: Pricing[0..1]endpointURI: URI

SearchServiceInterface

decayFun: Expressionchunked: BooleanchunkSize: Integer[0..1]initTime: MillisecondsfetchTime: Milliseconds

ExactServiceInterface

fetchTime: Milliseconds

ServiceConnection

AttributeDirection<<enumeration>>

INOUTRANKING

1

0..*

1

1..*

1

0..*

10..*

10..*

1 0..*

1..*0..*

1 0..*

1 0..*

1

0..*

0..*

1

0..1

0..*Attribute

+dir: AttributeDirection

Page 13: Modeling Search Computing Applications

SeCo – Search Computing

Service Mart Metamodel

ServiceMart– A ServiceMart is an abstraction (e.g., Hotel) of one or more Web service

implementations (e.g., Bookings and Expedia)– capable of accepting queries and of returning results– possibly ranked and chunked into page

Attribute– ServiceMart contains Attributes– Attributes can be Atomic or Composite

AccessPattern– An AccessPattern specifies RankingType and AttributeDirection (I/O) for

every Attribute of the ServiceMart, thus allowing its actual invocation

ConnectionPattern– is defined as an input-output relationship between pairs of service marts

that can be exploited for joining them • e.g., the output city of the Concert can be used as input for the Hotel.

ServiceInterface– physical interface of the service, with details about chunk size, cost, …– Exact or Search (ranked)

13

Page 14: Modeling Search Computing Applications

SeCo – Search Computing

Query Metamodel

LogicalQuery– is a conjunctive query over services– can be defined at an abstract level (AccessPatternLevelQuery)

or at physical level (InterfaceLevelQuery).

14

QueryClause– a LogicalQuery is composed by a set of QueryClauses– a QueryClause can refer to the SM level or to the SI level – Several types of clauses

Service martsLogical queries

LogicalQuery

id: String

RankingClause

direction: SortDirectionweight: Float

J oinClause

InvocationClause

PredicateClause

condition: Expression

ServiceMart

ServiceInterface

ConnectionPattern

+serviceMart1

0..*

+selectedInterface0..10..*

+connectionPattern1

0..*

InterfaceLevelQuery

MartLevelQuery

QueryClause

id: String

+clauses

1 0..*

+source1 +target1

+rankedInvocation1

Attribute

+rankedAttribute1

Page 15: Modeling Search Computing Applications

SeCo – Search Computing

SeCo Overview: Transformations

1. Vertical transformations for Queries and ServiceMarts

2. QueryToPlan transformation

3. Query Execution transformation (at runtime)

4. Result transformation (at runtime)

15

Query Model

Result Model

Service Mart Model

Designer ChoicesQueryToPlan

Query Plan ModelQuery Parameters

DESIGN TIME

RUN TIME

Conceptual level (CIM)

Logical level (PIM)

Physical level (PSM)

Conceptual level (CIM)

Logical level (PIM)

Physical level (PSM)

Designer Choices

11

2

4

3

Page 16: Modeling Search Computing Applications

SeCo – Search Computing

Vertical transformations

For moving among different conceptualization levels

For providing recommendations

For transforming informations

Examples:– service mart and query: for moving from conceptual to logic to

phisical level– result:

• for reshaping the data in the resultset (exploratory approach implemented by liquid query)

• For enriching the results with personalization and recommendations

Page 17: Modeling Search Computing Applications

SeCo – Search Computing

Query Execution transformation

Query execution as a transformation– model of the query parameters -> model of the query results

Represented as a Query Plan model– well-defined scheduling of service invocations, possibly

parallelized, that complies with their service interface and exploits the ranking in which search services return results to rank the combined query results.

QueryPlan metamodel + Concrete Syntax = Panta Rhei Language

17

Page 18: Modeling Search Computing Applications

SeCo – Search Computing

Query plan metamodel

Execution plans

Service marts

ExecutionPlan

id:Integer

Edge

id: String

Node

id: StringmonitoredAttributes: String[0..*]

ControlFlow

type: ControlType

DataFlow

Output Modifier

filter: Expression

Sorter

criteria: SortCriteriablocking: Boolean

Chunker

chunkSize: Int[0..1]stop: Int[0..1]

J oiner

predicate: Expression

Service

alias: String

Input

ExactService

SearchService

AttributeBinding

binding: Expression

+edges 1

0..*+nodes10..*

+attributeBindings

1

0..*

+source+outgoingEdges 10..*

+target+ingoingEdges 10..*

Attribute

+attribute1

0..*

ServiceInterface

+serviceInterface 1

0..*

Selector

filter: Expression

PipeController

tilesPerFetch: Intstrategy: PipeStrategy

ParallelController

tilesPerFetch: Intstrategy: ParallelStrategy

Page 19: Modeling Search Computing Applications

SeCo – Search Computing

Transformations: Panta Rhei

Panta Rhei– describes both the execution flow and the data flow between nodes. – Several types of nodes exist

• service invocators, sorting, join, and chunk operators, clocks (defining the frequency of invocations), caches, and others.

The query result model is constructed stepwise, following the execution flow

19

Page 20: Modeling Search Computing Applications

SeCo – Search Computing

Transformations: Query to Plan (1/2) 1st phase: an ATL helper (functional program) encapsulates

the scheduling algorithm of the execution plan. – The function produces a representation of a partial order of the

clauses– Several very different scheduling algorithms can be used in this

phase, and the transformation structure allows to easily swap the preferred one, also at runtime

2nd phase generates the output Pantha Rhei query plan. In this phase the following mappings are assumed:– Invocation clauses become Service invocation nodes– Join clauses become parallel joins or pipe joins– The connections between the nodes are generated based on the

ordering calculated in the first phase.

A Higher Order Transformation (HOT) could be used to automatically modify the logic of the plan, based on domain-specific needs or insights

Page 21: Modeling Search Computing Applications

SeCo – Search Computing

User interaction metamodel

Implemented by the Liquid Query paradigm– See: http://demo.search-computing.org

LiquidQueryType

LiquidQuery Instance

ConcreteQuery

+ID: Integer+Name: String

AvailableOperation

+ID: Integer

Operation

+ID: Integer+Name: String

Parameter

+ID: Integer+Name. String+Type: String+DefaultValue: String

+QueryParameters

10..*

LiquidQueryInstance

+ID: Integer+TimeStamp: Time

+InstancedQuery1..*

1

OperationInstance

+TimeStamp: Time

LiquidResultSet

+ID: Integer+TimeStamp: Time

+QueryOperations10..*

0..* 1

+InstancedOperation0..*

1

+OperationParameters

1

0..*

+QueryResults 0..*

1

ParameterInstance

+ID: Integer+Value: String

+InstancedValues 0..*

1

+QueryParameterInstance

1 0..*

Filter

+FilterAttribute: String+Condition: String+DefaultValue: String

Expand

+ServiceMartName: String

Group

+AttributeName: String

LiquidResult

+ID: Integer

+ResultInstances1 0..*

+OperationParametersInstance

1 0..*

LogicalQuery

+ID: Integer

+QueryImplementation0..*

1

Page 22: Modeling Search Computing Applications

SeCo – Search Computing

Model Transformation Challenges

Specification of mappings for data extraction– Simple interface based on MT– e.g. using Model Weaving, Transformations by Example.

Transformations for building views of the results.– views and viewpoints on models– i.e. model transformations to filter or change the representation

of a given data set

Search process orchestration in light of model transformations. – the Pantha Rhei DSL can be seen as a model transformation. – formalization is needed to represent query plans as composition

of operations on models.

Search on query models. – Search within the domain of the queries themselves – Ex: most typical queries and their relationship to usage patterns

22

Page 23: Modeling Search Computing Applications

SeCo – Search Computing

Experiments and prototypes

Main SeCo concept models in ECORE

Implemented ATL transformation that generates the query plan from query and service mart definitions, using trivial strategies

Further works: implementing different optimization strategies, by adopting rule-based optimization (old concept in the DB field)

Prototypes available online:http://dbgroup.como.polimi.it/brambilla/SeCoMDA

Page 24: Modeling Search Computing Applications

SeCo – Search Computing

Conclusions Search Computing as integration of several interacting models,

Partition of the design space and responsibilities on the different roles and involved expertise, in a non-trivial way

Objective: is to replace programming with model driven development wherever possible, yielding to flexibility and efficiency.

A model transformation approach is a good tool for clarifying the problem and solution space

Probably not viable for actual implementation of the search system, because of performance /scalability issues

Current status of the project and state of the artrecorded in the book: Search Computing Challenges and Directions (Springer LNCS, vol. 5950, Ceri-Brambilla eds.)– Part 1: Visions by Ceri, Baeza-Yates, Weikum

– Part 2: Technology Watch  – Part 3: Issues in Search Computing

Page 25: Modeling Search Computing Applications

SeCo – Search Computing

Thanks!

Questions?