Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which...
Transcript of Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which...
![Page 1: Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which may or may not be dynamically generated) not accessible from thehyperlinked structureof](https://reader035.fdocuments.in/reader035/viewer/2022071016/5fcf2b6c6b0e836b27709844/html5/thumbnails/1.jpg)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Understanding the Hidden Web
Pierre Senellart
Journées GEMO — 2nd June 2005
![Page 2: Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which may or may not be dynamically generated) not accessible from thehyperlinked structureof](https://reader035.fdocuments.in/reader035/viewer/2022071016/5fcf2b6c6b0e836b27709844/html5/thumbnails/2.jpg)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
The Hidden Web
Definition (Hidden Web)The set of webpages (which may or may not be dynamicallygenerated) not accessible from the hyperlinked structure ofthe World Wide Web.
Size estimate (2001) : 500 times larger than the surfaceWeb.
How to understand it and benefit from its content?
![Page 3: Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which may or may not be dynamically generated) not accessible from thehyperlinked structureof](https://reader035.fdocuments.in/reader035/viewer/2022071016/5fcf2b6c6b0e836b27709844/html5/thumbnails/3.jpg)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
The Hidden Web
Definition (Hidden Web)The set of webpages (which may or may not be dynamicallygenerated) not accessible from the hyperlinked structure ofthe World Wide Web.
Size estimate (2001) : 500 times larger than the surfaceWeb.
How to understand it and benefit from its content?
![Page 4: Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which may or may not be dynamically generated) not accessible from thehyperlinked structureof](https://reader035.fdocuments.in/reader035/viewer/2022071016/5fcf2b6c6b0e836b27709844/html5/thumbnails/4.jpg)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
The Hidden Web
Definition (Hidden Web)The set of webpages (which may or may not be dynamicallygenerated) not accessible from the hyperlinked structure ofthe World Wide Web.
Size estimate (2001) : 500 times larger than the surfaceWeb.
How to understand it and benefit from its content?
![Page 5: Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which may or may not be dynamically generated) not accessible from thehyperlinked structureof](https://reader035.fdocuments.in/reader035/viewer/2022071016/5fcf2b6c6b0e836b27709844/html5/thumbnails/5.jpg)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Semantic Interpretation Process
World Wide Web
![Page 6: Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which may or may not be dynamically generated) not accessible from thehyperlinked structureof](https://reader035.fdocuments.in/reader035/viewer/2022071016/5fcf2b6c6b0e836b27709844/html5/thumbnails/6.jpg)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Semantic Interpretation Process
World Wide WebHTML form
WSDL
UDDIdiscovery
World Wide Web
![Page 7: Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which may or may not be dynamically generated) not accessible from thehyperlinked structureof](https://reader035.fdocuments.in/reader035/viewer/2022071016/5fcf2b6c6b0e836b27709844/html5/thumbnails/7.jpg)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Semantic Interpretation Process
World Wide WebHTML form
WSDL
UDDIdiscovery
World Wide Web
WSDL
wrappers
HTML form
WSDL
UDDIdiscovery
World Wide Web
![Page 8: Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which may or may not be dynamically generated) not accessible from thehyperlinked structureof](https://reader035.fdocuments.in/reader035/viewer/2022071016/5fcf2b6c6b0e836b27709844/html5/thumbnails/8.jpg)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Semantic Interpretation Process
World Wide WebHTML form
WSDL
UDDIdiscovery
World Wide Web
WSDL
wrappers
HTML form
WSDL
UDDIdiscovery
World Wide Web
analysisAnalyzed Web Services WSDL
wrappers
HTML form
WSDL
UDDIdiscoveryWorld Wide Web
![Page 9: Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which may or may not be dynamically generated) not accessible from thehyperlinked structureof](https://reader035.fdocuments.in/reader035/viewer/2022071016/5fcf2b6c6b0e836b27709844/html5/thumbnails/9.jpg)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Semantic Interpretation Process
World Wide WebHTML form
WSDL
UDDIdiscovery
World Wide Web
WSDL
wrappers
HTML form
WSDL
UDDIdiscovery
World Wide Web
analysisAnalyzed Web Services WSDL
wrappers
HTML form
WSDL
UDDIdiscoveryWorld Wide Web
Web Services Index
indexing
Web Services Index
indexing
analysisAnalyzed Web Services WSDL
wrappers
HTML form
WSDL
UDDIdiscoveryWorld Wide Web
![Page 10: Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which may or may not be dynamically generated) not accessible from thehyperlinked structureof](https://reader035.fdocuments.in/reader035/viewer/2022071016/5fcf2b6c6b0e836b27709844/html5/thumbnails/10.jpg)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Semantic Interpretation Process
World Wide WebHTML form
WSDL
UDDIdiscovery
World Wide Web
WSDL
wrappers
HTML form
WSDL
UDDIdiscovery
World Wide Web
analysisAnalyzed Web Services WSDL
wrappers
HTML form
WSDL
UDDIdiscoveryWorld Wide Web
Web Services Index
indexing
Web Services Index
indexing
analysisAnalyzed Web Services WSDL
wrappers
HTML form
WSDL
UDDIdiscoveryWorld Wide Web
Results
Queries
queryingWeb Services Index
indexing
Web Services Index
indexing
analysisAnalyzed Web Services WSDL
wrappers
HTML form
WSDL
UDDIdiscoveryWorld Wide Web
![Page 11: Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which may or may not be dynamically generated) not accessible from thehyperlinked structureof](https://reader035.fdocuments.in/reader035/viewer/2022071016/5fcf2b6c6b0e836b27709844/html5/thumbnails/11.jpg)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Discovery
Crawling the World Wide Web for:
HTML forms implementing a Web ServiceUDDI registriesWSDL descriptionsOther resources (XML, HTML, Web as a full-textindex. . . )
Only interested in Web Services with no side effects:
OkYellow PagesPublication databases. . .
Not OkBooking servicesMailing list management. . .
![Page 12: Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which may or may not be dynamically generated) not accessible from thehyperlinked structureof](https://reader035.fdocuments.in/reader035/viewer/2022071016/5fcf2b6c6b0e836b27709844/html5/thumbnails/12.jpg)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Discovery
Crawling the World Wide Web for:
HTML forms implementing a Web ServiceUDDI registriesWSDL descriptionsOther resources (XML, HTML, Web as a full-textindex. . . )
Only interested in Web Services with no side effects:
OkYellow PagesPublication databases. . .
Not OkBooking servicesMailing list management. . .
![Page 13: Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which may or may not be dynamically generated) not accessible from thehyperlinked structureof](https://reader035.fdocuments.in/reader035/viewer/2022071016/5fcf2b6c6b0e836b27709844/html5/thumbnails/13.jpg)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Discovery
Crawling the World Wide Web for:
HTML forms implementing a Web ServiceUDDI registriesWSDL descriptionsOther resources (XML, HTML, Web as a full-textindex. . . )
Only interested in Web Services with no side effects:
OkYellow PagesPublication databases. . .
Not OkBooking servicesMailing list management. . .
![Page 14: Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which may or may not be dynamically generated) not accessible from thehyperlinked structureof](https://reader035.fdocuments.in/reader035/viewer/2022071016/5fcf2b6c6b0e836b27709844/html5/thumbnails/14.jpg)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Discovery
Crawling the World Wide Web for:
HTML forms implementing a Web ServiceUDDI registriesWSDL descriptionsOther resources (XML, HTML, Web as a full-textindex. . . )
Only interested in Web Services with no side effects:
OkYellow PagesPublication databases. . .
Not OkBooking servicesMailing list management. . .
![Page 15: Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which may or may not be dynamically generated) not accessible from thehyperlinked structureof](https://reader035.fdocuments.in/reader035/viewer/2022071016/5fcf2b6c6b0e836b27709844/html5/thumbnails/15.jpg)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Wrapping Web Service Descriptions
Analyzing the structure of:HTML forms
Result webpages
![Page 16: Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which may or may not be dynamically generated) not accessible from thehyperlinked structureof](https://reader035.fdocuments.in/reader035/viewer/2022071016/5fcf2b6c6b0e836b27709844/html5/thumbnails/16.jpg)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Conceptual Model
IsA ontology of concepts (simple DAG)
Person
Man Woman
Thing
Proceedings Article Book
Publication
n-ary typed roles
AuthorOf(Person,Publication)HasName(Person,Name)
![Page 17: Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which may or may not be dynamically generated) not accessible from thehyperlinked structureof](https://reader035.fdocuments.in/reader035/viewer/2022071016/5fcf2b6c6b0e836b27709844/html5/thumbnails/17.jpg)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Conceptual Model
IsA ontology of concepts (simple DAG)
Person
Man Woman
Thing
Proceedings Article Book
Publication
n-ary typed roles
AuthorOf(Person,Publication)HasName(Person,Name)
![Page 18: Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which may or may not be dynamically generated) not accessible from thehyperlinked structureof](https://reader035.fdocuments.in/reader035/viewer/2022071016/5fcf2b6c6b0e836b27709844/html5/thumbnails/18.jpg)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Services and queries
ExampleService giving authors from publication titles
A*← WrittenBy(P,A),HasTitle(P,T),Input(T)
QueryService with no input
Example<A,T*>*← WrittenBy(P,A), Article(P), HasTitle(P,T),KeywordOf(“xml”,P)
![Page 19: Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which may or may not be dynamically generated) not accessible from thehyperlinked structureof](https://reader035.fdocuments.in/reader035/viewer/2022071016/5fcf2b6c6b0e836b27709844/html5/thumbnails/19.jpg)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Services and queries
ExampleService giving authors from publication titles
A*← WrittenBy(P,A),HasTitle(P,T),Input(T)
QueryService with no input
Example<A,T*>*← WrittenBy(P,A), Article(P), HasTitle(P,T),KeywordOf(“xml”,P)
![Page 20: Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which may or may not be dynamically generated) not accessible from thehyperlinked structureof](https://reader035.fdocuments.in/reader035/viewer/2022071016/5fcf2b6c6b0e836b27709844/html5/thumbnails/20.jpg)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Semantic Interpretation of a Service
How to analyze a Web Service into this formalism?
Field labels and variable namesExample requestsConcrete type descriptionsLinguistic analysis of plain text descriptions
![Page 21: Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which may or may not be dynamically generated) not accessible from thehyperlinked structureof](https://reader035.fdocuments.in/reader035/viewer/2022071016/5fcf2b6c6b0e836b27709844/html5/thumbnails/21.jpg)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Semantic Interpretation of a Service
How to analyze a Web Service into this formalism?
Field labels and variable namesExample requestsConcrete type descriptionsLinguistic analysis of plain text descriptions
![Page 22: Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which may or may not be dynamically generated) not accessible from thehyperlinked structureof](https://reader035.fdocuments.in/reader035/viewer/2022071016/5fcf2b6c6b0e836b27709844/html5/thumbnails/22.jpg)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Semantic Interpretation of a Service
How to analyze a Web Service into this formalism?
Field labels and variable namesExample requestsConcrete type descriptionsLinguistic analysis of plain text descriptions
![Page 23: Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which may or may not be dynamically generated) not accessible from thehyperlinked structureof](https://reader035.fdocuments.in/reader035/viewer/2022071016/5fcf2b6c6b0e836b27709844/html5/thumbnails/23.jpg)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Semantic Interpretation of a Service
How to analyze a Web Service into this formalism?
Field labels and variable namesExample requestsConcrete type descriptionsLinguistic analysis of plain text descriptions
![Page 24: Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which may or may not be dynamically generated) not accessible from thehyperlinked structureof](https://reader035.fdocuments.in/reader035/viewer/2022071016/5fcf2b6c6b0e836b27709844/html5/thumbnails/24.jpg)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Indexing and Querying
Given a query, represented as an Analyzed Web Service,how to know which known web services to query?
Issues:
Subsumption of input/output parametersMissing input parametersComposition of webservices
![Page 25: Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which may or may not be dynamically generated) not accessible from thehyperlinked structureof](https://reader035.fdocuments.in/reader035/viewer/2022071016/5fcf2b6c6b0e836b27709844/html5/thumbnails/25.jpg)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Indexing and Querying
Given a query, represented as an Analyzed Web Service,how to know which known web services to query?
Issues:
Subsumption of input/output parametersMissing input parametersComposition of webservices
![Page 26: Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which may or may not be dynamically generated) not accessible from thehyperlinked structureof](https://reader035.fdocuments.in/reader035/viewer/2022071016/5fcf2b6c6b0e836b27709844/html5/thumbnails/26.jpg)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Indexing and Querying
Given a query, represented as an Analyzed Web Service,how to know which known web services to query?
Issues:
Subsumption of input/output parametersMissing input parametersComposition of webservices
![Page 27: Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which may or may not be dynamically generated) not accessible from thehyperlinked structureof](https://reader035.fdocuments.in/reader035/viewer/2022071016/5fcf2b6c6b0e836b27709844/html5/thumbnails/27.jpg)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Indexing and Querying
Given a query, represented as an Analyzed Web Service,how to know which known web services to query?
Issues:
Subsumption of input/output parametersMissing input parametersComposition of webservices
![Page 28: Understanding the Hidden Web · The Hidden Web Definition (Hidden Web) The set of webpages (which may or may not be dynamically generated) not accessible from thehyperlinked structureof](https://reader035.fdocuments.in/reader035/viewer/2022071016/5fcf2b6c6b0e836b27709844/html5/thumbnails/28.jpg)
Understandingthe Hidden
Web
PierreSenellart
Introduction
ProcessdescriptionDiscovery
Wrappers
Semantic Analysis
Indexing andQuerying
Summary
Web Service Semantic Interpretation Process
World Wide WebHTML form
WSDL
UDDIdiscovery
World Wide Web
WSDL
wrappers
HTML form
WSDL
UDDIdiscovery
World Wide Web
analysisAnalyzed Web Services WSDL
wrappers
HTML form
WSDL
UDDIdiscoveryWorld Wide Web
Web Services Index
indexing
Web Services Index
indexing
analysisAnalyzed Web Services WSDL
wrappers
HTML form
WSDL
UDDIdiscoveryWorld Wide Web
Results
Queries
queryingWeb Services Index
indexing
Web Services Index
indexing
analysisAnalyzed Web Services WSDL
wrappers
HTML form
WSDL
UDDIdiscoveryWorld Wide Web