DBLP-SSE: A DBLP Search Support Engine

1

DBLP SSE: A DBLP Search Support Engine

Yi Zeng1, Yiyu Yao1,2, Ning Zhong1,3, Yulin Qin1,4

1. International WIC Institute, Beijing University of Technology, P.R. China

2. University of Regina, Canada3. Maebashi Institute of Technology, Japan

4. Carnegie Mellon University, USA

http://www.iwici.org/~yizeng

2

The Evolution Towards the Intelligent Web

Some general views on efforts towards “the Intelligent Web”: [Zhong, Liu, Yao 2002] AI on the Web, Self-direction and learning,

Personalization, etc. [Nova 2002] 2020 +, Next big cycle: Reasoning and A.I. Intelligent personal

agent.

Nova Spivack, CEO & Founder, Radar Networks. Making Sense of the Semantic Web, 2002

3

Back to the Origin of Web Intelligence

How?

[Zhong, Liu, Yao 2002] Ning Zhong, Jiming Liu, Yiyu Yao: In Search of the Wisdom Web. IEEE Computer 35(11): 27-31 (2002)

Questions from a more practical perspective:

How to serve users more wisely from a personal perspective ?Can user personalization be realized in different perspectives?

Web-empowered systems should provide various supporting functionalities to users to meet their diverse needs.

User Interests

4

Search Support Engine

In this paper, we use Search Support Engine (SSE) as an example to serve the user wisely.

As one type of systems for developing the Intelligent Web, a Search Support Engine (SSE) implements the basic principles of (Web) Information Retrieval Support Systems (IRSS) [Yao2002, Hobert2008].

An SSE aims at meeting the diversity needs from different users, providing various supporting functionalities, tools, etc. for users to perform various tasks beyond the traditional search and browsing provided by current search engines.

In this talk, we concentrate on user centric supporting functionalities for this intelligent Web-empowered system.

5

Creating a Context for User Interest :from Various Perspectives

• (Frequency and Recency) Exponential Model for Interest Retention :

• (Frequency and Recency) Power Model for Interest Retention :

n

jjimiTI

1),()(

ibTn

jAejimiEIR

1),()(

bi

n

jATjimiPIR

1),()(

The “ Basic level advantage ” for problem solving acceleration [Rogers2007].

Concepts in a basic level are used by users more frequently than other terms [Wisniewski1989].

• (Frequency) Total Interest :

As a step forward “familiar term” in basic level, “interests retention” which focuses on frequency and recency at the same time, can be developed based on Cognitive memory retention models (exponential function model, power function model) [Anderson, Schooler 1991].

6

Interest Retention and Interest Prediction

To some extend, future interests are relevant to interest retention.

Using the power law model, under A=0.855, and b=1.295, we selected all the authors whose publication numbers are above 100, and we predict their top 9 interests from 2000 to 2007 using interest retention (1226 persons). 49.54% of this samples can predict 3 out of 9 interests.

We analyzed research Interest retention for all the 615,124 computer scientists based on the SwetoDBLP dataset.

We released the “computer scientists’ research interest RDF dataset :

http://www.iwici.org/dblp-sse

We used the interest retention to extend and refine the query.

A comparative study of total research interests through the years 1990-2008 and current research interests in 2009 (based on both the power law and exponential law models)

Difference on the contribution values from papers published in different years

7

DBLP-SSE : DBLP Search Support EngineLet’s use our WI 2009 Program Chair, Ricardo Baeza-Yates to test our system!

Recent interests are extracted using the power law interest retention model.

Terms with high frenquency do not necessarily have high interest retention. (e.g. “Model”)

Interest Frequency Interest Retention

Web 65 7.8095837

Search 64 5.587062

Distributed 7 3.1938698

Engine 19 2.269411

Mining 15 2.144009

Content 10 2.1001248

Query 26 1.2631637

Data 8 1.1330068

Index 8 1.0922915

User 6 1.0907176

Trade-off 2 1.0907129

Process 5 1.0531572

System 5 1.0516398

Impact 2 1.0515785

Model 15 1.0501344

Quality 4 1.0500001

Performance 2 1.05

Usage 1 1.05

8

Representing User Interests

<?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE rdf:RDF [

<!ENTITY xsd "http://www.w3.org/2001/XMLSchema#">

]>

<rdf:RDF

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:rdfs = "http://www.w3.org/2000/01/rdf-schema#"

xmlns:foaf="http://xmlns.com/foaf/0.1/">

<foaf:Person>

<foaf:name> Ricardo Baeza-Yates </foaf:name>

<foaf:workplacehomepage

rdf:resource="www.dcc.uchile.cl/~rbaeza/ " />

<foaf:mbox rdf:resource="[email protected]" /><rdfs:seeAlso rdf:resource="http://dblp.uni-trier.de/db/indices/a-tree/b/Baeza=Yates:Ricardo_A=.html"/><rdf:Seq><foaf:topic_interest>Web</foaf:topic_interest><foaf:topic_interest>Search</foaf:topic_interest><foaf:topic_interest>Distributed</foaf:topic_interest><foaf:topic_interest>Engine</foaf:topic_interest></rdf:Seq></foaf:Person></rdf:RDF>

9

DBLP-SSE : DBLP Search Support EngineLet’s use our WI 2009 Program Chair, Ricardo Baeza-Yates to test our system!

Log in Ricardo Baeza-Yates (WI 2009 Program Chair)Top 9

Recent

interests

Web, search, distributed, engine, mining, content, query, data, index

Query : Artificial Intelligence

List 1 : without a starting point (recent interests constraints)

* PROLOG Programming for Artificial Intelligence, Second Edition.

* Artificial Intelligence Architectures for Composition and Performance Environment.

* Artificial Intelligence in Music Education: A Critical Review.

* Music, Intelligence and Artificiality. Artificial Intelligence and Music Education.

* ......

List 2 : with a starting point (recent interests constraints)

* Searching in a Maze, in Search of Knowledge: Issues in Early Artificial Intelligence.

* Web Intelligence and Artificial Intelligence in Education.

* Using Distributed Data Mining and Distributed Artificial Intelligence for Knowledge Integration.

* Parallel, Distributed and Multi-Agent Production Systems – A Research Foundation for Distributed Artificial Intelligence.

......

Vague or incomplete query can be refined by the starting point (containing recent interests extracted through interest retention models).

10

Domain Analysis Support

Learning hierarchical knowledge structures from conference proceeding indexes.

An illustrative example:

Domain structure for Artificial Intelligence from conference indexes in the DBLP dataset.

A partial multi-level knowledge structure of Artificial Intelligence according to analysis on proceedings indexes of IJCAI 1969-2007.

Finer grained sub knowledge structure on robotics in the structure of Artificial Intelligence.

11

Domain Tracking Support

Domain Tracking Author Distribution

Domain Tracking of “learning” based on Proceedings of the IJCAI 1981-2007.

Author Distribution in some fields of Artificial Intelligence.

12

Why It is a Short Paper : Future Work

The interest retention model can just capture part of users’ current interests and cannot make very good prediction on future interest. Better prediction model on user current interests should be provided (by using spreading activation in ACT-R).

Bridging Cognitively inspired models for interest retentions and predictions with concept drifting (Comments from Yiyu Yao and Orland Hoebert).

Although the current system is user centric, a unified, practical architecture that bridge various supporting functionalities should be provided in order to avoid break different functions into unrelated, ill-structured fragments.

13

References

[Lu 1987] Lu, Ruqian. Artificial Intelligence (I). The Science Press, 1987.

[Shi 1993] Shi, Chunyi, Huang, Changning, Wang, Jiaqin. Principles of Artificial Intelligence, Tsinghua University Press, 1993.

[Brachman 2004] Brachman, Ronald J., Levesque, Hector J. Knowledge Representation and Reasoning, Morgan Kaufmann Publishers, 2004.

[Solso 2004] Solso, R.L., MacLin, M.K, MacLin, O.H.: Cognitive Psychology. Pearson Education, Inc. (2004).

[Anderson 1983] Anderson, J.R. The architecture of cognition. Cambridge, MA: Harvard University Press.

[LaBrie 2004] LaBrie, R.C. The Impact of Alternative Search Mechanisms on the Effectiveness of Knowledge Retrieval, Ph.D thesis, Arizona State University, 2004.

[Landauer 1998] Landauer, C. Data, Information, Knowledge, Understanding: Computing Up the Meaning Hierarchy, Proceedings of the 1998 IEEE International Conference on Systems, Man, and Cybernetics, 1998.

14

[Collins 1969] Collins, A.M. and Quillian, M.R. Retrieval time from semantic memory. Journal of Verbal Learning and Verbal Behaviour, 8, 240-247.

[Segal 2001] Segal, E.M. Semantic Memory, Course Lecture Notes on Reasoning and Problem Solving, State University of New York at Buffalo, 2001.

[Stokman 1988] Stokman, F.N. and Vries, P.H.de. Structuring knowledge in a graph. Human-Computer Interaction: Psychonomic Aspects, Springer, 1988.

[Stokman 1992] Stokman, F.N. Knowledge Graphs, In: Linguistic instruments in knowledge engineering, Elsevier, 1992.

[Bransford 2000] Bransford, J.D., Brown, A.L. and Cocking, R.R(Eds). How People Learn: Brain, Mind, Experience, and School, National Academy Press, 2000.

[Bruner 1981] Bruner, J.S. The organization of action and the nature of adult-infant transaction: Festschrift for J. R. Nuttin. Pp. 1-13 in Cognition in Human Motivation and Learning, D. d'Ydewalle and W. Lens, eds. Hillsdale, NJ: Erlbaum, 1981.

15

[Martin 2002] Philippe Martin. Knowledge Representation, Sharing and Retrieval on the Web. In: Web Intelligence, Springer, 2002.

[Bakker 1987] Bakker, R.R. Knowledge Graphs : representation and structuring of scientific knowledge. Ph D thesis, University of Twente, Enschede, 1987.

[Carre 1979] Carre, B. Graphs and Networks. Clarendon Press, Oxford, 1979.

[Simon 1957] Simon, H.A. Models of Man, Wiley, 1957.

[Simon 1953] Simon, H.A. A behavioral model of rational choice, the RAND Corporation, 1953.

[Yao 2002] Yao, Y.Y., Liau, C.-J.: A Generalized Decision Logic Language for Granular Computing. In: Proc. of FUZZ-IEEE’02, Hawaii, USA, 1092-1097, 2002.

[Yao 2003] Yao, Y.Y. A Framework for Web-based Research Support Systems, The 27th Annual International Computer Software and Applications Conference, 601, 2003.

[Yao 2006] Yao, Y.Y.: Three Perspectives of Granular Computing. Journal of Nanchang Institute of Technology, Vol. 25, No. 2, 16-21, 2006.

16

[Keil 1989] Concepts, kinds, and cognitive development. Cambridge, MA: MIT Press.

[Fensel 2007] Fensel, D. and van Harmelen, F. Unifying reasoning and search to web scale, IEEE Internet Computing, 2007, 11(2): 96, 94-95.

[Sowa 1984] Sowa, J.F. Conceptual Structures, Information Processing in Mind and Machine, Addison-Wesley, Reading, Massachusetts, 1984.

[Hawkins 2004] Hawkins, J. and Blakeslee, S. On Intelligence, Henry Holt and Company, 2004.

[Berners-Lee 2006] Berners-Lee, T., Hall, W., Hendler, J.A., O’Hara, K., Shadbolt, N. and Weitzner, D.J. A Framework for Web science, Foundations and Trends inWeb Science, 2006, 1(1): 1-130.

[Barsalou 2000] Concepts: Structure. In A. E. Kazdin (Ed.), Encyclopedia of psychology, Vol. 2, Washington, DC: American Psychological Association, 2000, 245-248.

17

Thank you!

DBLP-SSE: A DBLP Search Support Engine

Technology

Transcript of DBLP-SSE: A DBLP Search Support Engine