Data & Knowledge Engineering Group · 2013-03-19 · Data & Knowledge Engineering Group ......
Transcript of Data & Knowledge Engineering Group · 2013-03-19 · Data & Knowledge Engineering Group ......
Data & Knowledge Engineering Group
“Successful Search on the Internet:
Strategies and Tools„ Workshop am 9. Magdeburger Lehrertag
Tatiana Gossen, Andreas Nürnberger, Marcus Nitsche www.dke.ovgu.de
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 2
Agenda Motivation Search Strategies Available Tools Research at DKE group Exploratory Search Search engines for children
AMSL group IT Security for children
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 3
Motivation
WWW plays a big role in our lives
A lot of useful information, that can lead to knowledge, is hidden
“Scientia potentia est”
Skills to succeed in search are beneficial How to learn efficient searching?
Source: Wikipedia
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 4 4
Search Strategies
Search Strategies
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 5
Tactics vs. Strategies Tactic: near-time goals and maneuver operations, actions
Strategy: long-term planning (total planning) combine a set of operations to achieve a special goal
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 6
Taxonomy of Search Search Goal [Broder ‘02] Informational
find information about a topic assumed to be available on the web in order to read about it
Navigational immediate intent to reach a particular website that the user has
in mind Transactional
further carry out some transactions, e.g. purchasing a product User skills Novice user Experienced user
Search Task Well defined vs. Ill-defined Simple vs. complex
Source: http://www.preachtheword.com
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 7
Search Task: well-defined Search Task: well-defined
First step: Transform information need into the query Problem: finding the right set of keyword
“The mere formulation of a problem is far more essential than its solution”
Source: Wikipedia
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 8
Search Strategies: Example Exercise
Q: Find information on whether drinking red wine is
more effective at reducing your risk of heart attacks than white wine.
Query?
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 9
Search Strategies: well-defined First step: Transform information need into the query Problem: finding the right set of keywords
Solutions:
a) Think about documents you would like to get 1) Use words, they should contain, in the query 2) Use synonyms and different writings
If your query is too specific you may miss relevant documents
If your query is too vague you may get too generic results
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 10
Search Strategies: well-defined First step: Transform information need into the query Problem: finding the right set of keywords
Solutions:
b) Read some relevant documents you get and learn about useful key words
c) Ask you colleges & friends → collaborative search
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 11
Search Strategies: well-defined First step: Transform information need into the query Problem: finding the right set of keywords
Solutions:
d) Usage of “stopwords” Prepositions, article, conjunctions Usually not useful Exceptions: e.g. The Who
e) Usage of search operators +, -, “”
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 12
Search Strategies: well-defined First step: Transform information need into the query Problem: complex search task
Q: Is it colder in the Arctic Circle on the Earth or on the
planet Jupiter?
One query won't provide you with relevant documents Why?
Solution: Divide your task into several Merge the results of the subsearches
The answer is spread among several documents!
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 13
Search Task: ill-defined Search Task Well defined vs. Ill-defined
Paradox of Finding Out About “The need to describe that which you do not know in
order to find it” (Roland Hjerppe)
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 14
Search Strategies: ill-defined First step: Transform information need into the query Problem: search task is ill-defined
You cannot find a thing if you do not know what are you
looking for.. Solution: Employ an exploratory search “Exploratory search is a highly dynamic process of a user to
interact with an information space in order to satisfy an information need that requires learning about structure and/or content of the information space.” [Bison book’12]
To put it simple, during the process you may discover information you are interested in
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 15
Search Strategies: ill-defined First step: Transform information need into the query Problem1: what exactly am I looking for Problem2: where to start looking Problem3: what are my options ..
Q: Planning a holiday trip with my family.
Solution: First, you have to explore your options Exploratory Search
Then you can start parallel searches In real live, depth-first search
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 16
Search Strategies: be on guard! Can you trust the source? Everyone can create a website .. and write “yesterday, aliens have landed on the earth”
Be aware of the context
In this example,
the webpages were created in 2011!
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 17
Web Search Engines: all for free? Organic Results vs. Advertisements
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 18
Search Strategies: be smart! Problem: you cannot find information using Google Solution: use other search tools
Different search tools have different features Coverage Ranking algorithm User interface ...
Deep Web
Source: http://www.biblogtecarios.es/juanjoseprieto/deep-web
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 19 19
Available Tools
Available Tools
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 20
Available Tools Search engines Google Bing Yahoo! Yandex … so many more
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 21
Search Engines: History (1) Simple user interface does not changes much
Infoseek in 1997 Google in 2007
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 22
Search Engines: History (2)
(2001)
(2008)
(2004)
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 23
Search Engines: Future?
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 24
Search Engines: Future! Mobile Search Interfaces E.g. with Spoken Queries
Natural User Interfaces Touch, gesture (e.g. Xbox Kinect), augmented (e.g.
Google Glasses), brain computer interfaces,…
Yahoo mobile search interfaces Apple iOS Siri
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 25
Search Engines for Young Users Search engines blinde-kuh.de fragfinn.de helles-koepfchen.de loopilino.com dipty.com onekey.com kids.yahoo.com askkids.com dibdabdoo.com factmonster.com kids.aol.com kidsclick.org … so many more
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 26
SE for Young Users (1) Major feature Help children to find child appropriate content on the
WWW
Very Google-like!
Presentation of the search results in the search engine dipty.com.
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 27
SE for Young Users (2) Unfortunately, current search engines for children not
always match the skills and abilities of children!
Askkids.com web search engine returns no results for a misspelled query “schience”.
Blinde-kuh.de web search engine returns no results for
an empty query.
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 28 28
Research at FIN-OVGU
Research at FIN-OVGU
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 29
Research at FIN-OVGU DKE Group Search engines for children Exploratory search Multimedia retrieval Multilingual search Data analysis …
AMSL Group IT Security for children …
The Data & Knowledge Engineering Group
Head: Andreas Nürnberger Currently 7 local + 8 external Ph.D. students various students on internships or working as
research assistants http://www.dke.ovgu.de/
15.03.2013 DKE Group - Short Intro 30
Research interests
15.03.2013 DKE Group - Short Intro 31
Data Knowledge
Engineering
Lab Data, Logfiles, Texts
Media objects,…
If income > 1000 and nof_contracts > 10 then offer credit If income < 1000 then do not offer credit
• Development of formal (data) structures to store data and knowledge (e.g. based on XML)
• Development of software tools to search, navigate, visualize, structure and maintain
Development of Data Mining Methods, especially to analyze and structure
Data Mining Methods „Knowledge Discovery
in Data Bases“
Hyperonym
Hyponym
A
Hyperonym
Hyponym
A
Hyperonym
Hyponym
A
Hyponym
Hyponym
Hyponym
Hyperonym
Hyperonym
Hyperonym
Synset
SynsetID | Synonyms | Glossary
Hyponym
Hyponym
User and Context Specific Organization
The Data & Knowledge Engineering Group founded in March’03 as Information Retrieval Group adaptive text & multimedia retrieval focus on structuring and interaction mainly text & images
personalization multi-lingual / cross-lingual retrieval
since October’07 DKE Group information visualization music information retrieval data mining bioinformatics exploration of information networks search engines for children
15.03.2013 DKE Group - Short Intro 32
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 33
Web search engines
for young users
Search Engines for Children
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 34
Research Specifics of Information Retrieval for Young Users:
A Survey [Information Processing & Management‘13]
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 35
Information Seeking Information Seeking Behaviour Logfile analysis on web search engines for children
[Sigir’11]
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 36
Search Interface Usability evaluation [Personal and Ubiquitous Computing‘12]
Challenges in design & possible solutions [euroHCIR‘12] Emotional Support Language Support Cognitive Support Memory Support Interaction Support Relevance Support
Knowledge Journey [HCIR‘12] children of primary school age textual information retrieval ad-hoc search & exploration personal information management
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 37
Search User Interface for Children Knowledge Journey
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 38
SUI: User Study Comparative user study
Fixed back end using the same web document collection for both SUIs + Controlled results (child safe and appropriate)
Well-defined (answer-oriented) search tasks Within-participants Latin Square blocking design
Figure 8: Classic keyword-oriented search user interface.
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 39
SUI: User Study
What search user interface do young children prefer and why?
What is children’s attitude towards new interface elements like the guidance figure, audio support, pie menu and treasure chest?
How can both user interfaces be improved?
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 40
Knowledge Journey Version 2.0: Adapt & Evolve
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 42
Development Process In order to develop better systems Iterative design Prototyping & Evaluation Evaluation of the whole system and individual
components
Design
Evaluate Implement
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 43
User Studies
In order to develop better search engines for children, we are constantly seeking for ways to involve young users in our user studies..
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 44
Exploratory search: Task Example
What do Tom Cruise and Niels Bohr have in common?
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 45
Information Exploration (1) CET [Bison book’12]
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 46
Information Exploration (2) Trailblazer
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 47
Trailblazer
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 48
Information Exploration (3)
AMSL: Auszug Forschungsschwerpunkte IT-Sicherheit und Kinder Digitale Wasserzeichen IT-Forensik (z.B. Digitale Fingerspuren) Biometrie (z.B. Handschrift, Gesicht, Sprache) Sicherheitsevaluierungen (z.B. Auto, Roboter) etc.
Motivation: IT-Security und Kinder KIM-Studie: Regelmäßige Nutzung des Computers, des Internets bzw. Handys von Grundschülern in Deutschland
Quelle: KIM-Studie 2010 - Basisuntersuchung zum Medienumgang 6- bis 13-Jähriger in Deutschland, Medienpädagogischer Forschungsverbund Südwest, 2010 (Ansprechpartner: Jana Fruth, Jana Dittmann)
Computernutzung:
6-7 Jahre: ca. 60%
8-9 Jahre: ca. 75 %
10-11 Jahre: ca. 90%
Internetnutzung:
6-7 Jahre: ca. 50%
8-9 Jahre: ca. 60%
10-11 Jahre: ca. 80%
Handynutzung:
6-7 Jahre: ca. 14%
8-9 Jahre: ca. 30%
10-11 Jahre: ca. 75% Foto: J. Fruth (privat)
Stand der Forschung: Vermittlung von IT-Security
Fokus: Sensibilisierung für Bedrohungen beim Umgang der Kinder mit IT (Computer, Smartphones) und dem Internet
Wer klärt auf? • Schule: Informatikunterricht • Erziehungsberechtigte: Regeln • Sonstige: best. Institutionen, Initiativen (z.B. Klicksafe)
Wie wird aufgeklärt? • Informationsbroschüren (z.B. Internet Guide für Kids d. dt.
Kinderhilfswerks) • Webseiten (z.B. www.klicksafe.de)
• Filme (z.B. Projekt „Sheeplive“, de.sheeplive.eu) • Online-Spiele (z.B. „Die Internauten“, www.internauten.de)
Fazit: Vielfältige Sensibilisierungsansätze vorhanden ABER: Scheinen Kinder nicht zu erreichen! -> Neue Ansätze!
(Ansprechpartner: Jana Fruth, Jana Dittmann)
AMSL-Forschungsarbeiten – I/II 1) Sicherheitswarnmeldungen für Smartphones für
Grundschulkinder (7-10 Jahre)
Quelle: W. Menzel, S. Tuchscheerer, J. Fruth, C. Krätzer, J. Dittmann, Design and evaluation of security multimedia warnings for children's smartphones, SPIE: Conference Multimedia on Mobile Devices, Burlingame,’12 (Ansprechpartner: Jana Fruth, Jana Dittmann)
• 1. Ansicht
• 2. Ansicht
AMSL-Forschungsarbeiten – II/II 2) Analyse der Sicherheit von Kinderwebseiten
Quelle: S. Kuhlmann, T. Hoppe, J. Fruth, J. Dittmann: Voruntersuchungen und erste Ergebnisse zur Webseiten- gestaltung für die situationsbewusste Unterstützung von Kindern in IT-Sicherheitsfragen, GI, BS, Informatik 2012 (Ansprechpartner: Jana Fruth, Jana Dittmann)
Ergebnisse Fragebogenbefragung (18 Kinder, 6. Klasse, 12-13 Jahre): • Internetnutzung seit der Grundschule • Sicherheitsgefühl im Internet (80% sicher) • Regeln der Eltern: technische, Beschränkung der Zeit, best. Webseiten • höheres Sicherheitsgefühl oft genutzter Aktivitäten (Chat, Video, Spiele) • belästigt/beobachtet (20%, davon 0% Hilfe geholt) • Maßnahmen d. Sicherheitsgefühl erhöhen (Sicherheitssymbole, Software) • Anonymität im Netz (50% Nein, >25% Ja, <25% Ja, mit Vorkehrungen) • Gefahreneinschätzung der Webseiten positiver als Experten (außer Google)
Kontakt Otto-von-Guericke-Universität Magdeburg Fakultät für Informatik Institut für Technische und Betriebliche Informationssysteme (ITI) Advanced Multimedia and Security (AMSL) Prof. Dr.-Ing. Jana Dittmann E-Mail: [email protected] Tel.: +49 391 67-58965
Jana Fruth
E-Mail: [email protected] Tel.: +49 391 67-19357
ViERforES: Förderkennzeichen: 01IM08003 ViERforES-II: Förderkennzeichen: 01IM10002A
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 55
Q & A
Thank you for your attention!
Additional Slides…
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 56
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 57 57
Available Tools
General Features
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 58
Starting Points of Information Search
Mostly you‘ll find first of all an input form … Alternatives: list of sources overviews cluster category hierarchies / special numbers co-citation Links
examples, wizards and guided tours
58
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 59
Query Specification via Entry Form Interfaces (1)
Standard query specification forms
Drop-down list showing queries that the user has issued in the past that matches the prefix typed so far
15/03/2013 59 Information Retrieval - T. Gossen
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 60
Query Specification via Entry Form Interfaces (2)
Combination a drop-down menu for category of content to search
within + the entry form
60
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 61
Alternative Query Specification: Overviews
Category overviews Open Directory Project Fireball Yahoo! HiBrowse MeSHBrowse
http://dmoz.org/
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 62
Task: Find out how many people have bought the new Harry Potter book so far Harry Potter and the Half-Blood Prince sales Harry Potter and the Half-Blood Prince amount sales Harry Potter and the Half-Blood Prince quantity sales Harry Potter and the Half-Blood Prince actual quantity sales Harry Potter and the Half-Blood Prince sales actual quantity Harry Potter and the Half-Blood Prince all sales actual quantity all sales Harry Potter and the Half-Blood Prince worldwide sales Harry Potter and the Half-Blood Prince
The Need for Query Reformulation
User tried out
Engine provides
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 63
Illustration of term suggestions from Dogpile.com, 2008 InfoSpace
Query Reformulation
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 64
Query Suggestion Quintura.com search engine
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 65
Query Suggestion Quintura.com search engine term suggestions in a 2D map layout, or “cloud,” related terms are shown near one another but arranged somewhat arbitrarily mousing over a term causes the others to shift away,
and additional similar terms to appear nearby
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 66
General Features
Presentation of Search Results
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 67
Document Surrogates Common way: a vertical list of
information summarizing the retrieved documents
Document surrogate
Title Summary URL
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 68
KWIC, or Query-Oriented Summaries Query-Oriented Summaries keyword-in-context
(KWIC) extractions for use in display of retrieval results
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 69
Highlighting Query Terms Highlighting can occur both in retrieval results listings and in the retrieved documents
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 70
Additional Features of Results Listings (1)
Previews of Document Content
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 71
Additional Features of Results Listings (2)
Indicators of Search Result Diversity
Clusty.com
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 72
Additional Features of Results Listings (3)
Indicators of Search Result Diversity
Exalead.com
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 73
Additional Features of Results Listings (4)
Indicators of Additional/Related Hits Sitelinks Shortcuts
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 74
Additional Features of Results Listings (5)
Blended Results and Media Types Search results from
multiple information sources: Web, Twitter, Blogs, Pubmed, Credible etc.
Hakia.com
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 76
Visualization of Search Results (1) WebForager search system placed search results into
virtual “books” that could be “flipped through” using animation in a 2.5D rendering
Offline
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 77
Visualization of Search Results (2) Searchme.com shows a small set of retrieval results
for the query Obama as rendered Web pages that are “flipped through” using the CoverFlow animation
Offline
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 78
Alternative Visualization of Search Results (1)
kartoo.com
Offline
15.03.2013 T. Gossen, A. Nürnberger, M. Nitsche 79
Alternative Visualization of Search Results (2)
eyePlorer.com