A sneak peek into the web

16
A sneak peek into the web Another way to see Internet Guillaume Lebourgeois - December 2008

description

A short description of the topology of the web.

Transcript of A sneak peek into the web

Page 1: A sneak peek into the web

A sneak peek into the webAnother way to see Internet

Guillaume Lebourgeois - December 2008

Page 2: A sneak peek into the web

Your visionA browser

Websites

An interface : search engines

Page 3: A sneak peek into the web

RealityWebsites interconnected

Topology of the web

Web is a huge graph

Page 4: A sneak peek into the web

Hyperlinks

A Graph is made of nodes linked together

A link can have an orientation

Page 5: A sneak peek into the web

Hyperlinks

A link from website A to website B

A reciprocal link

A B

A B

Page 6: A sneak peek into the web

Determining website quality

Two ways :

- Text mining, semantic approach- Topologic approach

It is better to mix both.

Page 7: A sneak peek into the web

Topologic approach

Authorities : websites linked by others

A

Hubs : websites dealing a lot of links

H

Page 8: A sneak peek into the web

Topologic approach

The authority is judged by the others as a reference website.

The hub has a good knowledge of his territory.

Page 9: A sneak peek into the web

Topologic approach

These two notions must be understood relatively to a specific territory :

A community

Page 10: A sneak peek into the web

Communities

Topology : a community is a subpart of the web with a good link density.

Semantic : a community is a subpart of the web which shares a thematic, ideas, ...

Page 11: A sneak peek into the web

Communities

C1

C2

C3

weak link

weak link

Let’s Observe weak links

Page 12: A sneak peek into the web

CommunitiesWeak links : they link distant communities together. These links are rare and stategic. They can be considered as bridges.

Six degrees : thanks to them, there are in the worst case 6 degrees of separation between 2 random websites.

Social : the situation is exactly the same in the social graph.

Page 13: A sneak peek into the web

Exploring

To explore these structures we can use a web Crawler.

- Extracts links and informations- Stores data and visits links found

Page 14: A sneak peek into the web

ExploringBegin Links

Crawl

Depth 1 Links

Crawl

Depth 2 Links

...

Storage

data

data

Page 15: A sneak peek into the web

Using data

Once you’ve collected data you can :- produce a map of the territory you explored.- create a search engine- imagine loads of different applications...

Page 16: A sneak peek into the web

End of the presentation

Feel free to ask any question