Web-Based Information Systems University of Alberta Dr. Osmar R. Zaïane, 2001-2004 1
Web-Based Information Systems
Dr. Osmar R. Zaïane
University of Alberta
Fall 2004
CMPUT 410: Internet and WWW
Web-Based Information Systems University of Alberta Dr. Osmar R. Zaïane, 2001-2004 2
• Perl & Cookies• SGML / XML • CORBA & SOAP• Web Services• Search Engines• Recommender Syst.• Web Mining• Security Issues• Selected Topics
Course Content
• Introduction• Internet and WWW• Protocols• HTML and beyond• Animation & WWW• CGI & HTML Forms• Javascript• Databases & WWW• Dynamic Pages Preliminaries
Web-Based Information Systems University of Alberta Dr. Osmar R. Zaïane, 2001-2004 3
Objectives of Lecture 2
• Get a brief overview of the history of the Internet and the different tools that exist on the Internet;
• Understand the distinction between the Internet and the World-Wide Web.
Internet and WWWInternet and WWW
Web-Based Information Systems University of Alberta Dr. Osmar R. Zaïane, 2001-2004 4
Outline of Lecture 2
• The Memex machine: the dream will come true
• Hypertext: linking new kinds of documents
• The Internet: infallible information exchange
• The World-Wide Web and the start of a new era
• Web-based applications
• Some terminology
Web-Based Information Systems University of Alberta Dr. Osmar R. Zaïane, 2001-2004 5
When Did It All Start?
• In 1945, Vannevar Bush wrote an article “As We May Think” describing a machine, Memex, containing human collective knowledge organized with “trails” linking materials of the same topic.
• The article revolutionized information technology before even the existence of modern computers.
Web-Based Information Systems University of Alberta Dr. Osmar R. Zaïane, 2001-2004 6
Where is the memex?
• Memex is a hypothetical machine.
• The information stored ought to be accessible.
• We haven’t fulfilled the dream yet.
• But much has been achieved in 50 years.
Web-Based Information Systems University of Alberta Dr. Osmar R. Zaïane, 2001-2004 7
Outline of Lecture 2
• The Memex machine: the dream will come true
• Hypertext: linking new kinds of documents
• The Internet: infallible information exchange
• The World-Wide Web and the start of a new era
• Web-based applications
• Some terminology
Web-Based Information Systems University of Alberta Dr. Osmar R. Zaïane, 2001-2004 8
Hypertext-Hyperlink-Hypermedia
• Following Memex idea, Ted Nelson developed the Xanadu project which aimed at placing the entire world’s literary corpus on-line.
• Ted Nelson coined the term hypertext in 1965.
A document is not contiguous but is a set of connected parts of documents. Hyperlinks are links that connect sub-documents. Hypermedia is a multimedia hypertext document,
Web-Based Information Systems University of Alberta Dr. Osmar R. Zaïane, 2001-2004 9
Outline of Lecture 2
• The Memex machine: the dream will come true
• Hypertext: linking new kinds of documents
• The Internet: infallible information exchange
• The World-Wide Web and the start of a new era
• Web-based applications
• Some terminology
Web-Based Information Systems University of Alberta Dr. Osmar R. Zaïane, 2001-2004 10
ARPAnet• In the heart of the cold war, ARPA (Advanced Research
Projects Agency) was created (1957). The purpose was to outrun the Russians in the race for mastering rocket launching.
• In 1969, it was decided to link sensitive computer centres by a network in order to withstand a possible nuclear attack. The idea was to allow centres to communicate even after a centre is destroyed. (Bob Taylor’s idea)
• It connected government labs, major research centres and universities.
• It existed until 1988 and was officially dismantled in 1990.
• Backbone Network speed: 64Kbits/second
• Major achievements:– TCP/IP, Domain Name Service, e-mail (SMTP), FTP, Telnet...
Web-Based Information Systems University of Alberta Dr. Osmar R. Zaïane, 2001-2004 11
NSFnet• DARPA, the Defense Advanced Research Projects Agency,
still exists and the military have their own network but the original ARPAnet was integrated into the current Internet.
• The National Science Foundation in the USA funded the NSFnet which was created in 1985.
• Backbone Network speed: T1 (1.5mb/sec.) to T3 (45mb/sec.) • It originally connected 5 major universities with
supercomputer centres, but rapidly included other universities, research centres and private companies.
• Replaced ARPAnet as the backbone of Internet in 1990
Web-Based Information Systems University of Alberta Dr. Osmar R. Zaïane, 2001-2004 12
What about the Internet?
• The Internet didn’t originate in the USA alone.
• Other networks existed in North America and Europe and other places in the world.
• BitNet, for instance, connected many research centres and universities.
• Bridges connected these networks to create a larger international network: the Internet.
• Late 90s: Internet2, funded by US universities, a sequel to NSFnet with new protocols.
Web-Based Information Systems University of Alberta Dr. Osmar R. Zaïane, 2001-2004 13
Ca net 1990 1.5 mb/s NSFnet
Ca net 2 1997 155 Mb/s Internet2
Ca net 3 1999 2.5 Gb/s Internet2 Abilene & vBSN projects
CA netYear Speed USA equivalent
Canada committed $110 million for Ca net4,a10 Gb/s optical network connecting research institutions across Canada.
Web-Based Information Systems University of Alberta Dr. Osmar R. Zaïane, 2001-2004 14
Explosive Growth
Web-Based Information Systems University of Alberta Dr. Osmar R. Zaïane, 2001-2004 15
1973First internationalconnection (UK+Norway)
1970 19951990198519801975
1969ARPANETcommissioned
by DoD
1974
TCP/IP 1979USENET
1982ARPANETtransition to
TCP/IP
1986NSF-Netcreated
1990ARPANETceases to exist
1990Archie
1988IRC
2000
1991Gopher
1991WAIS
1992WWWin CERN
1992MBONE
1992Veronica
1993Mosaic
1993Crawlers
1995Java
1995VRML
1994E-commerce
1996Internet phone
1972ARPANETdemonstration
1981BITNETand CSNET come into being
1983ARPANET splitsinto ARPANET and MILNET
11969
31973
111989
331991
491992
591993
811994
961995
1341996
1711997
# countriesYear
41969
621974
313,0001990
1,486,0001993
6,642,0001995
36,739,0001998
# hostsYear
2131981
1,9611985
1994UCSTRI
1993Aliweb
1994MLDB +WebQL
1991Netfind
1994 Yahoo
1985FTP
1997WirelessInternet access
1998Internet TaxFreedom Act
1996AltaVista
1986NNTP
1999Internet2NGI
1999RSVP
1994Harvest
1996WebSQL
1997WebOQL
1998Google
1998Clever
1993 W3C
1971FTP on NCP
Internet Timeline
Web-Based Information Systems University of Alberta Dr. Osmar R. Zaïane, 2001-2004 16
Outline of Lecture 2
• The Memex machine: the dream will come true
• Hypertext: linking new kinds of documents
• The Internet: infallible information exchange
• The World-Wide Web and the start of a new era
• Web-based applications
• Some terminology
Web-Based Information Systems University of Alberta Dr. Osmar R. Zaïane, 2001-2004 17
Advent of the World-Wide Web• In 1990, Tim Berners-Lee developed a on-line hypertext-based
system to help researchers at CERN in Switzerland share information across a diverse computer network.
• He came up with first versions of HTML (based on SGML) and the HTTP protocol.
• HTTP and HTML catapulted the Internet to new heights.
• The WWW revolutionized the use of the Internet thanks to a multimedia user friendly interface: a web browser.
• Mosaic was developed in NCSA by students at the University of Illinois in 1993, among them Marc Andreessen who created Netscape in 1995.
Web-Based Information Systems University of Alberta Dr. Osmar R. Zaïane, 2001-2004 18
The WWW is not alone
• There are other tools on the Internet. They could be classified as:– Command Line. Ex: FTP (1971)– Menu-based. Ex: gopher (1991)– Search engine. Ex: WAIS (1991)– Hypermedia. Ex: WWW (1991)
Web-Based Information Systems University of Alberta Dr. Osmar R. Zaïane, 2001-2004 19
Other Taxonomy of Internet Tools
• Communication services– E-mail, newsgroups (usenet), telnet, internet relay chat
(IRC), …
• Information storage and exchange– FTP, Gopher, Alex, …
• Information Indexing– Archie, Veronica, Wais, UCSTRI, Whois, …
• Interactive Multimedia information delivery– WWW and its indexes.
Web-Based Information Systems University of Alberta Dr. Osmar R. Zaïane, 2001-2004 20
Outline of Lecture 2
• The Memex machine: the dream will come true
• Hypertext: linking new kinds of documents
• The Internet: infallible information exchange
• The World-Wide Web and the start of a new era
• Web-based applications
• Some terminology
Web-Based Information Systems University of Alberta Dr. Osmar R. Zaïane, 2001-2004 21
Client-Server Architecture
The World-Wide Web is an assortment of interconnected computers. In this context, computers provide data to other computers.
Provides the informationRequests the
information
ServerClient
Web-Based Information Systems University of Alberta Dr. Osmar R. Zaïane, 2001-2004 22
Client-Server Architecture
Request
Response
URL
HTML page
ServerClientHTTP
Web-Based Information Systems University of Alberta Dr. Osmar R. Zaïane, 2001-2004 23
Client-Server Architecture
Request + Data
Response
HTTP
DB
HTTP serverBrowser
Application
CGI + Servlets(Perl and Java)
Javascript and Java
Web-Based Information Systems University of Alberta Dr. Osmar R. Zaïane, 2001-2004 24
Application / Application Communication – scenario 1
Request + Data
Response
HTTP
DB
HTTP server
No browser involved
Application Identifying fields and variables
Application Parses the HTML page to extract the needed information
Wrapperneeded
Wrapperneeded
Web-Based Information Systems University of Alberta Dr. Osmar R. Zaïane, 2001-2004 25
Application / Application Communication – scenario 2:
XML request + Data
Response
HTTP
DB
HTTP server
CORBA can also be used to exchange objects
Application parses XML with known DTD or schema
SOAP over
XMLdoc
XMLdoc
Web Service
Web-Based Information Systems University of Alberta Dr. Osmar R. Zaïane, 2001-2004 26
Outline of Lecture 2
• The Memex machine: the dream will come true
• Hypertext: linking new kinds of documents
• The Internet: infallible information exchange
• The World-Wide Web and the start of a new era
• Web-based applications
• Some terminology
Web-Based Information Systems University of Alberta Dr. Osmar R. Zaïane, 2001-2004 27
Terms in the Glossary• Internet: group of networks connected together. The Internet refers to the global
connection of networks around the world.
• LAN: Local Area Network: a group of computers, usually all in the same room or building, connected for the purpose of sharing files, exchanging email, and collaboration.
• Intranet: internal company network. Internal use of web capabilities.
• Extranet: ability to securely connecting the intranet with defined external networks.
• CGI: Common Gateway Interface: means of developing application for the web on the server side.
• Middleware: a tier usually between a web application or a web server and a database or another application layer.
Top Related