Comp II....Unit 1
-
Upload
theanuuradha1993gmai -
Category
Documents
-
view
216 -
download
0
Transcript of Comp II....Unit 1
-
7/27/2019 Comp II....Unit 1
1/31
-
7/27/2019 Comp II....Unit 1
2/31
BBA IV Sem/CAII/Unit-1
The Louvre's website also has links to the sites of other museums, such as the Vatican
Museum. When you click on that link, you access the web server for the Vatican Museum. In
this way, information scattered across the globe can be linked together.
The "glue" that holds the Web together is called hypertext and hyperlinks. This feature
allows electronic files on the Web to be linked so you can jump easily between them. On theWeb, you navigate through pages ofinformation--commonly known as browsing
or surfing--based on what interests you at that
particular moment.To access the Web you need a web browser,
such as Netscape Navigator or Microsoft
Internet Explorer. How does your web
browser distinguish between web pages andother types of data on the Internet? Web
pages are written in a computer language
called Hypertext Markup Language orHTML.
Some Web History
The World Wide Web was originally developed in 1990 at CERN, the European
Laboratory for Particle Physics. The original idea came from a young computer scientist, Tim
Berners-Lee. It is now managed by The World Wide Web Consortium.
The WWW Consortium is funded by a large number of corporate members, includingAT&T, Adobe Systems, Inc., Microsoft Corporation and Sun Microsystems, Inc. Its purpose is
to promote the growth of the Web by developing technical specifications and reference softwarethat will be freely available to everyone. The Consortium is run by MIT with INRIA (The French
National Institute for Research in Computer Science) acting as European host, in collaboration
with CERN.
The National Center for Supercomputing Applications (NCSA) at the University of
Illinois at Urbana-Champaign, was instrumental in the development of early graphical software
utilizing the World Wide Web features created by CERN. NCSA focuses on improving theproductivity of researchers by providing software for scientific modeling, analysis, and
visualization. The World Wide Web was an obvious way to fulfill that mission. NCSA Mosaic,one of the earliest web browsers, was distributed free to the public. It led directly to the
phenomenal growth of the World Wide Web.
-
7/27/2019 Comp II....Unit 1
3/31
BBA IV Sem/CAII/Unit-1
World Wide Web Vs Internet
Many a times we do not make a distinction between the Internet and the World Wide
Web. Though they are related to each other, they are not the same. The Internet is a massivenetwork that connects millions of computers across globe. Whereas, web is a way by which the
information is accessed over the Internet. Information over the Internet travels from computer to
computer via protocols. While sending electronic mails Internet uses SMTP protocol, whilesharing files (files can be text, images, video or MP3), the Internet uses FTP protocol and whileexchanging web related information (i.e. hypertext information) it uses HTTP protocol. But web
uses HTTP protocol to transmit the data, share the web pages (hyperlink documents) and
exchange the business logic. It utilizes the browser such as Internet Explorer or NetscapeNavigator, to display the hypertext documents. Web therefore, can be said to be a portion of the
Internet.
Domain Name System (DNS)
Format. IP numbered addresses are difficult to remember. People are better in remembering
names and mnemonics (symbols, letters etc). Therefore, numbered addresses have been mappedinto name, which consists of the host name and a domain (the group to which the computer
belongs). The general format of domain name system is given below: -
Host Name. Second Level Domain Name. First Level Domain Name
Where,
(a) Host Name is the name of the service provider or network name, e.g., VSNL.(b) Domain Name signifies the kind of organisation. Some of the organisational and
geographic domain names are given in table.
Rules: The rules that are followed for mapping numbered IP addresses into DNS scheme are: -
(a) The DNS is distributed hierarchical naming system.(b) A node on the DNS can be named by traversing the tree from itself to the root. At
each node, the name is added and a period (.) is appended to it until the root is
reached.
(c) Each node can have any number of child nodes but only one parent node. Childnodes must have different names to ensure a unique naming system.
(d) All the letters used in the name of a node must be lower case with no spacebetween the dots (periods).
Figure shows the domain name (address) of a node with a name APJ. A domain name
Server on Internet keeps a directory of all the nodes on it.
-
7/27/2019 Comp II....Unit 1
4/31
BBA IV Sem/CAII/Unit-1
FIG: DOMAIN NAME
Organizational & Geographical Domain Names
com
Commercial Organizationedu
Educational
govGovernment Agencies
mil Military Organisation
com Commercial Organisation
net Sites which perform some administrative functions for the Net
org Non Profit Organization
au Australia
ca Canada
es Spain
fr France
hk Hong Kong
in India
jp Japan
uk United Kingdom
us United States
com inmil
edu
gov
net
vsnl
apj
yahooapj.vsnl.net.in
com inmil
edu
gov
net
vsnl
apj
yahooapj.vsnl.net.in
com inmil
edu
gov
net
vsnl
apj
yahooapj.vsnl.net.in
com inmil
edu
gov
net
vsnl
apj
yahooapj.vsnl.net.in
-
7/27/2019 Comp II....Unit 1
5/31
BBA IV Sem/CAII/Unit-1
IP Addressing
Every host and router on the Internet has a unique IP address which encodes its network
number and host number. No two machines or routers can have same IP address. The addressingscheme on Internet uses IPv4 (Internet protocol version four), which is a 32-bit IP addressing
scheme. In this scheme, 32 bits are divided into four groups of 8-bit each joined by a period (i.e.,
8 bits.8 bits.8 bits.8 bits). With eight bits 256 (28
) numbers can be represented. Thus, eacheight-bit group can represent numbers from 0 to 255. A typical IP address will appear like137.00.2.11. Based on this addressing scheme, networks connected on Internet have been
classified into five types as shown in figure.
FIG: IP ADDRESSING SYSTEM
URL (Uniform Resource Locator)
A string of characters that specify the address of a Web page.
The browsers display is hypertext that contains pointers to the other documents. The
pointers are implemented using a concept that is central to Web browsers called UniformResource Locator. URL can be thought of as a network extension of standard file name concept
except that in this case the file and its directory can exist on any computer on the network.
Typing a URL in the location area and hitting the return key will cause the browser to attempt to
retrieve that page. If the browser is successful in finding the page, the browser will display it.This high-level explanation does not, however, convey any of the details of what is happening.
To go from a URL to having the Web page displayed, the browser needs to be able to answer
such questions as:
How can the page be accessed? Where can the page be found? What is the file name corresponding to the page?
8 2416 32
0 Network Host1.0.0.0 to
127.255.255.255
10 Network Host
110 Network Host
1110 Multicast Address
128.0.0.0 to
191.255.255.255
192.0.0.0 to
223.255.255.255
224.0.0.0 to
239.255.255.255
11110240.0.0.0 to
247.255.255.255
A
B
C
D
E Reserved for future use
126 Networks with 16 mil hosts
16,382 Networks with 64 K hosts
2 mil Networks with 254 hosts
Range of H ostsClass
8 2416 32
0 Network Host1.0.0.0 to
127.255.255.255
10 Network Host
110 Network Host
1110 Multicast Address
128.0.0.0 to
191.255.255.255
192.0.0.0 to
223.255.255.255
224.0.0.0 to
239.255.255.255
11110240.0.0.0 to
247.255.255.255
A
B
C
D
E Reserved for future use
126 Networks with 16 mil hosts
16,382 Networks with 64 K hosts
2 mil Networks with 254 hosts
Range of H ostsClass
-
7/27/2019 Comp II....Unit 1
6/31
BBA IV Sem/CAII/Unit-1
The URL is designed to incorporate sufficient information to resolve these questions.Quite naturally, then, the URL has three parts. We can view the format of a URL as follows:
how://where/what
OR we can say in other words, URL contains three parts: the first describes the type ofresource (protocol), second part gives the name of server housing the resource, the third part
gives the full file name of resource i.e. directory, subdirectory and file name. The format is:
protocol://domain name of server/directory name/sub-directory name/file name
At this point, it is helpful to consider a sample URL to illustrate the three parts:
http://pubpages.uminn.edu/index.html
Let us break this example down into its component.
1. http-: Defines the protocol or schema by which to access the page. In this case, theprotocol is Hyper Text Transfer Protocol. This protocol is the set of rules by which anHTML document is transferred over the Web.
2. pubpages.uminn.edu-: Identifies the domain name of the computer where the pageresides. The computer is a Web server capable of satisfying page requests. Just as a
waiter serves food, a Web server serves Web pages. The name pubpages.uminn.edu tells
the browser on which computer to find the Web page. In this case, the computer islocated at the University of Minnesota.
3. index.html-: Provides the local name (usually a filename) uniquely identifying thespecific page. If no name is specified, the Web server where the page is located may
supply a default file. On many systems, the default file is named index.html or index.htm.
This example demonstrates that the URL consists of a protocol, a Web servers domain name,and a file name.
Entering a URL in the location field of the browser will bring up the designated Web page,
barring any problems. For example, if the Web page has moved to another machine or has beenremoved, or if you type an invalid URL, or if the server you are trying to access is unavailable,
an error message will be displayed. Another way to retrieve a Web page is to mouse over and
click on a hyperlink in the Web page that is currently being displayed.
In the URL example presented earlier, the protocol to access the page was http. This is usedfor transferring an HTML document. Much of the power of browser is that they are
multiprotocol. That is they can retrieve and render information from a variety of servers and
sources. The given table provides a summary of other common protocols:
Protocol Name Use Example
ftp File Transfer ftp://ftp.bio.umaine.edu
gopher Gopher gopher://gopher.tc.umn.edu/11/libraries
http Hypertext http://www.chem.uab.edu/pauling/argon.html
telnet Remote Login telnet://www.amnesty.org
Mail to Sending E-mail mailto:[email protected]
http://pubpages.uminn.edu/index.htmlftp://ftp.bio.umaine.edu/http://gopher//gopher.tc.umn.edu/11/librarieshttp://www.chem.uab.edu/pauling/argon.htmlhttp://telnet//www.amnesty.org/mailto:[email protected]:[email protected]://telnet//www.amnesty.org/http://www.chem.uab.edu/pauling/argon.htmlhttp://gopher//gopher.tc.umn.edu/11/librariesftp://ftp.bio.umaine.edu/http://pubpages.uminn.edu/index.html -
7/27/2019 Comp II....Unit 1
7/31
BBA IV Sem/CAII/Unit-1
Concept of Protocol
For any network to exist, there must be connections between computers and agreements(protocols) about the communication language. However, setting up connection and agreements
between disparate computers (PCs to mainframe) is complicated by the fact that over the last
decade, systems have become increasingly heterogeneous in their software and hardware as well
as their intended functionality. A range of standards for networking, called protocol stacks hasbeen developed.
A Protocol standard allows heterogeneous computers to talk to each other. Protocolstacks are software that performs variety of actions necessary for data transmission between
computers. Protocol stacks are set of rules for inter computer communication that has been
agreed upon and implemented by many vendors, users and standard bodies. The protocol stackworks by residing either in a computers memory or in the memory of transmission device like
a network interface card. When data is ready for transmission it puts the data on the wire. At the
receiving end, it takes the data off the wire and prepares the data for the application, taking off
the error control information that was added at the transmission end. Internet Uses TCP/IP
(Transmission Control Protocol/ Internet Protocol) as a protocol.
Web Caching
Web caching is the storage of Web objects near the user to allow fast access, thus
improving the user experience of the Web surfer. Examples of some Web objects are Web pages
(the HTML itself), images in Web pages, etc. Web objects can be cached locally on the userscomputer or on a server on the Web.
Browser cache: Browsers cache Web objects on the users machine. A browser first looks for
objects in its cache before requesting them from the website. Caching frequently used Web
objects speeds up Web surfing. For example, I often use google.com and yahoo.com. If theirlogos and navigation bars are stored in my browsers cache, then the browser will pick them up
from the cache and will not have to get them from the respective websites. Getting the objectsfrom the cache is much faster than getting them from the websites.
Web objects can have an expiry time associated with them after which the object is considered tobe stale. A stale object is not used. If the object in the cache is stale, then it is equivalent to
the object not being in the cache. An expiry date can be specified in the http header of a Web
object. The expiry date is specified using EXPIRES and CACHE-CONTROL http headers.
What are the Advantages of Web Caching?
Web caching has the following advantages:
Faster delivery of Web objects to the end user. Reduces bandwidth needs and cost. It benefits the user, the service provider and the
website owner.
Reduces load on the website servers.
-
7/27/2019 Comp II....Unit 1
8/31
BBA IV Sem/CAII/Unit-1
Web Server
Web servers are computers that deliver (serves up) Web pages. In other words we can
say, a web server is a computer that stores the web pages and gives them to the client wheneverasked for. When a client or the browser sends request message, it searches for the domain name.
Every Web server has an IP address and possibly a domain name. For example, if you enter
the URL http://www.pcwebopedia.com/index.htmlin your browser, this sends a request to theWeb server whose domain name ispcwebopedia.com. The server then fetches the pagenamed index.htmland sends it to your browser.
Any computer can be turned into a Web server by installing server software and connecting the
machine to the Internet. There are many Web server software applications, including publicdomain software from NCSA and Apache, and commercial packages
from Microsoft, Netscape and others.Proxy Server
A server that sits between a client application, such as a Web browser, and a real server.
It intercepts all requests to the real server to see if it can fulfill the requests itself. If not, itforwards the request to the real server.
In computer networks, a proxy server is a server (a computer system or an application) that acts
as an intermediary for requests from clients seeking resources from other servers. A client
connects to the proxy server, requesting some service, such as a file, connection, web page, orother resource available from a different server. The proxy server evaluates the request according
to its filtering rules. For example, it may filter traffic by IP address or protocol. If the request is
validated by the filter, the proxy provides the resource by connecting to the relevant server andrequesting the service on behalf of the client. A proxy server may optionally alter the client's
request or the server's response, and sometimes it may serve the request without contacting the
specified server. In this case, it 'caches' responses from the remote server, and returns subsequent
requests for the same content directly.
Proxy servers have two main purposes:
Improve Performance: Proxy servers can dramatically improve performance for groupsof users. This is because it saves the results of all requests for a certain amount of time.Consider the case where both user X and user Y access the World Wide Web through a
proxy server. First user X requests a certain Web page, which we'll call Page 1.
Sometime later, user Y requests the same page. Instead of forwarding the request to theWeb server where Page 1 resides, which can be a time-consuming operation, the proxy
server simply returns the Page 1 that it already fetched for user X. Since the proxy server
is often on the same network as the user, this is a much faster operation. Real proxy
servers support hundreds or thousands of users. The major online services suchas America Online, MSN and Yahoo, for example, employ an array of proxy servers.
Filter Requests: Proxy servers can also be used to filter requests. For example, acompany might use a proxy server to prevent its employees from accessing a specific setof Web sites.
-
7/27/2019 Comp II....Unit 1
9/31
BBA IV Sem/CAII/Unit-1
Firewall
A system designed to prevent unauthorized access to or from a private network.
Firewalls can be implemented in both hardware and software, or a combination of both.Firewalls are frequently used to prevent unauthorized Internet users from accessing private
networks connected to the Internet, especially intranets. All messages entering or leaving the
intranet pass through the firewall, which examines each message and blocks those that do notmeet the specified security criteria.There are several types of firewall techniques:
Packet filter: Looks at each packet entering or leaving the network and accepts orrejects it based on user-defined rules. Packet filtering is fairly effective and transparent to
users, but it is difficult to configure. In addition, it is susceptible to IP spoofing.
Application gateway: Applies security mechanisms to specific applications, such
as FTP and Telnet servers. This is very effective, but can impose a performance
degradation.
Circuit-level gateway: Applies security mechanisms when a TCP or UDP connectionis established. Once the connection has been made, packets can flow between the hosts
without further checking.
Proxy server: Intercepts all messages entering and leaving the network. The proxy
server effectively hides the true network addresses.
In practice, many firewalls use two or more of these techniques in concert. A firewall is
considered a first line of defense in protecting private information. For greater security, data can
be encrypted.
Web Portal
A Web portalorpublic portalrefers to a Web site or service that offers a broad array of
resources and services, such as e-mail, forums, search engines, and online shopping malls. The
first Web portals were online services, such as AOL, that provided access to the Web, but bynow most of the traditional search engines have transformed themselves into Web portals to
attract and keep a larger audience.
An enterprise portalis a Web-based interface for users of enterprise applications. Enterprise
portals also provide access to enterprise information such as corporate databases, applications
(including Web applications), and systems.
Home Page
This is the starting point or front page of a Web site. This page usually has some sort of
table of contents on it and often describes the purpose of the site. For example,http://www.apple.com/index.html is the home page of Apple.com. When you type in a basic
URL, such as "http://www.cnet.com," you are typically directed to the home page of the Web
site. Many people have a "personal home page," which is another way the term "home page" canbe used.
-
7/27/2019 Comp II....Unit 1
10/31
BBA IV Sem/CAII/Unit-1
Web Page and Web Site
Web pages are what make up the World Wide Web. These documents are written in
HTML (hypertext markup language) and are translated by your Web browser. Web pages caneither be static or dynamic. Static pages show the same content each time they are viewed.
Dynamic pages have content that can change each time they are accessed. These pages are
typically written in scripting languages such as PHP, Perl, ASP, or JSP. The scripts in the pagesrun functions on the server that return things like the date and time, and database information.All the information is returned as HTML code, so when the page gets to your browser, all the
browser has to do is translate the HTML.
Please note that a Web page is not the same thing as a Web site. A Web site is acollection of pages. A Web page is an individual HTML document. This is a good distinction to
know, as most techies have little tolerance for people who mix up the two terms.Cookies
A cookie, also known as an HTTP cookie, web cookie, orbrowser cookie, is used for
an origin website to send state information to a user's browser and for the browser to return thestate information to the origin site. The state information can be used for authentication,
identification of a user session, user's preferences, shopping cart contents, or anything else that
can be accomplished through storing text data on the user's computer.
Cookies cannot be programmed, cannot carry viruses, and cannot install malware on thehost computer. However, they can be used by spyware to track user's browsing activitiesa
major privacy concern that prompted European and US law makers to take action. Cookies can
also be stolen by hackers to gain access to a victim's web account.
Browsers
A Web browser is a program that your computer runs to communicate with the Web
servers on the Internet, which enables it to download and display the Web pages that you request.
A Web browser is an interface between the user and the internal working of the Internet.Browsers are referred as Web clients or universal clients as they follow the principle of clientserver technology where the browser is the client.
On typing a URL in the address window or by following hyperlinks; the browser contacts
the server by sending a request for the required information. After receiving this information thebrowser displays it on the Web page in the users window.
At a minimum, a Web browser must understand HTML and display text. In recent years,
however, Internet users have came to expect a lot more. A state-of-the-art Web browser provides
a full multimedia experience, complete with pictures, sound, video, and even 3-D imaging.Because a Web browser has the ability to interpret or display so many types of files; you
often may use a Web browser even when you are not connected to the Internet. Windows 98, for
example, uses Internet Explorer to open most image files.There are many types of browsers; you can obtain a comprehensive list of the same from
the web sitewww.browsers.com. The most popular browsers; by far; are Netscape Navigator and
Microsoft Internet Explorer. Both are state-of-the-art browsers; and the competition betweenthem is fierce.
Both Navigator and Internet Explorer are available over the Internet at no charge.
Microsoft designed Internet Explorer for the Windows operating system, but it is now available
http://www.browsers.com/http://www.browsers.com/http://www.browsers.com/http://www.browsers.com/ -
7/27/2019 Comp II....Unit 1
11/31
BBA IV Sem/CAII/Unit-1
for Macintosh and some UNIX system, as well. Navigator is available for Windows, Macintosh,UNIX, and Linux operating system.
Features of a Good Browser
1. The most important feature of a web browser is the presentation of web pages withoutdistortion.
2. The browser should support multimedia features like sound, video, etc.3. It should support also forms and frames. Frames divide web pages into sections, thus
improving readability.
4. A good should have the ability to open multiple windows.5. Latest browsers support Active X technology, Java, VRML and other plug-in support.6. E-mail, News, and FTP support should also be extended.7. Last but not the least, certain amount of security features like the ability to block the
access to certain Web pages should also exist.
Internet
The Internet - Interconnected Networks - is the most well known component of
the Information Super Highway (I-Way) infrastructure. Today, Internet is an information
distribution system spanning several continents. Its general infrastructure targets not only one
electronic commerce application, such as video-on-demand or home shopping, but a wide rangeon computer-based services, such as e-mail, EDI, information publishing, information retrieval
and video conferencing. Simply put, the Internet environment is unique combination of postal
service, telephone system, research library, supermarket and talk show center that enables peopleto share and purchase information. Internet is viewed as a prototype for emerging I- way of
which it will become one component.
Internet began around 1965 when US Department of defence (DOD) financed the design
of a computer network to link a handful of universities and military research laboratoriescalledAdvance Research Project Agency Net work (ARPA net). In mid 1980's National Science
Foundation (NSF) took over the control, when defence traffic moved from ARPA net to MIL
net. In 1987, the NSF created NSF net. In 1991, commercial Internet started using NSFbackbone. In 1995, NSF net was decommissioned and modern Internet came into existence.
Internet Administration
The Internet, with its roots primarily in the research domain, has evolved and gained a
broader user base with significant commercial activity. Various group that coordinate Internet
issues have guided and development. Figure shows the general organization of Internet
administration.
-
7/27/2019 Comp II....Unit 1
12/31
BBA IV Sem/CAII/Unit-1
Internet Society (ISOC)
The Internet Society (ISOC) is an international, non-profit organization formed in 1992to provide support for the Internet standards process. ISOC accomplishes this through
maintaining and support other Internet administrative bodies such as IAB, IETF, IRTF and
IANA. ISOC also promotes research and other scholarly activities relating to the Internet.
Internet Architecture Board (IAB)
The Internet Architecture Board (IAB) is the technical advisor to the ISOC. The mainpurpose of the IAB is to oversee the continuing development of the TCP/IP Protocol Suit and to
serve advisory capacity to research members of the Internet community. IAB accomplishes this
through its two primary components, the Internet Engineering Task Force (IETF) and the
Internet Research Task Force (IRTF). Another responsibility of the IAB is the editorialmanagement of the RFCs. IAB is also the external liaison between the Internet and other
standards organizations and forum.
Internet Engineering Task Force (IETF)
The Internet Engineering Task Force (IETF) is a forum of working groups managed by
the Internet Engineering Steering Group (IESG). IETF is responsible for identifying operationalproblems and proposing solutions to these problems. IETF also develops and reviews
specifications intended as Internet standards. The working groups are collected into areas, and
each area concentrates on a specific topic. Currently nine areas have been defined, although thisis by no means a hard and fast number. The areas are:
Applications Internet Protocols Routing Operations
-
7/27/2019 Comp II....Unit 1
13/31
BBA IV Sem/CAII/Unit-1
User Services Network Management Transport Internet protocol next generation (IPng) Security
Internet Research Task Force
The Internet Research Task Force (IRTF) is a forum of working groups managed by the
Internet Research Steering Group (IRSG). IRTF focuses on long term research topics related to
Internet protocols, applications, architecture, and technology.
Internet Assigned Numbers Authority (IANA) and Internet Corporation for Assigned Names and
Numbers (ICANA)
The Internet Assigned Numbers Authority (IANA), supported by the U S government,
was responsible for the management of Internet domain names and addresses until October 1998.
At that time the Internet Corporation for Assigned Names and Numbers (ICANA), a private non-profit corporation managed by an international board, assumed IANA operations.
Network Information Center (NIC)
The Network Information Center (NIC) is responsible for collecting and distributing
information about TCP/IP protocols.
History of Internet
1960s Telecommunications:- Essential to the early Internet concept was packet
switching; in which data to be transmitted is divided into small packets of information andlabeled to identify the sender and recipient. The packets were sent over a network and then
reassembled at their destination. If any packet did not arrive or was not intact, the original sender
was requested to resend the packet.
ARPANET, 1969:- In 1969, Bolt, Beranek, and Newmann, Inc., (BBN) designed anetwork called the ARPANET for the United States Department of Defense. The
military created ARPA to enable researchers to share super-computing power. It
was rumored that the military developed ARPANET in response to the threat of anuclear attack destroying the countrys communication system.
1970s Telecommunications:- In this decade, the ARPANET was used primarily by the
military, some of the larger companies, such as IBM, and universities. The general populationwas not yet connected to the system and very few people were on line at work.
The use of Local Area Networks (LANs) became more prevalent during the 1970s. Alsothe idea of an open architecture was promoted; that is, networks making up the ARPANET could
have any design. In later years, this concept had a tremendous impact on the growth of the
ARPANET.
Twenty Three Nodes, 1972:- By 1972, the ARPANET was international, with nodesin Europe at the University College in London, England, and the Royal Radar
Establishment in Norway. The number of nodes on the network was up to 23, and the
-
7/27/2019 Comp II....Unit 1
14/31
BBA IV Sem/CAII/Unit-1
trend would be for that number to double every year from then on. Ray Tomlinson,who worked at BBN, invented e-mail.
UUCP, 1976:- AT & T Bell Labs developed UNIX to UNIX copy. In 1977, UUCPwas distributed with UNIX.
USENET, 1979:- User Network (USENET) was starting by using UUCP to connectDuke University and the University of North Carolina at Chapel Hill. Newsgroup
emerged from this early development.
1980s Telecommunications:- In this decade, Transmission Control Protocol/InternetProtocol (TCP/IP), a set of rules governing how networks making up the ARPANET
communicate, was established. For the first time, the term Internet was being used to describe
the ARPANET. Security became a concern, as virus appeared. As the Internet became longer,the Domain Name System was developed; to allow the network to expand more easily by
assigning names to host computers in distributed fashion.
CSNET, 1980:- The computer Science network (CSNET) connected all UniversityComputer Science departments in the United States. Computer Science departments
were relatively new and only a limited number existed in 1980. CSNET joined the
ARPANET in 1981. BITNET, 1981:- The Because Its Time Network (BITNET) formed at the City
University of New York and connected to Yale University. Many mailing lists
originated with BITNET.
TCP/IP, 1983:- The United States Defense Communication Agency required thatTCP/IP be used for all ARPANET hosts. Since TCP/IP was distributed at no charge,
the Internet became what is called an open system. This allowed the Internet to grow
quickly, as all connected computers were now speaking the same language. Centraladministration was no longer necessary to run the network.
NSFNET, 1985:- The National Science Foundation Network (NSFNET) was formedto connect the National Science Foundations (NSFs) five super-computing centers.
This allowed researchers to access the most powerful computers in the world, at atime when large, powerful, and expensive computers were a rarity and generally
inaccessible.
The Internet Worm and IRC, 1988:- The virus called Internet Worm (created byRobert Morris while he was a computer science graduate student at CornellUniversity) was released. It infected 10 percent of all Internet hosts. Also in this year,
Internet Relay Chat (IRC) was written by Jarkko Oikarinen.
NSF Assumes Control of the ARPANET, 1989:- NSF took over control of theARPANET in 1989. This changeover went unnoticed by nearly all users. Also, thenumber of hosts on the Internet exceeded the 1,00,000 mark.
1990s Telecommunications:- During the 1990s, lots of commercial organizations startedgetting on-line. This stimulated the growth of the Internet like never before. URLs appeared intelevision advertisements and, for the first time, young children went on-line in significant
numbers.
Graphical browsing tools were developed, and the programming language HTMLallowed users all over the world to publish on what was called the World Wide Web. Millions of
people went on-line to work, shop, bank, and be entertained. The Internet played a much more
significant role in society, as many nontechnical users from all walks of life got involved withcomputers. Computer literacy and Internet courses sprang up all over the world.
-
7/27/2019 Comp II....Unit 1
15/31
BBA IV Sem/CAII/Unit-1
Gopher, 1991:- Gopher was developed at the University of Minnesota, whose sportsteams mascot is the Golden Gopher. Gopher allowed you to go for or fetch files onthe Internet using a menu based system. Many Gophers sprang up all over the
country, and all types of information could be located on Gopher servers. Gopher is
still available and accessible through Web browsers, but its popularity has faded; forthe most part, it is only of historical interest. (gopher://gopher.well.sf.ca.us/)
World Wide Web, 1991:- The World Wide Web (WWW) was created by TimBerners-Lee at CERN (a French acronym for the European Laboratory for Particle
Physics), as a simple way to publish information and make it available on theInternet.
WWW Publicly Available, 1992:- The interesting nature of the Web caused it tospread, and it became available to the public in 1992. Those who first used the system
were immediately impressed.
Netscape Communications, 1994:- The company called Netscape Communications,formed by Marc Andreessen and Jim Clark, released Netscape Navigator, a Web
browser that captured the imagination of everyone who used it. The number users of
this software grew at a phenomenal rate. Netscape made its money largely through
advertising on its Web pages. Yahoo, 1994:- Stanford graduate students David Filo and Jerry Yang developed their
Internet Search Engine and directory called Yahoo, which is now world famous.
Java, 1995:- The Internet programming environment, Java, was released by SunMicrosystems, Inc. This language, originally called Oak, allowed programmers todevelop Web pages that were more interactive.
Microsoft Discovers the Internet, 1995:- The software giant committed many of itsresources to developing its browser, Microsoft Internet Explorer, and Internet
applications.
Netscape Releases Sources Code, 1998:- Netscape Communications released thesource code for its Web browser.
Internet Services
The Internet provides a mechanism for millions of computers to communicate, but what kind
of information is transmitted? Many services are available over the Internet, and the followingare the most popular ones.
1) E-Mail:- Enables people to send private message, as well as files, to one or more otherpeople.
2) Mailing Lists:- Enable group of people to conduct group conversations by E-mail, andprovide a way of distributing newsletters by E-mail.
3) On-line Chat:- Provides a way for real time online chatting to occur, whereby participantsread each others message within seconds of when they are sent.
4) Voice and video conferencing:- Enable two or more people to hear and see each other andshare other applications.
5) The World Wide Web:- A distributed system of interlinked pages that include text,pictures, sound, and other information.
6) File Transfer:- Lets people download files from public file servers, including a widevariety of programs.
-
7/27/2019 Comp II....Unit 1
16/31
BBA IV Sem/CAII/Unit-1
7) Remote Login:- There are two programs that allow you to login to another computer froman a/c in which you are already logged, they let you use and interact with s/w on remote
machine. To do this, you will need a second computer a/c and password that is accessible
to you.8) Internet Telephony:- As the name suggest, Internet Telephony involves the usage of the
Internet to transmit real time audio from personal computer to another(or in some
instance to other telephone itself)9) USENET:- It is a bulletin board service featuring a large no of discussion groups
involving millions of people around the world.
10)Archie:- It is an indexing service like library. The large number of FTP server andarchieved on the number of archie server on Internet.
11)Gopher:- Before web came into existence University of Minnesota, developed a systemcalled Gopher connecting Universities, Colleges and Government Authorities. Gopher
system is based on set of related menus. The entire interconnected Gopher servers are
collectively known as Gopher Space.12)Veronica:- It provides the archies services to Gopher. Veronica services are not
necessary always easier and faster as gopher server are widely distributed.
13)WAIS:- It is an Internet Service which looks for specific information from Internetdatabases. Searching is done by keywords and source documents are indexed for fastretrieval.
Basic Structure of Internet
Internet is the network of networks. Basic elements of Internet and associated
components are shown schematically in figure. Various terms have the following meanings: -
(a) Internet Service Provider (ISP):- ISP acts as an interface between end-users (whichcould be a stand alone PC or LANs) and Internet. ISP acts as main crossing of the town,
which allows traffic to come out of the town and join the national highway. ISP hasrouters and severs, through which it connects end-users to Internet backbone. For all
problems and management at end-user level, an end-user interacts with ISP only.
(b)Router:- A special purpose computer that directs the packets of data along a network.
(c) Gateway:- ISP gets connected to Internet's backbone through a Gateway. A Gateway
functions as a door to enter the Internet backbone. It connects number of ISPs to Internetbackbone. In India VSNL has been the sole Gateway service provider until recently.
However, private operators are now permitted to provide Gateway services.
(d) Internet Backbone:- Internet backbone is high bandwidth (high speed) fiber opticcable - on which numbers of routers are in place - and is managed through Network
Operations Center of Internet. The Internet backbone is of different bandwidth in various
segments.
The basic elements of Internet are a user (standalone PC or a LAN), ISP, routers,
gateways and Internet backbone. Thus, an end-user wishes to establish link with another user on
-
7/27/2019 Comp II....Unit 1
17/31
BBA IV Sem/CAII/Unit-1
LAN, goes through his LAN - ISP - Gateway and gets connected to distant end user throughGateway - ISP - LAN (refer figure).
FIG: BASIC ELEMENTS OF INTERNET
Intranet
An intranet is a private network (usually a LAN, but may be larger) that uses TCP/IP andother Internet standard protocols. Because it uses TCP/IP, the standard Internet
communications protocol, an intranet supports TCP/IP-based protocols, such as HTTP (the
protocol that web browsers use to talk to web servers), and SMTP and POP (the protocols that e-
mail programmes use to send and receive mail). In other words, an intranet can run web servers,web clients, mail servers, and mail clients. An intranet is a network for a single organization
with following features: -
It uses Internet technologyBrowser & TCP/IP All services available on Internet can be implemented on intranet It could be implemented on a single LAN or a combination of LANs It could be implemented on a MAN or WAN Intranet need not be connected with Internet (for outside connectivity it can be
through the Internet)
R
R
RR
RR R
R
Network
Operation
Centre
Internet
Backbone
Buil d
ing
Rout e
rsLAN I
LAN II Server
Router
ISP
Gateway
PC
PC
PC
PC
Serv
er
Stand Alone PC
LAN III
Ga
teway
LAN
ISP
LAN
LAN
RR
RR
RRRR
RRRR RR
RR
Network
Operation
Centre
Network
Operation
Centre
Internet
Backbone
Buil d
ing
Rout e
rs
Buil d
ing
Rout e
rsLAN ILAN I
LAN IILAN II ServerServer
RouterRouter
ISP
GatewayGateway
PCPC
PCPC
PCPC
PCPC
Serv
er
Serv
er
Stand Alone PC
LAN III
Ga
teway
Ga
teway
LANLAN
ISP
ISP
LANLAN
LAN
LAN
-
7/27/2019 Comp II....Unit 1
18/31
BBA IV Sem/CAII/Unit-1
It is a private Internet of an organisationArchitecture of Intranet
The architecture of intranet is shown in figure. A simplified intranet consists of
following components: -
(a) Workstations & Client Software. A PC with any Operating System (Win 95, 98,
Mac, Unix) that supports networking can be connected on intranet as a workstation. In
addition to other application programmes, workstations run client software that providesthe user with access to network servers. On an intranet a client software will typically
include (depending upon the services provided) a browser (MS Internet explorer,
Netscape Navigator), e-mail client (outlook Express), newsreaders, chat or FTP clients.
These clients may be integrated with the OS or add-on.(b) Servers, NOS & Server Software. This is an important area of intranet in respect
of hardware and software requirements, viz.,
(i) The servers provide services to the workstations connected with the
intranet. A network server is required to manage the LAN. Besides this,
depending on the services to be provided servers would be required, e.g., Web
server, mail server, FTP server, application servers and printer server.
(ii) Network Operating System (Windows NT, Unix, and Linux) is required to
run on Network server. Client part of NOS would require to be run onworkstations.
(iii) Server software includes web server, mail server etc. (depending on the
server & services required). Many intranet server programmes run on Unix andsome on NT. Lots offreeware and shareware server programmes are available
for Unix server programmes. Windows NT server comes with a Web server (MS
Internet Information Server).
(iv) Intranet also needs middleware, the software that provide the access to
database from a web browser, e.g., calls to the database programme to read and
write records.
(c) Network Cards, Cabling, Switches/Hubs. These are the components that are
required to setup LAN. Commonly used network adapter card isEthernet, most common
configuration of LAN is star topology and commonly used cables are CAT-5 or CAT-6UTP cables.
(d) Security Systems (Firewall). If intranet is connected to the Internet, we need to
control the kind of information that can pass between intranet and Internet. Thehardware, software and procedures that provide access control make up a firewall.
Firewall systems are of two categories, viz.,
-
7/27/2019 Comp II....Unit 1
19/31
BBA IV Sem/CAII/Unit-1
FIG: ARCHITECTURE OF INTRANET
(i) Network-Level Firewalls. These firewalls examine only the headers of
each packet of information passing to or from the Internet. The firewall accepts
or rejects packets based on the packets sender, receiver and port number (eachInternet service, such as e-mail or WWW has different pot number). For
example, firewall might allow e-mail/Web packets to and from any computer onthe intranet, but allow remote login packets to and from only selectedcomputers.
(ii) Application-Level Firewalls. These firewalls handle packets for eachInternet service separately, usually by running a programme called proxy server,
which accepts e-mail, Web, Chat, newsgroup and other packets from computers
on the intranet, strips off the information that identifies the packet and passes it
along to the Internet or vice versa. When the replies return, the proxy serverpasses the replies back to the computer that sent the original message. To the rest
of the Internet, all packets appear to be from the proxy server, so no information
leaks out about the individual computers on your intranet. A proxy server cankeep a log of all packets that pass by. The proxy server can be configured toallow one-way login and disallow the other way.
Switch
FTP
Server
News
Server
Email
Server
WWW
Server
Network
Server
Application
Server
Router
Firewall
Internet
Public Domain
Corp
LAN
Corporate
Intranet
Router
Firewall
External LAN
OR User
SwitchSwitch
FTP
Server
News
Server
Email
Server
WWW
Server
Network
Server
Application
Server
FTP
Server
News
Server
Email
Server
WWW
Server
Network
Server
Application
Server
RouterRouter
FirewallFirewall
Internet
Public Domain
Internet
Public Domain
Corp
LAN
Corp
LAN
Corporate
Intranet
RouterRouter
FirewallFirewall
External LAN
OR User
-
7/27/2019 Comp II....Unit 1
20/31
BBA IV Sem/CAII/Unit-1
Advantages and Disadvantages of an Intranet
LANs and intranets both let you share hardware, software, and information by connecting
computers together. You dont need an intranet to share files and printers, or to send e-mailamong the people on your network: an LAN can do those jobs. The following are some reasons
to convert a LAN to an intranet, or to connect your computers together into an intranet: -
(a) Intranets Use Standard Protocols. Internet protocols such as TCP/IP are used on ahuge number of diverse computers. More development is happening for Internet-based
communication than other types of communication. For example, intranet users can
choose from a wide variety of e-mail programmes, because so many have been writtenfor the Internet.
(b) Intranets are Scalable. TCP/IP works fine on the Internet, which has millions of
host computers. So you dont have to worry about your network outgrowing itscommunications protocol.
(c) Intranet Components are relatively Cheap and some are free. Because theInternet started as an academic and military network (rather than a commercial one),
there is a long tradition of free, cheap, and cooperative software development. Some of
the best Internet software is free, including Apache (the most widely used web server),
Pegasus, and Eudora Lite (two excellent e-mail client programmes).
(d) Intranets enable you to set up Internet-style Information Services. You can have
your own private web, using web servers on your intranet to serve web pages to membersof your organisation only. You can also support chat, Usenet, telnet, FTP, or other
Internet services privately on your network. Push technology (web channels) can deliver
assignments, job status, and group schedules to the users desktop via his or her browser.
(e) Intranets let People Share their Information. Everyone in your organisation can
make their information available to other employees by creating web pages for the
intranet. Because many word processing programmes can now save documents as webpages, creating pages for an intranet does not require a lot of training. Rather thanprinting and distributing reports, people can put them on the intranet and send e-mail to
tell everyone where the report is stored.
Of course, intranets have some disadvantages too, including these: -
(a) Intranets Cost Money. You may need to upgrade computers, buy new software,
run new cabling, and teach people to use the new systems.
(b) People in your organization may waste time. If you connect your intranet to the
Internet, people may spend hours a week watching sports results or checking their stockoptions. Even if you dont connect to the Internet, people can use the intranet to build
web sites about the company softball team and send e-mail about upcoming baby
showers. Youll need policies in place to determine how the intranet may be used.
-
7/27/2019 Comp II....Unit 1
21/31
BBA IV Sem/CAII/Unit-1
What can you do with an Intranet?
Many organisations, especially those with large existing computer systems, have lots of
information that is hard to get at. The intranet can change all that, by using Internet tools. Hereare some ideas/ways that your organisationlarge or smallcan use as an intranet.
(a) E-mail within the organisation and to and from the Internet. People can use onee-mail programme to exchange mail both with other intranet users and with the Internet.
(b) Private Discussion Groups. Using a mailing list manager or a news server
accessible only to people in your organisation, you can set up mailing lists or newsgroupsto encourage people to share information within departments or across the organisation.
(c) Private Websites. Each department in your organisation can create a website that
is accessible only to people on the intranet. Instead of circulating memos and handbooks,information can go on these web sites. For example, the human resource department can
post all employee policies, job postings, and upcoming training opportunities. The
marketing department can post information about products, including upcoming releasedates, how products are targeted, and other information that is not appropriate for a public
site on the Internet-based web. Every department can post web pages to shore its
information with the other departments in the organisation. By using the intranet instead
of printing on paper, it is economical to publish large documents and document thatchange frequently.
(d) Access to Legacy Databases. If your organisation has information that is lockedaway in an inaccessible database, you can convert the information to web pages so that
everyone on the intranet can see it. (Legacy systems are those considered outdated by
whoever is describing the system). For example, a non-profit organisation might have a
proprietary database containing all of its fundraising and membership information. Byusing a programme that can display database information as web pages and enter
information from web page forms into the database, all the people at the non-profit
organisation can see, and even update, selected information from the database by usingonly a web browser. Naturally, the programme would need to limit that could see andchange particular information in the database.
(e) Teleconferencing. Rather than spend huge amount on video teleconferencingsystems, think about using your intranet (and the Internet), instead. If your organisation
has offices in several locations, you can use the Internet for online chats with text, voice,
and even limited video.
Security Policies
In addition to a firewall, you need to take steps to make sure that the intranet is usedappropriately in your organisation: -
(a) Establish acceptable-use Policies. Post rules for using the intranet, including theuse of e-mail, the web, and discussion groups both within the intranet and on the Internet.
-
7/27/2019 Comp II....Unit 1
22/31
BBA IV Sem/CAII/Unit-1
(b) Monitor usage. It does not mean to suggest that you look over everyonesshoulders while they use the intranet, but make sure that someone monitors the content of
the intranets web sites and discussion groups. Look for copyright infringements,
personnel issues, and security lapses.
(c) Close the door behind Departing Employees. When someone leaves the
organisation, make sure that a system is devised to close the persons accounts, changepasswords, and deny other access to the intranet.
(d) Be Vigilant about Data in general, not just about the intranet. The intranets
connection to the Internet can certainly be a security hazard, but important data can alsowalk out your organisations door on a diskette in someones pocket, in a fax, or many
other ways.
Extranet
An extranet is a network that links selected resources of the intranet of a company with
its customers, suppliers and other business partners. Main features of extranet are: -
(a) The link between the intranet and its business partners is achieved throughTCP/IP, the standard Internet protocol.
(b) The extranet is an extended intranet, which isolates business communication fromopen Internet through secure solutions.
(c) Extranets provide the privacy and security of an intranet while retaining theglobal reach of the Internet.
(d) Extranets use cryptography and authorization procedures for securing data flowsbetween intranets through the Internet.Extranet connects intranets of business partners, suppliers, financial services, distributors,
customers etc by an agreement between collaborating partners. The emphasis is on allowingaccess to authorized groups through strictly controlled mechanism.
Extranets have led to true proliferation of e-commerce and act as an engine for B2B
collaboration. It is the combination of intranets and extranets, which has established the virtualcorporation paradigm. This new virtual paradigm of e-commerce allows corporations to take
advantage of any market opportunity anywhere, anytime and offering customized services and
products. It is this combination that provides the technological backbone for strategic advantage
to organizations in terms of reach, intensity, response time and innovative skills.
Architecture of Extranet
12. Figure shows the basic architecture of an intranet with its extension to one LAN or a
single user. This makes it an extranet. Similar logic can be extended to make it general
infrastructure of extranet plus intranets as shown in figure-2.
-
7/27/2019 Comp II....Unit 1
23/31
BBA IV Sem/CAII/Unit-1
FIG: ARCHITECTURE OF EXTRANET
Components of Extranet
Since extranet is an extension of intranet, the additional hardware and software that is
needed to extend an intranet, is: -
(a) Firewall servers and their software(b) Router(c) Internet connection (at least ISDN)
Basic Level Applications of Extranet
The basic level applications of extranet are given below: -
S No Service Applications
1 Secure e-mail For B2B Communications
2 Usenet Services Bulletin board services, one-to-many info
exchange, EDI messages, floating tenders
3 Mailing List Private one-to-many e-mail, online newsletter,
Internet
Public
DomainISP
ISP
ISP
ISP
Intranet
Company A
Location 1
Intranet
Company A
Location 2
Intranet
Company B
Intranet
Company C
Intra
netC
ompanyA
Extra
netC
ompB
&C
Extranet CompA&
C
Extra
net
CompA
&B
Internet
Public
Domain
Internet
Public
DomainISP
ISP
ISP
ISP
Intranet
Company A
Location 1
Intranet
Company A
Location 2
Intranet
Company B
Intranet
Company C
Intra
netC
ompanyA
Extra
netC
ompB
&C
Extranet CompA&
C
Extra
net
CompA
&B
-
7/27/2019 Comp II....Unit 1
24/31
BBA IV Sem/CAII/Unit-1
discussion group
4 File Transfer (FTP) Exchange of data between supply chains, between
Corp HQ & various companies, customer support
& sales data
5 Conferencing & Chat Electronic meetings
6 Remote login (Telnet) Access to databases & ERP software
7 Calendar Scheduling tasks
ISP
An ISP is a company that supplies Internet connectivity to home and business customers.
ISPs support one or more forms of Internet access, ranging from traditional modem dial-up to
DSL and cable modembroadband service to dedicated T1/T3 lines.More recently, wireless Internet service providers or WISPs have emerged that offer
Internet access through wireless LAN or wireless broadband networks.
In addition to basic connectivity, many ISPs also offer related Internet services like
email, Web hosting and access to software tools.A few companies also offer free ISP service to those who need occasional Internet
connectivity. These free offerings feature limited connect time and are often bundled with some
other product or service.ISP Architecture
As stated earlier, for availing the Internet services, each user must be connected to an
ISP. For each modem at the user end, there is corresponding modem at the ISP. ISP has number
of servers for each service that it provides. The versatility of the ISP can be measured by the
number and type of services (in terms of value addition) provided by it to its customers. Figure
shows the typical ISP architecture.
FIG: ARCHITECTURE OF AN ISP
WAIS
Server
Gopher
Server
News
Server
WWW
Server
Email
Server
Appl
Server
Mod
Mod
Mod
Mod
ISDN
Mod
ISDN
Terminal
Server
Dial-up
Terminal
Server
Modem Farm
Verify User log-in &
Password
BillingServer
Router
connection
To Internet
WAIS
Server
WAIS
Server
Gopher
Server
Gopher
Server
News
Server
News
Server
WWW
Server
WWW
Server
Email
Server
Email
Server
Appl
Server
Appl
Server
Mod
Mod
Mod
Mod
ISDN
Mod
ISDN
Terminal
Server
Dial-up
Terminal
Server
Modem Farm
Verify User log-in &
Password
BillingServerBillingServer
Router
connection
Router
connection
To Internet
-
7/27/2019 Comp II....Unit 1
25/31
BBA IV Sem/CAII/Unit-1
Searching
Searching the World Wide Web
With the advent of the World Wide Web came the wide spread availability of on-line
information. It is no longer necessary to travel to the library to find the answer to a question or
engage in research on a specialized topic. Much of what you might want to know is availabilitythrough the web. Since any one can publish on the web, the range of topics that can be found isnearly all encompassing. However, while a lot of information is available on-line, not all of it is
completely accurate.
In all likelihood, the answers to your questions are some where on the Web, but how do
you locate them? In the early days of the Web, unless you knew exactly where to look, you had
trouble finding what you wanted. Unlike a library, the pages on the Web are not as neatly
organized as books on shelves, nor are Web pages completely cataloged in one central location.Even knowing where to look for information is not a guarantee that you will find it, since Web
page addresses are constantly changing. Usually, a forwarding address is provided for a page that
has moved, but it may only be available for a short time.
The rapid growth of the Web, as well as its huge size, has ruled out trying to keep track
manually of What is what and What is where. As people were spending their time trying to
find things on the Web, rather than actually reading the material they were after, the firstdirectories and search engines were being developed. These tools allow you to find information
more quickly and easily. You have probably already been using these tools, but perhaps not as
effectively as possible.
Methods of Searching
1. Directories:- The first method of finding and organizing Web information is the directoryapproach. A Web directory or Web guide is a hierarchical representation of hyperlinks. The top
level of the directory typically provides a wide range of very general topics, such as arts,
automobiles, education, entertainment, news, science, sports, and so on. Each of these topics is ahyperlink that leads to more specialized subtopics. They in turn have a number of subtopics, andso on until you reach a specific web page.
In addition to being very easy to use, another benefit of a directory structure is you need
not know exactly what you are looking for in order to find something worthwhile. You select thecategory for the topic in which you are interested. You continue to move down through
hierarchy, selecting subcategories and narrowing the search at each level, until you are presented
with a list of hyperlinks that pertain to your topic.
As you begin with zero in on your topic, you may find other interesting items of whichyou were previously unaware. On the other hand, you may reach the bottom of the directory
without finding the information you were after. In such case, you may need to back track, going
up several levels and then proceeding down again. Of course, it is possible that the directory youare searching does not contain the information you want, in this case you may decide to try either
a different directory or a search engine.
When traversing a directory downward, you are moving toward more specific topics.When going upward, you are heading back to more general topics. Directories are useful if you
want to explore a tpic and its related areas, or if you want to research a subject, but not at a very
detailed level.
-
7/27/2019 Comp II....Unit 1
26/31
BBA IV Sem/CAII/Unit-1
If you are interested in a very specific topic, you may want to start off by using a searchengine or a meta search engine. Arriving at a very specific topic in a directory structure involves
traversing between five and ten hyperlink level.
Note that while the directory structure is logically organized as a hierarchy, a specificWeb page may occur in many different parts of the hierarchy. There is usually more than one
way to reach a given page.
Popular Directories
AOL NetFind - www.aol.com/netfind CNET Search.com - www.search.com Excite - www.excite.com Infoseek - www.infoseek.com Looksmart - www.looksmart.com Lycos - www.lycos.com Magellan - www.mckinley.com Yahoo - www.yahoo.com Rediff - www.rediff.com
2. Search Engine:- The second approach to organizing information and locating information on
the Web is a search engine, which is a computer program that does the following:
(a)Allow you to submit a form containing a query that consists of a word or phrasedescribing the specific information you are trying to locate on the Web.
(b)Searches its database to try to match your query.(c)Collates and returns a list of click able URLs containing presentations that match your
query; the list is usually ordered, with the batter matches appearing at the top.
(d)Permits you to revise and resubmit a query.A number of search engines also provide URLs for related or suggested topics.
Many people find that search engines are not as easy to use as directories. To use a search
engine, you supply a query by entering information into a field on the screen. To be effective,that is, to have the search engine return a small list of URLs on your topic of interest, you oftenneed to be very specific. To pose such queries, you must learn the query syntax of the search
engine with which you are working. Learning the syntax so that you can phrase effective and
legal queries often requires that you read and understand the documentation accompanying thesearch engine. A hyperlink to the documentation is usually provided next to the query field, and
example queries are often given.
Once you learn to use a specific search engine query language effectively, you can
quickly zoom in on very narrow topics, this is the advantage of a search engine. Thedisadvantages are that you have to learn the query language and you have to learn a search
strategy.
The user-friendliness and power of query languages vary from search engine to searchengine. We recommend you try several of them and then learn the syntax of one search engines
query language. Since each search engine searches a different database, you would be best off
learning about a search engine that has indexed an gauge this by posing similar queries to anumber of search engines and seeing which one finds the best matches.
http://www.aol.com/netfindhttp://www.aol.com/netfindhttp://www.search.com/http://www.search.com/http://www.excite.com/http://www.excite.com/http://www.infoseek.com/http://www.infoseek.com/http://www.looksmart.com/http://www.looksmart.com/http://www.lycos.com/http://www.lycos.com/http://www.mckinley.com/http://www.mckinley.com/http://www.yahoo.com/http://www.yahoo.com/http://www.rediff.com/http://www.rediff.com/http://www.rediff.com/http://www.yahoo.com/http://www.mckinley.com/http://www.lycos.com/http://www.looksmart.com/http://www.infoseek.com/http://www.excite.com/http://www.search.com/http://www.aol.com/netfind -
7/27/2019 Comp II....Unit 1
27/31
BBA IV Sem/CAII/Unit-1
Popular Search Engines
AOL NetFind - www.aol.com/netfind Excite - www.excite.com Infoseek - www.infoseek.com Looksmart - www.looksmart.com Lycos - www.lycos.com Magellan - www.mckinley.com Yahoo - www.yahoo.com Rediff - www.rediff.com AltaVista - altavista.digital.com Hot Bot - www.hotbot.com Google - www.google.com Web Crawler - www.webcrawler.com
3. Meta Search Engine:- A meta search engine or all-in-one search engine performs a search by
calling on more than one other search engine to do the actual work. The results are collated,
duplicate retrievals are eliminated, and the results are ranked according to how well they matchyour query. You are then presented with a list of URLs.
The advantage of a meta search engine is that you can access a number of different search
engines with a single query. The disadvantage is that you will often have a high noise-to-signalratio; that is a lot of matches will not be of interest to you. This means you will need to spend
more time evaluating the results and deciding which hyperlinks to follow.
For very specific, hard to locate topics, meta search engines can often be a good starting
point. For example, if you try to locate a topic using your favorite search engine, but fail to turn
up anything useful, you may want to query a meta search engine.
Popular Meta Search Engine
Meta Search - www.metasearch.com Meta Crawler - www.metacrawler.com Meta Find - www.metafind.com Savvy Search - guaraldi.cs.colostate.edu:2000
4. Web Ring:- A web ring is community of related Web pages that are organized into a circular
ring. Each page in a ring has links that enable visitors to move to an adjacent site on the ring,
access a ring index or jump to a random site. Web sites are added continuously to the web rings.Each ring is managed from one of the sites. Web rings are fun to visit, but they do not contain the
volume of information of the other search tools. Currently, web rings are available on many
topics, including acrobatics, religion, Spanish Hotels, Disney Land, medieval studies. Most webrings are devoted to games. Web ring home page at www.webring.com contains more
information on the web rings and how to search web rings. Another devoted to web rings is the
ring surf site, located atwww.ringsurf.com.
http://www.aol.com/netfindhttp://www.aol.com/netfindhttp://www.excite.com/http://www.excite.com/http://www.infoseek.com/http://www.infoseek.com/http://www.looksmart.com/http://www.looksmart.com/http://www.lycos.com/http://www.lycos.com/http://www.mckinley.com/http://www.mckinley.com/http://www.yahoo.com/http://www.yahoo.com/http://www.rediff.com/http://www.rediff.com/http://www.hotbot.com/http://www.hotbot.com/http://www.google.com/http://www.google.com/http://www.webcrawler.com/http://www.webcrawler.com/http://www.metasearch.com/http://www.metasearch.com/http://www.metacrawler.com/http://www.metacrawler.com/http://www.metafind.com/http://www.metafind.com/http://www.webring.com/http://www.webring.com/http://www.ringsurf.com/http://www.ringsurf.com/http://www.ringsurf.com/http://www.ringsurf.com/http://www.webring.com/http://www.metafind.com/http://www.metacrawler.com/http://www.metasearch.com/http://www.webcrawler.com/http://www.google.com/http://www.hotbot.com/http://www.rediff.com/http://www.yahoo.com/http://www.mckinley.com/http://www.lycos.com/http://www.looksmart.com/http://www.infoseek.com/http://www.excite.com/http://www.aol.com/netfind -
7/27/2019 Comp II....Unit 1
28/31
BBA IV Sem/CAII/Unit-1
Search Terminology
Here are a few common search related terms we should know about.
Search Tool:- Any mechanism for locating information on the Web, usually refers to asearch or meta search engine or a directory.
Query:-Information entered into a form on a search engines Web page that describes theinformation being sought. Query need not be a question. Invariably a word or a phrase is
used. A phrase is put within the quotes e.g. Indian Tigers. Query Syntax:- A set of rules describing what constitutes a legal query. On some search
engines, special symbols may be used in a query. Syntax defines the grammar of the
query writing. Each search engine may have different syntax rules that are available in
Help menu of the search engine. Query Semantics:- A set of rules that defines the meaning of a query. Page View:- The viewing of one specific HTML file without counting any graphics or
other items on the page is referred to as page view rate.
Hit/Match:- A URL that a search engine returns in response to a query. Commonlythought of as the number of times a page on a web site is requested by a browser but thisis not accurate. Hits also includes the number of times all other files, such as graphic,
images are viewed. For example, if your home page has nine graphics on it, each time
someone views your home page, the log file registers one hit for the HTML file and ninehits for the graphics, for a total of ten hits. Because the term hits has such an
ambiguous meaning, most people are now measuring traffic in terms of page views. Visit:- All the pages viewed by a user within a continuous session, which can include a
single HTML file or a visit that lasts for a given duration, is called visit.
Relevancy Score:- A value that indicates how close a match a URL was to a query;usually expressed as a value from 1 to 100, with the higher score meaning more relevant.
Search Engine Components
If you understand how a search tool works, there is a good chance you will be able to useit more effectively. For the most part, these same ideas apply to directories; the main differenceis that the hierarchical organizational structure and categorizations for directories need to be in
place and displayed. The references include additional information about how directories are put
together.To describe how a search engine works, we split up its functions into a number of
components: user interface, searcher, and evaluator.
User Interface:- The screen in which you type a query and which displays the search results.Searcher:- The part that searches a database for information to match you query.
Evaluator:- The function that assigns relevancy scores to the information.
In addition, a search engines database is created using the following.
Gatherer:- The component that traverses the Web, collecting information about pages.Indexer:- The function that categorizes the data obtained by the gatherer.
-
7/27/2019 Comp II....Unit 1
29/31
BBA IV Sem/CAII/Unit-1
For comparison, think of the different facets of a typical library, such as a acquisitions,cataloging, indexing, and on-line searching.
User Interface:- The user interface must provide a mechanism by which a user can submitqueries to the search engine. This is universally done using forms. In addition, the user interface
needs to display the results of the search in a convenient way. The user should be presented with
a list of hits from their search, a relevancy score for each hit and a summary of each page thatwas matched. This way, the user can make an informed choice as to which hyperlinks to follow.
Searcher:- The searcher is a program that uses the search engines index and database to see if
any matches can be found for the query. Your query must first be transformed into a syntax thatthe searcher can process. Since the databases associated with search engines are extremely large
(with perhaps 25,000,000 to 50,000,000 indexed pages), a highly efficient search strategy must
be applied.
Evaluator:- The searcher locates any URLs that match your query. The hits retrieved by your
query are called the result set of the search. Not all of the hits will match your query equally
well. For example, a query about Honey Bees might be matched by a page containing thephrase Honey Bees in the following sentence:
Ants, honey bees, and crickets are all insects.
Or by the page title
Everything You Ever Wanted To Know About Honey Bees.
Clearly, in most cases, it would be better to rank this second page much higher, as it
probably contains many more references to Honey Bees.
The ranking process is carried out by the evaluator, a program that assigns a relevancy
score to each page in the result set. The relevancy score is an indication of how well a given page
matched your query.How is the relevancy score computed by the evaluator? This varies from search engine to
search engine. A number of different factors are involved, and each one contributes a different
percentage towards the overall ranking of a page. Some of the factors typically considered are:
a) How many times the words in the query appear in the page.b) Whether or not the query words appear in the title.c) The proximity of the query words to the beginning of the page.d) Whether the query words appear in the CONTENT attribute of the meta tag.e)
How many of the query words appear in the documents.
Some search engines also consider other factors in computing a relevancy score. Each
factor is weighted, and a value is computed that rates the page. The values are usuallynormalized and are assigned numbers between 1 and 100, with 100 representing the best possible
match. As part of the user interface, the result set and relevancy scores computed by the
evaluator are displayed for the user. With the best matches appearing first. Hyperlinks to each hitare provided and a short description of the page is usually given.
-
7/27/2019 Comp II....Unit 1
30/31
BBA IV Sem/CAII/Unit-1
Gatherer:- A search engine obtains its information by using a gatherer, a program that traversesthe Web and collects the information about web documents. The gatherer does not collect the
information every time a query is made. Rather the gatherer is run at regular intervals, and it
returns information that is incorporated into the search engines database and is indexed.Alternate names for gatherer are bot, crawler, robot, spider, and worm.
Indexer:- Once the gatherer retrieves information about Web pages, the information is put into adatabase and indexed. The indexer function creates a set of keys (an index) that organizes thedata, so that high-speed electronic searches can be conducted and the desired information can be
located and retrieved quickly.
Types of QueriesTwo types of queries are generally used for surfing-
(a)Pattern Matching Queries:- It is the most basic type of query, which is used. Toformulate a pattern-matching query a keyword or a group of keywords are used and typedin query submission form. The search engine returns the URL of any page that contains
these keywords. The result set varies from one search engine to other. The search result
may vary if singular or plural words are used. A space between two words treats them astwo words. We can also use (+) and (-) signs to include or exclude a word from the query
words, e.g. the query +Indian+Lion-Tiger will search for the words Indian and Lion but
not Tiger. Any words within the quotes are taken as one word or phrase. These syntax
rules may vary with different search engines. For details one must go through the Helpsupport of that search engine.
(b)Boolean Queries:- Boolean queries involve Boolean operations AND, OR and NOT.Most search engines permit to enter Boolean queries. Some example of Boolean queriesis given below-
(i) Lion AND TigerWill show all pages that contains both Lion and Tiger.(ii) Lion OR Tiger Will show all pages that contains either Lion or Tiger or
both, i.e. at least one of the word.(iii) Lion NOT TigerWill show all pages that contains information about Lion
but not Tiger. Thus, Boolean NOT operation is used to exclude a word.
Search Strategies
Determining which search engine to use can be challenging. You can begin by testing a
number of different search engines, trying to find one that you believe meets the following
conditions:
Possesses a user friendly interface. Has easy to understand, comprehensive documentation. Is convenient to access; that is you do not have to wait several minutes before being able
to submit a query. Contains a large database, so that it knows a lot about the information for which you are
searching.
Does a good job in assigning relevancy scores.If you can find a search engine that meets most of these criteria, you should concentrate on
learning it well, rather than learning a little bit about several different search engines.Once you have learned a query syntax of that search engine, you can begin to formulate
your search strategy. When you post queries to the search engine, two common situations can
-
7/27/2019 Comp II....Unit 1
31/31
BBA IV Sem/CAII/Unit-1
occur: either your query does not turn up a sufficient number of hits, or your query turns up toomany hits. In the next sections, you will learn strategies for dealing with these situations.
1. Too Few Hits : Search GeneralizationSuppose your query returns no hits or only a couple of hits, neither of which is very useful to
you. In this case, you need to generalize your search. The ways to do this include: If you used a pattern matching query, eliminate one of the more specific keywords from
your query.
If you used a Boolean query, remove one of the keywords or phrases with which youused AND, or delete a NOT item you specified.
If you restricted your search domain, enlarge it. If you are still having no luck, try keywords that are more general, or exchange a couple
of the keywords with synonyms.
If this fails, you may decide to use a directory and work your way down to the topic ofinterest. Another alternative would be to use a metasearch engine.
2. Too Many Hits : Search SpecializationSuppose your query returns more URLs than you could possibly look through. In this case,
you need to specialize your search.
If you started with a pattern matching query, you may want to add more keywords. If you began with a Boolean query, you might want to AND another keyword, or use the
NOT operator to exclude some pages.
If you are still retrieving too many hits, try capitalizing proper nouns or names. If nothing seems to work, try reviewing the first 20 or son URLs, since search engines list
the best matches near the top. If they do not contain what you are looking for, theinformation they do contain may help you refine your search.
If this fails, you could resort to a directory and work your way down to the topic ofinterest.