Managing local area networks

7

Click here to load reader

Transcript of Managing local area networks

Page 1: Managing local area networks

Managing local area networks

David Coffield and David Hutchison consider the need to provide some form of local area network management

This paper considers the need to provide some form of local area network management. Although Ethernet systems form the main example, the requirements are the same, in general, for other local networks. The authors argue why network management is necessary, examine the difficulties involved, and take a look at the current standards work in the area~ Finally, they discuss management of their own particular local area network environment.

Keywords: management computer networks, LANs, Ethernet

A computer network can be simply defined as an inter- connected collection of computers and computer-based devices including terminals. A great advantage of networked machines is that they can exchange inform- ation, offering such facilities as electronic mail, file transfers/facsimile and voice transmission. The connec- tion is generally made by some form of copper cable, although satellites and fibre optics can also be used.

Networks are generally classified as wide area networks (WANs) or local area networks (LANs). In WANs, machines communicate via a series of switching nodes. Data to be transferred is broken into a series of packets, each of which is sent from the source host to the destination host viathe switches. Each packet maytravel bythe same route or by completely different routes - - it depends on the rules, or protocols, that the individual network uses.

There are now hundreds of different WANs in use throughout the world. In the USA, one of the largest is the Defense Advanced Research Projects Agency's (DARPA) Catenet, which is an example of an internetwork - - a collection of interconnected packet switched networks. In the UK, British Telecom's Packet SwitchStream (PSS)

Department of Computing, University of Lancaster, Bailrigg, Lancaster, LA1 4YR, UK. David Coffield is a CASE postgraduate research student sponsored jointly by the UK Science and Engineering Research Council and Digital Equipment Company, Scotland Ltd.

0140-3664/85/050240-07 $03.00 ©

240

is widely used, and the Joint Academic Network (JANET) serves the academic community.

LANs are a fairly recent development encouraged by the general fall in the cost of computer hardware, by the increasing number and variety of peripheral equipment available, and especially by improvements in chip technology. Unlike WANs, which are usually operated by a country's telephone companies, LANs belong to the user organization that installs them and are therefore 'free' as regards operational tariffs. They possess much faster transmission speeds than WANs. This is due to the physical media that LANs use and the fact that only short distances are spanned, typically confined to a single building or a campus environment - - say up to a maximum of 10 km. LANs are also highly reliable, again owing to the media used and the fact that many LAN technologies make use of passive components.

The use of LANs is increasing, especially in office automation and real-time process control environments.

TOPOLOGY AND PROTOCOLS

WANs consist of point-to-point links between switching elements and have an arbitrarily complex mesh topology. LANs, on the other hand, fall into one of three main topologies: star, bus or ring (see Figure 1).

'

Figure 1. LAN topologies

1985 Butterworth & Co (Publishers) Ltd

computer communications

Page 2: Managing local area networks

nsttuockt--

All networks have protocols by which they operate. A protocol is basically a set of rules defining the procedures, both software and hardware, that will allow any machine on the network to communicate with any other.

A common set of protocols is provided by the Inter- national Standards Organization's Open Systems Inter- connection Reference Model 1. The protocols differ between WANs and LANs and from manufacturer to manufacturer, but the architectures of most systems are similar to the OSl model, although the names of the individual layers may be different. Work has been pro- gressing on standards for the individual layers from the Physical layer upwards. The Physical layer is defined in the X.21 standard ~, the data link in High-level Data Link Control (HLDC) 3, and the first three layers are all covered in the X.25 standard 4.

Work is progressing in the layers above these. In the meantime, the UK academic community has been using an interim set of standards, based on what are known as the 'coloured book protocols'. These are a series of protocol definitions for various network tasks, e.g. the Blue Book defines file transfer and the Grey Book defines mail. As the ISO standards become more stable they will replace the 'coloured book protocols'; the transition from one to the other is currently under discussion.

A reference model has been proposed specifically for LANs by the Institute of Electrical and Electronic Engineers' (IEEE) special project committee 802 and the European Computer Manufacturers Association (ECMA) s. Figure 2 compares this with the OSI model.

NETWORK MANAGEMENT Wide area networks The term 'network management' applied to WANe generally refers to functions such as auditing and

ISO OSl Reference Model

Application 7

Presentation 6

Session 5

Transport 4

Network 3

Data link 2

Interface to upper layers

III

" " ! I Logical link control Medium access control

Physical 1 Physical signalling

I I / M e d i u m / / M e d i u m /

Figure 2. I_AN reference model compared with model

OSl

accounting. This is because users, in using the network for communication, are consuming a service provided by a common carrier. The carrier has to know how much to charge users per time period and, therefore, has to have statistics on who sent how many packets, and when. Charging is primarily related to the number of hops a packet has made in travelling to its destination node and, as this route is not necessarily the optimum, on any 'regional boundaries' crossed.

Gathering this type of information also enables the carrier to determine whether the facilities provided are adequate; e.g. if congestion is a frequent problem between two particular nodes, then perhaps another route between them should be arranged.

Thus, there is a need for network management in WANe; hence the increasing number of available moni- toring tools and software for them.

Local area networks Network management for LANe has only recently come under investigation. Although there is now considerable interest, little work has yet been done. This section discusses the management of LANe in general terms. Although the 'high-level' aims of management systems are similar for all LAN architectures, the 'low-level' methods of achieving them are not. The high-level ideas discussed here are relevant to all LAN architectures.

There are two main application areas in which LANe are being put to use:

• The loosely coupled interconnection of computers and devices in a local area. This provides the basis for resource sharing types of application.

• The distributed computing system where the LAN plays the part of an extended backplane to distributed nodes that make up a more closely coupled computer system.

Ethernet is currently the main LAN technology being used by systems manufacturers in both classes of application.

Many people argue that there is no need for network management for LANe. An Ethemet LAN is 'simply' a length of coaxial cable with a few branch lines coming from it to the resident machines on the network. The net is local after all, so what is there to manage?

As a simple example, consider that a single Ethemet segment of coaxial cable can be a maximum of 500 m in length, and on to this segment (;an be connected up to 100 nodes. If several segments are connected via repeaters, it is possible to have several hundred nodes (the maximum for a single Ethemet is 1024). If gateways are added for interconnection to other LANe and/or WANe, the network can grow to a system much more complex than previously envisaged. This may lead, in practice, to someone being given responsibility for the network, i.e. becoming 'network manager'. The network manager will require information to assist in the planning and development of the network and also for general maintenance of the network's present configuration.

vol 8 no 5 october 1985 241

Page 3: Managing local area networks

Much management information is present on any LAN but is generally hidden from the users, the majority of whom do not need it in any case. To acquire and accumu- late the necessary data for management, certain software tools are necessary, e.g. traffic monitors. However, these are not as readily available for LANs as they are for WANs. The general belief, so far, seems to be that they are unnecessary.

Nevertheless, a number of bodies have begun to recognize the eventual need for a network management system for LANs. ECMA, in particular, have put forward a proposal to add a network management function to their ECMA-82 standard 6. They define the network management system as having three main functions:

• to aid network planning, i.e. help users determine when and how to expand the network;

• to assist in daily operations, such as start up, monitoring and performance optimization;

• help with maintenance, i.e fault detection, isolation and repair.

They state that the network management system should be placed at the Link Layer level of the OSI model, the reason being that it is at this level that a data packet has the greatest amount of framing information surrounding it. Through the interface, at this level, network manage- ment can perform two main functions:

• Configuration control: 0 initiating, suspending and resuming Link Layer

operation, 0 setting the station's physical address, 0 setting the station's addressing modes (normal,

promiscuous, multicast). • Observation of:

0 the values of certain data link state variables and parameters,

0 activity data (frames sent and received, bad frames, collisions),

0 error conditions (operation of carrier sense and collision detect).

The ISO are themselves investigating network manage- ment, as are the IEEE. Both approaches are reviewed in the section on current working documents below.

The COST 11 BIS I_AN Management group 7 have been looking at network management for real time applications, although many of their ideas are applicable to other application areas such as office automation. The COST 11 group argue that the ISO management model is not quite good enough because it appears to be aimed at 'systems management' activities such as resource control, appli- cation process management and commitment control.

There is the possibility of confusion here as to what is meant by LAN management. The difference between 'systems mangement' and 'network management' is subtle. For effective systems management there has to be effective LAN management. They are not the same. Distributed systems management should be considered as an application built on the lower level of LAN management.

Having defined their own management model, the group then split systems management (their definition) into the usual categories of configuration, maintenance, and performance measurement and optimization.

PREVIOUS WORK

A detailed examination of the work that has been done is outside the scope of this paper, but it is mostly WAN based. Some examples include:

• University College London's work in managing their distributed service environment 8. They have a large network configuration consisting of interconnected Cambridge Rings and gateways to JANET and to Arpanet in the USA.

• Bell Laboratories' network management is described by Coates and Mackey 9, giving details of the Bell Labs network expansion and the consequent necessity for network management.

• Digital's network management in DECnet 1° is an actual layer specified in their network architecture. DEC seem to have realized the importance of network manage- ment prior to ISO, IEEE and others who are now busy defining standards to allow implementation.

• The Hatfield Polytechnic network monitoring work described by Vassiliades ~. They use a Cambridge Ring and run WAN protocols (Digital Data Communication Message Protocol -- DDCMP) on top of basic Cambridge Ring protocols.

• Network management facilities for JANET (previously known as SERCnet) are described by Kummer 12.

CURRENT WORKING DOCUMENTS

This section reviews some of the work in progress on defining standards for network management.

IEEE approach

The IEEE refer to systems management across the layers of the OSI model 13. Systems management is achieved through layer management. The systems management facilities considered necessary by the IEEE are:

Initialization and closedown. A means of remotely initiating, resetting and closing down whole communi- cations systems and individual nodes should be provided.

Software load and dump. Some nodes may require soft- ware load and dump facilities. Multicasting would allow parallel loading if this was required.

Routing and configuration control. Means must be pro- vided to inform directory functions of configuration changes for the purpose of updating a configuration database. Also, means must be provided to allow an

242 computer communications

Page 4: Managing local area networks

entity to obtain routing information from a remote directory.

Software distribution. Facilities for managing the distri- bution and installation of protocol updates are necessary.

Event reporting. An 'event' is an occurrence of something that should be brought to the attention of a management entity, e.g. a counter about to overflow, or an attempt to breach security. A standard method for generalized event reporting is necessary. This will be a basic tool upon which error handling, statistics, accounting, tracing and so on may be built. All of these services are based on events that will be recorded. External utilities using the data as input will build the information necessary to perform these additional services.

Error handling. A method of remote notification of error or exceptional situations is required. Also, a reflect-test capability may be required on a per layer basis. Together, these facilities allow passive and active fault detection by a manager. In addition, it may be desirable to allow configuration of a Ioopback-test facility.

Statistics. Traffic statistics, retry counts, and error levels will be achieved by the provision of counters or meters within the appropriate layer. A means should be provided to read (and reset) the counters.

Accounting. The correlation of statistics accumulated within several layers to allocate costs. This is achieved by an application which makes use of the statistical facilities of systems management.

OSlapproach

The ISO state that the management necessary depends on the environment, and management tools can be basic or extensive when it comes to implementation 14. OSI management should be optional and have a basic optional set of functional capability. Management func- tions are listed as for the IEEE. The functions are stated as being the means of fulfilling management requirements, which are as follows:

Standard management activities to allow the planning; organization, supervision and control of the communication services. Requirements include: allocation, deallocation, access control and status indication of communications- objects; configuration management (logical connectivity); activation and deactivation; monitoring (including deadlock and its prevention); statistical reporting and accounting; appropriate command and response languages; and time determination.

Flexibility to accommodate changing and new applications as a consequence of changing requirements, such as reconfiguration and the handling of names and their synonyms.

Open systems which support their applications in a secure and predictable manner. Included are requirements for:

repeatability; integrity against system malfunction and interference from other users; use of commitment control; and the testing of application behaviour.

Protecting their information and authenticating their sources of, and destinations for, information. Included are requirements for: authentication and name validation; encryption and key management; and log on (identifi- cation) and log off.

Reliable reports whenever failure of open systems components renders their applications unavailable or unreliable. Included are requirements for: error reporting and recovery; failure diagnostics; and journalling.

Monitoring and controlling costs. Included are require- ments for. accounting information; cost parameters; and performance monitoring and audit trails.

Specifying the performance criteria and the quality of service required for their applications.

Communicating with managers responsible for open systems. Requirements include appropriate command and response languages, and complants and requests.

It should be remembered that the OSI document is not specifically aimed at LANs.

MAP approach

General Motors are presently defining a MAP (Manu- facturing Automation Protocol) for use on LANs within their manufacturing plants is. As part of that work they have outlined their approach to network management.

MAP places the management responsibility on a network manager, which is aided in its function by a series of system managers, one of which resides at each node. MAP divides network management into five elements:

Monitoring. Responsible for the collection and storage of information on the network. Examples of monitoring functions include: traffic throughput in packets; traffic throughput in messages; packet response time; and mean packet size, etc.

Control. This network management element contains those functions that are used to directly alter the state of network devices including recovery from catastrophic failures. Examples of control functions include: initiating an IPL (Initial Program Load); validating node operation; comparing directories for integrity; resetting statistics; and Ioadin~dumping memory.

Configuration. Those functions that alter the network topology and propagate this information to any affected nodes. Examples of configuration functions include: changing node address directories; generating a topology map via a report generator; and generating a node status list via a report generator.

Problem determination. Problem Determination Tech- niques (PDT) are used to recognize the existence of a

vol 8 no 5 october 1985 243

Page 5: Managing local area networks

problem, track down the source, and take corrective action. PD action is usually initiated after an error condition has been identified by a monitor function. The PDT element is 'intelligent' in that it obtains additional information about the problem and derives a solution. Some examples of PD diagnostics include: line status/ condition; processor status/condition; modem status/ condition; modem Ioopback techniques; and counter inquiry/text.

Recovery. This element is made up of procedures that describe what action and functions must be used to recover the network from catastrophic failures. These procedures serve as a guide for the use of the functions contained in the other four elements of network manage- ment. Some examples of recovery procedures include: graceful shutdown; forced shutdown; status verification; physical recovery; and restarting a node.

The OSI document is not directly LAN based; rather, it refers to the OSI architecture in general. The I EEE and MAP documents are I_AN specific.

The individual approaches to defining the architecture is different in each case. The IEEE architecture seems to have been designed from the bottom up. It is a detailed, and hence somewhat confusing, document. The OSI architecture is more top down, with slighly more emphasis on what management should achieve rather than how to achieve it. The MAP document follows more of a middle course. It is aimed at factory-based, real-time LANs, where effective network management is particularly important. All three documents, however, vary on the terminology used.

Although the basic goals of each architecture are similar, as are the means of achieving those goals, the variations in terminology do not make this clear. The one idea that is common to all is the provision, in each layer, of network management facilities. Such facilities would typically be processes that are active and can be accessed by the layers above. Common to all three approaches is the concept of layer management, on which network management is built.

OTHER POINTS IN M A N A G I N G A LAN

Managing an Ethemet is a different proposition to managing a WAN. Ethernet is a reliable technology with well shielded coaxial cable, and the transceiver taps are passive. More dangerous failure modes may concern the higher-level protocol software.

Line-cost management is not a problem of Ethemet systems. The cost of use is constant. Getting more performance out of a given set of nodes, thus avoiding having to add equipment, requires weighing of costs and benefits.

Tuning for packet length can have a positive effect on general system performance. Also, tuning the packet- buffering requirements of performance-sensitive nodes can add significant performance benefits.

Some support for system simulation may be useful: the ability to create certain conditions and observe the results, e.g. what might happen if the network were fully loaded.

NETWORK M A N A G E M E N T HARDWARE SUPPORT

The group responsible for the IEEE 802.3 Ethernet standard have proposed a set of statistics counters and events that should be available for use by a network management entity (see Table 1) 16 . The intention is that these counters and event signals should be built into the Ethemet con- troller boards of all machines. Nineteen counters and events are listed in Table 1, of which only five counters are stated to be mandatory; these are all frame oriented.

Table 2 shows the counters available on the Deuna - - the Ethernet-to-Unibus controller board for the Digital range of Vax and PDP minicomputers 17.

Ignoring the confusion in terminology between the two figures - - assume that a frame is the same as a packet

Table 1. Proposed IEEE 802.3 counters and events

Measurement Classification

Number of frames received OK M Number of bytes received OK O Number of multicast frames received OK O Number of frames received with frame check sequence error M Number of frames received with alignment error M Number of frames received with length error M Frame received longer than the permitted maximum O (SE) Number of good frames lost due to lack of Rx resources R Number of frames transmitted OK R Number of bytes transmitted OK O Number of pad bytes transmitted O Number of transmissions with collision O Number of transmissions with more than one collision (but less than the maximum) 0 Transmission aborted owing to late collision R (SE) Number of Tx Frames aborted owing to excessive collisions M Heartbeat failure R (SE) Carrier sense R (SE) Result of time domain reflectometer test R Number of Tx frames that had to be deferred on first try R

M = m a n d a t o ~ , O = opt iona l , R = r e c o m m e n d e d , SE = Signal event, Tx = transmit( ted), Rx = receive(d)

244 computer communications

Page 6: Managing local area networks

neluml e

Table 2. Deuna on board counters

Seconds since last zeroed Packets received Multicast packets received Packets received with error Data bytes received Multicast data bytes received Receive packets lost - - internal buffer error Receive packets lost - - local buffer error Packets transmitted Packets transmitted 3+ attempts Packets transmitted 2 attempts Packets transmitted - deferred Data bytes transmitted Multicast data bytes transmitted Transmit packets aborted Transmit collision check failure

- - the following differences may be noted. All five of the proposed IEEE counters are present on the Deuna, viz: Number of frames received OK; Number of frames received with CRC error; Number of frames received with alignment error; Number of frames received with length error; and Number of Tx frames aborted owing to excessive collisions.

Of the recommended counters, only the number of frames transmitted OK, and the number of frames lost owing to lack of Rx resources are recorded.

Three of the optional counters are implemented: number of bytes received OK; number of multicast frames received OK; and number of bytes transmitted.

The Deuna, therefore, compares favourably, considering that it was available prior to the I EEE proposal. It may well be that the IEEE considered the Deuna, the facilities it provided and those it lacked.

LOCAL E N V I R O N M E N T

Lastly, this section gives a brief description of the problems in managing our own local environment. We possess an Ethemet, illustrated in Figure 3, to which are

Sun Sun Vax X.25 link Vax Vax 750 ~ 780 785

ET] [ ~ [ ~ ~.~eaj~l E ~ E ~

~.] ~ Segment 1 Segment 2

Sun Pro 350

Figure 3. Lancaster Ethernet configuration (mid1985)

attached a variety of machines - - presently three DEC Vaxes, three Sun Microsystems workstations, and a DEC 350 Professional. The Suns and Whitechapel, along with one of the Vaxes, run 4.2 BSD Unix. The other VAXes run VMS and the Professional runs P/OS.

4.2 Unix systems cam/strong InterProcess Communi- cation (IPC) facilities 18. The basic building block for communication in 4.2 is the socket, which is an endpoint of communication to which a name may be bound. Each socket has a type (virtual circuit, datagram or 'raw') and one or more associated processes. Sockets exist within communication domains, of which two are presently supported: the Unix domain, for communication within the same machine, and the Internet domain, for communi- cation over a network using the DARPA standard protocols. The VMS machines, and the Professional, communicate via DECnet, DEC's proprietary network architectuure.

This highlights an important problem, namely that with different operating systems supporting different protocol sets, intermachine communication is extremely difficult. Typically, there is also a lack of operating system 'hooks' which interface to the networking software, making network management, in particular, difficult to implement on existing systems. For our part, we shall attempt to use sockets to manage the Unix machines. This leaves the problem of the VMS machines. Either Intemet software needs to exist on the VMS systems or the Unix machines must be given a DECnet capability. However, a problem also exists with other versions of Unix. For example, we have a PDP 11/44 running Version 7 Unix. V7 Unix provides no socket abstraction so if we connect the PDP to the Ethemet there is no immediate means of getting it to communicate with the 4.2 systems. System V has a socket type abstraction known as streams, but the exact relationship with 4.2 sockets, and how the two might interface, is unclear.

Another problem is that the Deuna is currently the only controller we have which provides hardware counters for network management. This should, however, ease as the Ethemet chips become more widely available and used - - and if the IEEE counter recommendations are followed.

4.2 Unix has a number of commands that may be considered network management functions. We intend to improve these and/or bring them together as the basis of a network management system. This may be used in a Sun workstation that has complete control over its local domain. One particular area we believe to be worth investigating is that of 'load sharing' over the Ethernet.

C O N C L U S I O N

Although we have predominantly discussed Ethemet, network management needs are much the same in other LAN systems. The general requirements are similar, although there may be specifically different problems. For instance, in token ring and token bus systems the handling of the various token loss timers may be con- sidered as a problem that network management should deal with. In broadband systems, the management task

vol 8 no 5 october 1985 245

Page 7: Managing local area networks

may be even more substantial. As broadband networks have the potential for the attachment of more nodes, and over a greater distance, such networks may become more complex than other LAN types.

For any local network system, the implementation of network management is not straightforward and relies on several factors if it is to be done properly. Most important is the provision of the correct 'hooks' -- both in software and hardware -- to allow its implementation. In present local network systems, these hooks are few and far between, but the new network management standards being developed are likely to improve matters in the near future.

REFERENCES

1 ISO 'Information processing systems -- Open systems interconnection -- Basic Reference Model' Inter- national Standard ISO/IS 7498 (1983)

2 McNamara, J E TechnicalAspects of Data Communi- cation Digital Equipment Corporation 2nd Edition (1982) (Chapter 7)

3 ISO 'Data communication -- High-level data link control procedures -- frame structure' ISO 3309 (1979)

4 Anon 'Draft revised recommendation X.25' ACM Comput Commun. Rev. Vol 10 No I and 2 (1980) pp 56-I 29

5 IEEEDraftlEEEStandard802.1 IEEEComputerSociety (August 1984). Also 'Introduction to Local Area Networks' Technical Guide TG101/5, UK Department of Trade and Industry (December 1984)

6 Bulnes 'Proposal to add a Network Management function to ECMA-82' ECMA/TC24/83/87 ECMA (June 1983)

7 Sloman, M 'Management of local area networks' Part 2 of Final Report of COST 71 BIS Local Area

Network Group (October 1984) 8 Winfield, B, Daniel, 1" and Hall, B 'Network manage-

ment in a distributed service environment' INDRA Note 1577 University College London, UK (April 1984)

9 Coates, K E and Mackey, K E 'The evolution of network management services in the Bell Labs network: throes and aftermath' Compcon 82 -- High Technology in the Information Industry IEEE Computer Society pp 220-230 (February 1982)

10 Stewart, R L and Wecker, S 'Network management in DECnet' Compcon 80 -- Distributed Computing -- 21st International Conference IEEE Computer Society (September 1980)

11 Vassiliades, S Measurements on the Hatfield Network School of Information Sciences, Hatfield Polytechnic, UK

12 Kummer, P S Managements and operation of a wide- area network Network Development Group, SERC, Daresbury Laboratory, Warrington, UK (1980)

13 IEEE 'Systems Management' Draft IEEE Standard 802.1: Section 5 IEEE Computer Society (August 1984)

14 ECMA 'OSI Management Architecture' ECMA TR/YY (December 1984)

15 MAP 'Manufacturing Automation Protocol' V2.0 General Motors, MI, USA (February 1985)

16 Appendix D 'Intercept Recommendations for Local Area Networks according to the CSMA/CD access method' Technical Guide TG101/2, U K Department of Trade and Industry, UK (April 1984)

17 DEC Deuna Users Guide Digital Equipment Corpo- ration (1983)

18 Leffler, S J, Fabry, R S and Joy, W N A 4.2 BSD Interprocess Communication Primer Computer Systems Research Group, Department of Electrical Engineering and Computer Science, University of California, Berkeley, USA

246 computer communications