Normal people working in normal organizations with normal ... · Normal people working in normal...

16
Normal people working in normal organizations with normal equipment: System safety and cognition in a mid-air collision Paulo Victor Rodrigues de Carvalho a, b, * , Jose ´ Orlando Gomes b , Gilbert Jacob Huber b , Mario Cesar Vidal c a Comissa ˜o Nacional de Energia Nuclear, Instituto de Engenharia Nuclear, Cidade Univerisita ´ria, Ilha do Funda ˜o, CEP 21945-970 Rio de Janeiro, RJ, Brazil b Graduate Program in Informatics-NCE&IM, Cidade Universita ´ria, Rio de Janeiro, RJ, Brazil c Luis Alberto Coimbra Research Institute, COPPE, Federal University of Rio de Janeiro, Rio de Janeiro, RJ, Brazil article info Article history: Received 1 February 2008 Accepted 28 November 2008 Keywords: Mid-air collision System safety Cognitive strategies Air traffic management system abstract A fundamental challenge in improving the safety of complex systems is to understand how accidents emerge in normal working situations, with equipment functioning normally in normally structured organizations. We present a field study of the en route mid-air collision between a commercial carrier and an executive jet, in the clear afternoon Amazon sky in which 154 people lost their lives, that illus- trates one response to this challenge. Our focus was on how and why the several safety barriers of a well structured air traffic system melted down enabling the occurrence of this tragedy, without any cata- strophic component failure, and in a situation where everything was functioning normally. We identify strong consistencies and feedbacks regarding factors of system day-to-day functioning that made monitoring and awareness difficult, and the cognitive strategies that operators have developed to deal with overall system behavior. These findings emphasize the active problem-solving behavior needed in air traffic control work, and highlight how the day-to-day functioning of the system can jeopardize such behavior. An immediate consequence is that safety managers and engineers should review their tradi- tional safety approach and accident models based on equipment failure probability, linear combinations of failures, rules and procedures, and human errors, to deal with complex patterns of coincidence possibilities, unexpected links, resonance among system functions and activities, and system cognition. Ó 2008 Elsevier Ltd. All rights reserved. .if there is no seed, if the bramble of cause, agency, and procedure does not issue from a fault nucleus, but is rather unstably perched between scales, between human and non- human, and between protocol and judgment, then the world is a more disordered and dangerous place Galison (2000), p. 32 1. Introduction Mid-air collisions of en route aircraft are extremely rare events. A review of air traffic management (ATM) related accidents world- wide, from 1980 to 2001, (Van Es, 2003) showed that ATM-related accidents account for 8% of all accidents (the ATM-related accident rate is 0.44 per million flights). This review also showed that most the fatalities are caused by mid-air collisions (63%), and the major causal factors (classified according to the ICAO taxonomy – Flight Crew, Air Traffic Controller (ATC), Environmental, and Aircraft System) that contributed to the mid-air collisions were: 1) ATC – Failure to provide separation – air, and 2) Flight Crew – Lack of positional awareness – in air. In this paper, we use a systemic framework to analyze the functioning of the ATM cognitive system during the mid-air collision between flight GLO1907 (a commercial aircraft Boeing 737-800) and flight N600XL (an EMBRAER E-145 Legacy jet) to understand how and why this tragedy happened. This ATM-related accident occurred at 16:56 Brazilian time on September 29, 2006, in the clear afternoon Amazon sky. Our aim is to understand how a mid-air collision can still happen, despite the various defense layers that exist in the ATM system to prevent just such an event. Based on our findings we develop some insights about the cognitive functioning of the Brazilian Air Traffic Controllers and its safety implications to the ATM system operation. Accidents and incidents by themselves cannot be considered absolute and direct indicators of the safety of any system (Woods and Cook, 2006). However, the analysis of the dynamic interplay of loosely and tightly coupled subsystems during the emergence of incidents and accidents can reveal patterns of behavior in the * Corresponding author. Comissa ˜o Nacional de Energia Nuclear, Instituto de Engenharia Nuclear, Cidade Univerisita ´ ria, Ilha do Funda ˜o, CEP 21945-970 Rio de Janeiro, RJ, Brazil. Tel.: þ55 21 21733835. E-mail address: [email protected] (P.V. Rodrigues de Carvalho). Contents lists available at ScienceDirect Applied Ergonomics journal homepage: www.elsevier.com/locate/apergo 0003-6870/$ – see front matter Ó 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.apergo.2008.11.013 Applied Ergonomics 40 (2009) 325–340

Transcript of Normal people working in normal organizations with normal ... · Normal people working in normal...

Page 1: Normal people working in normal organizations with normal ... · Normal people working in normal organizations with normal equipment: ... procedure does not issue from a fault nucleus,

lable at ScienceDirect

Applied Ergonomics 40 (2009) 325–340

Contents lists avai

Applied Ergonomics

journal homepage: www.elsevier .com/locate/apergo

Normal people working in normal organizations with normal equipment:System safety and cognition in a mid-air collision

Paulo Victor Rodrigues de Carvalho a,b,*, Jose Orlando Gomes b, Gilbert Jacob Huber b, Mario Cesar Vidal c

a Comissao Nacional de Energia Nuclear, Instituto de Engenharia Nuclear, Cidade Univerisitaria, Ilha do Fundao, CEP 21945-970 Rio de Janeiro, RJ, Brazilb Graduate Program in Informatics-NCE&IM, Cidade Universitaria, Rio de Janeiro, RJ, Brazilc Luis Alberto Coimbra Research Institute, COPPE, Federal University of Rio de Janeiro, Rio de Janeiro, RJ, Brazil

a r t i c l e i n f o

Article history:Received 1 February 2008Accepted 28 November 2008

Keywords:Mid-air collisionSystem safetyCognitive strategiesAir traffic management system

* Corresponding author. Comissao Nacional de EEngenharia Nuclear, Cidade Univerisitaria, Ilha do FuJaneiro, RJ, Brazil. Tel.: þ55 21 21733835.

E-mail address: [email protected] (P.V. Rodrigues

0003-6870/$ – see front matter � 2008 Elsevier Ltd.doi:10.1016/j.apergo.2008.11.013

a b s t r a c t

A fundamental challenge in improving the safety of complex systems is to understand how accidentsemerge in normal working situations, with equipment functioning normally in normally structuredorganizations. We present a field study of the en route mid-air collision between a commercial carrierand an executive jet, in the clear afternoon Amazon sky in which 154 people lost their lives, that illus-trates one response to this challenge. Our focus was on how and why the several safety barriers of a wellstructured air traffic system melted down enabling the occurrence of this tragedy, without any cata-strophic component failure, and in a situation where everything was functioning normally. We identifystrong consistencies and feedbacks regarding factors of system day-to-day functioning that mademonitoring and awareness difficult, and the cognitive strategies that operators have developed to dealwith overall system behavior. These findings emphasize the active problem-solving behavior needed inair traffic control work, and highlight how the day-to-day functioning of the system can jeopardize suchbehavior. An immediate consequence is that safety managers and engineers should review their tradi-tional safety approach and accident models based on equipment failure probability, linear combinationsof failures, rules and procedures, and human errors, to deal with complex patterns of coincidencepossibilities, unexpected links, resonance among system functions and activities, and system cognition.

� 2008 Elsevier Ltd. All rights reserved.

.if there is no seed, if the bramble of cause, agency, andprocedure does not issue from a fault nucleus, but is ratherunstably perched between scales, between human and non-human, and between protocol and judgment, then the world isa more disordered and dangerous placeGalison (2000), p. 32

1. Introduction

Mid-air collisions of en route aircraft are extremely rare events. Areview of air traffic management (ATM) related accidents world-wide, from 1980 to 2001, (Van Es, 2003) showed that ATM-relatedaccidents account for 8% of all accidents (the ATM-related accidentrate is 0.44 per million flights). This review also showed that mostthe fatalities are caused by mid-air collisions (63%), and the majorcausal factors (classified according to the ICAO taxonomy – Flight

nergia Nuclear, Instituto dendao, CEP 21945-970 Rio de

de Carvalho).

All rights reserved.

Crew, Air Traffic Controller (ATC), Environmental, and AircraftSystem) that contributed to the mid-air collisions were: 1) ATC –Failure to provide separation – air, and 2) Flight Crew – Lack ofpositional awareness – in air.

In this paper, we use a systemic framework to analyze thefunctioning of the ATM cognitive system during the mid-aircollision between flight GLO1907 (a commercial aircraft Boeing737-800) and flight N600XL (an EMBRAER E-145 Legacy jet) tounderstand how and why this tragedy happened. This ATM-relatedaccident occurred at 16:56 Brazilian time on September 29, 2006, inthe clear afternoon Amazon sky. Our aim is to understand howa mid-air collision can still happen, despite the various defenselayers that exist in the ATM system to prevent just such an event.Based on our findings we develop some insights about the cognitivefunctioning of the Brazilian Air Traffic Controllers and its safetyimplications to the ATM system operation.

Accidents and incidents by themselves cannot be consideredabsolute and direct indicators of the safety of any system (Woodsand Cook, 2006). However, the analysis of the dynamic interplay ofloosely and tightly coupled subsystems during the emergence ofincidents and accidents can reveal patterns of behavior in the

Page 2: Normal people working in normal organizations with normal ... · Normal people working in normal organizations with normal equipment: ... procedure does not issue from a fault nucleus,

P.V. Rodrigues de Carvalho et al. / Applied Ergonomics 40 (2009) 325–340326

functioning of the whole system that may indicate a drift to unsafestates (Snook, 2000). The basic safety goal of the ATM systemduring en route commercial flights is to avoid mid-air collisions.Therefore, an analysis of the system operation during a mid-aircollision can provide insights about system weaknesses and safety.

1.1. A systemic framework for accident analysis

Modern organizations are complex sociotechnical systemscomprised of many nested levels: government, regulators,company, management, staff, and activities/work/processes/tech-nical systems. According to Rasmussen and Svedung (2000), safetycan be viewed as a control problem involving all these levels andmust be managed by a control structure embedded in an adaptivesociotechnical system. From this perspective, accidents are emer-gent properties of complex systems that are more prone to occurwhen the control systems (safety barriers included) do notadequately handle the day-to-day system failures/disturbances ina broad sense (external disturbances, component failures, humanfailures, or dysfunctional interactions among system components),throughout the system’s life cycle (Hollnagel, 2004). The outcomeof a controlled system – the result of reasonably foreseen inputchanges – is directly influenced by the system’s control mecha-nisms in place. Hazardous situations occur when flaws in thecontrol mechanisms (technical, human, organizational) enable theemergence of unexpected outcomes that generate negativeconsequences. Accidents in safety-critical systems, like ATM, withmany layers of control (defense-in-depth concept) can happen onlywhen there are simultaneous failures in various control mecha-nisms in different system layers (Hollnagel, 2006).

ATM systems are composed of many nested layers resulting incomplex interactions. Interactions occur between human operators(controllers and pilots), between human operators and procedures(flight plans, rules to define the controlled air space, the air spacesectors that must be handled by some specific controller team,general safety rules for the control of traffic, and so forth), andbetween operators and hardware/software technical systems(radar systems, computer processing of radar and flight data,aircraft navigation systems, traffic alert and collision avoidancesystem – TCAS, communication systems between controllers andpilots, flight progress strips).

Brooker (2006) simplified the ATM system description intothree structural system layers acting as the system controls: Plan-ning (pre-operational), Operation (the flight in progress), and Alert(the ground and air protection enabled by conflict alert systems, onwhich the controller/pilot will act). Humans in the several ATMsystem layers participate in control of the system, acting to operatethe system in a safe manner. Failure of system control occurs whenthe mechanisms used to keep system stability fail and make thesituation worse against reasonably foreseen threats. The funda-mental issue here is how to identify reasonably foreseen threats indynamic real-life situations. To do so, we must understand howpeople address the overall constraints of the control system duringtheir daily work. The most important question regarding humancontrol performance in safety-critical systems is to understand howand why normal work done by normal people enables the emer-gence of accidents (Dekker, 2006).

2. Method

The research methodology to investigate the functioning andsafety of the ATM system based on the analysis of a single casestudy – the GLO1907/N600XL collision – can be justified by Yin’s(1994) rationale. According to Yin case studies can be used whenhow and why questions are being posed, and when the focus is on

a contemporary phenomenon within some real-life context.Understanding HOW two airplanes collided in the clear afternoonAmazon sky satisfies Yin’s HOW criterion. The concurrence oforganizational and cognitive factors that enable the emergence ofthis tragedy – directly related to the ATM system’s safety – satisfiesYin’s WHY criterion.

Catastrophic accidents in the domain of ultra-safe systems –such as commercial fixed wing scheduled flight, with risk lowerthan 10�6 – provide a unique opportunity to study the safety ofcomplex systems. This condition also justifies the use of a singlecase study according to Yin’s rationale: an extreme and unique caseor a revelatory one (Yin, 1994). The mid-air collision satisfies thesetwo criteria. It is unique and extreme in the sense that mid-aircollisions are the least frequent ATM-related accidents, and it isrevelatory due to the generation of a considerable amount of datathat become available to the public and researchers throughsecondary sources. The availability of these data is especiallyimportant in the case of the Brazilian ATM system as it is operatedunder military administration and the data about its functioning(e.g. danger reports, near misses) are not usually available to thepublic and social scientists.

In our research method, we search for the mid-air collisionantecedents traced through the concurrence of performances of theseveral actors (pilots and controllers) and their interaction with thecontextual conditions of the several ATM subsystems. Our aim is tounderstand how and why this accident happened based on the richdata set that, because the occurrence of this tragedy is now availableto the public. Data and evidences, antecedents and consequencesfrom this mid-air collision come from a wide range of publiclyavailable sources. These sources include official government docu-ments, congressional hearings, including controllers’, pilots’, and airtraffic management authorities’ testimonies, video tapes, audiotapes of many media centers, press releases, newspaper clippings,flight plans, regulations, maps, directives and so forth.

3. The air traffic management system – ATM

To understand the control functions of the ATM system, we usethe systems layers concepts as defined by Brooker (2006). The ATMsystem layers are formed by human, technological (hardware andsoftware), and organizational components. Some of them, impor-tant to avoid mid-air collisions, are briefly described:

– Controllers and pilots, sharp-end operators at the bottom layerof the system;

– Prescribed safety rules for the control of traffic, including theminimum separation to be permitted between aircraft, flightplans;

– Communications equipment between controllers and pilots(special radio communications frequencies);

– Airways to structure aircraft traffic in the controlled air spaceand provide references for traffic separation;

– Air space sectors to allow the division of air traffic controlamong different controller teams;

– Many other software rules to discipline where and how thedifferent types of aircraft must fly;

– Flight progress strips based on the flight plan data used bycontrollers to quickly recall the order (in time) and the detailsof the flights they have to handle; recommendations regardingthe maximum number of flights that can be managed by thecontroller;

– Radar systems:B Primary Surveillance Radar (PSR), that works by passively

bouncing a radio signal off the skin of the aircraft and whoseadvantage is that it operates totally independently of the

Page 3: Normal people working in normal organizations with normal ... · Normal people working in normal organizations with normal equipment: ... procedure does not issue from a fault nucleus,

P.V. Rodrigues de Carvalho et al. / Applied Ergonomics 40 (2009) 325–340 327

target aircraft – that is, no action from the aircraft isrequired for it to provide a radar return – but that does notpositively identify the aircraft, generates imprecise altitudeinformation, and has a range limited by distance, altitude,terrain and rain or snow;

B Secondary Surveillance Radar (SSR) that overcomes theselimitations but depends on a transponder in the aircraft torespond to interrogations from the ground station to makethe plane more visible and to report the aircraft’s altitude.

– Computer processing complementing the information comingfrom the radar and flight data;

– High quality aircraft navigation systems using Inertial Naviga-tion Systems through to satellite-based aids such GPS;

– Short Term Conflict Alert (STCA) and Separation MonitoringFunction (SMF) that are the computer processing systems foranalyzing the radar tracks to predict if aircraft might come intoproximity soon and, if they might, warn the controller byflashing a message on his radar screen;

– Traffic Alert and Collision Avoidance System (TCAS), the onboard collision avoidance systems based on detection of otheraircraft in the vicinity through their SSR transponders. Theseinform the pilot of nearby traffic – TA (Traffic Advisory) – andaircraft coming into conflict – RA (Resolution Advisory). RAsinform the pilot to climb or descend as appropriate to take theflight out of risk.

In an ATM control system functioning as described above, thesafety control barriers to avoid a mid-air collision can be summa-rized as:

– Controlled air space with straight route, with two traffic lanes,– Opposite traffic flows along each lane,– Distance between the two lanes of 1000 ft (vertical separation),– Well defined flight plans received before the flight,– Traffic flow per lane lower than 4 aircraft/hour (longitudinal

separation),– The two aircraft are equipped with Traffic Alert and Collision

Avoidance System (TCAS),– The two aircraft are under surveillance of ground controllers,

with radar tracking and radio communication.

In a system with this entire set of control safety barriers, howcould an accident like this ever happen? How a could a sophisti-cated ATM system with several safety barriers, many coordinatingmechanisms, surveillance and communication equipmentproviding redundant layers of cross-checking possibilities allowthis to happen? There was no emergency situation, no weatherproblems in the clear afternoon Amazon sky, and no suddenequipment failure that can be considered the cause of this tragedy.

3.1. The configuration of the Brazilian ATM system

Because the research approach attaches great significance to thework environment as the root of variation in decision-making andcognitive behavior, we present a brief description of the BrazilianATM system using the means-end hierarchy (Rasmussen et al.,1994). In particular, we will use the Rasmussen and Svedung (2000)framework for risk management to look at management andorganization structures. Fig. 1 presents the generic actor map forthe Brazilian ATM system. This diagram provides a view of thewhole system, including many levels ranging from governmentalstructures at the top, down to the local environment of sharp-endoperators (controllers and pilots), the ultimate level related to thecollision. The lower level represents the individual operators thatare interacting with the process being controlled. The third level

describes the company managers responsible for the companies’policy and strategies, and for the supervision of the operators’activities. The second level describes activities of regulators andassociations responsible for monitoring the activities of thecompanies in the aviation sector. The first level details activities ofthe government and juridical aspects related to the same sector.The representation of all these levels is necessary because they allinteract with each other, mutually directing and influencing eachlevel behavior, to control the system and provide adaptation toenvironmental changes. To understand the events at any particularlevel, it is therefore important to understand what has gone on atall levels in the system. As pointed out by Rasmussen and Svedung(2000), any sociotechnical system is subject to severe environ-mental pressure in a dynamic society. The society pressure firstappears at the higher levels in the definition of legislation, regu-lations, budget, and so forth, going down to the companies’ policiesand strategies, reaching the sharp-end operators’ behaviors andactions. Adequate control strategies at all levels enable the systemto operate at low risk, in which a proper co-ordination and deci-sion-making at all levels can be achieved. These observations areparticularly important considering the rapid growth of commercialfixed wing flights in Brazil, which is increasing the demands on theentire Brazilian air traffic system.

The regulation of Brazilian air traffic system complies with theinternational ICAO legislation. The Brazilian constitution isthe source for the development of specific regulations governingthe functioning of the Brazilian Air Space Control System (SISCEAB),which is also regulated by the Department of Air space Control(DECEA), responsible for the integrated air defense and air trafficcontrol system. The SISCEAB encompasses all of the system’soperational functions: communication, air traffic control, airdefense, aeronautic meteorology, cartography, aeronautic infor-mation, search and rescue, and accident investigation.

Brazilian authorities decided to have only one ATM system, theIntegrated Air Defense and Air Traffic Control System (SISDACTA).The SISDACTA is responsible for the co-ordination and joint opera-tion of the air defense and air traffic control activities. The SISDACTAacts on the entire Brazilian air space and in the international regionsunder Brazilian responsibility, according to ICAO agreements. Thetotal area covered corresponds to 22 million square kilometers(8 511965 Km2 above Brazilian land). The SISDACTA comprises:

1) Tower Control Centers (TWRs) – control air traffic up to 5 kmfrom the airports;

2) Approach Control Centers (APPs) – control air traffic between 5and 74 km from the airports;

3) Area Control Centers (ACCs) – known as CINDACTAs (acronymfor Integrated Air Defense and Air Traffic Control Centers)control air traffic in the airways.

For radar surveillance and air traffic control purposes, Brazil hasbeen divided into 4 big regions. All radar data from each region areprocessed by an area control center (ACC) or CINDACTA. Each one ofthe four CINDACTAs has radar control of its region and is respon-sible for the co-ordination of the regional air traffic. Due to itscentralized localization, CINDACTA I (Brasilia ACC), which operatesfrom Brasilia and was the first center installed, is responsible forcontrolling most of the air traffic in Brazil. CINDACTA II operates inthe south region, CINDACTA III covers the northeast air space, andCINDACTA IV (Amazon ACC) the north region, comprising theAmazon forest. The CINDACTAs control traffic of en route aircraft,when the airplane is above 19.500 feet. During the approach toairports, air traffic is controlled by the Approach Control Centers(APPs), and the final descent is controlled by the airport’s TowerControl Center (TWRs).

Page 4: Normal people working in normal organizations with normal ... · Normal people working in normal organizations with normal equipment: ... procedure does not issue from a fault nucleus,

NationalCongress

Min. of DefenseAir Force Command

Min. ofPlaning

Min. ofFinance

Min. ofLabor

Air Space ControlDept. - DECEA

Area Control Centers (ACCs)–CINDACTA I, II III, IV

Civil Aviation NationalAgency - ANAC

Aeronautic Infrastructure,Airports Administration -INFRAERO

Control System of BrazilianAir Space - SISCEAB

International Civil AviationOrganization – ICAO

Air Defense and Traffic ControlSystem – SISDACTA

Civil ControllersWorkers Association

TWRs CivilControllers

1. Government,Legislation &Budgeting

3. Companies

2. Associations

4. Operators

2. RegulatoryBodies

CINDACTA ControllersWorkers Association

Tower ControlCenters (TWRs)

ApproximationControl Centers (APPs)

APPs CivilControllers

ACCs Civilian and MilitaryControllers

Military Operations ControlCenters (COPM) –CINDACTA I, II III, IV

COPMs MilitaryControllers

Fig. 1. Generic actor map of Brazilian ATM system.

P.V. Rodrigues de Carvalho et al. / Applied Ergonomics 40 (2009) 325–340328

In each CINDACTA, there are two air traffic control systems intwo different control rooms:

– Military Operations Centers (COPM) that handle flight controlof military aircraft that are in military operation, operated bymilitary controllers, under military rules.

– Area Control Centers (ACCs), responsible for the air trafficcontrol of aircraft (civilian and military) flying under thegeneral aviation circulation rules, operated by civilian andmilitary controllers, under civilian rules.

Military controllers operate both centers. The integration of airspace and air defense control in Brazil – an option that is not used inmost countries – enables the use of the same resources forcommunication, detection, surveillance, control and early warningfor air traffic control and for air space defense purposes.

4. The facts – what happened?

The collision was a very brief event. About 1 hour elapsedfrom the moment when flight N600XL, the Embraer E-145 Legacyjet, entered the air space controlled by the Brasilia ACC (CIN-DACTA I), until the collision with flight GLO1907, the Boeing 737-800, in the surveillance intersection zone between the Amazonand Brasilia control centers. Fig. 2 shows the collision path. FlightN600XL, even with some damage, was able to land at a militarybase airport.

Fig. 3 shows a time-line with the main events before and afterthe collision. Flight N600XL, an Embraer Legacy executive jet, tookoff from Sao Jose dos Campos airport (in Sao Paulo state) at 14:30(Brasilia time) to Eduardo Gomes airport in Manaus (in Amazonasstate) with a crew of two (pilot and co-pilot) and 4 passengers.Flight N600XL’s plan stipulated two level changes (see Fig. 3).However, the flight level stipulated for the first leg (until Brasilia) of

the flight, 370 (or 37 000 ft), was reached at 15:33 and was main-tained up to the moment of the collision.

Flight GLO1907, the Boeing 737-800, took off from EduardoGomes airport in Manaus to Brasilia International Airport at 15:35.At 15:58, it reached the 370 flight level in the UZ6 airway (a duallane airway with 1000 ft of vertical separation between lanes), andmaintained this level, as stipulated by its flight plan, up to thecollision moment.

The last successful bilateral contact between the N600XL andthe Brasilia ACC happened at 15:51. At 15:55, the N600XL flew overthe Brasılia VOR vertical line, and entered the UZ6 airway (the sameas flight 1907, but in the opposite direction), without requesting orreceiving any instruction from the Brasilia traffic control center andkeeping the 370 flight level. At 16:02, the Brasilia ACC lost thesecondary surveillance radar (SSR) information about the N600XL,which presents accurate altitude information to the trafficcontroller. At 16:30, there was a 2 min loss of primary radar contactwith the N600XL, which transmits the aircraft’s geographic posi-tion to the controller. No contact was attempted between 15:51 and16:26 by either the N600XL or the Brasilia traffic control center.From 16:26 to 16:53the Brasilia ACC made seven unsuccessful callattempts. At 16:38, the Brasilia center lost definitively the primaryradar contact with the N600XL (it should have been transferred tothe Amazon control center).

The N600XL, at 16:48, began a series of 12 call attempt to theBrasilia ACC. At 16:53:39, the N600XL was able to hear the last(unilateral) call by the Brasilia center, instructing the N600XL to callthe Amazon ACC, but the crew was not able to copy the frequenciesprovided. At 16:53:57, the N600XL radioed the Brasilia centerrequesting the repetition the decimals of the first frequency,because it had not been able to copy these values. The Brasiliacenter did not receive this message. After that, the N600XL madeseven more unsuccessful call attempts to the Brasilia center from16:54 to 16:57.

Page 5: Normal people working in normal organizations with normal ... · Normal people working in normal organizations with normal equipment: ... procedure does not issue from a fault nucleus,

TERES

NABOL

UZ6

UZ6

MANAUS

BRASÍLIA

W E

N

SSurveillanceintersection zonebetween Amazonand Brasilia controlcenters

Legacy

Boeing

Fig. 2. The mid-air collision occurred approximately 20 Km northwest from Nabol, flight level 370 (37 000 ft), geographic coordinates 10� 440 S/053� 310 W, at 16:56:54 Braziliantime. Both aircraft were flying in the UZ6 airway in opposite directions at the same altitude when the left wing of the Legacy, flight N600XL to Manaus, collided with left wing of theBoeing 737, flight GLO1907 to Brasilia. (source Ferreira, 2006).

P.V. Rodrigues de Carvalho et al. / Applied Ergonomics 40 (2009) 325–340 329

The mid-air collision occurred when both aircraft were in theUZ6 airway, flying in opposite directions, in the same lane – FL370or 37 000 ft – at 16:56:54. With the collision, flight 1907 becameuncontrollable, immediately going into a dive until it crashed intothe ground, causing the death of all passengers and crew members(154 people), while flight N600XL was still able to fly and suc-ceeded in making an emergency landing at the Brigadeiro Velosomilitary air base.

5. The analysis – how and why it happened

The commander of the Department of Air space Control(DECEA), responding to a senator’s question in the senate publichearing after the accident said: ‘‘I am as puzzled as you Sir. A thinglike this is impossible to happen.’’

5.1. Preliminary investigation questions

The preliminary report elaborated by CENIPA (Ferreira, 2006)indicated the following important things that did not happen in thisaccident

TERESBRASÍLIA1555

SÃO JOSÉ1451

Flight level

1533

360

370

380

1551

Last bilateralcontact

Lost SSRinformation

1602

From 1626 until c

UZ6 airway

Lora

Fig. 3. Time-line with the main even

� There was no loss of radar surveillance between the AmazonACC (CINDACTA IV) and flight 1907, until its transference to theBrasilia ACC;� There is no evidence in the communication records of any

N600XL request to the air traffic control centers to change itsflight level, after having reached the 370 flight level.� There is no registered evidence regarding any instruction for

the N600XL to change its flight level coming from air trafficcontrol, after the last successful bilateral contact (15:51)between this aircraft and Brasilia center.� There is no registered evidence of any traffic alert alarm or

instruction for evasive action to the respective crews to avoidcollision in the TCAS systems, existing in both aircraft.� There is no registered evidence of any manifestation in either

crew related to any possible visual perception of theapproaching aircraft.� There is no attempt for action or evasive maneuver, according

to the data existing in flight recorders.

Looking through the list it seems that the entire system(including man-pilots and controllers – technology and

N600XL 1907

NABOL

Requested N600XL levels

MANAUS1535

position &time

370

1657

ollision many contact attempts

st of primarydar contact

1638

1654

Unilateral contact:change freq.

ts before and after the collision.

Page 6: Normal people working in normal organizations with normal ... · Normal people working in normal organizations with normal equipment: ... procedure does not issue from a fault nucleus,

Table 1Dialog between TWR controller and Legacy N600XL pilots – dialog in English.

Time Operator Communications

14:26:40 LegacyPilot

Sao Jose ground november six zero zero x-ray lima.

14:26:47 TWRcontroller

November six zero zero x-ray lima go ahead.

14:26:51 Legacypilot

Yes sir (.) start engines.

14:26:59 TWRcontroller

Er, did you request, er, about weather?

14:27:02 Legacypilot

Yes sir, weather and runway.

14:27:05 TWRcontroller

Roger. Er, Sao Jose operating under visual conditions, ceilingfive thousand feet, visibility one zero kilometers, runway inuse one five, wind two two zero degrees, eight knots, quiueneiti one zero one nine, temperature two zero, time checktwo five.

14:27:37 Legacypilot

Thank you.

14:31:46 Legacypilot

Ground, november six zero zero x-ray lima like to have pushback for taxi.

14:32:02 Legacypilot

Ground, november six zero zero lima, x-ray lima, like to giveready, clear to push for taxi.

14:32:10 TWRcontroller

Ah, november six zero zero x-ray lima, er, clear to start up,temperature two zero. Er, are you ready to taxi?

14:32:24 Legacypilot

Yes sir, we’ll be in turn right now (.) to the taxi back.

14:32:31 TWRcontroller

Er, report ready for taxi.

14:32:34 Legacypilot

Report ready to taxi, six hundred x-ray lima.

14:40:31 Legacypilot

Sao Jose ground, november six zero zero x-ray lima ready totaxi.

14:40:38 TWRcontroller

Er, roger. Er, maintain position, november six zero zero x-raylima.

14:40:44 Legacypilot

November six zero zero x-ray lima maintaining position.

14:41:50 TWRcontroller

Are you ready to copy the clearance?

14:41:53 Legacy Ah, affirmative, yes.14:41:57 TWR

controllerNovember six zero zero x-ray lima, ATC clearance to EduardoGomes, flight level three seven zero, direct Poços de Caldas,squawk transponder code four five seven four. After take-offperform OREN departure.

14:42:26 LegacyPilot

Okay sir, I get the runway one five to so. ah SBEG, flight levelthree seven zero, I didn’t get the first fix, I get squawk fourfive seven four, OREN departure.

Source: CPI final report (Camara dos Deputados, 2007).

P.V. Rodrigues de Carvalho et al. / Applied Ergonomics 40 (2009) 325–340330

organization) was not aware that two aircraft were flying in thesame lane and in opposite directions.

5.2. Some official explanations

Charles Perrow pointed out that conventional explanations foraccidents are ‘‘operator error, faulty design or equipment, lack ofattention to safety features, lack of operating experience, inadequatelytrained personnel, failure to use the most advanced technology, andsystems that are too big, under financed, or poorly run’’ (Perrow, 1984,pp 63). The conclusions from the Senate Public Hearing reportconfirm Perrow’s view and leans toward faulting the operators:‘‘(.) it is possible to learn that several factors contributed to theaccident: possible technical failures, pilot failures and air trafficcontroller failures. However, analyzing the event, I conclude that thehuman factor is the main cause. Eliminating the human errors fromthe causal chain, the accident would never have happened (.)’’(Senado Federal, 2007, pp 61).

We want to go beyond these obvious explanations to considerthe complexity of the events, understanding the details of eachsituation (how and why things happened). Our aim is to uncoverthe underlying cognitive and organizational dynamics that enabledthe emergence of the tragedy, using the rich set of data that wasmade available by the accident investigations as a window to graspa more fundamental understanding about the ATC system’s normalfunctioning. To do so, in the following sections, we will discussHOW and WHY the safety barriers against mid-air collisionsdescribed in Section 4 eroded enough to allow the emergence ofthis accident.

5.3. The controlled air space

The controlled air space – the first safety barrier – is an abstractconstruction, which aims to create ‘‘airways’’ in the space byprescribing routes, altitudes, and directions to be followed by airtraffic. These airways are represented on aeronautical navigationcharts and should be followed by pilots and controllers. The flightplan is the artifact that enables pilots and controllers to virtuallyconstruct a flight, allocate a portion of the controlled air space to it,and then discipline the flights and air traffic control.

In this accident, the flight levels prescribed in flight N600XL’sflight plan were not followed. How could this happen? We will usethe investigation findings and human decision-making theories toexplain how normal pilots’ and controllers’ cognitive behaviors leadto this unwanted system outcome.

5.3.1. Pilots’ behaviorFig. 3 includes a representation of the prescribed altitudes of

flight N600XL’s approved flight plan. Starting in Sao Jose dosCampos, it passed through Poços de Caldas, in the UW2 airway atflight level 370 (37 000 ft) until Brasilia, where it would drop toflight level 360 (36 000 ft) and enter the UZ6 airway. The flight plancalled for another altitude change at the Teres (virtual) notificationpoint, after which the aircraft would continue in the UZ6 airway,but at FL380 (38 000 ft). The UZ6 is a dual lane airway with 1000 ftvertical separation distances, in which the odd flight levels (370,390, 410) are used for north-south navigation (Manaus to Brasılia),and the even flight levels (360, 380, 420) are used for south-northnavigation (Brasılia to Manaus). According to the CPI final report(Camara dos Deputados, 2007), the submitted flight plan wasapproved without modification by the Brasilia ACC.

Flight N600XL’s pilots had never flown in Brazil beforeSeptember 29, when they came to Sao Paulo to fly the brand newEmbraer Legacy aircraft just acquired by American Excelaire. Theflight plan they were using was both prepared and submitted by

Embraer to the air traffic control center for approval. This flight planpreparation procedure (a normal way to prepare flight plans), did notrequire the pilots to look at Brazilian airways on the local aero-nautical charts to configure their flight. As a consequence, some ofthe details of the plan, and in particular, the situation regarding theUZ6 airway, may not have come to the pilots’ attention.

The dialog in Table 1 shows the communications between flightN600XL’s pilot and the TWR controller in Sao Jose (SP) just beforedeparture.

The dialog above seems to be a normal departure communica-tion between a tower controller and a pilot. In fact, we note that,according to the basic verbal protocol rules, there are communi-cation feedbacks and redundancies when flight N600XL’s pilot didrepeat the authorizations received. However, when the air trafficcontroller did not communicate the complete flight plan (levelchanges at Brasilia to 360 and at Teres to 380), (. ATC clearance toEduardo Gomes, flight level three seven zero .), the pilot did notchallenge the ATC’s clearance communication asking for theclearance limits - until where he would fly at level 370 – implyingthat he would fly at FL370 all the way to Eduardo Gomes airport, inManaus (against the flight plan information).

Page 7: Normal people working in normal organizations with normal ... · Normal people working in normal organizations with normal equipment: ... procedure does not issue from a fault nucleus,

P.V. Rodrigues de Carvalho et al. / Applied Ergonomics 40 (2009) 325–340 331

From this dialog, we conclude that the N600XL pilots had twoconflicting sets of information at the moment of departure: 1) theflight plan with many level changes, and 2) the ATC’s communi-cation clearance that mentioned only FL370. Based on the con-flicting information (inputs) they had, why did the pilots notevaluate and compare the inputs and query the controller?

Traditional research in decision-making has developed norma-tive models for human decisions that involve: first, generate a rangeof options, then generate a set of criteria for evaluating theseoptions, assign weights for the evaluation criteria, rate each optionon each evaluation criterion, perform calculations, and finally,compare the options determining the best choice (Doherty, 1993). Ifpeople during their normal work made decisions according to thismodel, we could imagine that the pilots’ actual behavior wascompletely unusual or abnormal (and probably very unlikely tohappen again with other pilots), because their decision process didnot agree with any of the normative steps described.

However, situated or naturalistic based human decision-makingmodels, identifying deviations from the norms, e.g. Prospect Theory(Tversky and Kahneman, 1974) suggest that decision-makersgenerally apply a wide range of heuristics, even when these resultin sub-optimal performance. Heuristic-based theories have beensuperseded by descriptive models emphasizing the phenomenathemselves, without reference to abstract norms, such as Ras-mussen’s model of Cognitive Control (Rasmussen, 1983) and Klein’sRecognition-Primed Decision-Making - RPD model (Klein, 1993)and more recently the efficiency-thoroughness trade-off, the ETTOprinciple (Hollnagel, 2004). In such models, rather than makinga concurrent evaluation of the relative advantages and disadvan-tages of several courses of action (the normative approach), thedecision-makers in actual situations select a course of action, whichis generated through heuristics and recognition. A situation similarto a previous experience, a familiar situation, is evaluated for itsadequacy in the particular set of circumstances of the presentsituation. There are many empirical findings that even in safety-critical systems, for instance, nuclear systems (Carvalho, 2005;Carvalho et al., 2005; Carvalho et al., 2006), experienced operatorsuse pattern recognition and simple heuristics to make decisionsduring their daily work in operating the system.

Flight N600XL’s pilots’ behavior – not querying the ATCcontroller, not searching for more information – indicates that theydecided based on the recognition that the clearance they receivedfrom ATC is a familiar situation of changing a flight plan by ATCauthorities. Together with their non-familiarity with Brazilian airspace, the pilots did not have any doubt about the new flight planthey got after the ATC clearance communication. The pilotsconfirmed that they had no doubts in an interview for the Braziliannewspaper Folha de Sao Paulo on Feb. 19, 2007. The pilot said, ‘‘It iscommon for there to be differences, it happens all the time. You haveto fly according to the authorization. The actual flight plan is theclearance that you receive from the control center.’’, and the co-pilot,‘‘As he said, it happens all the time, we have a flight plan to fly at onealtitude and we are authorized to fly at another one. Let me say thatthis happens 99% of the time. The flight plan is just a proposal.’’

This behavior can also be explained according to the belief-biaseffect that occurs in reasoning when people make judgments basedon prior beliefs and general knowledge, rather than on the rules oflogic and the information available (Evans, 2004; Quinn and Mar-kovits, 1998). In general, people are likely to make inadequatechoices when the logic of a reasoning problem (conflicting infor-mation about the flight plan) is not supported by their backgroundknowledge (flight plan changes happen all the time) (Holyoak andSimon, 1999).

Then, an unwanted system outcome – an aircraft flying ata wrong level in a dual lane airway – was made possible by normal

behaviors of the pilots. Normal in Perrow’s sense, not desired orexpected, but a perfectly possible outcome of the system.

5.3.2. Air traffic controller behaviorFrom other side, the air traffic controller gave a flight authori-

zation where only the level of the first leg (FL370) was communi-cated in a route that included two other level changes. The ICA100-12 publication (DECEA, 2006) regulates the air traffic servicesin Brazil. Chapter 8, Area Control Services, defines the rules to beapplied for ATC services. According to these rules, the submittedflight plan can be different from the actual plan, and the controllermust indicate the various levels of the route, or the limit for theroute authorization he gave. In the ICA 100-12 there are thefollowing definitions:

� Flight plan – specific Information, related to a planned flight orpart of a flight of an aircraft, supplied to the air traffic services.� Submitted flight plan – flight plan such as it is submitted by the

pilot, or his representative, to the air traffic services.� Current flight plan – actual flight plan, including the modifi-

cations (if they were necessary) made by air traffic services.

Based on theses rules and controller behavior the CongressionalHearing Report concludes ‘‘(.) a message with partial authorizationfor the flight of an aircraft is a procedure without any normativesupport’’ (Camara dos Deputados, 2007, p. 58).

The procedure used for flight plan authorization has many steps.First, the plan is submitted in the Air Information Service Room,located in the departure airport. In this room, a Sergeant specializedin aeronautic information (not a flight controller) receives andchecks the submitted plan. Next, the plan is presented to anotherSergeant, specialized in communications, who inputs the flightplan data into the computerized system and sends these data to theregional ACC, where a flight controller in the Flight Plan Roomchecks the proposed route comparing it to the other flight plans inthe region. If it is OK, he/she confirms the insertion of the flight plandata in the system and sends the ‘‘Traffic Authorization or ClearanceDelivery’’ by an electronic message to the departure airport. Finally,the TWR controller at the departure airport, responsible for the‘‘traffic authorization’’, using a private telephone line (called ‘‘hotline’’), calls the ACC controller to confirm the flight plan informa-tion for the clearance delivery. Having received the ACC’s confir-mation, the TWR controller radios the ‘‘Authorized Flight Plan’’ tothe flight’s pilot.

The last internal step of this procedure, the final communicationbetween flight controllers (Sao Jose TWR controller with the Bra-silia ACC) to confirm flight N600XL’s information is transcribed inTable 2.

In this dialog, which happened about 10 min before the take-offclearance communication dialog between the TWR controller andthe pilot, presented in Table 1, the controllers’ verbalizations referonly to the level 370. In both dialogs, there are no explicit verbal-izations regarding the complete N600XL flight plan. In all conver-sations involving pilots, the TWR controller in Sao Jose, and the ACCcontroller in Brasilia, the level changes in N600XL’s flight plan werenot verbalized. Therefore, from this dialog, and the dialog betweenTWR controller and the pilot, the controllers did not follow theDECEA prescriptions.

The easiest way to address this situation is consider that wehave a human error. Doing so, the problem can be confined to thosespecific controllers that behave in a completely different way thanthe other controllers working in the system.

However, from a systemic point of view, this can also be viewedas a normal behavior. On long routes, with many level changes, theATC controller may assume that pilots are aware of the need for

Page 8: Normal people working in normal organizations with normal ... · Normal people working in normal organizations with normal equipment: ... procedure does not issue from a fault nucleus,

Table 2Dialog between TWR and ACC controllers – Dialog in Portuguese, translated toEnglish.

Time Operator Communications

14:33:33 Brasilia ACCController

Talk Sao Jose

14:33:35 Sao Jose TWRcontroller

Hi Brasilia. The November six zero zero x-ray lima toEduardo Gomes Sao Jose Eduardo Gomes .

requesting level three seven zero.14:33:50 Brasilia ACC

Controller. Level three seven zero transponder four five sevenfour Poços de Caldas.

14:33:55 Sao Jose TWRcontroller

Three seven zero direction Poços. What is thefrequency he calls you there?

14:33:59 Brasilia ACCController

One two six fifteen . one three three five

14:34:04 Sao Jose TWRcontroller

One three three five three seven zero direction Poços.OK

14:34:09 Brasilia ACCController

Bye bye

14: 34:10 Sao Jose TWRcontroller

Bye bye

Source: CPI final report (Camara dos Deputados, 2007).

P.V. Rodrigues de Carvalho et al. / Applied Ergonomics 40 (2009) 325–340332

level changes during the flight. They also may think that there willbe other communication opportunities later on, when the aircraftreaches the clearance limits (in this flight the first one was overBrasilia), at which time the pilots will receive the informationregarding the other legs of the plan.

Both behaviors – pilots’ and controllers’ – can be explained bythe by the efficiency-thoroughness trade-off, the ETTO principle(Hollnagel, 2004). Pilots and controllers use common and simpleheuristics such as ‘‘someone one will communicate this later’’, ‘‘thereis no need to pay attention to that now’’, ‘‘flight plans can always bechanged’’ that characterize the ETTO principle. Initially conceived toexplain coping strategies to reduce the cognitive task demands (e.g.time pressure, information overload etc.), the ETTO principle iscurrently viewed as a typical strategy to cope with complexity ina long-term perspective (Hollnagel and Woods, 2005). Using ETTOheuristics it is possible to reduce cognitive effort and keep sparecapacity for emergency situations, making a trade-off before it isobjectively required by the current situation. Thus, the choice ofa coping strategy not only represents a short term or temporaryadjustment but may equally well indicate a more permanent way todo things in the system functioning. The risks inherent in thesestrategies use are obvious, as can be seen from their part in thisaccident explanation.

We argue that the controlled air space as a cognitive system(a system that involves people, technology and organization) may bea weaker safety barrier than we usually expect, enabling unwantedsystem outcomes that may result in system accidents, depending onhow the system is functioning daily. Due to the limitations inherent inanalyzing only one case, we are not able do know how frequently theETTO based strategies are actually being used in the entire BrazilianATC system. However, if they are frequent we may have a serioussafety problem, since the first barrier against mid-air collisions maybe not working properly, characterizing a sociotechnical drift-to-failure situation (Snook, 2000; Rasmussen and Svedung, 2000). Inthis particular case, pilots and air traffic controllers, all acting nor-mally (albeit in this case inadequately in hindsight), interacted witheach other in a way that to all appearances – in foresight – wasnormal, but allowed a mid-air collision to occur.

5.4. The TCAS functioning

Another barrier used to avoid mid-air collisions is the TrafficAlert and Collision Avoidance System (TCAS). Both aircraft involved

in this accident were equipped with TCAS equipment. According toBrazilian flight regulations, before entering air space with reducedvertical separation minimum (RVSM space, where the verticalseparation distance is 1000 ft.), the pilot must check (besides otherequipment) that the transponder is normally working in mode C orS. In Mode C, the transponder informs its code and the aircraft’saltitude with good accuracy. In Mode S, in addition to the Mode Cinformation, the transponder sends other flight parameters such asspeed, direction etc. The TCAS requires the transponder function tooperate, and goes into standby mode (effectively off) whenever thetransponder is not enabled.

One of the most puzzling notes of the accident preliminaryinvestigation report is that ‘‘There was no attempt for action orevasive maneuver, according to the data existing in flight recorders’’(Ferreira, 2006), indicating that TCAS did not give the TA (TrafficAdvisory) or RA (Resolution Advisory) alarms to the pilots.According to the time-line presented in Fig. 3, Brasilia ACC failed toreceive any transponder replies from flight N600XL for approxi-mately the last 50 min of the flight before the collision.

Further investigations indicated that the transponder and TCASof flight GLO1907 was working properly during the accident, andthe transponder of flight N600XL was not functioning (it was in theOFF mode) at the moment of the collision. Evidence that flightN600XL’s transponder was in the OFF mode came from manysources:

– The lack of N600XL transponder contact with the Brasilia ACC;– The lack of N600XL transponder contact with the Amazon ACC;– The dialog between N600XL pilots registered in the CVR

(Cockpit Voice Recorder), just after the collision.

This dialog is presented in Table 3 below.Clearly, the Transponder/TCAS system of the N600XL aircraft

failed. However, even after extensive tests, teardowns and simu-lations in the aircraft manufacturer and in the transponder/TCASmanufacturer, the reason why the transponder was set in theSTAND BY mode (turned off) remains ‘‘unexplained’’ to this day.Three main possibilities were investigated: 1) hardware/technicalfailure, 2) pilots deliberately turned off the transponder, and 3)pilots accidentally turned it off with an unintentional slip. Aboutthe hardware/technical failure, all post-accident information madepublic up to this moment – the investigation has not finished yet –indicates that the critical transponder components were opera-tional. The second possibility, a professional crew willinglyswitching off such an important piece of equipment in the RVSMspace would have been an act of basic procedure violation (Reason,1997). This behavior simply could not be admitted by any profes-sional crew and has no support in the flight data collected (Camarados Deputados, 2007). The last possibility, pilots accidentallyturning off the TCAS with an unintentional slip, is also not sup-ported by the investigation findings. The only known way to put thetransponder STAND BY mode is to press the button on the RadioManagement Unit (RMU) control screen twice, within less than20 s, as shown in Fig. 4, which could not be attributed to a slip.

In his testimony to the Senate Congressional Hearing, the officialresponsible for the Brazilian aeronautic accident investigation said,‘‘In accordance with the description documents and its certifications,the transponder and TCAS systems did not present design or integra-tion errors. They functioned as they had to function. (.)We now focusour investigation on the operational factor, or either, in the relation ofthe operation of. of the human being with that system’’.

In another part of this testimony, he added some new infor-mation related to the ‘‘operational factor’’: ‘‘The transponder stoppedfunctioning. We did all the tests to try to exclude the possibility ofa technical failure in the transponder. We did not find technical

Page 9: Normal people working in normal organizations with normal ... · Normal people working in normal organizations with normal equipment: ... procedure does not issue from a fault nucleus,

Table 3Dialog in the Legacy cockpit at the moment of the collision – Dialog in English.

Time Operator Communication

16:56:38 Co-pilot on Legacyradio

Brasılia, November six zero zero X-ray Lima.

16:56:50 Co-pilot on Legacyradio

Brasılia, November six zero zero X-ray Lima.

16:56:54 Cockpitmicrophone

Impact sound

16:56:56 Pilot, Co-pilot Uh oh16:56:56 Cockpit

microphoneAutomatic pilot (.)

16:59:08 Co-pilot This.16:59:12 Pilot or Co-pilot Deep breath sound16:59:13 Co-pilot Man, are you with TCAS on?16:59:15 Pilot The TCAS is off16:59:25 Co-pilot All right, pay attention only at the traffic. We will

succeed, we will succeed we will succeed. I knowthat.

Just after the collision, the co-pilot started a visual flight (. pay attention only at thetraffic .) which make sense only in the absence of the TCAS/transponder. Afterthat, when the pilot set the emergency code 7700 the transponder worked well.

17:02:06 Pilot I will set to seven thousand and seven hundred. It isan emergency!

17:02:08 Co-pilot Yes, set it up.In the following dialog presents some comments of the crew just after the collision,

which shows some issues about the crew’s rationale.

17:28:36 Pilot So. if we beat in somebody. I want to say, we were inthe right altitude.

17:28:39 Co-pilot Well, they were trying to give the frequency and I wastrying to answer them. I only got three numbers. I didnot get the last two, then I was calling all thefrequencies.

17:28:47 Co-pilot They were probably trying to induce us to go down.But, probably, we were not in the radar. Or they .

(fucks), and they did not make.17:28:54 Passenger

(Embraeremployee)

At any moment we receive a clearance to leave thisaltitude. Then I left in the altitude.

17:28:59 Co-pilot The guys forgot us. Previous frequency forgot uscompletely. And I started to risk. This is not right. Iwas without speaking with somebody for too muchtime.

17:29:11 Co-pilot We are alive.17:29:13 Pilot But, I am worried about the other airplane. If we beat

in another airplane (.) whatever more this can be?

Source Camara dos Deputados, 2007.

Fig. 4. Legacy cockpit and RMU units with transponder desativation procedure and theindication of the TCAS in STAND BY on the RMU. (source Camara dos Deputados, 2007).

P.V. Rodrigues de Carvalho et al. / Applied Ergonomics 40 (2009) 325–340 333

malfunctions. The equipment functioned according to its specifica-tions. Therefore, we are focusing on the operational factor. What I haveto say at this moment is that I have no indication that the pilots turnedoff the transponder intentionally. I do not have, looking at the CVR, Ihave nothing that indicates an action related to this equipment.’’(Camara dos Deputados, 2007).

However, the transponder was in OFF mode and TCAS was inSTAND BY mode for 50 min before the collision. The question thatremains is: Why did the pilots not perceive that these importantpieces of equipment were not functioning properly? The indica-tions in the cockpit that the transponder is currently switched off,and as a consequence the TCAS is in STAND BY mode, are difficult toperceive as shown in Fig. 5, because they are not indicated in thestandard failure colors, and there is no alarm signal to draw thepilots’ attention. The indication that the TCAS is in standby modeappears on the Radio Management Unit (RMU) as a small indicationin green letters inside a yellow window, just underneath thetransponder code selected. There are two other indicationsregarding the operation mode of the TCAS: on the right side of thePrimary Function Display (PFD) there is a message in white andsmall letters indicating TCAS OFF, and on the left side of the MultiFunction Display (MFD) (if this specific page is the selected one) themessage is repeated in the same small white letters (see Fig. 5).

Problems with these indications had already been noted, back in2005, by the European air traffic authorities, in a similar avionicsystem: ‘‘When this reversion to standby mode occurs, the ATC/TCASstandby mode is indicated on the RMU and Cockpit Displays (PFD/MFD), however these indications may not be apparent to pilots,especially during periods of high workload’’ (Irish Aviation Authority,2005, pp 2). Note that in spite of the observation dated from 2005that the indications may not be apparent to pilots, the same indi-cation methods are still used in avionic systems.

There were other cases of spurious failures or automationsurprises in Transponder/TCAS operation in a similar avionicsystem. At the end of 2003, the European ATC providers notedmany lost transponder tracks for several minutes. An EmbraerE-145 remained invisible in the busy European air space for morethan 45 min before it was identified as an ‘‘unknown target’’ byFrench military control, and the first steps for intercepting thetarget were initiated. The inquiry identified a software problem ina particular transponder type and the European Aircraft SafetyAgency (EASA) issued an Airworthiness Directive (AD) in August2005. According to this AD, ‘‘A design deficiency causes the tran-sponder to revert to standby mode if a change of the 4096 ATC code(also called the Mode A code) is not completed within 5 seconds. Asa consequence, the SSR radar symbol and label associated with theaircraft’s position will no longer be shown on the ATC ground radardisplay. In addition, aircraft collision avoidance systems (ACAS) onboard own and other aircraft will be compromised. Current opera-tional procedures, typically, do not require the crew to recheck thetransponder status after changing the 4096 ATC Code. This type offailure will increase ATC workload and will result in improper func-tioning of ACAS’’ (EASA, 2005, p. 2).

Although the specific transponder failure mentioned above hasalready been corrected in the new transponders’ software versions,we can conclude this section observing that, even without cata-strophic technical or operational failures having been identified inflight N600XL’s transponder after more than a year of investiga-tions in different organizations involving two countries, onSeptember-29-2006 an aircraft remained invisible for about 50 minin the Brasilia RVSM controlled air space, without a transpondersignal. Therefore, we add normal equipment to the Dekker (2006)citation. Therefore, in complex systems, accidents occur with‘‘Normal people working in normal organizations, with normalequipment’’ considering that the transponder functioned (and it isstill functioning) in a normal way.

To conclude this point, we argue that a search-for-failure in thetransponder (mal) functioning or about human error in its

Page 10: Normal people working in normal organizations with normal ... · Normal people working in normal organizations with normal equipment: ... procedure does not issue from a fault nucleus,

Fig. 5. Indications of TCAS OFF in the PFD (left) and MFD (right). (source Camara dos Deputados, 2007).

P.V. Rodrigues de Carvalho et al. / Applied Ergonomics 40 (2009) 325–340334

operation (as the traditional accident analysis does) would not besufficient to explain why such an important safe-critical function(collision avoidance) remained out of operation without beingnoticed during this collision. As already pointed out in Section 5.3,the overall system function and the interactions agents perform ina daily basis must be addressed to explain the mechanisms thatallow the function to fail.

+

X

NX600XL197+37035 S017W

ROCHO

TEXAS

BCO

Legacy targetposition symbol

Fig. 6. A schematic view of the Brasilia ACC radar screen.

5.5. The Brasilia ACC and their communications with flight N600XL

After the flight plan misunderstanding and the TCAS in standbymode, flight N600XL was flying in the UZ6 dual lane airway in thewrong direction near Brasilia. In this situation, the communicationloop between flight N600XL’s crew and the Brasilia ACC became thelast barrier to tragedy. The fundamental issues about the quality ofthe communication feedback loops, and how they can affect systemsafety have already been addressed (see Carvalho et al., 2007). Inthe following sections, we will describe how and why this lastsafety barrier did not function to avoid this accident.

5.5.1. What the Brasilia ACC controller actually sawDuring the investigations, the representations of flight N600XL

on the Brasilia ACC controller radar screen were reproduced. Weuse this information to show what the controller was actuallyseeing as the flight progressed. In Fig. 6, we show a radar screenshotof the Brasilia ACC. Flight N600XL, with its main flight data, isrepresented by the target symbol and a data block.

The meaning of the data block is explained in Fig. 7.The aircraft altitude information is located on the second line of

the data block. This line is composed of three segments: a threedigit number, a symbol, and another three digit number. The threedigits on the left side of the second line represent the aircraft’saltitude (usually the last surveillance radar information). Thesymbol between the numbers is a status indicator and specifies thetype of altitude information displayed by the digits to its left(actual, estimated) and its immediate future development (stable,climbing, descending). A ‘‘¼’’ (equals) symbol indicates level flight,a ‘‘þ’’ (plus) symbol indicates a climbing aircraft, and a ‘‘�’’ (minus)symbol indicates a descending aircraft. The display of the¼,þ, or�

symbols also provides visual confirmation to the controller that theaircraft’s transponder is providing Mode C or S altitude informationto air traffic control. Modes C and S of the transponder reportaltitude data to air traffic control that is, in turn, displayed on theradarscope in hundred foot increments. The Z (capital letter Z)symbol indicates that the aircraft altitude is not transponderreported but is estimated by the primary radar. The three digits onthe right hand side of the second line of the data block representthe altitude authorized by the flight plan for that particular flightsegment.

Each controller is responsible for the surveillance of specificsectors of the controlled air space that are displayed in his/herworkplace. The sectors are displayed on the controller radar screenor radarscope. The screen of workplace 8 that controls sectors 7(the sector of the collision), 8 and 9 at 15:55 (four minutes afterflight N600XL entered sector 7) is presented in Fig. 8. The colors(the black background and white lines) were changed for better

Page 11: Normal people working in normal organizations with normal ... · Normal people working in normal organizations with normal equipment: ... procedure does not issue from a fault nucleus,

Altitude status indicator+ Aircraft climbing – Aircraft descending Aircraft level using transponder signals= Level Flight Modes C and SZ Primary radar estimating aircraft altitude? Mode C altitude momentarily lost

NX600XL -------------- Call sign

Current altitude (MODE C) --------- 197 + 370 -------------- Flight planned altitude

Ground speed (x10) -------- 35 S017W ---------------- ACC sector ID

+ Target position symbol

+ Target detected by primary and secondary radar with velocity vector (MODE S transponder)

+ Target detected by primary and secondary radar (MODE C transponder)

+ Target detected only by primary radar, no transponder returns – primary radar only

Notification points

Fig. 7. The meaning of the data block in radarscope.

P.V. Rodrigues de Carvalho et al. / Applied Ergonomics 40 (2009) 325–340 335

visualization on paper. On the radar screen, in addition to thetargets information (aircraft flying) the controller also has the flightplan information in the electronic flight strips, displayed in thevertical column on the right side of the screen.

The first changes in flight N600XL’s data block occurred whenthe aircraft entered sector 7. It appeared to the controller as pre-sented in Fig. 9 (with inverted colors).

At 16:01 flight N600XL’s transponder failed and the ACC screenchanged automatically to that presented in Fig. 10.

Even without the transponder information, the N600XL datablock continues on the screen because the system ‘‘understands’’that the target detected by the primary radar is the same aircraftthat earlier transmitted the identification code. Therefore, theN600XL data block continues on the screen based on the correla-tion of the target position and the previous information receivedfrom the secondary radar.

Under the Brazilian ATC regulations (ICA 100-12), when an air-craft’s transponder ceases to present the required response signalin the RVSM air space (the case of Brasilia Vertical line), thecontroller must ask the pilot to verify the functioning of the tran-sponder. Besides that, in the Reduced Vertical Separation Minimumair space, a vertical separation of 1000 ft is permitted only toappropriately equipped aircraft (i.e., those with, among otherthings, a functioning transponder that has Mode C and Mode Scapabilities). With the loss of the transponder signal, ATC wasrequired to suspend RVSM operations for flight N600XL andprovide at least 2000 ft of vertical separation (Non-RVSM separa-tion in the Upper Control Area is 2000 ft) between it and othertraffic along its route of flight. This included flight 1907. Evenconsidering the importance that the loss of the transponder signalhas in the regulations and in the controller procedures, there is noactive warning signal delivered by the system. The controller mustactively seek for the information on the screen – the symbolchanges in the aircraft data blocks – as described above.

On the three screens presented in Fig. 11, we see the primaryradar indications with level variations, and when flight N600XL’starget disappeared completely from the radar screen, before goingout of sector 7 (still without a transponder signal). The flight datarecorder confirms that flight N600XL remained level at Flight Level

370 until the collision. However, the information received from theheight-finding radar estimated the aircraft at a different altitudewith nearly every ten second sweep of the radarscope (at 16:29:58the indicated level was 348). The estimated altitudes variedconsiderably not only from the aircraft’s actual altitude, but alsofrom the flight’s planned altitude, depicted in the right-handportion of the second line of the data block. When it was reachingTERES notification point, there was an intermittent detection fromprimary radar, and flight N600XL disappeared from the controllerscreen at 16:30:08 and appeared again at 16:31:28, as shown inFig. 11.

5.5.2. The communication problemsTable 4 summarizes the major events and communication

attempts that occurred in the Brasilia ACC regarding flight N600XL.The communication frequencies used in the controlled air space

are divided according to the sector the aircraft is currently flying inand are described in the aeronautical charts. Each aircraft must setits equipment to a communications frequency from which it canreceive and transmit voice communications. From the other side,the ACC control center transmits in broadcast, in all frequenciesactivated for each sector. Therefore, its transmissions should bereceived by all aircraft flying in the region. Table 5 shows thefrequencies used in sectors 7, 8 and 9, controlled by workstation 8of the Brasilia ACC.

Based on the data presented in Tables 4 and 5 we note that thecommunications difficulties were related to the frequenciesselected, not to flaws in the radio system. The first 6 attempts madeby Brasilia ACC used the frequency of 125.05 and were not receivedin flight N600XL’s radio system. This occurred probably becauseflight N600XL was reaching the limit of the communication rangeof the sector 7 frequency, in need of a new frequency. Indeed, in hissecond attempt, the controller tried to communicate a newfrequency to flight N600XL’s crew. Flight N600XL’s first attemptsused the 123.30 and 133.05 frequencies, which were not activatedin workplace 8. Finally, the last and only successful communicationoccurred on the 135.90 frequency. However, at the moment of thiscommunication flight N600XL was leaving sector 7 and enteringthe Amazon ACC region. The Brasilia ACC registered clearly the

Page 12: Normal people working in normal organizations with normal ... · Normal people working in normal organizations with normal equipment: ... procedure does not issue from a fault nucleus,

Fig. 8. A schematic view of the workplace 8 controller screen at 15:55 (source: Camara dos Deputados, 2007).

P.V. Rodrigues de Carvalho et al. / Applied Ergonomics 40 (2009) 325–340336

communications of flight GLO1907 with the Amazon ACC justbefore the collision, during ATC service transfer from the AmazonACC to Brasilia ACC, on the 125.20 frequency, which indicated thatthe ATC radio system was working properly at that moment, ina normal way.

N X600X L

370=37045 S077W

TAM3823052+37016 S097WS077W

+

+X

15:55:23

+

Fig. 9. Part of the sector 7 screen at 15:55. Two minutes before the Legacy reached the Bradata block from 370¼ 370 to 370¼ 360 indicating the new flight level after Brasilia VOR(transponder functioning) (source Camara dos Deputados, 2007).

5.5.3. The controllers’ inactionAs already noted long ago by Dailey (1984), ‘‘The central skill of

the controller seems to the ability to respond to a variety ofquantitative inputs about several aircraft simultaneously and toform a continuously changing mental picture to be used as basis for

NX600XL

370=36045 S077W

TAM3823053?370 16 S097W

S077W+

X

15:55:28

+

+

silia VOR (VHF Omnidirectional Range), the system automatically changed the Legacy’scontained in the filed flight plan. The ¼ sign indicates secondary surveillance radar

Page 13: Normal people working in normal organizations with normal ... · Normal people working in normal organizations with normal equipment: ... procedure does not issue from a fault nucleus,

16:01:43

GLO1693062?40022 S097W+

+

X

GOLALUKA

ACRE

NATA

LATO

PORMTREGELBILA

EROGEDNAP

GLO1693062?40023 S097W+

+

X

GOLALUKA

ACRE

NATA

LATO

PORMTREGELBILA

EROGEDNAP

NX600XL370Z36046 S077W

16:01:53

NX600XL370=36046 S077W

Fig. 10. Brasilia ACC system stop to receive replies from radar surveillance interrogations of the Legacy’s transponder. In the controller scope at 16:01:53 the circle disappeared fromthe Legacy’s target position symbol, leaving only the þ to represent the aircraft. In place of the ¼ that previously showed the Legacy level at a reported Mode C altitude of FlightLevel 370, the altitude status indicator began to display a Z, indicating that the three numerals to the left of the Z represent the aircraft’s estimated altitude derived from informationsupplied by ground-based height-finding primary radar (source Camara dos Deputados, 2007).

P.V. Rodrigues de Carvalho et al. / Applied Ergonomics 40 (2009) 325–340 337

planning and controlling the courses of the aircraft’’ (pp 134).Therefore, the aim of the air traffic controller is to constructa mental model that matches the dynamics and evolution of the airtraffic situation. This mental model is constructed by a continuouscomparison of the current air traffic situation with the anticipationof the situation in the near future, to plan required actions. AfterEndsley (1995), this special type of mental representation is calledSituation Awareness (SA). To achieve an adequate SA, the controllermust understand the current air traffic situation monitoring theradarscope and other information sources (flight plans, air-charts,communication frequencies), and anticipate the future aircrafttrajectories using sector maps, aircraft velocities and altitudes,flight strips, communication with pilots and co-ordination withother controllers. Therefore, the controller’s work activity requiresactive monitoring and control strategies to cope with workloadvariations, maintaining the SA that is paramount to ensure that themental picture remains consistent with the actual situation.

The dominant paradigm of the post hoc analysis of human errorin accidents indicates that many operational errors can be attrib-uted to lack of SA or to SA problems (e.g. Jones and Endsley, 1996),and in this accident the investigations reached similar conclusions.According to the final report of the Senate investigation, the acci-dent was caused by a chain of human errors (Senado Federal, 2007),including the ATC controllers’ failures in providing adequate airtraffic services to flight N600XL to ensure that it was properlyseparated from other traffic (flight GLO1907 included). These fail-ures were related to monitoring the radarscope over an extended

16:29:58

EGOLA

TERES

DRD1451380=38047 S077W

+NX600XL348Z360 45 S077W

+

16:30:08

DRD1451380=38045 S077W+

Fig. 11. N600XL with a big level variation (348) due to the low accuracy of the primary heigtransponder signals at that moment. At 16:30:08 the Legacy, still in the sector 7, disappearradar systems. At 16:31:26 the N600XL appeared again in the screen. The data block is missin

period without perceiving the potential conflict situation, proce-dure non-compliance (not terminating RVSM operation for flightN600XL after the loss of Mode C transponder flight level informa-tion), failures in the co-ordination with other controllers (e.g.transfer of control between sectors should include any abnormalcommunications status or uncertainties about flight data; incom-plete relief briefings during shift changeovers), failures in takingimmediate actions (e.g. to locate an aircraft which has beensimultaneously or unexpectedly lost from radar and radio).

In fact, the main indications of ‘‘abnormal’’ behavior of flightN600XL can be summarized as:

– Actual flight level different from the planned flight level;– Transponder ceased to reply the ATC surveillance radar

(2 signals – Z letter in the data block and target symbol change);– Erratic level variation (FL331to FL396);– The disappearance of the aircraft inside sector 7 (before leaving

sector 7).

We note that after the first automatic change in the radarscope(15:55), when flight N600XL entered in sector 7, the radarscopeindicated the loss of transponder returns (16:02), and differencesbetween the actual and planned flight levels. The controller onduty, despite these screen changes, remained almost 22 minuteswithout taking any action. After the shift changeover at worksta-tion 8 (16:17), the substitute controller took 10 min to try, withoutsuccess, to contact flight N600XL’s crew.

DRD1451380=38047 S077W

TERES

+

EGOLA +

TERES

16:31:28

ht-finding radar. The DRD1451 flight indicated that the ATC system was able to receiveed from the radarscope, indicating that it is flying without the primary and secondaryg because it appeared as a non-correlated target (source Camara dos Deputados, 2007).

Page 14: Normal people working in normal organizations with normal ... · Normal people working in normal organizations with normal equipment: ... procedure does not issue from a fault nucleus,

Table 4Major events and communications.

Time Event

15:50:37 Legacy is leaving sector 5 of Brasılia ACC to enter in the sector 7. Itreceived instructions from sector 5 controller to use the radiofrequency of 125.05 in the new sector

15:51 Legacy entered in sector 7 of the Brasilia ACC

15:55:28 Planned flight level automatically (no controller action) changedto 360 (370¼ 360)

16:01:53 Legacy transponder ceased to reply the ATC’s secondary radar(370Z360 and þ)

16:17 Shift changeover in the workplace 8 (sectors 7, 8 and 9)16:26:51 to

16:34:08Brasilia ACC controller tries to communicate with Legacy – 6attempts using the frequency of 125.05 Mhz

16:48:13 to16:52:07

Legacy tries to communicate with Brasilia ACC controller – 12attempts using several frequencies (2 attempts freq. 123.30, 1attempt freq. 133.05 and 8 attempts with other non- identifiedfrequencies)

16:53:38 Brasilia ACC contact Legacy in the frequency135.09. Instructions toLegacy contacts Amazon ACC in the frequency 123.32 or 126.45.However, Legacy did not get the numbers.

16:56 The collision

P.V. Rodrigues de Carvalho et al. / Applied Ergonomics 40 (2009) 325–340338

However, how much time is needed to perceive screen changesand more importantly, consider that they are important enough toact? Can other controllers working in the same system behave inthe same way? In such a cluttered display (Fig. 8 is only a small partof the radar screen where target (aircraft) data blocks are super-imposed on each other, making their visualization difficult), anautomatic change made by the system does not favor immediateperception by the controller. In fact, the automatic change of the‘‘cleared altitude’’ in the data block, without any controller action oreffective warning signal, can be easily overlooked by controllersand is potentially misleading, especially if we consider that thecontroller normally has his/her attention divided among severaldata blocks (aircraft under his/her responsibility) on the screen.Indeed, Means et al.’s (1988) cognitive task analysis indicates thatcontrollers are more aware of flight data of aircraft that they hadperformed control actions on than of aircraft that they had notcarried out control actions on. Barber (1988), analyzing the mid-aircollision in Yugoslavia back in 1976, had already noted that ina divided-attention task when people try to pay attention to two ormore simultaneous tasks, the consequences of divided-attentioncould be disastrous. Perception of the environment characteristicsdepends on the attention that allows our cognitive processes totake in selected aspects of our sensory world in an efficient andaccurate manner (Palmer, 1999). Therefore, small screen changeswithout warning, to be quickly perceived, require a concentrationof mental activity – attention – only on these inputs, which is verydifficult in divided-attention tasks.

The other issue is: Is perceiving the changes enough to triggera controller action? This depends on the strategies controllersnormally use to control their workload and cognitive demands.

Table 5Frequencies used in the workplace 8. In bold the identified frequencies used by theLegacy crew trying to communicate with Brasilia ACC. Two of them were not acti-vated in the workplace 8 (source Camara dos Deputados, 2007).

Sector Sector frequencies as presentedin the aeronautic chart of theUZ6 airway

Frequencies activatedin the workplace 8

Frequencies notactivated in theworkplace 8

7 123.30–128.00–133.05–135.90–121.50

135.90 123.30–128.00–133.05

8 122.25–125.20–135.00–121.50 122.25–125.20 135.009 125.05–133.10–121.50 122.25–133.10 –

Source: CPI final report (Camara dos Deputados, 2007).

Using controller interviews to study conflict resolution, Kallus et al.(1999) found that under low workload, the controllers take moretime to solve conflicts than in high workload situations, becauseunder low workload they have enough time and prefer to monitoraircraft movement closely and intervene only if there is a real need.Early ergonomic field studies about controllers’ work activity(Bisseret, 1971; Sperandio, 1971) indicated that controllers regulatetheir work activity, using different strategies to control the work-load. For example, controllers may reduce their workload byregulating the amount of attention they give to some aircraft. Thisregulation is based on the controller’s previous experience with theaircraft/flight characteristics (e.g. aircraft that they believe haveconflict possibilities require more attention). These strategiesemerge within the controller’s daily work with the system and areframed according to the system (man/technology/organization)behavior. Therefore, to understand how and why this accidentemerged there are fundamental questions about the controllers’activities that must be answered like, How frequently do the‘‘abnormal’’ indications seen in this event occur in the controllers’daily work? How much feedback do they have from supervisors?What is the normal way they coordinate their activities with othercontrollers? Unfortunately, we do not have ergonomic field studiesabout Brazilian controllers’ activities to shed some light on thesequestions.

The first action taken by a Brasilia ACC controller regarding flightN600XL occurred at 16:28. It was a radio call requesting flightN600XL’s crew to change to a new frequency to enable furthercommunication with Amazon ACC. This call occurred approxi-mately 26 min after ATC ceased receiving transponder returns fromthe flight. Flight N600XL was now flying almost at the limit of therange of the sector 7 ATC radio transmitter, resulting in a situationwhere although its radios were operating in a normal (proper) way,it could not receive the ATC radio calls clearly.

The sector 7 Brasilia ACC controller also communicated to theAmazon control center before the collision, to inform that flightN600XL was proceeding northwest on UZ6. However, he did notinform the Amazon controller that Brasilia ATC was not receivingthe flight’s transponder returns, or that the flight was no longer inradar contact. This communication is presented in Table 6.

In this conversation, we see the same communication patternobserved before, when two other controllers were talking aboutflight N600XL’s flight plan. We note that the Brasilia controller onlyinformed about the communication frequency. He simply did notmention any of the ‘‘abnormal’’ (at least in hindsight) indicationsthey had on the radarscope. At this moment, he (or his supervisor)probably had already perceived the indication changes, but he didnot think that these indications were important enough to becommunicated to the fellow controller. This brings the importantquestion regarding how the system functions daily. If the situationsuncovered by this accident, like loss of radar contact, communica-tions difficulties, and level variations due height-finding radar

Table 6Communication between Brasilia and Amazon controller Dialog in Portuguese,translated to English.

Time Controller Communication

16:53:30 Amazon ACC Hi Brasilia

16:53:32 Brasilia ACC November six zero zero x-ray lima. Do you have?16:53:35 Amazon ACC Yes, here.

16:53:37 Brasilia ACC It is now entering in your area.16:53:41 Amazon ACC Yes, I have it. Yes, I have it.16:53:45 Brasilia ACC Good, three six zero. He is calling you there.16:53:49 Amazon ACC OK16:53:50 Brasilia ACC Bye

Source Camara dos Deputados, 2007.

Page 15: Normal people working in normal organizations with normal ... · Normal people working in normal organizations with normal equipment: ... procedure does not issue from a fault nucleus,

P.V. Rodrigues de Carvalho et al. / Applied Ergonomics 40 (2009) 325–340 339

inaccuracy are frequent enough in a way that ‘‘abnormal’’ indica-tions were being considered ‘‘normal’’, then the ETTO principle andassociated heuristics (these things always happen, it is not impor-tant to act now, the system is always changing symbols) function asan important factor for the construction of cognitive strategies. Inthis situation, we cannot attribute the cause of the accident toa chain of human errors. Doing so, we will be blind to address thereal safety threats throughout the ATC system functioning.

5.5.4. The workplace in the Brasilia ACC during the eventAll flight controllers who in some way acted in this event

(including those who worked during the clearance of the N600XLflight in Sao Jose) are military controllers, sergeants of the BrazilianAir Force, in a military activity, control of the air space, a task that isattributed by Brazilian law to the Brazilian Air Force. The controllerswork in military workplaces, the Air space Control Centers (ACCs),all under Department of Air space Control (DECEA) administration,part of the structure of the Air Force Command (see Fig. 2).

In the ACCs, each workplace has 2 controllers, a main controllerand an assistant controller. A senior controller supervises each 2 or3 workplaces. The investigation showed that at workplace 8 thecontrollers that monitored sector 7 (including flight N600XL) didnot have an excessive number of aircraft under control (themaximum number of aircraft is 12) during the period beforecollision, and were therefore working in a low workload situation.Despite the variation of the on screen indications, the investigationconcluded that there was no problem in the radar system software/hardware, and the system functioned as it was designed to func-tion. The investigation also emphasized that there were no wrongindications or unexpected signals on the radarscope (Camara dosDeputados, 2007). Therefore, we can conclude that the systemsnormally function as described above. To explain their actions thecontrollers said in their testimonies (Camara dos Deputados, 2007):

– Main controller: Before flight N600XL entered sector 7, hecommunicated with the crew verifying that the aircraft was atlevel 370 (the correct level at that moment). He did not antic-ipate the need to change the level when the aircraft enteredsector 7. In the period from 18:55 (aircraft entered sector 7) upto 19:17 (shift changeover), he perceived the loss of thesurveillance radar, but it did not alarm him. He said he wassatisfied with the information coming from the primary radar.He informed his relief controller that flight N600XL was at level360, because he knew about the inaccuracy of the primaryradar information and assumed that the aircraft was followingthe flight plan that was displayed in the electronic strips.

– Assistant controller: He perceived that the N600XL did nothave complete information on the radar screen, and consideredthat to be a normal situation. Even though uninformed of theaircraft’s actual altitude, he coordinated with the Amazon ACCcontroller the level of 360, based on the electronic flight stripindication.

– Controller after shift changeover: He received flight N600XL atlevel 360 and did not question the outgoing controller. He saidhe had noticed the abnormal transponder functioning, andtried 8 times to contact fight N600XL. However, he did not takeany action to avoid the conflict.

6. Conclusion

The accident described here opened a window onto the func-tioning of the Brazilian air traffic system. As already noted in manyergonomic field studies, in safety-critical systems operators’cognitive strategies to maintain situation awareness are shaped bythe real conditions under which [they j operators] perform their

work, where resource limitations and uncertainty severelyconstrain the choices and action opportunities. Cognitive taskanalysis has been widely used (unfortunately not in Brazil) toexamine how air traffic controllers develop cognitive strategies tomanage their workload maintaining situation awareness. However,most of the research focus is on how extreme traffic situationsinfluence the cognitive control strategies developed by thecontrollers, rather than on how normal controllers working withnormal equipment in normal organizations shape their cognitivestrategies. In the events described here, we saw the influence of theworking constraints in shaping cognitive strategies that affectsystem safety.

During the antecedents of the collision, there was no special airtraffic situation, no catastrophic equipment failure equipment, andno trigger event as required in traditional accident models. Thisaccident emerges as a complex phenomenon within the normalvariability of the system functioning. This tragedy and many otheraccidents in complex systems raise serious questions on how safetyis thought about in complex safety-critical systems. This accidentand accidents in other safety-critical systems have complexpatterns of emergence, where coincidences, unexpected links, andresonance, substitute the old bullets such as equipment failureprobability, linear combinations of failures, human errors, and soforth. Therefore, safety managers and engineers should review,among many other things, how safety barriers should be used to beeffective in a defense-in-depth safety approach. The use of safetybarriers to stop the propagation of some trigger event cannot avoidthis type of accident simply because, as we have seen in this study,there is nothing to be stopped. Almost didactically, we saw all thebarriers developed to avoid mid-air collisions melt down in a situ-ation where everything functioned normally.

References

Barber, P., 1988. Applied Cognitive Psychology. Methuen, London.Bisseret, A., 1971. An analysis of mental model processes involved in air traffic

control. Ergonomics 14, 565–570.Brooker, P., 2006. Air traffic management accident risk. Part 1: the limits of realistic

modeling. Safety Science 44, 419–450.Camara dos Deputados, 2007. Relatorio final da comissao parlamentar de inquerito

crise do sistema de trafego aereo. Camara dos Deputados, Brasil.Carvalho, P.V.R., 2005. Ergonomic field studies in a nuclear power plant control

room. Progress in Nuclear Energy 48 (1), 51–69.Carvalho, P.V.R., Vidal, M.C.R., Carvalho, E.F., 2007. Nuclear power plant communi-

cations in normative and actual practice: a field study of control room opera-tors’ communications. Human Factors in Ergonomics and Manufacturing 17 (1),43–78.

Carvalho, P.V.R., Santos, I.J.A., Vidal, M.C., 2006. Safety implications of some culturaland cognitive issues in nuclear power plant operation. Applied Ergonomics 37,211–223.

Carvalho, P.V.R., Vidal, M.C., Santos, I.L., 2005. Nuclear power plant shift supervisor’sdecision-making during micro incidents. International Journal of IndustrialErgonomics 35 (7), 619–644.

DECEA, 2006. Regras do ar e serviços de trafego aereo, ICA 100–12. Ministerio daAeronautica, Brasil (in Portuguese).

Dekker, S., 2006. Resilience engineering: chronicling the emergence of confusedconsensus. In: Hollnagel, E., Woods, D.D., Leveson, N. (Eds.), Resilience Engi-neering. Concepts and Precepts. Ashgate, Aldershot, UK.

Dailey, J., 1984. Characteristics of air traffic controller. In: Sells, S.B., Dailey, J.T.,Pickerel, E.W. (Eds.), Selection of Air Traffic Controllers (No. FAA-AM-84-2).Federal Aviation Administration Office of Aviation Medicine, Washington DC,pp. 128–141.

Doherty, M.E., 1993. A laboratory scientist’s view of naturalistic decision making. In:Klein, G.A., Orasanu, J., Calderwood, R., Zsambok, C.E. (Eds.), Decision Making inAction: Models and Methods. Ablex Publishing Corp, Norwood, NJ.

EASA, 2005. Airworthiness Directive AD no: 2005-0021. European Aviation SafetyAgency, Germany.

Endsley, M.R., 1995. Toward a theory of situation awareness in dynamic systems.Human Factors 37, 65–84.

Evans, J., 2004. Biases in deductive reasoning. In: R, Pohl (Ed.), Cognitive Illusions:Handbook of Fallacies and Biases in Thinking, Judgment, and Memory.Psychology Press, Hove, England.

Ferreira, R., 2006. Colisao em Voo, Presentation to the Brazilian Press in 29September, 2006 (in Portuguese).

Page 16: Normal people working in normal organizations with normal ... · Normal people working in normal organizations with normal equipment: ... procedure does not issue from a fault nucleus,

P.V. Rodrigues de Carvalho et al. / Applied Ergonomics 40 (2009) 325–340340

Galison, P., 2000. An accident of history. In: Galison, P., Roland, A. (Eds.), Atmo-spheric Flight in the 20th Century. Kluver Academic, Dordrecht, The Neder-lands, pp. 3–44.

Holyoak, K., Simon, D., 1999. Bidirectional reasoning in decision making byconstraint satisfaction. Journal of Experimental Psychology: General 128, 3–31.

Hollnagel, E., 2004. Barriers and Accident Prevention. Ashgate, Aldershot, UK.Hollnagel, E., 2006. Resilience – the challenge of unstable. In: Hollnagel, E.,

Woods, D.D., Leveson, N. (Eds.), Resilience Engineering. Concepts and Precepts.Ashgate, Aldershot, UK.

Hollnagel, E., Woods, D.D., 2005. Joint Cognitive Systems: An Introduction toCognitive Systems Engineering. Taylor & Francis.

Irish Aviation Authority, 2005. Aeronautical Notice NR 0.52, issue 1, date 21 dec.2005. Safety Regulation Division Irish Aviation Authority, Ireland.

Jones, D.G., Endsley, M.R., 1996. Sources of situational awareness errors in aviation.Aviation, Space and Environmental Medicine 67, 507–512.

Kallus, K.W., Van Damme, D., Barbarino, M., 1999. Model of the Cognitive Aspects ofAir Traffic Control. European Air Traffic Management Programme Report.Eurocontrol, Brussels.

Klein, G.A., 1993. A recognition-primed decision (RPD) model of rapid decisionmaking. In: Klein, G.A., Orasanu, J., Calderwood, R., Zsambok, C.E. (Eds.),Decision Making in Action: Models and Methods. Ablex Publishing Corp,Norwood, NJ.

Means, B., Mumaw, R.J., Roth, C., Schlager, M.S., Mc Williams, E., Gagne, E., 1988. ATCTraining Analysis Study: Design of the Next Generation of ATC Training System(Rep. No. FAA/OPM 342–036). U.S. Department of Transportation – FederalAviation Administration, Washington DC.

Palmer, S.E., 1999. Vision science: Photons Phenomenology. MIT Press, Cambridge, MA.

Perrow, C., 1984. Normal Accidents. Basic Books, New York.Quinn, S., Markovits, H., 1998. Conditional reasoning, causality, and the structure of

semantic memory: strength of association as a predictive factor for contenteffects. Cognition 68, 93–101.

Rasmussen, J., 1983. Skills, rules and knowledge: signals, signs and symbols, andother distinctions in human performance models. IEEE Transactions onSystems, Man and Cybernetics 13, 257–266.

Rasmussen, J., Pejtersen, A., Goodstein, L., 1994. Cognitive Systems Engineering.Wiley, New York.

Rasmussen, J., Svedung, I., 2000. Proactive Risk Management in a Dynamic Society.Swedish Rescue Services Agency, Karlstad SW.

Reason, J.,1997. Managing the Risks of Organizational Accidents. Ashgate, London, UK.Senado Federal, 2007. Relatorio parcial dos trabalhos da Cpi ‘‘do apagao aereo’’.

Senado Federal, Brasılia (in Portuguese).Snook, S.A., 2000. Friendly Fire: The Accidental Shootdown of U.S. Black Hawk

Helicopters Over Norther Iraq. Princeton University Press, U.K.Sperandio, J.C., 1971. Variation of operator’s strategies and regulating effects on

workload. Ergonomics 14, 571–577.Tversky, A., Kahneman, D., 1974. Judgment under uncertainty: heuristics and biases.

Science 185, 1124–1131.Van ES, G.W.H., 2003. Review of Air Traffic Management-Related Accidents

Worldwide: 1980–2001, NLR-TP-2003-376. National Aerospace Laboratory NLR,The Netherlands.

Woods, D.D., Cook, R.I., 2006. Incidents – markers of resilience or brittleness? In:Hollnagel, E., Woods, D.D., Leveson, N. (Eds.), Resilience Engineering. Conceptsand Precepts. Ashgate, Aldershot, UK.

Yin, R.K., 1994. Case Study Research: Design and Methods. Sage, Thousand Oaks, CA.