Process Monifwhg and Diagnosis -...

Process Monifwhg and Diagnosis A Model-Based Approach Daniel Dvorak, AT&T Bell laboratories Benjamin Kuipers, University of Texas at Austin

I N A BOOK WITH THE CURIOUS title Normal Accidents,’ Charles Perrow examines several of the most notable accidents involving complex systems in the modern industrial world-accidents such as the 1979 Three Mile Island nuclear power accident, the 1977 New York City black- out, and the 1969 Texas City explosion of a butadiene refining unit. Perrow high- lights the difficult job of plant operators who are responsible for physical systems with complex interactions and tight cou- pling. With current monitoring technology, alarms are triggered whenever fixed thresholds are exceeded. A nuclear power plant, for example, can have over a thou- sand distinct alarms, and hundreds of them can be activated within a minute, as in a loss-of-coolant accident. In such situations, process operators tend to overlook relevant information, respond too slowly, and panic when the rate of information flow is too great.

Not surprisingly, operator advisory systems have become an important area of application for expert systems. ESCORT2 (an expert system for complex operations in real time) and REALM3 (a reactor emergency action level monitor) are two of many expert systems developed for process industries. These systems aim to

PROCESS PLANTS MUST DEAL WITH CHANGZNG

STATES, MULTZPLE FAULTS, AND INCOMPLETE

AND UNZIELZABLE MEASUREMENTS. OUR

MODEL-BASED ADVISORY SYSTEM EXPLOITS

THREE RECENT TEWNOLOGZES TO ZMPROVE ,i.,.b”i^

PROCESS MONZTOZUNG AND DZAGNOSZS.

reduce the cognitive load on operators, ~ based on an expert’s imperfect recall of usually by helping to diagnose the cause of alarms and possibly by suggesting correc-

~ symptoms and faults-is to use a model of , the process to predict its behavior or to

tive actions. Most of these expert systems ~ check consistency among observed vari- get their knowledge of symptoms, faults, ~ dbles. I When observations disagree with and corrective actions through the usual ~ the model’s predictions, some diagnostic process of codifying human expertise in technique is initiated to identify the fault rules or decision trees. The problem, as candidates. These model-based approach- with all expert systems, is reliability. As es to diagnosis have emerged from two Denning observes4 different communities. In the engineering

The trial-and-error process by which knowledge is elicited, programmed, and tested is likely to produce inconsistent and incomplete databases; hence, an expert system may exhibit important gaps in knowledge at unex- pected times.

Obviously, these “gaps in knowledge” can have serious consequences in some process industries.

An alternate approach-one that is not

community, fault detection and isolation techniques generally rely on a precise mathematical model of the process and on pre- enumerated symptom-fault patterns known as fault signatures. In the computer science/artificial intelligence community, model-based diagnosis relies on models of structure and behavior.” For example, given symptoms of misbehavior (as detected by the behavioral model), fault candidates

JUNE 1991 0885/9000/9 110600-0067 $ I .OO 0 199 I IEEE 67

Alarms j Advising

Figure 1. The three tasks of process monitoring.

are identified using the structural model by following a dependency chain back from a violated prediction to each component and parameter that contributed to that prediction.

The model-based approach we discuss here has evolved within the AI community, but similar ideas have appeared indepen- dently in the fault detection and isolation literature.6 The type of model used in any model-based approach determines many of the capabilities and limitations of the specific method. Model types vary a great deal: There are numerical models, dynamic qualitative models, extended signed directed graphs, causal models and conflu- ence equations, fuzzy qualitative models, and this article’s semiquantitative model.

We focus here on process monitoring and diagnosis - the basic elements of an operator advisory system. In this setting, several conditions hold that challenge diagnostic methods. First, the plant is a continuous-variable dynamic system with feed- back loops and changing states. Second, diagnosis must be performed while the system operates. Third, many system quan- tities are not sensed. Finally, measurements are unreliable due to sensor failures.

The basic idea: mimicry

The key cognitive skill for process operators is the formation of a mental model that not only accounts for current observations but also lets the operators predict near-term behavior as well as the effect of possible control actions. This observation underlies our framework for process monitoring, named Mimic. The basic idea is quite simple: Mimic the physical system

with a predictive model, and when the system changes behavior due to a fault or repair, change the model accordingly so that it continues to give accurate predictions of expected behavior.

Figure 1 depicts the Mimic framework. Two tasks maintain the model. The tracking task advances the model’s state in step with observations from the physical system. When observations disagree with predictions, Mimic uses model-based diagnosis to determine the possible faults. After identifying a fault, the diagnosis task in- jects it into the current model so that the predictions will continue to agree with actual observations. To be precise, Mimic maintains a set of candidate models since a given behavior might be caused by one of several faults. Each candidate model represents a possible condition of the system, including its state and faults.

The key benefit of this approach is that we can use the model as our window into the physical system. Specifically, the model can

l detect early deviations from expected behavior more quickly than with fixed- threshold alarms (this method, known as analytical redundancy, uses known analytical relationships among sets of signals to check for mutual consistency);

l predict the values of unobserved variables (signal reconstruction) to permit alarms or other inferences on unseen variables, and to help the operator understand process conditions;

l predict near-term undesirable or hazardous conditions, thus providing early warnings; and

l predict the effect of proposed control actions to test if they will have the desired effect - a valuable capability in complex systems.

The end purpose of monitoring and diagnosis is advice to the operator about what’s happening and what to do about it. The advising task applies expert knowledge of safety conditions, recommended operating procedures, and performance objectives to produce advice in the form of alarms, warnings, and recommended actions. Although we do not discuss it further here, the advising task is a major beneficia- ry of the model-based approach in that the candidate models (and their tracked states) provide a testbed for generating warnings and for testing proposed control actions.

Three key technologies

Mimic exploits threerelatively new technologies: semiquantitative simulation, measurement interpretation (tracking), and model-based diagnosis. These technologies work together in a hypothesize-build- simulate-match cycle, as shown in Figure 2. This figure gives a more detailed view of the tracking and diagnosis tasks of Figure I.

Semiquantitative simulation. Industrial process plants (such as chemical refineries and nuclear power plants) are examples of continuous-variable dynamic systems. In modern control theory, these dynamic systems are modeled with a set of coupled first- order differential equations consisting of balance equations, physical-chemical state equations, and phenomenological laws. Given a set of initial values, a numerical simulation of the equations yields precise predictions of values for each variable over time.

Oddly enough, standard numeric simulation is too exact and too narrow for our purposes. Real process systems contain much imprecision. Sensors, actuators, and functional units operate within certain tol- erances; parameter values and some functional relations are known only approxi- mately. One approach, of course, is to do precise numerical simulation using aver- age values, and then use some form of approximate matching of simulation re- sults to observations. This approach pre- sents two problems. First, given initial conditions, numerical simulation predicts only one behavior from a model, even though more than one might be possible, given the real imprecision. For example, a tiny difference in one parameter can determine whether or not a rocket achieves escape velocity. In effect, numerical simulation makes an inappropriate commitment to a single behavior. In contrast, qualitative simulation guarantees that all possible behaviors will be predicted. This capability is especially important in testing a fault hypothesis, which can exhibit several qualitatively distinct behaviors.

The second problem is the approximate- matching problem - how do you decide, in a principled way, when a difference between prediction and observation is due to imprecision and when it is due to a fault? What we really want is to explicitly express imprecise knowledge as part of the

68 IEEE EXPERT

model and have the simulator use it. pro- ducing valid ranges for each variable and permitting direct matching of observations to predictions. Semiquantitative simulation provides this capability. Furthermore, when observations match the predictions of two (or more) distinct behaviors, we want to track both hypotheses until they diverge. which Mimic does.

Qualitative simulation.’ the foundation for semiquantitative simulation, has two important characteristics for our application. First. it uses a qualitative level of description that lets us express imprecise knowledge. This purely qualitative description uses no numbers (but can take advantage of quantitative information when available). Second. qualitative simulation generates all the qualitatively distinct behaviors that are attainable from a starting state and consistent with the given imprc- cise knowledge. This property is essential in monitoring a physical system. whether healthy or faulty.

Semiquantitative simulatiot? take5 advantage of quantitative knowledge when it is available, which is always the case in process plants. This knowledge consists of numeric ranges for landmark values (for example. the pressure relief valve opens at 200-2 IO pounds per square inch) plus en- velope functions that define the limits of monotonic relationships (for instance, an approximate relationship between the volume of fluid in a tank and its height). Quantitative values are expressed as numeric ranges and simulated with a modi- fied interval arithmetic. Interval arithmetic is normally subject to an uncertainty explosion, but this problem is avoided because all reasoning takes place with rc- spect to the fixed set of landmarks provided by the qualitative behavior. The resulting semiquantitative simulation retains all the properties of qualitative simulation. but with two additional benefits: (I ) it elimi- nates behaviors that are qualitatively possible but quantitatively inconsistent, and (2) it permits direct comparison of numeric sensor readings to the numeric ranges prc- dieted for each variable. Qsim, which we have used in our research, provides this form of semiquantitative simulation.

The version of Mimic we describe here has evolved from an earlier design’ that based hypothesis generation on a decision tree and used a less sophisticated form of semiquantitative simulation.

JUNE 1991

Figure 2. The architecture of Mimic.

Tracking. Mimic seeks to maintain a model whose state and faults (if any) reflect the current state of the physical system. More precisely, it maintains a set of models. each in a state consistent with the most recent observsations. This set is called the “tracking set.” and each model in the set embodies different fault hypotheses and therefore represents an alternate interpretation of the system. During diagnosis. Mimic adds models to the tracking set as it generatcs fault/repair hypotheses. During tracking. Mimic deletes models from the tracking set when their predictions fail to track observations.

Qualitative simulation generates a behavior graph. which is a directed graph of the system’s possible qualitative states and the transitions among them. A behavior is a path through the directed graph and corn- prises a sequence of states alternating between states representing an instant oftime and states representing an interval of time. Tracking. also called measurement interpretation, is the process of using observations to follow a path (a behavior) through the behavior graph.“’

Using the behavior graph fragment in Figure 3. we can describe several details of the process. If a model is in state E. Mimic compares each new set of observations to the values of state E. If the observations match the predictions. (that is. if each observation falls within the predicted range). then the model remains in state E. The usual noise filtering of sensor readings should still be performed before matching.

When an observation does not match the

current state E, Mimic uses incremental simulation to generate the immediate successor states F. Cl, H. (Recall that a state in a semiquantitative simulation can have more than one possible successor state because of the imprecise knowledge expressed in the model.) If, say, the observations match state G, then the model is retained with its state now set to G. Incremental simulation refers to the control that the tracking task exerts over the simulator. When triggered. the simulator generates only the immediate qualitative successor states to the current state. Thus, the simulation is advanced only as needed; it is never “run to completion.”

If the observations do not match any of the immediate successor states. Mimic re- peats the incremental simulation and compares the observations to the second- generation successor states (I. J, K, L. M). The model needs this limited-distance look- ahead to jump over instantaneous states that fall between consecutive observations.

Figure 3. Tracking through a behavior graph.

b9

Lower heater

I- Figure 4. An electric water heater.

Observations can include independent variables, that is, exogenous variables whose values cannot be predicted. When an independent variable changes value, tracking must reinitialize the models in the tracking set using the current observations and most recent predictions for integrated quanti- ties. Thus the simulation, just like the physical system, is made to react to changes in independent variables.

Through progressive step-sizerefinements of the semiquantitative simulation, we can attain a desired precision for the quantitative predictions.8 Specifically, we can use the time interval between qualitative time points to refine the step size of the quantitative simulation, inserting new quantitative states that have time points within that interval. These new quantitative states more precisely bound the predicted behavior.

Mimic never has to generate the full behavior graph (envisionment), a compu- tation that can be prohibitively expensive for complex systems because of the intrac- table branching problem of qualitative simulation. Instead, Mimic performs incremental simulation to generate only the states in the immediate vicinity of the last tracked state, abandoning any branches that do not track the observations. In effect, the observations act as a filter that elimi- nates irrelevant branches in the behavior graph.

Model-based diagnosis. Process systems are designed for continuous operation, and are therefore somewhat fault tol- erant. Economic pressures to keep a plant in operation often mean that the system will continue running with multiple faults. Thus, single-fault diagnosis is inadequate. However, complete multiple-fault diagno-

70

sis is combinatorially explosive and there- ~ fore unrealistic for real-time monitoring. i As a middle approach, Mimic incremental- ly constructs and tests multiple-fault hy- ! potheses. Specifically, since sensor mea- 1 surements are frequent, we assume that ) only a single new fault (or a single repair) : can occur between successive measure- ~ ments. Thus, Mimic can construct multiple- fault hypotheses, one hypothesis at a time.

Let’s examine when and how diagnosis occurs. The tracking task discards a model ( when it finds a discrepancy between pre- 1 dictions and observations. However, before discarding the model, Mimic tries to modify it to bring its predictions into agreement with observations. This is similar in intent to the debug phase of the 1 generate-test-debug paradigm,” though it and Mimic differ in many other ways. Us- ing the structural model containing components, connections, and parameters, Mimic’s algorithm traces upstream from 1 the site of the discrepancy to identify all components and parameters that could have contributed to the discrepancy (dependency tracing). Assuming that the discrepancies are due to a single new fault or a single 1 new repair, the only suspects are those that can account for all discrepancies. Mimic further checks these suspects for global 1 consistency through constraint-suspension; i if it finds no assignment of values consistent with all symptoms, then the suspect is exonerated. For each remaining suspect, each of its other operating modes is tested for compatibility with the observations. ’ Mimic adds to the tracking set whatever ) model variations survive this test.

Unlike many diagnostic methods, model-based diagnosis does not rely on a set of I symptom-fault patterns. Such patterns are often incomplete, since it is difficult for an expert to anticipate all possible faults and predict their symptoms, especially the symptoms of interacting faults. Even if we collect symptom-fault patterns from ex- haustive fault-model simulations, using these patterns is not necessarily more efficient.

Another important property of model- based diagnosis is that it handles failed sensors and missing data naturally, not as a special case. A sensor is just another component that affects an observation; dependency tracing will identify it as a suspect in the usual way. As Scar1 observesI model- based reasoning avoids combinatoric

problems in handling failed sensors and unavailable data because it matches against predictions rather than symptom- atic patterns. If a sensor is bad and thus gives readings different than predicted, the sensor becomes a suspect simply because it is upstream of the discrepancy. If a datum is unavailable, it is not compared with predictions and therefore cannot cause discrepancies.

An example

To illustrate Mimic at work, consider the electric water heater shown in Figure 4, which we have modeled and tested with Mimic. The water heater has a single thermostat that controls whether or not power is applied to the two heating elements (on- offcontrol). Raw sensorinformationcomes from a temperature sensor near the thermostat, from a flow-rate sensor on the cold- water inlet, and from a voltage sensor on the heating elements. In a real monitoring situation we would want to diagnose vari- ous possible faults such as defective heating elements, a stuck thermostat, a faulty flow-rate sensor, and loss of electrical power. However, to keep this example simple, we consider only the possibility of defective heating elements.

The water heater is modeled as a two- compartment model in which two masses of water (upper and lower) are connected with thermal flow and mass flow between them, as shown in Figure 5. Each compartment is treated as well mixed (the temperature is the same everywhere within the compartment). Each compartment’s temperature is affected by five heat flows: heat gain from the heating element, heat loss to the room through the insulating jacket, heat gain due to water inflow, heat loss due to water outflow, and heat transfer through thermal contact with the other compartment. The semiquantitative Qsim model of the water heater contains the usual equations that relate mass, mass flow, heat, heat flow, thermal resistance, and temperature. It also contains numeric ranges for landmark values such as room temperature, inlet water temperature, nominal heating rate, and thermal resistance of the insula- tion. This model does not require any enve- lope functions because it does not contain imprecise monotonic relationships.

In the normal (fault-free) model, all

IEEE EXPERT

components (tank. heating elements, thermostat, temperature sensor, flow-rate sensor. voltage sensor, voltage supply) operate according to their intended purposes. In a fault model, a faulty component operates according to a failure mode, such as a heating element that generates no heat when power is applied.

Table 1 summarizes an example ofmon- itoring the water heater, showing how monitoring progresses over eight moments in a series of observations. The observations are simulated from a separate numerical model of’ a faulty water heater in which the lower heating element produces no heat. For each moment. the table shows what hypotheses have been proposed and what models are beinp tracked. The water heater begins in a state where the tank’s water is hot, the heating elements are off. no water is flowing. and the temperature is slowly falling. These readings are consistent with the normal model. Now. someone starts to draw water for a bath. A high flow rate is measured, but all other readings remain the same. Since water flow is an independent variable, Mimic reinitializes every tracked model (just the normal model in this case) to reflect the change. Since the normal model remains consistent with the new values, it is retained.

As time continues, the temperature in- side the tank drops because of the cooler inlet water. These readings are consistent with the current state of the normal model. 50 no change occurs to the tracking set. At moment 3 the temperature drops to the point where the heating elements turn on. as observed by the voltage sensor. Since this event is predicted by the normal model as an immediate successor state of the most recently tracked state, the normal model is retained with its state updated.

At moment 4 the temperature continues to drop. Although this observation is qualitatively consistent with the normal model, it is inconsistent with the associ- ated quantitative ranges. In effect, the model is saying that for this flow rate. tank capacity, heating rate. and inlet temperature. the water temperature should not be dropping so fast. Thus, the tracking task discards the normal model. At the same time, this discrepancy triggers dependency tracing. which identifies two possible faults - a bad upper heating element or a bad lower heating element (denoted bad-h I and bad-h2). This causes

Water source

Figure 5. A structural model of water heater.

Mimic to build two fault models. Both models are successfully initialized. so Mimic is now tracking two models. Since Mimic assumes that only one fault occurs at a time. it does not hypothesire adouble fault (bad-h1 and bad-h2). In a more detailed example. other hypotheses would also be proposed. such as a faulty temperature sensor. faulty flow sensor, and faulty thermostat, since all are upstream of the temperature discrepancy.

The water flow stops at moment 5 (some- body turned offthe faucet). With this change

in an independent variable, Mimic reinitializes the two models. At moment 6, the temperature is observed to be rising. Mim- ic then compares the observed temperature to the quantitative predictions of the two models. Because the observed temperature exceeds the ran&e predicted by the bad-hl model, Mimic discards that model. The predictions of the one remaining model, bad-h2. are compatible with the observations. so the model is retained. This model continues to track future readings. and emerges as the sole fault hypothesis.

Table I. Diagnosing the water heater based on its dynamic behavior.

MOMENT

0 1 2 3 4 5 6 7

SYNOPSlS

TIME (MIN.)

FLOW

(LITE~~S/MIN.)

TEMPERATURE

(OEG. C)

POWER

(ON OR OFF)

NEW FAULT HYPOTHESES

NEW FAULT MODELS

TRACKED MODELS

Temp Flow Temp Heater Temp still Flow hot starts dropping on dropping stops

0.0 1.0 2.0 2.4 2.7 3.0

0 30 30 30 30 0

64.9 64.8 61.4 58.9 57.1 55.9 60.0 66.0

off

none

none

normal

off off on on on

none none none bad-h1 none bad-h2

none none none bad-h1 none bad-h2

normal normal normal bad-h1 bad-hi bad-h2 bad-h2

Temp rising

13.0

0

on

none

bad-h2 bad-h2

Heater off

27.7

off

none

none

JUNE 1991 71

Advantages

The water heater example shows how Mimic can diagnose a system by observing its dynamic behavior with few observable variables. In general, the speed at which Mimic can narrow its diagnosis depends on the number of monitored variables and the dynamic activity of the system. With more monitored variables and more system activity, more opportunities arise to refute incorrect hypotheses.

Diagnostic systems often rank compet- ing hypotheses by probability, based on the previously known fault probabilities of components. Because Mimic monitors continuously, it can also rank hypotheses by age. The longer a hypothesis survives changing observations, the stronger the evidence supporting that hypothesis. This age-ranking also focuses attention on hypotheses that account for the earliest manifestations of a fault, before other manifestations and corresponding hypotheses appear. In short, the system’s natural time delays help identify the correct hypothesis.

Mimic treats alarms in a new way, in that they are based primarily on the predictions of the models in the tracking set. This has several positive consequences:

l Alarm thresholds can be dynamic rather than fixed, thus allowing earlier alerting.

l Alarms can be based on unobserved variables, permitting more freedom in alarm design.

l Alarms can reveal any mutually inconsistent readings (extreme analytical redundancy).

l Alarms, called forewarnings, can be based on near-future predicted states.

l False alarms due to operating-mode changes (for example, startupor shutdown) should not occur if the model faithfully predicts such dynamic behavior.

limitations

If Mimic cannot quickly refute invalid hypotheses, the tracking set will grow and Mimic will slow down correspondingly. Mimic refutes hypotheses through tracking: there must be an observed discrepancy between the model’s predictions and the sensor readings. In practice, this means the model’s quantitative predictions must be reasonably well bounded and that there must be an adequate number of well-placed sensors.

Mimic assumes that faults occur one at a time. More precisely, it assumes that the manifestations of different faults appear at different times with respect to its sampling rate. This assumption can be violated in the case of a catastrophic event (such as an explosion) or cascading faults, where discrepancies due to more than one fault can appear simultaneously.

A qualitative simulation algorithm can predict spurious behaviors when the qualitative model does not explicitly preserve an invariant quantity (for example, energy)

THE SPEED AT ~HZCH

MIMIC CAN NARROW ITS

DLAGNOSZS DEPENDS ON THE

NUMBER OF MONlTORED

VARlABLES AND THE DYNAMIC

ACTMTY OF THE SYSTEM.

from a qualitative state to its successors. If the manifestations of an actual fault hap- pen to match a predicted but spurious behavior, then the system will find no discrepancy and the fault will go undetected. However, this problem has been substan- tially reduced by introducing several global constraints, that is, constraints that elim- inate spurious behaviors through global consistency checks, such as applying the nonintersection constraint to trajectories in qualitative phase space, automati-tally de- riving energy constraints by recognizing conservative and nonconservative forces, and using higher-order derivative constraints.

The Qsim algorithm guarantees that all behaviors are predicted, and only under a qualitative level of description does this give a tractable set ofpossibilities. In simple cases such as the water heater, this is tractable in practice as well as in theory. For more complex systems, controlling the size of the hypothesis set is still a potential problem.

Related work

Kay” has demonstrated the Mimic approach in monitoring the pump-down phase of a vacuum system for semiconductor

fabrication, which requires ultrahigh vacu- ums ( 10m9 Tot-r). Since no practical theory exists for the sorption of gases, it is difficult to model the process numerically. Kay’s semiquantitative model, with dynamic en- velopes that bound the expected observations, permits reasoning with uncertainties and still achieves detection of faults early in the pump-down phase.

Abbott’s approach to monitoring and diagnosisI like Mimic, takes advantage of the sequence in which symptoms appear, although the mechanisms differ somewhat. Her Draphys system detects symptoms (discrepancies) by comparing sensor readings to expected values com- puted from a numerical simulation model of the fault-free system. Fault hypotheses are then generated by tracing upstream from the symptom through a graph model of the paths of interaction among components, tracing both functional and physical paths. As new symptoms appear, Draphys tests each existing hypothesis to see if propagating its effects further downstream in the graph model covers the new symptoms. This latter step is akin to Mimic’s tracking, but at a more abstract level. Spe- cifically, the graph model in Draphys represents only that a fault in one component might affect another component; there is no information about whether the affected sensor should read high or low, nor about time delays in fault propagation. Such information could be used to refute some hypotheses. Mimic can refute some hypotheses because the semiquantitative model it uses during tracking provides such information.

Isermann’s model-based approach to process fault diagnosis,6 like Mimic, uses dynamic mathematical models and mea- surable input and output signals to allow estimation of unmeasurable internal quan- tities, which can then be used for fault detection. Unlike Mimic, however, Iser- mann’s models are strictly quantitative and are expected to “describe the process behavior precisely.” The resulting approximate-matching problem (to determine if an observation is “normal”) is handled with a Bayes decision algorithm. After recognizing a symptom, this system classifies the fault by comparing it with fault signatures, which have been established beforehand. Although Isermann’s work uses different methods for simulation, measurement interpretation, and diagnosis, he reaches a

-. - 72 -iiEE EXPERT

conclusion that we share: “Dynamic process behavior yields considerably more information on process faults than can be achieved in the static case.”

Several expert systems have been built that share Mimic’s operational goal -that of relieving some of the monitoring burden from process operators.” Mimic focuses primarily on determining the physical system’s state, but most of these expert systems have the broader scope of trying to advise the operator on corrective actions. ESCORT,’ for example, gets its knowledge of faults, anomalies, and corrective actions through the usual process of codifying human expertise in rules, rather than by encoding a predictive model of the physical system as Mimic does.

C OMPARED TO EXISTING METH- ods based on fixed-threshold alarms, fault dictionaries, decision trees, andexpert systems, our method for monitoring and diagnosing process systems has several advantages beyond those we’ve already discussed:

VARlABLES AND MORE

SYSTEM ACTMTY, MORE

OPPORTUNITIES ARlSE TO

REFUTE IhCORRECT

HYPOTHESES.

l A semiquantitative model of a physical system can predict all possible behaviors that are consistent with the incomplete and imprecise knowledge of the system’s devices and processes. This ensures, for example, that rare but hazardous behaviors will not be overlooked.

l By using a structural model of the plant and tracing upstream from the site of un- matched observations, model-based diagnosis generates fault candidates efficient- ly, without resorting to precompiled (and often incomplete) symptom-fault patterns.

l By injecting a hypothesized fault into the model and tracking its predictions against observations, the dynamic behavior of the plant is exploited to corroborate or refute hypotheses.

a higher-level description of a physical system, and a simulation task, which takes the equations and predicts possible behaviors. Although we described our example model at the level of semiquantitative differential equations used in Qsim, it is usually more convenient to describe a model at a higher level of abstraction. For example, a device ontology” views a system as a collection of interconnected devices (such as tanks, pumps, and pipes), while a process ontology’* views a system as a set of processes (such as liquid flow and heat flow) plus the preconditions that enable each process. These popular ontologies can be compiled into the qualitative mathematics of Qsim, but additional work on

l By simulating ahead in time from the automated model building is needed to add

current state, an operator can be warned of partial quantitative information and permit

nearby undesirable states that the plant automatic injection of faults.

might enter. Similarly, the effects of pro- 1 posed control actions can be determined by simulating from the current state. Acknowledgments

many benefits from this research. For example, recent research* has improved quantitative reasoning mechanisms to provide tighter bounds on predictions of semiquantitative models. Also, research on reasoning about energy l6 has eliminated an important source of spurious behaviors in qualitative simulation.

An important task that we have not discussed is model building. Model-based reasoning can and should be decomposed into a model-building task, which creates semiquantitative differential equations from

WITH MORE MONITORED

The three foundational technologies that support Mimic’s method of monitoring and diagnosing process systems - semiquantitative simulation, tracking, and model- based diagnosis - continue to be active areas of research. Mimic stands to inherit

We thank the guest editors and anonymous referees for pointing out several shortcomings in the earlier version of this article. Daniel Dvorak has been supported by the AT&T Doctoral Support Program. Benjamin Kuipers has been supported in part by NSF grants IRI-8602665, IRI-8905494, and IRI-8904454, by NASA grants

NAG 2-507 and NAG 9-200, and by the Texas Advanced Research Program under grant 003658175.

References

1. C. Perrow, NormalAccidents, Basic Books, New York, 1984.

2. P.A. Sachs, A.M. Paterson, and M.H.M. Turner, “ESCORT - An Expert System for Complex Operations in Real Time,” Expert Systems, Vol. 3, No. I, Jan. 1986, pp. 22-29.

3. R.A. Touchton, “Emergency Classification: A Real-Time Expert System Application,” Proc. Southcon 1986, Electronics Conven- tions Management, Los Angeles, 1986, pp. 2,321.2,323.

4. P.J. Denning, “Towards a Science of Ex- pert Systems,” IEEE Expert, Vol. 1, No. 2, Summer 1986, pp. 80-83.

5. R. Davis and W. Hamscher, “Model-Based Reasoning: Troubleshooting,” in Explor- ing Artificial Intelligence: Survey Talks from the Nar’l Conferences on Artificial Intelligence, H.E. Schrobe, ed., MIT Press, Cambridge, Mass., 1988, pp. 297-346.

6. R. Isermann, “Process Fault Diagnosis Based on Dynamic Models and Parameter Esti- mation Methods,” in Fault Diagnosis in DynamicSystems: TheoryandApplications, Chapter 7, R. Patton, P. Frank, and R. Clark, eds., PrenticeHall,EnglewoodCliffs, N.J., 1989.

7. B. Kuipers, “Qualitative Simuiation,“Arti- ficial Intelligence, Vol. 29, No. 3, Sept. 1986, pp. 289-338.

8. D. Berleant and B. Kuipers, “Qualitative- Numeric Simulation with 43,” to appear in Recent Advances in Qualitative Physics, B. Faltings and P. Struss, eds., MIT Press, Cambridge, Mass., 1991.

9. D. Dvorak and B. Kuipers, “Model-Based Monitoring of Dynamic Systems,” in Proc. ZIthZnt’IlointConf.ArtificialZntelligence (ZJCAZ 89), Morgan Kaufman”, San Ma- teo, Calif., 1989, pp. 1,238.1,243.

10. K.D. Forbus, “Interpreting Measurements of Physical Systems,” Proc. Fifth Nat’1 Conf. Arrificial Intelligence (AAAI 86), MIT Press, Cambridge, Mass., 1986, pp. 113-117.

Il. R. Simmons and R. Davis, “Generate, Test, and Debug: Combining Associational Rules and Causal Models,” Proc. ZOth Znt’l Joint Conf Artificial Intelligence (ZJCAZ 87), Morgan Kaufman”, SanMateo, Calif., 1987, pp. 1,071-1,078.

JUNE 1991 73

12. E. Scarl, “Sensor Failure and Missing Data: Further Inducements for Reasoning with Models,” Proc. 1989 AAAI Workshop on Model-Based Reasoning, 1989, pp. l-6, available from E. Scarl, ed., Boeing Com- puter Services, M/S 7L-64, Seattle, Wash.

13. H. Kay, Monitoring and Diagnosis ofMul- titank Flows Using Qualitative Reasoning, master’s thesis, Univ. of Texas at Austin, 1990.

14. K. Abbott, Robust Fault Diagnosis of Physical Systems in Operation, doctoral dissertation, Rutgers Univ., New Brunswick, N.J., 1990.

15. D. Dvorak, “Expert Systems for Monitor- ing and Control,” Tech. Report AI87-55, Dept. of Computer Sciences, Univ. of Tex- as at Austin, 1987.

16 P. Fouche and B. Kuipers, “Reasoning About Energy in Qualitative Simulation,” to be published in IEEE Trans. Systems, Man, and Cybernetics, 1991.

17. J. de Kleer and J.S. Brown, “A Qualitati- Physics Based on Confluences,” in Qua, tative Reasoning about Physical Systen D.G. Bobrow, ed., MIT Press, Cambridg Mass., 1985, pp. 7-83.

18. K.D. Forbus, “Qualitative Process The ry,” in Qualitative Reasoning about Ph) icalSystems,D.G. Bobrow,ed.,MITPre! Cambridge, Mass., 1985, pp. 85-168.

Further reading

D. Dalle Molle, Qualitative Simulation of C namic Chemical Processes, doctoral disserl tion, Univ. of Texas at Austin, 1989.

Fault Diagnosis in Dynamic Systems: The< and Applications, R. Patton, P. Frank, and Clark, eds., Prentice Hall, Englewood Clif N.J., 1989.

T.J. Laffeyetal., “Real-Time Knowledge-Bas Systems,“AZMagazine, Vol. 9, No. I, 1988,l 27-45.

1-91 NEW PROCEEDINGS from e IEEE Computer Society Press

2nd ANNUAL CONFERENCE ON AI, SIMULATION AND PLANNING IN HIGH AUTONOMY SYSTEMS

The proceedings examines integrated methods in simulation and planning that help automate basic decision-making processes in computer systems. The articles explore current and future decision-making tools that rely heavily on their ability to reason with sophisticated models that are designed, planned, and simulated in real time. It also examines the ongoing need to integrate the qualitative system structures found in expert systems, reasoning systems, logic and social science by utilizing the knowledge found in physical science and engineering.

328 pages. /SBN O-8786-2162-1. Catalog # 2162 $70.00/$35.00 Members

I991 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (3 VOLUMES)

This three volume collection covers recent research and advances in all aspects of robotics and manufacturing automation. The book contains 420 papers balanced among theoretical developments, experimentation, robotics, and manufacturing aspects. It includes subjects such as robot vision, multiple robotic systems, robot modeling and design, motion planning, object recognition, and telerobotics.

2976 pages. LSBN O-8186-21 63-X. Catalog # 2163 $300.00/$150.00 Members

is, I 1

1.0. Oyeleye and M.A. Kramer, “Qualitative iimulation of Chemical Process Systems: Steady- State Analysis,” AZChemical Engineering J., 401. 34, No. 9, Sept. 1988.

1. Shen and R. Leitch, “Synchronized Qualita- ive Simulation in Diagnosis,” to be published in #orking Papers Fifth Znt’l Workshop Qualita- ive Reasoning about Physical Systems, AI Lab, Jniv. of Texas at Austin, 1991.

V. Venkatasubramanian and S.H. Rich, “An 3bject.Oriented Two-Tier Architecture for In- .egrating Compiled and Deep-Level Knowledge ror Process Diagnosis,” Computers and Chemi- :a1 Eng., Vol. 12, No. 9/10, 1988, pp. 903-921.

~ 1 ‘Y- ta- I ’

‘TV I R. Ts, ~

,ed ‘P.

Daniel Dvorak is a dis- tinguished member of technical staff at AT&T Bell Laboratories and a doctoral candidate in computer science at the University of Texas at Austin. His research interests include model- based reasoning, qualitative reasoning, and

case-based reasoning. Recent work has focused on knowledge-based monitoring of the UUCP computer network.

Dvorak received his BS in electrical engineering from Rose-Hulman Institute of Tech- nology in 1972 and his MS in computer engineering from Stanford University in 1974. He is a member of the IEEE Computer Society, IEEE, AAAI, and Computer Professionals for Social Responsibility.

His address is AT&T Bell Laboratories, 2000 N. Naperville Rd., Naperville, IL 60566-7033; e-mail, [email protected]

Benjamin Kuipers is an associate professor of computer science at the University of Texas at Austin and a new member of ZEEE Expert’s editorial board (his photo appears on p. 2). His researcn Interests include commonsense knowledge; qualitative reasoning with incomplete knowledge; resource-limited inference; and spa- tial exploration, learning, and problem solving.

Kuipers received his BA in mathematics from Swarthmore College in 1970 and his PhD in mathematics from the Massachusetts Institute of Technology in 1977. He is a member of AAAI, ACM, the Cognitive Science Society, and the New York Academy of Sciences.

His address is Department of Computer Sci- ence, University of Texas at Austin, Austin, TX 78712; e-mail, [email protected]

Process Monifwhg and Diagnosis -...

Documents

Transcript of Process Monifwhg and Diagnosis -...