1-s2.0-0308016183900388-main

Int. J. Pres. Ves. & Piping 12 (1983) 63-105

Evaluation of the Risk of Pressure Vessel Failure Due to Errors in the Manufacturing Process

D. L. Marr iott

Visiting Associate Professor, Materials and Design Division, Department of Mechanical and Industrial Engineering,

University of Illinois at Urbana-Champaign, Urbana, Illinois, USA

and

C. J. E. Beyers

Standards Division, Licensing Branch, South African Atomic Energy Board, Pelindaba, South Africa

(Received: 17 February, 1982)

INTRODUCTION

Increasing costs and concern over safety have created a need to demonstrate the failure probability of engineering structures in a quantitative manner. In the case of large one-of-a-kind structures, such as bridges and pressure vessels, it is not possible to use a statistical approach based on direct observation of service failure rates for this purpose, and it becomes necessary to infer failure probability from indirect sources.

In recent years there have been several major studies of pressure failure probability, with special interest in the integrity of light water reactor vessels. This work falls into two main categories.

(i) Statistical analyses of the service performance of populations of pressure vessels.

(ii) Theoretical analysis of failure probability using so-called prob- abilistic fracture mechanics.

In the first category three studies deserve special mention. These are: Kellerman; 1 an ongoing study by the Safety and Reliability Directorate

63 Int. J. Pres. Ves. & Piping 0308-0161/83/0012-0063/$03"00 Applied Science Publishers Ltd, England, 1983. Printed in Great Britain

64 D. L. Marriott, C. J. E. Beyers

of the UKAEA; 2'3 and a project undertaken by the ACRS in the USA, which was one of the reference documents for the Reactor Safety Study. 4'5 All these studies refer to very mixed populations of vessels. In all cases the failure rate for all pressure vessels, taken as a single population, is of the order of 10-5. This figure is not representative of any sub- population, however. In particular it is difficult to draw any firm conclusions for the sub-population of special interest in safety studies, i.e. the one consisting of heavy section, all-welded construction. To overcome this problem each of the studies quoted included some attempt to reinterpret the overall statistics so as to apply more specifically to the case of heavy wall construction, this being the sub-population of most concern in safety studies. Smith and Warwick 3 addressed the problem by eliminating all but nuclear vessels from their base population and calculated a failure rate specifically for these components. Unfortunately they simply reintroduced the sample size problem and obtained a very pessimistic estimate of failure rate with little value. Kellerman and the ACRS both inferred significantly lower failure rates for nuclear vessels than for vessels for other applications. Each study predicted an improvement in failure probability to about 10-6/vessel year. However, Kellerman's analysis contains basic errors in the interpretation of the data. When these are corrected a higher figure for heavy vessels is actually calculated than for pressure vessels as a whole. The ACRS approach was based on subjective judgement. Experts in the manufacture of nuclear vessels were asked to rate the quality of nuclear vessels compared with non-nuclear components. The consensus was that reliability would be improved by a factor of between l0 and 100. While the increased surveillance undertaken in a nuclear quality assurance (QA) program is expected to bring about some improvement in performance, there is no clearly defined link between QA activities and subsequent component behaviour at present, in the absence of which it is not possible to say whether the improvement is a matter of some orders of magnitude, or whether it is only a few per cent.

It can be concluded from the above discussion that, while statistical studies may have provided some useful insight to the problem, their quantitative predictions cannot be accepted without reservation at this time.

The literature on theoretical analysis of pressure vessel failure probability has grown rapidly in recent years, and has been surveyed compre- hensively by Johnson.6 The first analysis of significance was made by

The risk of pressure vessel failure 65

ir fL fL fs(X) = p.d.f, of S Shaded Area => p (Failure)

f(x) L(X) s(X)

o. X = S (Strength) or L (Load)

x~ "6

._o

,,=

a

a

Shaded Area => p (Failure)

X = S (Strength) or L (Load)

Fig. 1. Failure probability diagrams for variations (a) within controlled limits and (b) outside controlled limits.

Becher and Pederson.7 They assumed a uniform stress field in a cylindrical vessel under internal pressure. The only mode of failure to be considered was fast fracture from a defect following growth to a critical size by fatigue. Since then several similar analyses have been published, s- 14 Each of the analyses referred to contains different refinements of the original Becher and Pederson problem, but in the one essential feature of failure mode, they are all limited to the same single mode of fast fracture following slow fatigue growth mentioned earlier. It is also assumed in every case that the only contributions to variability consist of the inevitable variations of parameters of manufacture and environment which are expected within normal limits of statistical control. Variations of this type are well characterised by bell-shaped frequency distributions of Gaussian form, with a peak around some mean value and rapidly diminishing tails (see Fig. l(a)). These two assumptions regarding the source of failures lead to a very restrictive model which does not coincide with the observations of


many service failures. The fatigue/fracture failure model is only one, and a relatively infrequently observed, member of a wide spectrum of metallurgical causes of failure. Furthermore it is seldom, if ever, that the variabilities leading to a service failure are entirely the result of variations within statistical control. It is more often than not the case that a major contribution to the failure is some gross variation from expectation, due for instance to complete failure of a processing stage, introduction of environmental factors which were not considered at all at the design stage, or incorrect installation. All these types of error can be identified as major contributors to the failure of pressure vessels, is- 19 as well as bridge failures 2'21 and aircraft crashes such as the Turkish Airlines DC-10 disaster. 22 From the UKAEA studies 2'3 it is possible to deduce that a large proportion of those failures for which causes were given contained a significant element of gross error. Gross errors in manufacture and operation have been recognised in the past as a factor to be considered in reliability studies, e.g. Bompas-Smith. z3 The effect of such errors is, unfortunately, to add contributions to the tails of both load and strength distributions for the component, without any easily perceived trend. Unlike variations within statistical control, there is no necessary relationship between the probability of occurence of a gross error and the consequent deviation from a mean state. The resulting probability distribution can be represented schematically as shown in Fig. l(b).

The problem of gross errors has also been identified in one of the most recent studies of nuclear vessel integrity. This is the Marshall Report. 24. In this report the problem is reviewed in the introductory chapter, but is specifically excluded from further consideration on the grounds that it is the task of quality assurance (QA) to avoid such failures. In terms of design strategy to minimise failure this stance may be acceptable. However, sources of gross error cannot be ignored if the objective is to assess the risk of failure. To do so is to assume that QA procedures are 100 per cent effective. The examples referred to earlier, as well as other sources of service failure experience, show that errors in QA procedure clearly contribute significantly to the overall risk of failure. In fact the case studies published by the American Society for Metals, 25 and summarised by the author, 26 suggest that the relative frequency of gross errors, falling into the category of QA failures, is higher in advanced technology industries such as defence, aerospace and nuclear fields than in general engineering applications. In the nuclear field in particular, examination of the data on plant construction and operation produced by the USNRC z 7


clearly show that the problem has not been eradicated by the imposition of rigorous QA requirements.

It is apparent that gross errors, or QA failures, play an important role in determining the reliability of mechanical components. Although the problem has been recognised before no attempt has been made to date to include its effect in any theoretical model of structural reliability, with the result that existing models are likely to be limited in scope and, more importantly, optimistic in their predictions. In view of the neglected state of the problem a great deal of work needs to be done to develop understanding to the same level as the rest of reliability theory. This paper describes one starting attempt to define the problem of assessing the risk of component failure caused by gross errors. It was recognised that failure can be caused by errors in design, manufacture and operation, and that of these operational errors are possibly the most frequent. However, in order to make a start, it was decided to examine the manufacturing phase first. The rationale for this decision was that there is more control over manufacture and it is easier to observe. This offers a greater prospect of providing a general methodology of failure assessment, which can be later generalised to other aspects of plant operation.

The starting-point of the investigation was a pilot study based on the construction of a small pressure vessel to be used in an experimental chemical process plant. The main body of this paper is devoted to a description of that pilot study, and the conclusions drawn from it.

DESCRIPTION OF THE PILOT STUDY

The objective of the study was twofold:

(i) The narrow objective was to evaluate the risk of failure of the vessel under scrutiny as part of the safety assessment of the chemical plant.

(ii) The broader objective was to develop a general assessment strategy for evaluating failure risk caused by gross errors.

Description of the vessel

The vessel was the product of a small volume mechanical engineering jobbing shop, which specialised in components for chemical process


4 4,57 mm

1520ram

Fig. 2.

~ . / 150 mm T Pressure vessel description and dimensions. = denotes weld.

plant. This particular example was chosen for several reasons. Firstly, it was a real problem involving a significant hazard in the event of failure. Secondly, the operation was sufficiently small to be easily observable, while retaining sufficient complexity to represent the type of problem where this type of assessment might be applied in the future. Thirdly, the workshop in question was accessible, and information on all aspects of design and fabrication were readily available. This degree of accessibility was undoubtedly greater than normally afforded in industry generally, and made the task of problem identification considerably easier than would have been the case otherwise.

The vessel dimensions are given in Fig. 2. Further details of materials and construction methods are given in Table 1. The design procedure was according to ASME VIII. Manufacturing control and material supply requirements were based on ASME VIII practice, but with modifications to allow for local problems of material availability and manufacturing methods.

No specific QA programme was drawn up for the vessel, it being considered that its size did not justify the expense. However, the

The risk of pressure vessel failure

TABLE 1 Pressure Vessel Design Specifications

69

(A) Design requirements (i) Loading: internal pressure on a 3 to 7 day load/unload cycle. (ii) Temperature: ambient indoor to 80C. Full pressure loading only applied at

elevated end of temperature range. (iii) Environment:

(a) Internal content: halogen compounds in all three states. (b) External: normally clean and dry, but under frequent upset conditions

deposits of halogen compounds and moisture from leaking pipes. (B) Material and manufacture

(i) Design and fabrication code: ASME Boiler and Pressure Vessel Code, Section VIII (adapted to suit local supply and other restrictions).

(ii) Material: low carbon, semi-killed steel boiler plate to BS 1501:161:28A, local supplier. Thickness = 12 mm.

(iii) Forming: all plate bending and pressing performed cold in one operation. (iv) Welding: manual, argon shielded tungsten arc method (TIG). (v) Inspection:

(a) All stages visually inspected by independent inspector. (b) All weld runs subject to dye penetrant testing. (c) Final welds subject to 100 per cent X-radiography.

(vi) Mechanical testing: final assembly hydrotested to 130 per cent ofdesign pressure.

manufacturer had experience of fabricating similar components, and had developed a set of internal procedures which constituted a de facto QA programme.

There is no doubt that the vessel studied in no way represents the complexity to be expected in the manufacture of a large, thick-walled vessel for nuclear application. However, it is believed that it is valid to deliberately choose a relatively ~imple problem initially, in order to avoid losing the basic principles in a mass of detail. It is considered that the example has sufficient elements of the manufacturing process to allow any lessons learned to be extrapolated to more complex operations at a later time.

Scope of the investigation

The scope was limited to the manufacturing phase. It was assumed that all design work had been carried out correctly, and that the only causes of failure remaining were deviations from the specifications as laid down by the design department. The steps included in the assessment began with


the release of instructions from the production department, and ended with installation of the vessel. It is obviously impossible to isolate a manufacturing process completely from the development phases preced- ing and following it, so there was inevitably some encroachment into other areas such as design and operation. In particular, it was necessary to examine the service environment for factors which could modify, or add to, the specified design loading conditions.

Work performed in the field study

The first stage of the investigation consisted of a field study to collect basic data. One of the authors (C.J.E.B.) spent approximately one week in the workshop, observing workshop practices and interviewing personnel. At this stage only the most general notions about the eventual analysis procedure had been formulated. It was decided therefore that since the opportunity to collect information was particularly favourable, as complete a picture as possible should be built of the manufacturing process, reserving judgement on the usefulness of the information until the elements of an assessment strategy had emerged. Accordingly, data were collected on the following subjects:

(i) Design: procedures used, codes, anticipated loadings, materials. (ii) Production planning: responsibilities, information flow, material

procurement and other procedures. (iii) Manufacturing processes: cutting, plate bending, welding, etc. (iv) Inspection and testing: destructive and non-destructive, extent,

frequency, point in manufacturing route. (v) Installation: movements, positioning, surrounding activities dur-

ing construction. (vi) Working conditions: normal and accident environments, poten-

tial departures from, or additions to, specified loading.

Most of this information was not required in the final analysis. However, for this first attempt it would have been very difficult to make any progress toward a realistic assessment procedure without it.

Analysis of data

The first impression gained from the data was the very large number of potential errors to be found in even a modest sized manufacturing


process. It was apparent that a major problem would be to find an efficient method of identifying those errors which contribute to service failure, and rejection of errors having no safety significance. This problem would have to be solved before advancing to the calculation of failure probabilities. In common with most risk assessments, therefore, data analysis proceeded in two stages.

(i) Failure mode identification (ii) Failure probability estimation.

Failure mode identification It is necessary to distinguish between two types of failure. Firstly, there is failure in the manufacturing process itself. This may be a procedural error, a deviation from specification, or failure to detect a defect by test or inspection. Secondly, there is failure of the component, as a consequence of the manufacturing failures. In any given situation relatively few of the manufacturing failures are likely to contribute to failure of the component in service. Furthermore, a component failure is invariably the consequence of a series of manufacturing errors. The identification process therefore involves an exhaustive search among many possibilities for a relatively few significant combinations of elementary errors. In principle this is no different from the classical systems reliability problem. In practice it is made more difficult by the fact that there is only an indirect connection between the elemental errors causing the failure and the form of the final failure event, in the case of manufacturing errors. This is because elemental errors in manufacture are, generally speaking, of human origin, whereas the resulting component failure is usually in the form of material deterioration. Before a sequence of errors can be judged significant, their consequence in terms of material property changes must be evaluated. In contrast, most systems subject to safety evaluation display a more direct cause-effect relationship between elemental and overall system failures. For instance the elemental failures in a typical delivery system would be, 'pump fails to start' or 'valve sticks closed', and the system failure would take the form of 'failure of delivery pressure'. In situations such as this it is easy to identify failure sequences using only a simple model of the system and its interactions.

The sequential nature of failures in general suggested that an event tree approach might be most appropriate to identify significant combinations of errors (see Reference 5 for details of event tree analysis). In fact, this


turned out not to be so, mainly because of the large number of sequences of errors possible when human errors are taken into account. The number of branches in an event tree has a maximum of 2 n, where n is the number of events in a sequence. As will be shown in the welding consumable problem examined later in this paper, the number of events in a sequence of human actions can be large; in the example up to 18. The resulting number of branches is too large to handle without some method of rejecting meaningless branches at an early stage, and this in turn cannot be done until the consequence of each sequence has been worked out in terms of its effect on material condition. The only application for the event tree approach was found to be in analysing sequences of events with well defined starting and end points, e.g. the welding problem already mentioned.

The alternative method of identifying failure sequences is the so-called top-down approach, using fault tree analysis. 5.28 This approach starts at the final failure, the 'top event', then traces the causes of this top event down through progressively more detailed levels of the system until some basic initiating events are reached. Fault tree construction has the attribute that it does not lead to wasted effort in investigating large numbers of spurious branches. The only drawback is that an exhaustive set of top events must be known beforehand. This is an acceptable situation in systems where undesirable events can be easily defined in terms of the operation of the system, e.g. failure to start or failure to deliver power. In the case of material failures however, it is not obvious which top events are relevant until events at a lower level on the fault tree have been defined. This statement will be more apparent following some expla- nation of the possible classes of failure to be observed in a mechanical component. Observations from case studies 25'26 suggest that service failures may be grouped into three classes:

(i) Design load cases." In the design specification certain failure modes are basic to the process of defining allowable stress levels, e.g. yield stress and bursting failure by over-pressurisation. These failure modes are easily identified by examination of the list of assumed design load conditions. In the case of nuclear components this would include several categories, such as normal, upset and accident conditions.

(ii) Potential failures inherent in the manufacturing process: These are invariably forms of material deterioration which would not be


acceptable if allowed to pass into service, but constitute a risk which is recognised and prevented by careful control over manufacturing operations. An example of this type is the risk of hydrogen embrittlement in the welding of medium carbon steels. The only way of removing the inherent nature of the risk is to change the design to use lower stress levels or less critical materials. Either of these moves almost certainly imposes a weight or cost penalty, so the risk is accepted and then minimised by careful control.

(iii) Error induced failures: These are failure mechanisms which would not have been considered feasible at the time of construction but, as a result of errors which effectively change the construction route, a component is produced whose sensitivity to material deterioration is basically different from the original design. A common form of this type of failure is a material substitution which leads to accelerated deterioration in the working environment by a mechanism not experienced by the intended material of construction. The danger of this class of failure is that, since it is not expected under normal circumstances, there may be no provision in the surveillance program for its detection, either during construction or in service.

Errors in the third class do not simply increase the risk of failure. They actually create completely new top events. Identification of this class requires not only the specified production system to be investigated, but also all possible parallel systems which can be constructed as a result of deviations at the detailed process element level. In terms of more conventional systems, such as piping or electrical systems, this is equivalent to examining the likelihood of incorrectly connected pipe runs, extra units added to the system and substitution of specified components by others of different characteristics. Problems of this nature have been known to be the cause of failure in conventional systems, e.g. incorrect cable routing at Brown's Ferry nuclear plant, 29 but it is not normal practice to take them into account in systems reliability assessment. The reason for this omission is unknown. Presumably, it is assumed that their relative frequency is low enough in conventional systems to be neglected. This is certainly not the case where material failures are concerned. While interpretation of reports of service failures can only be done with caution, it is possible to conclude, with a reasonable degree of confidence, that most


I DISRUPTIVE FAILURE IN SERVICE J

= 5 x I0"3[ P I BRITTLE FRACTURE I

5 x 10 -3 I

[ StrainAge ] Embrittlement

p: 5 xlO -3 J Material

Deterioration n Serv ce

p = 5 x 10 -3

Deterioration 1 in Service

C

a

I P ~ 10-5 LDUCTILE FRACTURE '1

1 Ist ionl .... ..... Cracking

I - - l Envi . . . . ni I

~ect I

p ~ i0-5 l

I Weld Metal l Substitution

p = 5 x 10-3 I _

()

Material Substitution

Q Wrong Material Supplied with False Certificate I

~p: 5 x 10 "3

[ Material I Deterioration in Service

~ B p=l Deformation I befoNr HydrotestStrage

- - I p-" 1 Substitute Material

Sensitive to Strain Age

Wrong Material Issued from Store

Fig. 3. b

Fault tree for disruptive failure: (a) main diagram, (b) supplementary diagram for material deterioration.


failures involving material deterioration contain at least an element of the third class of failure described earlier, if not in the manufacturing route, then analogous errors in either design or operation. Both the Immingham and Tippi Oy vessels,16,17 for instance, were subject to errors in their heat treatments; a gross error in material and errors in heat treatment figure in the Thiokol Motor Casing failure; 19 similar instances can be found for less spectacular failures in the ASM Handbook on Failure Analysis, 25 and other sources . 3'31

It was eventually possible to represent the failure mechanisms identified for the subject of the pilot study in the form of a fault tree, as shown in Fig. 3. However, this figure is little more than a convenient way of displaying the results. During the course of the study the actual work of identifying and analysing failure mechanisms was carried out in an intuitive manner, iterating between examination of the actual processes and generic material failure mechanisms to find points of correspondence between possible deviations and feasible failure mechanisms. This procedure was not the basis of a useful general strategy for subsequent analyses, but it is believed to have been reasonably exhaustive, and gave guidance on how a more systematic approach might be developed. Alternative methods will be discussed later in this paper. The following section summarises the findings of the assessment of the pressure vessel.

Description of identified failure modes By examination of the processes and materials involved it is possible to identify a large number of potential failure mechanisms. This number could be rapidly reduced by limiting the investigation to failure modes displaying disruptive forms of release. The alternative form of release is a leak which, in the context of the chemical plant operation, would not constitute a hazard, and in fact could be dealt with adequately using normal operating procedures. Further reduction in the number of significant mechanisms was possible by excluding any that would be revealed during the hydro-test. An obvious example is gross yielding due to sub-standard proof strength. Other examples include inadvertent use of sub-thickness plate, and any form of embrittlement which shows up immediately during fabrication. Eventually three failure mechanisms of major significance were identified. These were:

(i) Strain age embrittlement (ii) Low cycle fatigue (iii) Stress corrosion cracking.


Failure by any of the above mechanisms can occur in two ways. The cause can be either stacking up of parameter variations within acceptable control limits, or gross errors which result in major deviations. Only the first of these has the characteristic of a smaller probability of occurrence as the deviation from a mean condition increases. When failure is caused by completely new factors being introduced due to gross errors, there is no reason to believe that any relationship will exist between the degree of structural degradation and the relative frequency of the error which causes it. For instance the effect of performing a heat treatment incorrectly, e.g. at the wrong temperature, can be more damaging to fracture toughness than omitting the treatment altogether, in some cases. The consequence of a gross error is therefore likely to be a completely random change in material properties, with virtually unlimited range. Not all such ranges will necessarily constitute a degradation--it is possible even to achieve an improvement in properties by accident. However, it can be assumed that a modern production process has been reasonably well optimised, and that any serious deviations from it will tend toward a loss of desirable properties in the end product. It seems reasonable therefore to assume a simple binary model for the consequences of gross errors. If the error creates the conditions which allow a mechanism to occur, the risk of failure by that mechanism will be unacceptably high; otherwise the error is judged insignificant. It is sufficient, in these circumstances, to limit the analysis to simple de- terministic calculations. Since this paper is concerned primarily with the gross error problem, the following assessments are approximate, in keeping with the above arguments. It is believed that the level of complexity is sufficient for the purpose of risk assessment, where it is not the object of the exercise to recreate the original design analysis.

Strain age embrittlement. Strain age embrittlement (SAE) was identified as a potential failure mechanism because the specified material, a semi-killed steel, is known to be marginally susceptible to this form of embrittlement. 32 This fact does not, by itself, suggest that the risk of SAE is a high one, but is simply the means of identifying it as a possibility for further consideration. Before it can be judged a high risk other conditions must be satisfied. These may be determined by studying the basic phenomenon of SAE.

Rimmed, and to a certain extent semi-killed, steels contain free carbon and nitrogen atoms in solid solution. According to some authorities 33'34


[ Susceptible Material

1 Plastic deformation

I Moderate Temps.

No anneal ht. T

Crane Component /3"-57 Trlomf /___~

Rimming Steel Plate of doubtful pedigree

Sheering operation Residual stresses caused by repeeted

repairs

Flame cutting of Welding tempereture slot near failure point gredlents

Not normally justified Ambiguous code for this grade of requirements

component regerdlng need of stress relief after weld repelr. Wrong decision due to incomplete information supplied to q.c. inspector

a

Fig. 4.

)Susceptible tiaterial I PWR Piping /4__27 Spun Head /__26~ AIM 304 AIM 304

l Sensitisation I welding of safe end brazing of head to cylinder

(Plastic Deformation I residuel stresses cold forming

BWR water chemistry low Cl content [ Environment I T Not allowed due to damage component too cheep

I No heat treatment l to adjacent ferritic steel to justify expense t

b Examples of patterns in observed failure sequences: (a) strain-age embrittlement,

(b) stress corrosion of stainless steel.

these solute atoms migrate to the sites of newly formed dislocations following plastic deformation, locking the dislocations in place. The effect at the macroscopic level is a time-dependent rise in yield stress, loss of ductility, and degradation of toughness, usually displayed as a rise in the brittle/ductile transition temperature. While the metallurgical reasons for SAE are understood, 33'34 quantitative data on its effect on material properties are virtually non-existent. Eventually, the only information that was any practical use were detailed descriptions of service failures, in which SAE was believed to have played a part. 18,35 From these accounts and others not published in the open literature, 36 it was observed that an SAE induced failure takes a definite sequence of events, as illustrated in Fig. 4, i.e.:

(i) Existence of susceptible material. (ii) Cold work in excess of about 4 per cent. (iii) Application of low heat (e.g. circa 200 C or less). This event is not


necessary, but speeds up the embrittlement rate by increasing the diffusion rate.

(iv) No post-work anneal to dissipate the locking process. Or (v) No incubation period between fabrication and proof testing, so

that embrittlement only develops in service. (vi) Service conditions to include loading at ambient or sub-ambient

temperature.

It is unlikely that every component experiencing the above conditions will actually fail in service. The binary model is conservative. In practice, failure will also depend on the degree of susceptibility of the material and the severity of the embrittling environment it experiences. As understanding of the phenomenon grows it is anticipated that a more precise model of SAE will be developed. In the meantime the binary model is the only one available, and it must be assumed that occurrence of the event sequence listed above is an empirically established indicator of a high, but currently unquantifiable, risk of failure.

Although SAE was identified as a potential mechanism of failure because the specified material was a semi-killed steel, the actual risk comes from a different source. According to the documentation which accompanied the delivered plate, the material used in construction was not semi-killed but fully killed, containing about 0.2 per cent silicon, but otherwise to the same specification as the ordered plate. If the com- position of the supplied plate could have been guaranteed, it would have been possible to ignore the prospect of SAE. Since one substitution had already been made, the procurement process was reviewed for further opportunities of material substitution. It was found that mixups could occur before delivery, resulting in the wrong test certificate being attached to a plate, and in the store issuing procedure in the shop itself. In either case the substitute material could be rimmed, construction grade steel, which would be strongly susceptible to SAE if subjected to the processing used in manufacture of the vessel under consideration. For this reason SAE was recorded as a feasible failure mechanism.

Low cycle fatigue. If the code design route was followed correctly, the peak strain in the vessel, due to the pressure cycle alone, is limited to approximately twice the yield strain. For the specified material, this value is of the order of 0-23 per cent. According to design curves given in ASME III, 37 the corresponding design life would be about 104 cycles, with a


factor of safety on life of 20. The expected 1000 cycles is therefore exceeded by a considerable margin. Fatigue in the absence of an initial defect or a severe thermal cycle can be discounted. Details of the chemical processes involved cannot be discussed, but it can be stated that the thermal gradients experienced by the vessel are negligible, so that thermal fatigue is not a significant factor.

It is still possible that fatigue cracks could propagate from initial weld defects. No initial defects were found during manufacture, but recognising that non-destructive examination is not 100 per cent effective, it was considered that the consequences of an initial defect should be investigated.

The problem is that neither the size nor the location of a hypothetical defect are known. The procedure taken was to assume that the defect occurs at the point of peak stress, and the size of defect calculated which would lead to significant growth. This is conservative, but not as much as it appears at first sight, because the most likely points for the formation of initial defects are the same weld features which cause the stress concentrations. A detailed stress analysis is required to determine the peak stress range accurately. An approximate estimate can be obtained from the assumption that the design analysis had been done correctly. If this is correct the peak strain range will have been limited automatically to about twice the yield strain of 0.23 per cent. This value can be used to estimate cyclic crack growth by substitution into a modified form of Paris's equation, as Paris first identified the now well known relationship between crack growth rate and stress intensity range:

da/dN = C(AK) n

where a is crack depth (edge crack), or 2 ! crack depth (subsurface), N is the number of cycles and AK is the stress intensity range.

The modified equation is due to El Haddad. a9 It replaces the stress range in the stress intensity term of Paris's equation with an equivalent stress range as follows:

da/dN = C(EAe~/(rra) ) n where EAe is 'stress intensity range', according to the ASME definition, i.e. the pseudo-elastic stress range assuming linear elastic behaviour.

Crack growth data for low carbon steels are given in Reference 39. A close fit to the experimental points is obtained with the following expression:

da/dN = 10- 9 (EAe~/(rca))2


At the peak strain range for the material the prediction of crack growth is

Aa/a = O. 15

If the stress conditions around a pre-existing crack are elastic, this amount of crack growth would cause an increase in the stress intensity of about 7 per cent. A crack with the potential to grow to critical size during the design life could be detected by an initial proof test at 107 per cent of the working pressure. If post-yield conditions hold, the proof test would need to be increased to 115 per cent of the working pressure to allow for the fact that post-yield fracture criteria, such as the J-integral 4 and crack opening displacement (COD) 41, are proportional to the crack size. The vessel in question passed a hydro-test to 130 per cent of the working pressure. It can be assumed therefore that, as long as the service temperature is always higher than the test temperature, and material deterioration in service is excluded from consideration, the risk of failure from fatigue can be ignored. The operating conditions of the chemical plant ensure that the vessel is not pressurised at temperatures lower than ambient, so that low temperature is not a problem. On the other hand, material deterioration in the form of SAE has already been identified as a significant failure mode. This could lead to a failure in which fatigue takes some part. However, fatigue is not an independent failure mechanism, because SAE introduces approximately the same degree of risk regardless of whether a small amount of crack growth occurs or not. It may be concluded that low cycle fatigue, in this particular example, is not a primary consideration in evaluating failure risk.

Although it turned out not to be necessary in the final analysis, an attempt was made to predict the probability of occurrence of weld defects in the pressure vessel. The analysis, together with other assessments of event probabilities, is discussed in a later section.

Stress corrosion cracking. Under normal circumstances no unusual corrosion problems were expected with the specified design. The same grade of steel had been used before in similar environments, and it was known that the only effect was a slow, general wasting corrosion, which could be accomodated by an extra allowance on the wall thickness. In the search for alternative failure mechanisms, associated with the process, it was found that elements of a potential stress corrosion mechanism existed. These were:

(i) The presence of halide compounds in the service environment, and


(ii) The presence of stainless steel in various forms and grades in the bonded store.

It was postulated that stress corrosion cracking was feasible if it was possible to have an inadvertent material substitution. Investigation of the material procurement and issuing procedures revealed that stainless steel welding wire was kept in the same store as the specified ferritic wire. Although both rolls were carefully marked, and the issuing procedure subject to surveillance by the inspection department, no physical barrier was placed between the two materials to positively prevent a mix-up. While the two materials look different, and can be readily differentiated during welding by their distinctive handling characteristics, it was not uncom- mon, in this workshop, to use the stainless wire to weld ferritic-to- austenitic transitions. The unusual feel of welding ferritic plate with austenitic consumable would not therefore be a natural check. On the basis of these findings it was judged that stress corrosion of substituted welding wire was a potential failure mechanism. It became necessary to evaluate both the consequences and probability of such a substitution.

If stainless welding wire is used by mistake two failure mechanisms are possible. The first of these is a transgranular crack in the austenitic weld deposit caused by the normal vessel contents, combined with residual tensile stresses in the weldment. The second is an intergranular crack in the fusion zone where the Cr content of the austenitic phase is diluted by contact with the parent material. In this case the electrolyte would be chemical deposits on the outside surface combined with condensation from leaks. In either case a likely form of cracking would be a fairly uniform attack along a sizeable length of weld, e.g. as experienced in some of the US boiling water reactor piping. 42 On reaching a depth of 2/3 to 3/4 of the wall thickness, ductile tearing could lead to a disruptive failure.

It was considered by the authors, and confirmed by discussion with metallurgists connected with the project, that welding wire substitution would introduce a high, multiple risk of failure. This is one example where the risk is almost entirely dependent on the probability of errors in procedure. For this reason a detailed analysis was made of the welding wire issuing procedure.

Risk quantification Assuming that the failure mechanisms identified in the previous section are the most important contributions to overall risk, the fault tree shown in Fig. 3 represents the failure logic for the system. If probabilities can be

82 D. L. Marriott. C. J. E. Beyers

obtained for the individual events on this diagram it is possible to employ this logic structure to calculate the overall failure probability, and hence the risk. It is apparent that the top event probability is governed largely by four events:

(i) Substitution of parent plate (event A of Fig. 3(b)) (ii) Critical level of embrittlement achieved by processing (event B of

Fig. 3(b)) (iii) Existence of initial defects (events C of Fig. 3(a)) (iv) Welding wire substitution (event D of Fig. 3(a)).

Error rates are difficult to obtain with any accuracy in a workshop environment. For risk assessment purposes however only order-of- magnitude estimates are adequate. Even then the problem of data acquisition is a serious one. Several approximate methods of error estimation were used, including the following:

(i) Direct statistics from work records (ii) Bounding estimates (iii) Adaptation of generic data (iv) Subjective judgement.

Parent plate substitution. Two opportunities exist for material substitution. Firstly, an error in the issuing procedure from the bonded store can occur. Secondly, the incorrect material can be delivered with a false test certificate attached. In the workshop under review it is not the normal practice to perform independent chemical analyses on every plate received when the material is supplied with a mill certificate.

The probability of the incorrect material being issued from the bonded store is a function of the control procedures in force. The procedure used to estimate the error probability is essentially the same as that used to estimate the probability of welding wire substitution in 'Probability of weld consumable substitution' in this section. No analysis will be reproduced here because the probability of incorrect issuing procedure is many times less than the probability of an error in supply, and can t~e ignored.

The major source of parent plate substitution is a supply error. This fact was not recognised initially, otherwise an independent chemical analysis would have been specified as part of the surveillance system. At the time of the investigation no statistics were available on the incidence


of false documentation. The only recourse was to subjective methods. Several such methods exist, the simplest and best known of these being the Delphi technique. 43 This technique involves interrogating a group of individuals with experience of the problem, and converting their re- sponses into a quantitative estimate. The method has been used in the IEEE sponsored Project 500 to augment failure rate data on electrical and electronic components, and application of the method is described in the Project 500 report. 44 The group used for this exercise had a range of experience in workshop practice, inspection and QA, and tackled the problem as one of several examples to study the potential of Delphi methods. The estimated error rate was 5 x 10 -3. It was impossible to verify this estimate at the time. If a better estimation could have been made it would have been used in preference to the Delphi estimate. Some confidence in the method was gained from the results of other examples attempted, which did have verifiable answers, varying from elementary problems such as telephone call error rates to estimation of the resolution of ultrasonic inspection equipment. The results were surprising, usually giving answers well within an order of magnitude of the observed value. It was therefore considered that use of the Delphi could be made with reasonable confidence. (About six months after completion of the field study some records were obtained which verified the frequency of false documentation to be between 1 in 200 and 300 cases.)

Probability of plate degradation. Given that a substitution of the specified material by an inferior one such as rimming steel occurs, it is not certain that failure by SAE will be an automatic consequence. Failure depends on the degre of embrittlement being sufficient to reduce the pressure vessel to a critical state in service. A survey of the current knowledge was made in an attempt to quantify the effect. Although the literature on the subject is extensive, 33'34 virtually all the available information is concerned with qualitative understanding of the phenomenon, and numerical data relating to design and safety are virtually non- existent. When faced with the situation of knowing why something happens, but not when or by how much, it is necessary to consider other factors. The vessel under review has a high hazard potential. It is also known, from past experience, that similar circumstances have led to disruptive failures; therefore it must be assumed that the same high risk obtains in the present case until a better understanding of SAE, or specific tests on the supplied material, is able to show whether a high risk situation


exists or not. The actual probability of failure by SAE is unknown, but the uncertainty, i.e. the probability of making a wrong decision, given the available information, is high and approaches unity. The probability that SAE could occur, given the available data, is therefore taken as 1 in the analysis described here. The risk eventually calculated is a subjective one related to the quality of the information supplied. The suitability of this measure for assessing the failure risk of one-of-a-kind components will be discussed in more detail at a later stage of this paper.

Probability of occurrence of weld defects. Weld defects in TIG welds are largely confined to four types :45

(i) Tungsten inclusions, caused by electrode contacts. (ii) Lack-of-fusion defects due to the relatively low power level. (iii) Root cracks in the first root run. (iv) Oxide inclusions or porosity due to failure of the argon shield.

Porosity and inclusions of all types can form initiation sites for both fatigue and brittle fracture. It is considered that, in the event of embrittlement, it is virtually certain that a discontinuity of sufficient severity will be present somewhere in the degraded area, in the form of surface irregularities, or scratches incurred in service, to initiate fracture. Three-dimensional defects are unlikely to add significantly to the existent risk due to these other, inevitable features, and can be ignored as an independent hazard. This assertion is supported by the quoted examples of SAE induced failure, 18,35,36 none of which displayed any macroscopic initiating defect. As far as fatigue is concerned, the stress concentrations caused by three-dimensional defects will be less severe than the postulated cracklike defect used in the analysis reported earlier in this paper, and are therefore judged not to be critical.

The major sources of risk are the cracklike defects formed by root cracks and lack-of-fusion (1.o.f.) defects. In order to detect such defects every weld run was inspected by dye penetrant testing, and the completed welds subjected to 100 per cent radiography. No defects were found by either test method, but this does not necessarily mean that none was present, because it is well known that inspection techniques are not reliable. Data on the reliability of non-destructive test methods are sparse, but the available information, for instance obtained from Packman 46 and Yang, 47 indicate that radiography may be as little as 10


per cent effective in detecting cracklike defects. Surface inspection techniques, such as dye-penetrant and magnetic-particle testing, are somewhat more successful, a figure of 90 per cent being more typical. The figures quoted above are very rough estimates. However, the type of calculation being carried out here does not merit more accurate estimates, so these figures will be used.

Given the levels of reliability stated above, it is obvious that an examination of the vessel alone provides insufficient evidence on which to judge the presence or otherwise of an initial defect. Assuming the existence of a pre-existing defect, the probability of detecting it in a single inspection is only l0 per cent if it is internal, and 90 per cent if it is a surface crack. If the presence of such a crack was important in the present case, the risk of missing it would be unacceptably high. Even multiple inspections would not improve the situation much because of the common-cause element in detection errors. The only recourse is to obtain information from alternative sources.

The most relevant alternative information source was found to be the inspection records of similar work carried out in the same workshop by the same procedure. Approximately 2000 m of similar weldment had been fabricated using the same techniques over several jobs. No cracklike defects had been found by either method of inspection. Using the reliability figures quoted earlier, it is possible to obtain an estimate of the likely initial defect rate.

The following analysis should be carried out independently for surface and subsurface cracks, because their respective detection rates are different. The subsurface defect case is the more critical because of the lower detection rate, and will be the only case considered here. It will also be assumed that detection reliability is independent of crack size. There is evidence to show that, in general, detection reliability increases with defect size, 46 but for low levels of reliability, as experienced with radiography in detecting small defects, an average value is a reasonable approximation, as shown by Packman's data. 46

Assuming the defect rate to be Poisson distributed, with an average 0 defects per unit length, it can be shown that the probability of observing no defects in the examination of length L of weld is,

of)

P(/)I0) = ~ (1 -PD)n(6L)"exp( -OL)n! n=l


where PD is the probability of defect detection/defect (=0.1 here). This can be simplified to

P(DIO) = exp ( - PDOL)

To find the likely defect rate O, given that no defects were found, use is made of Bayes's theorem:

f(OI/)) P(DIO)P(O) - P ( / ) )

where f(OI/5) is the probability density of 0, given zero defect detection, P(O) is the prior distribution of O, representing prior knowledge, and

f0 P(D) = P(DIO)P(O) dO Depending on the choice of prior distribution, different estimates of the defect rate 0 can be obtained. For instance, assuming complete ignorance of the likely rate, P(O) may be taken as a constant. In this case,

f(OID) = PDL exp ( - PDLO)

An estimation of 0 can be obtained from this equation by calculating the expectation (0), as suggested, for instance, in Ang and Tang: 4s

expectation (0} = 0f(01/))d0 = 1/PDL = 5 x 10-3/m

It can also be shown that there is only 10 per cent probability of 0 exceeding 1.5 x 10-2/m. It may be concluded therefore, that the actual defect rate probably lies in the range of 10-3 to 10-Z/m.

Bayes's theorem can also be used to incorporate additional information by modification of the prior distribution. In the present case, data collected by Salter and Gethin 49 at the British Welding Institute suggest that an appropriate prior distribution would be an exponential distribution, with a mean rate of occurrence for lack-of-fusion defects of 0'0167/m. The data from which these figures were obtained were surveys of 2336 m of weldment, subdivided into several categories, allowing both average and variance of the defect rate to be calculated. These turned out to be approximately equal, indicating an exponential distribution; otherwise a more general gamma distribution would have been used. Recent Japanese studies give incidence of lack-of-fusion defects of 0.015/m, which supports the values calculated from the Salter and Gethin study.


The prior distribution, P(0), therefore becomes

P(O) = 0ff 1 exp( - 0/0p)

where 0p is the mean 1.o.f. defect rate of 0.0167/m. Substituting into Bayes's equation and simplifying, the frequency distribution for 0 becomes

f(01/)) = (PDL+ 0p 1) exp [ - O/(PDL + 071)]

From this equation the expectation (0) is

(0) = 0.003 85/m

with an upper 90 per cent confidence limit on the defect rate of 0.008 85/m. These figures are slightly lower than the estimates assuming a constant prior distribution, as might be expected from the improved information content of the estimate.

Probability ofweM consumable substitution. As discussed in the section headed 'Description of identified failure modes', the substitution of the specified welding wire by stainless steel is virtually certain to introduce problems of stress corrosion cracking. Substitution requires the simul- taneous breakdown of a number of imposed and natural controls. Imposed controls include the issuing procedure used in the bonded store. An important natural control is the experience of the welder, who should know the difference in handling characteristics between ferritic and austenitic materials. In order to evaluate the reliability of the system, methods developed by Swain and co-workers at Sandia Laboratories, and recently presented in a Handbook on Human Reliability by the USNRC, were used. 5 The analysis is carried out in several steps:

(i) Task analysis: this involves breaking the process down into elemental steps, as illustrated in Table 2.

(ii) Event tree construction: the event tree, also referred to by Swain as a THERP diagram, is shown in Fig. 5. Some degree of judgement is required in the construction of this tree, in order to eliminate all but the meaningful branches. The inclusion of all possible branches would have led to about 218 end points, and would have been impossible to show graphically.

(iii) Assignment of probabilities: probabilities are assigned to all branches of the tree and sequences leading to the use of incorrect material are evaluated.


TABLE 2 Failure Mechanism Checklist (adapted from Collins s2)

(1) Force and/or temperature-induced (8) Wear elastic deformation (a) Adhesive wear

(2) Yielding (b) Abrasive wear (3) Brinnelling (c) Corrosive wear (4) Ductile rupture (d) Surface fatigue wear (5) Brittle fracture (e) Deformation wear

(a) Temper embrittlement (f) Impact wear (b) Strain aging (g) Fretting (c) Martensitic transformation (9) Impact (d) Grain growth (a) Impact fracture

(6) Fatigue (b) Impact deformation (a) High cycle (c) Impact wear (b) Low cycle (d) Impact fretting (c) Thermal fatigue (10) Fretting (d) Impact fatigue (a) Fretting fatigue (e) Surface fatigue (b) Fretting wear (f) JCorrosion fatigue (c) Fretting corrosion (g) Fretting fatigue (11) Creep

(7) Corrosion (a) Deformation (a) Direct chemical attack (b) Fracture (b) Galvanic action (c) Buckling (c) Crevice corrosion (d) Stress relaxation (d) Pitting corrosion (12) Thermal shock (e) Intergranular corrosion (13) Galling and seizure (f) Selective leaching (14) Spalling (g) Erosion corrosion (15) Radiation damage (h) Cavitation corrosion (16) Combined effects (i) Hydrogen damage (a) Creep fatigue (j) Biological corrosion (b) Stress corrosion (k) Stress corrosion (c) Corrosion fatigue (1) Oxidation (d) Creep oxidation

The major problem in this analysis is obtaining a reliable source of failure rate data. The data used in this study were taken largely from WASH 1400. 5 These data, with some additions, are contained in the USNRC Handbook. s In general, there is a severe shortage of data in this area, and it is necessary to augment them with subjective estimates at times. It is believed that the figures assumed in constructing Fig. 5 are acceptable for the purpose, for which order-of-magnitude values are sufficient. What is more important than correctly estimating individual error rates is making sure that common cause effects and sequential

The risk of pressure oessel failure 89

Critical Failure Path I

7 %:o.i

fllf Ps:0.O1 ~24 _ . PF-:105 ~ :l ra" ~8/"~6o f 22222,2 . . . .

Pe fl- i

P:a%.. ->IP.:l

12f

%o.oo~

.~0.SxlO "4

#---~-- - -o/"-~"-- '~-- - - - '~ ~b--" \Critical Failure 13d \13~ P12:l \13cx. Pat" rr

" - - P13: l~ \

i5c ' 15f c

/ 18 Pf =]..5xlC ~Fail P --0.1

Fig. 5. Event tree (THERP diagram) for welding consumable issue and use. c denotes correct action while fdenotes wrong action in relation to previous events. (Numbers relate

to steps in Task Analysis; see Table 3.)

dependencies are realistically modelled. For instance, if a human operator has a probability of error of 10- a on a first trial then, if not corrected in the meantime, he rhay be much more likely to commit the same error on a second trial. Furthermore, a human can be positively or negatively biased toward a certain action, depending on previous experience. The particular circumstances of each error must therefore be taken into account when estimating error rates based on tables of generic data. Unfortunately there is no well established approach to this complex problem at present, although the USNRC Handbook gives valuable

90 D. L. Marriott. C. J. E. Beyers

guidance using current understanding of the problem. It is likely that this type of analysis will continue to be largely an intuitive exercise for some time.

DISCUSSION OF RESULTS

From Fig. 3 it can be seen that the estimate of failure probability is dictated mainly by the single event of plate substitution. The overall probability is therefore approximately 5 x 10- 3. This is a high figure for a pressure vessel, and is probably unacceptable. However, if it were possible to make a large number of identical vessels, and put them into service under identical conditions, it is unlikely that the observed failure rate would be as high as the estimated value. The reason for this is that it was necessary to make conservative assumptions at a number of stages in the analysis, in order to quantify the problem at all. For instance, only a small proportion of vessels suffering strain-age embrittlement would be affected beyond a critical level in practice. Unfortunately the current state of knowledge on this phenomenon, and others, is insufficient to make any finer distinction than the rough go/no-go criterion adopted in this study. In fact, the probability figure calculated is a measure of the uncertainty of the available information and, as such, is a valid measure of risk in its own right, but a different one from the more conventional population failure rate and its estimators. If more information is made available about either the failure mechanisms involved or the procedures for material processing, it is possible to evaluate the risk of failure with greater precision until, when all relevant knowledge has been provided, it should be possible to state with absolute certainty whether the component will fail in service or not. The alternative use of information uncertainty as a measure of risk will be discussed in more detail in the section of this paper titled 'General Discussion'.

On a more qualitative level the failure mode identification has value even without postulating any error rates. From the structure of the feasible failure mechanisms identified, and the errors required to cause them as illustrated in Fig. 3, it is possible to make recommendations for modifying the process so as to eliminate these failures, or at least reduce their likelihood. In the case of the stress corrosion mechanism, for instance, a physical barrier between the stocks of ferritic and austenitic consumables in the bonded store would be an effective move, as would a


clear statement of inspection duties regarding the removal of old welding wire from the welding bay prior to the start of a new job. The problem of strain aging requires a greater investment of effort. A number of preventive strategies are suggested by the risk analysis.

(i) Independent chemical analysis of all supplied plate: this action would establish whether the material is killed, semi-killed or rimmed steel, and hence whether it is prone to strain-age embrittlement or not. The probability of an incorrect or in- accurate analysis would still have to be considered as a contribution to failure risk.

(ii) Mechanical tests on plate material: a test specifically devised to reveal a possible strain-aging tendency would need to be included in the surveillance program. A conventional Charpy V-notch test would probably not be suitable and would not decrease the uncertainty of failure by itself. An appropriate test would be to cold bend coupons, followed by a soak treatment at a temperature of about 200 to 300 C and machining into impact specimens for dynamic testing.

(iii) Specification of post-weld heat treatment: regardless of whether the material actually used is prone to strain aging or not, this treatment would remove any detrimental effects of cold forming and welding.

Any or all of the above proposed measures can be incorporated in the process at some extra cost. No difficulties are involved in including them in the process fault tree and, following evaluation of their error rates by the same techniques already used for the original processes, it is possible, in principle at least, to calculate the cost of improved surveillance. We therefore have the basis of a method for planning manufacturing surveillance procedures, i.e. QA activities, on a cost-effective criterion.

GENERAL DISCUSSION

The broad objective of this study was, firstly, to investigate the feasibility of performing a risk assessment of the errors of manufacture, and secondly, to formulate a general strategy for assessment. Although the problem addressed in the study was a very simple one, it is believed that the question of feasibility has been answered satisfactorily. In fact, it is


difficult to see how any rational approach to evaluation of QA activities can be made without an exercise of the type described here. As far as a general strategy is concerned, the study itself was carried out in an intuitive way, with much trial and error. In retrospect, however, it is possible to discern a logical structure which could form the basis of a more formalised technique. In addition, some insight was gained into the relationship between QA activities and reliability, and the interpretation of probability as a risk criterion for the case of one-of-a-kind components. These topics will be discussed in some detail.

Comments on a general assessment strategy

In common with most safety assessments, it is considered that the most important step is the systematic identification of potential failure mechanisms. As discussed in the section headed 'Failure mode identification' where material deterioration is involved, the conditions for feasibility of a given failure mechanism depend on opportunities for error at a detailed level in the process, which makes it very difficult to identify feasible mechanisms at an early stage in the analysis. In the pilot study this step was performed iteratively, some mechanisms being identified by postulating accident situations in operation and tracing back, using a top- down technique, to reveal significant errors in the process, while others were found by retracing individual errors to find out the consequences. As long as the accumulated experience of the analysis team is adequate, and the need to continually re-examine the process in the light of new information is recognised, this essentially intuitive approach may be acceptable, at least for jobs of modest extent. In order to place less reliance on individual experience, however, and to make the solution of complex problems more tractable, a more formalised strategy is desirable. Such a strategy has been developed, and is shown in flow diagram form in Fig. 6.

The suggested strategy is based on several observations from experience of this study, combined with data derived from service failure studies in general. These are:

(i) Material failures invariably follow a specific sequence of events. These both define the existence of the failure mechanism in the process and act as identifiers for search purposes.

(ii) It is necessary to examine not only the specified processes for identifiers of potential failures, but also any deviations either from


(iii)

A. Process as Specified

__ Process I Description I ~ .~

Failure Rate~ Checklist ] I

r

1. Initial Screening. Search for partial

[ correspondence

2. Detailed Matching. Search for complete

I correspondence of all elements

B. Process with Gross Errors

3. Initial Screening. Repeat Step No. l with deviations in process steps included

_ Error Rate] Estimates ]

I 4. Detailed Matching.

Complete matching of all elements of

Im- mechanisms found in Step No. 3

Risk Quantification

5.

6.

Fig. 6. Flow diagram of

specification of individual

in Process

Failures caused by gross deviations

Failure Probability ] [ Calculate Probability~9-~ of failure from error] J rates and mechanism

l Recommendations. ]~ [ Revise process, ~ add extra inspections, etc

proposed assessment strategy.

stages or the process as a whole, e.g. uncontrolled temperature, or total omission of a heat treatment. Most observed service failures are due to previously known causes, either documented or postulated before the event. The primary cause in most cases is therefore neglect of well understood situations rather than the emergence, in service, of a totally new or unexpected phenomenon.

Description of assessment strategy

This description is divided into the usual two subsections of failure mode identification and risk quantification.


Failure mode identification To implement the search procedure proposed here, a comprehensive file of generic failure mechanisms is required. The need for information of this type has become increasingly recognised recently, and the result is a growing number of publications relating to failure analysis. Prominent among these are the publications of the American Society for Metals. The most comprehensive source of data is the ASM Handbook, Volume X, Failure Analysis and Prevention. 25 Work is also being done by research workers to develop systematic classifications of failure mechanisms, e.g. the work of Dolan 51 and, more recently, Collins. 52 An abridged version of Collins's checklist is reproduced in T~ible 3.

The search procedure starts from the premise that all listed failure mechanisms are feasible unless there is evidence to the contrary.

TABLE 3 Task Analysis of Welding Wire Issue Procedure

(1) Foreman receives work order and specification. (2) Foreman explains job to welder. (3) Welder writes requisition. (4) Foreman checks welder's requisition. (5) Clerk in bonded store checks requisition against planning department supply order. (6) Inspector checks material in store against test certificate. (7) Material issued according to specification against requisition (mild steel in mild steel

rack in welding bay). (8) Welder makes visual inspection of material. (9) Welder accepts material and returns to workshop.

(10) Welder places material in multipartition rack after clearing it of leftovers from previous job.

(11) Welder draws rod for test weld. (12) Welder makes further visual inspection. (13) Welder resorts rods if mixup detected. (14) Welder performs test weld (possibility of detection of error at this point). (15) Welder draws rod for pressure vessel weld. (16) Welder has opportunity for a second (casual) visual check. (17) Welder performs pressure vessel weld. (18) Final weld visually inspected. (19) Stainless steel in rack by mistake. (20) Mild steel in stainless steel compartment of rack. (21) Stainless steel in mild steel compartment. (22) Stainless steel in stainless steel compartment. (23) Resort rods after detection of mixup. (24) Incorrect material delivered with wrong test certificate.


The first step is a rough screening operation, in which each mechanism is taken in turn and compared with the materials used, the processes involved in manufacture and the expected operating conditions of the finished component, for even a partial correspondence. Referring to the sub-section headed 'Strain age embrittlement' in this paper, the possible existence of strain-age embrittlement could be indicated by either the presence of plain carbon steel, or by cold work. Further examples are:

(a) Susceptibility of specific materials to certain failure mechanisms, such as temper brittleness in the case of some low alloy steels.

(b) Effects produced by certain processes, such as introduction of hydrogen by plating or manual metal arc welding.

(c) Environmental indicators, such as cyclic load and aggressive chemicals, introducing the possibility of fatigue and corrosion mechanisms respectively.

This first step is done at a relatively superficial level, and serves only to eliminate the most obviously inappropriate mechanisms from further consideration. The second step takes the remaining subset of failure mechanisms and compares their event sequences in detail with the actual manufacturing processes and operating conditions for the component. At this stage the degree of matching which can be achieved depends on how much information is available about the process, and the current state of understanding of the failure mechanism. Where any uncertainty remains it is always possible to retain the mechanism as feasible until further information is forthcoming. _Within the bounds of known failure mechanisms, therefore, this search technique is inherently conservative.

The two steps described above might suggest that an inordinate amount of work is involved. In practice it is believed that this is not so. The multiple requirements which must be satisfied before a failure mechanism is finally judged feasible are so restricting that the allowable subset reduces very rapidly. In a well designed component and its accompanying manufacturing process it is expected that all possible failure modes will be eliminated by this search procedure.

So far only the materials and processes as specified have been considered: The real problem of failure mode identification is associated with errors which cause fundamental changes under which the component is made or operated (errors of kind). To identify these errors it is necessary to return to step one and, taking each listed failure mechanism in turn, examine each manufacturing or operating stage to determine whether any


deviation is able to bring about correspondence with all or part of the failure event sequence. An example of this from the pilot study is the introduction of stress corrosion as a feasible mechanism by recognising the possibility of substituting the specified welding wire with austenitic material. Once a single correspondence has been identified, the remaining events in the failure sequence are fixed, and immediately point to other critical stages in the process. The final subset of feasible mechanisms is arrived at, as before, by searching for detailed correspondence between the failure mechanism and the process.

The following two examples are offered to illustrate how the procedure just described can reveal unexpected potential failures:

(a) Creep deformation: at least 350 C is needed to cause significant creep deformation in ferritic steel. The vessel considered in this paper is part of a chemical process which limits temperatures to no more than 100 C. However, external sources of heating, such as a fire in the process building, can be postulated, in which case a creep failure is feasible. This is a realistic and major hazard, which would have to be considered as part of the safety analysis of the complete system. It so happens that there is no intrinsic manufacturing error that could cause a creep failure in service, so that this eventually falls outside the scope of this paper.

(b) Buckling: according to the rules of ASME VIII, the vessel is much too thick to experience circumferential buckling of the ellipsoidal ends in the knuckle region under internal pressure. However, thinner vessels have been manufactured in the same workshop, and there would be no reason to question the use of thinner material if a common cause administrative error led to the ordering and issue of such material. The fault would be revealed by the hydrotest, which in this case is a satisfactory safeguard, but in other circumstances could cause unacceptable economic penalties.

Risk quantification With the feasible failure mechanisms identified it is a relatively simple task, in principle, to evaluate the probability of occurrence, and hence the risk of failure. Since each mechanism defines a logical combination of elementary events, the overall failure probability can be calculated without difficulty if the event probabilities are known. It is convenient, but not necessary, to represent the failure events in a logic diagram, such as the fault tree shown in Fig. 3. This representation helps to identify


common cause events, and is a useful illustration of the critical combinations of events in a manufacturing process.

The most important obstacle to risk quantification is the lack of information on error rates. A number of alternative techniques to estimate error rates have been demonstrated in this study, but these are not ideal answers, and there is no doubt that this is one aspect of failure assessment that requires much more work.

Discussion of assessment strategy The chief characteristic of the assessment procedure is the progressive use of information to constrain the admissible subset of failure mechanisms. One consequence of this approach is that, assuming the generic failure to have been sut~ciently inclusive, uncertainty leads to more mechanisms being accepted as feasible than necessary. This is inherently a safe approach, if wasteful of some effort. However, the Option always exists to expend some of this extra effort in learning more about the specific conditions surrounding the manufacture and operation of the component, and thereby further constrain the number of potentially feasible failure mechanisms. It is possible to represent the search procedure as a Venn diagram, as shown in Fig. 7. In this diagram all failure mechanisms are the set U. Mechanisms which are specific to the material of construction, the design configuration, the manufacturing process and the environment are designated M, D ,P and E respectively. Once all these constraints are applied simultaneously, the only feasible mechanisms remaining are those contained in the intersection, F, where

F= MnDnPnE

In general, safety analyses tend to become more extensive as the level of detail increases. The opposite is true of the strategy proposed here. As can be seen from Fig. 7, the progressively more specific description of manufacturing processes, first as the set of all processes P, then as the welding subset W, and finally as the closely defined manual argon- shielded TIG process, reduces the residual subset of feasible mechanisms. When the process is imprecisely defined it is necessary to retain for consideration a large number of hypothetical defects. Some of these, for instance hydrogen cracks, can be rejected once the welding process is identified as the TIG method as shown. It can be concluded, therefore, that this assessment strategy has the attribute of focusing attention on critical operations, and that this focusing becomes stronger as the quality of the information improves.

98 D. L. Marriott C. J. E. Beyers

Fig. 7. Venn diagram illustrating progressive application of constraints. U, set of all known failure mechanisms; M, subset of failure mechanisms constrained to methods of processing; W, subset of M related to welding processes; TIG, subset of W relating to

manual TIG process.

Discussion of probability measures for one-of-a-kind structures

One of the original objectives of this study was to predict the probability of failure for the pressure vessel, or at least that component of failure probability contributed by intrinsic defects in the vessel itself. It will be recalled that this objective was not achieved. Instead a probability figure was obtained which could be more accurately described as a measure of the uncertainty in the data available for judging the integrity of the vessel. At worst this figure can be considered to be a very conservative estimate of the true failure probability. However, there are some circumstances in which this measure of uncertainty may be adopted as an alternative to the more conventional failure rate interpretation of probability, as a risk criterion in its own right. These circumstances are precisely those which exist in the present problem, i.e. a one-of-a-kind structure, with the cause of failure being intrinsic defects in the structure itself. For this class of problem the population failure rate does not hold equally for all members, as would be the case for an external cause of failure such as a random overload situation. The true situation can be more accurately modelled by postulating two subpopulations, one with no intrinsic defects, and hence a negligible failure rate, and a second, much smaller,


population of defective structures, with a very high failure rate. The population as a whole will display a failure rate which is the average of these two subpopulation rates. As long as the penalties incurred by failure are also averaged over a large number of structures there is no need to make a distinction between intrinsic or external causes of failure, and a relative frequency definition of failure probability can be used for making decisions. On the other hand, if only one component is involved, the question to be asked is not what the failure rate for some notional population of similar structures might be if they were to be built, but whether the individual under scrutiny is good or bad. Where individual differences in construction and supervision are important, as might be expected in pressure vessel manufacture, the only way to answer the question is by examining the information which relates specifically to that particular vessel. The population failure rate, if it exists, has no relevance in this situation. If the vessel under scrutiny can be judged completely defect-free with absolute certainty, it is obvious that there is no risk regardless of the failure rate in all similar vessels. There is also little consolation to be drawn from a low population failure rate if the chosen component happens to be defective, and this fact is not detected. The conclusion is that, for one-of-a-kind structures, the risk is associated with the likelihood of a defective component being placed in service in the mistaken belief that it is defect-free. This is a subjective measure, related to the uncertainty of the available information, and is in fact the probability calculated in this study.

Adoption of this alternative measure has some consequences for the possible quantitative evaluation of QA activities. It is entirely consistent with the basic premises of QA that a situation about which nothing is known should be unacceptable. In measuring the uncertainty in the information provided, the assessment method used in this study is, in effect, a measure of the capability of the QA surveillance system. Any change in QA activities can therefore be assessed in terms of its influence on the overall uncertainty of the derived information. In principle, this can be used to plan improvements in QA programs on a cost-effective basis.

Alternative failure mode identification techniques

The search strategy outlined in this paper is considered to be reasonably effective, but it can require a degree of interdisciplinary involvement


I Supplies J ! Plate ~BondStore ~

l .aterial ' _ . _"o.z,e Specificationl r I Proc rement ~ F - ~ Bond Store Weld Rods ~ Bond Store ~-~

~ Cylindrical Plate--ColdBond ~ Elliptical End--Cold Dish l -

Cut Holes Tack Root ~nr 9 Filler

f i Service Environment--Halide Compounds i t_._.] Temperature = 0 to BOC i

[ Load = 1000 cycles O to 100% J

Fig. 8. Material processing flow diagram for pressure vessel manufacture. NDE l, check mill certificate against specification; NDE 2, 100 ~o dye penetrant inspection; NDE 3, surface visual and 100 ~ dye penetrant on all welds, NDE 4, 100 ~o radiography on all

welds.

F 1 I i I

l - L___ . ' - L ~ I Material E l " I . . - / r - - -~ .I- L - [

I Sp~:ificatio, p. . Pl . . . . . -L____J -L r-~

B D' B.D' L . . . . C - -2 - -~- [ ~ J F _ L _ _ ~ ~ { . I - - I I { 211~,I Cylindrical Plate--Cold Bend | l I t _ j Elliptical End--Cold Dish ) - J . . . . . J I - - --'-~ _ _~_~ _--. 1 L~- - - - t . . . . . . . . . . . . . . . . . . . . . . .

l ! B,D' - ----- S,O" . . . . ~- - . . . . . . . . . ~,b- ' - - - - l~t ~ _ r - - ! _r-~n _ r -~ ',---7- ~ . . . . - , . =L4 - - - - J I I , m L__l ---- ----7-----I . . . . . . . . . . . . . . J t

I

Fig. 9. Model of strain age embrittlement mechanism showing matching procedure. A, susceptible material (semi-killed steel); B, cold work > 4 C; C, post-cold work anneal; D', partial incubation; D, full incubation (requires several years at ambient); F, ambient

temperature service loading. Failure F = ABCDE.


I L - - - . I - - - - J , f - - - l _ r . . . . . . . i

I 1 ' i . . . . . . . _ _ . . r --j ~_._.[--- . . . . . . ~_, j f 1 i - ' - -7~'~-- - -~ I

~I ~-- -- -- -I- ~-- -L~'L -- J - -lq L . . . . . _1 Weld Rod i ~--~ Bond Store L I I

6 B

- L___" I~

Ii I . [ 1 ~ B B 8

I I i l . . . . I , / - t .

r - - - L~_ l - - -~ _ r - - l~__~ 1 I I L _ _ A L___ J - " - - " - ' "

U Fig. 10. Model of stress corrosion mechanism showing matching procedure, x, error X introduced; A, inadvertent issue of S.S. welding rod; B, heating of S.S. in the range of 550-850C; C, solution treatment > 1000C; D1, corrosive environment--halides; D~, corrosive environment--possibility of intermittent external surface wetting. Failure

F =ABI2 (D l + D2).

which is not always possible to achieve. It would be an advantage ifa more structured approach could be devised. One possibility being investigated is the concept of a Material Failure Log Model (MFLM). 5a It was noted, as a result of this study, that a large proportion of known material failure mechanisms can be described by Boolean logic expressions, for storage and automated search purposes, using a computer. The computer based system is described elsewhere. 53 Comment in this paper will be limited to illustration of the principles.

Figure 8 is a simplified flow diagram of the manufacturing process for the pressure vessel. In Figs 9 and 10 the event sequences for strain aging and stress corrosion, as outlined in Fig. 4, have been formalised as logic expressions, and matched element-for-element with the manufacturing process, so identifying the existence of the mechanisms. Note that the process includes gross errors in the process, otherwise the stress corrosion mechanism would not be found.

The above example is not much different from what could be carried out by hand. The full system under development has plans for a


comprehensive library of MFLMs, which can be continually updated in the light of new findings, a procedure for automatic assembly of manufacturing process simulations from a library of basic units, and a search routine.

CONCLUSIONS

(i) This study has demonstrated that the application of safety assessment to a manufacturing process is feasible, and that useful information regarding the improvement of control of such processes can be so derived.

(ii) In the specific problem considered it was not possible to calculate a failure probability with any accuracy, but the exercise provided insight into the problem, which would enable changes to be made to the inspection and data collection procedures, so that a more realistic figure could be calculated if required.

(iii) The most important contribution is considered to be the development of a systematic strategy for the identification of potential material failure mechanisms in a given manufacturing process.

(iv) As a spin-off from the main concern of the paper, an alternative probability measure is proposed for evaluating risk in the case of one-of-a-ki

1-s2.0-0308016183900388-main

Documents

Transcript of 1-s2.0-0308016183900388-main