Ibtesam' Thesis

PREDICTING THE MECHANICAL PROPERTIES OF PLAIN CARBON STEELS USING ANN

By

1. S. Ibtesam Hasan Abidi (Group Leader) 01 MT 23 2. Jamal Ahmed (A.G.L) 01 MT 21 3. S. Raheem shah 2k-01 MT 31

Supervised by:

Professor Mohammad Hayat Jokhio Chairman

Department of Metallurgy And Materials Engineering Mehran University Of Engineering And Technology, Jamshoro

Submitted in the partial fulfillment of the requirement for the degree

of Bachelor Of Metallurgy And Materials Engineering

Dec 2004

ii

Dedication

Dedicated to Time, our best teacher, and then to our Parents, our first teachers

iii

CERTIFICATE

This is to certify that the work presented in this project thesis on “Predicting the

Mechanical Properties Of Plain Carbon Steels Using ANN ” is entirely written by the

following students under the supervision of Prof. Mohammad Hayat Jokhio.

1. S. Ibtesam Hasan Abidi (Group Leader) 01 MT 23

2. Jamal Ahmed (A.G.L) 01 MT 21

3. S. Raheem Shah 2k-01 MT 31

Thesis Supervisor External Examiner

______, Dec 2004

Chairman

Department of Metallurgy And Materials Engineering

iv

Acknowledgements

Every praise is to ALLAH alone, and the grace of ALLAH is on Prophet Muhammad (PBUH) the only source of guidance and knowledge for all humanity.

It gives us great pleasure to record our sincere thanks to our supervisor Professor

Mohammad Hayat Jokhio, Chairman Department Of Metallurgy And Materials

Engineering, who gave his consent to guide us in the project. He had been very

encouraging and cooperative, while the work was carried out.

We are thankful to Mr. Isahque Abro, assistant professor, Department Of

Metallurgy And Materials Engineering, for his contributions in making the work

possible.

We consider ourselves extremely lucky to be a member of Mehran University Of

Engineering And Technology. Our Friends were great fun to be with. They always helped

us during the hard time. We will miss them all dearly.

We also thank to all those people who helped and contributed in our preparation of this thesis.

Finally, we would like to say how much we love our family. Without their support

and encouragement, these years would not have been possible.

v

ABSTRACT

Neural networks are the most emerging and prominent technology in the field of

Artificial intelligence. Present work, carried out in Mehran University Of Engineering

And Technology, Jamshoro is a try to use feed forward neural network with back

propagation training algorithm to predict the mechanical properties of plain carbon. In

present work the composition of the plain carbon steels was used as the parameter. 40

samples were used to train the network while it was validated on 7 samples. Neural

network was used with tansig, purelin, and trainlm functions.

The work would help the materials engineers and manufacturers’ suitability

design product for high performance designed mechanical properties.

vi

CONTENTS Chapter No. 1

INTRODUCTION……………………………………………………………………….1

Chapter No. 2

LITERATURE REVIEW…………………………………………………………….…3

2.1 NEURAL NETWORK

2.2 CHARACTERISTICS OF NEURAL NETWORK

2.2.1 Capabilities Of Modeling

2.2.2 Ease Of Use

2.3 ANALOGY TO THE BRAIN

2.4 THE BIOLOGICAL NETWORK

2.4.1 The Work Mechanism

2.5 THE ARTIFICIAL NEURON

2.5.1 The Work Mechanism

2.6 OUT STANDING FEATURES OF NEURAL NETWORK

2.7 MAIN FEATURES OF NEURAL NETWORK

2.8 LIMITATIONS

2.9 ADVANTAGES

2.10 CLASSIFICATION OF NEURAL NETWORKS

2.10.1 Feed Forward Networks

2.10.2 The Back Propagation

2.10.3 Single layer perceptron

2.10.4 Multi-layer perceptron

2.10.5 Simple recurrent network

2.10.6 Hopfield network

2.10.7 Boltzmann machine

2.10.8 Committee of machines

2.10.9 Self-organizing map

2.11. DESIGNING

2.11.1 Layers

vii

2.11.2 Communicating And Types Of Connections

2.11.2.1 Inter-layer connections

2.11.2.2 Intera-layer connections

2.11.3 Learning

2.11.3.1 Off-Line Or Online

2.11.3.2 Learning Laws

Chapter No. 3

MECHANICAL PROPERTIES OF STEEL AND FACTORS AFFECTING

MECHANICAL PROPERTIES……………………………………………………….23

3.1.1 Tensile Strength

3.1.2 Yield Strength

3.1.3 Elasticity

3.1.4 Plasticity

3.1.5 Ductility

3.1.6 Brittleness

3.1.7 Toughness

3.1.8 Hardness

3.1.9 Fatigue

3.1.10 Creep

3.2 FACTORS AFFECTING MECHANICAL PROPERTIES

3.2.1 Effect Of Grain Size On Properties Of Metals

3.2.2 Effect Of Heat Treatment On Properties Of Metals

3.2.3 Effect Of Environmental Variables

3.2.4 Effect Of Alloying Elements

Chapter No. 4

TESTING TECHNIQUES……………………………………………………………33

4.1 TENSILE TEST

4.1.1 Tensile Test Results

4.1.2 Proof stress

4.1.3 The Interpretation of tensile test results

viii

4.1.4 The effect of grain size and structure on tensile testing:

4.2 IMPACT TESTING

4.2.1 The Izod Test

4.2.2 The Charpy Test

4.2.3 The Interpretation Of Impact Tests

4.2.4 The Effect Of Processing On Toughness

4.3 HARDNESS TESTING

4.3.1 The Brinell Hardness Test

4.3.1.1 Machinability

4.3.1.2 Relationship Between Hardness And Tensile Strength

4.3.1.3 Work-Hardening Capacity

4.3.2 The Vickers Hardness Test

4.3.3 The Rockwell Hardness Test

4.3.4 Shore scleroscope

4.3.4.1 The Effect Of Processing On Hardness

Chapter No. 5

USING NEURAL NETWORK TOOL BOX…………………………………………60

5.1 INTRODUCTION

5.2 THE STRUCTURE OF THE NEURAL NETWORK TOOLBOX

5.3 NETWORK LAYERS

5.3.1 Constructing Layers

5.3.2 Connecting Layers

5.4 SETTING TRANSFER FUNCTIONS

5.4.1 Activation Functions

5.5 WEIGHTS AND BIASES

5.6 TRAINING FUNCTIONS & PARAMETERS

5.6.1 Performance Functions

5.6.2 Train Parameters

5.6.3 Adapt Parameters

5.7 BASIC NEURAL NETWORK EXAMPLE

5.7.1 Manually Set Weights

ix

5.7.2 Training Algorithms

5.8 GRAPHICAL USER INTERFACE

5.8.1 Introduction of GUI

5.8.2 Create A Perceptron Network

5.8.3 Input And Target

5.8.4 Create Network

5.8.5 Train the Perceptron

5.8.6 Export Perceptron Results To Workspace

5.8.7 Clear Network/Data Window

5.8.8 Importing From Command Line

5.8.9 Save A Variable To A File And Load It Later

Chapter No. 6

EXPERIMENTAL WORK…………………………………………………………..86

6.1 DATA SET

6.2 METHODOLOGY

6.2.1 Algorithm

Chapter No. 7

RESULTS AND CONCLUSION ……………………………………………………95

7.1 RESULTS

7.2 CONCLUSION

7.3 FUTURE WORK

APPENDICES

Appendix a ……………………………………………………………………………99

Appendix b ……………………………………………………………………………103

BIBLIOGRAPHY ……………………………………………………………………105

x

List of tables/illustration Figure 2.1 A systematic diagram of a single neuron nerve 5 Figure 2.3 Artificial Neuron 6 Figure 2.4 Feed Forward Networks 9 Figure 2.5 Back propagation 11 Figure 2.6 network layers 15 Figure 3.1 Yield point and yield strength 24 Figure 4.1 Testing machine 33 Figure 4.2 Tensile test specimen ( round ) 35 Figure 4.3 Tensile test specimen ( flat ) 35 Figure 4.4 load / extension curve for low-carbon steel 36 Figure 4.5 Proof stress 37 Figure 4.6 Typical stress/strain curves 40 Figure 4.7 Effect of grain orientation on material testing 41 Figure 4.8 effect of processing on the properties of low-carbon steel 41 Figure 4.9 Effect of tempering on tensile test 42 Figure 4.10 effect of temperature on cold-worked material 43 Figure 4.11 Typical impact testing machine 44 Figure 4.12 Impact loading 44 Figure 4.13 Izod test 45 Figure 4.14 Charpy test 46 Figure 4.15 Standard charpy notches 47 Figure 4.16 Effect of temperature on toughness 48 Figure 4.17 Effect of Annealing on the toughness of low-carbon steel 49 Figure 4.18 Effect of tempering on quench-hardened high carbon steel 49 Figure 4.19 Brinell hardness tester 51 Figure 4.20 Brinell hardness principle 51 Figure 4.21 Work-hardening capacity 52 Figure 4.22 Micro Vicker And Vicker Hardness Testers 53 Figure 4.23 Rockwell hardness tester 55 Table 4.1 Rockwell hardness test conditions 56 Table 4.2 Rockwell superficial hardness test conditions 57 Figure 4.24 Effect of cold-working on the hardness of various 58 Figure 4.25 Effect of heating cold-worked 70/30 brass 59 Figure 4.26 Effect of heating a Quench-hardened 0.8% plain carbon 59 Table 5.1 The XOR-problem 68 Figure 5.1 The logsig activation function 69 Figure 5.2 The targets and the actual output 71 Figure 5.3 Network data manager 75 Figure 5.4 Create new data window 76 Figure 5.5 Create new network window 77 Figure 5.6 View network window 78 Figure 5.7 Main network window 79 Figure 5.8 training result window 80 Figure 5.9 Import / load window 84

xi

Table 6.1. Data set to be used for training the network 86 Table 6.2. Data set to be used as unseen data for the network 87 Figure 6.1 training of the network with TRAINLM function and 1 neuron 89 Figure 6.2 training of network with 7neurons 91 Figure 6.3 training of network with 9 neurons 93 Figure 6.4 the network 94 Graph 7.1 Comparisons of actual value and predicted value with 1 neuron 96 Graph 7.2 Comparisons of actual value and predicted value with 7 neuron 97 Graph 7.3 Comparisons of actual value and predicted value with 9 neuron 97 Graph 7.4 Regression line for the values predicted and actual 98

Chapter No. 1

INTRODUCTION

1

Chapter No. 1

INTRODUCTION Neural networks are the most emerging and prominent technology in the field of

Artificial intelligence, Used in almost every Engineering, Finance, Defiance, Economics,

and other areas, neural networks have proven them selves the best predicting and

controlling tool. Present work is an attempt in the Department Of Metallurgy And

Materials Engineering MUET, Jamshoro to use these neural networks to predict the

mechanical properties of plain carbon, as these (mechanical properties) are the single

most important factor considered while designing a new composition, and also this work

will be useful in creation of a reference to compare the tested mechanical properties. In

present work we have used the composition of the plain carbon steels modeling the

mechanical properties, although, many other parameters are involved that affects the

mechanical properties of plain carbon steel such as, heat treatment, grain size, cold

working, environment etc., which are explained in chapter 3. 40 samples were used to

train the network while it was tested on 7 samples. As metallurgy and materials

engineers, we have adopted and preferred a theoretical approach, however research work

reveals that a mathematical approach adopted on similar topic. The main objectives of

this thesis work are (1) To modify the neural networks into comprehensive and simple

tutorial that could be helpful for utilization of neural networks in the field of engineering

materials by coming students. (2) To create an example we have tried to implement this

technology for prediction of mechanical properties.

During the research we have found the neural networks an intelligent, powerful

and a useful tool for engineering applications and especially for the prediction of

mechanical properties of plain carbon steel. All the wok was carried out at Mehran

University of Engineering And Technology, Jamshoro. The data was obtained from

Pakistan Steel Mills and ASTM Materials Handbook On Properties Of Metal Vol: 9.

The work includes:

Chapter 2 highlights a comprehensive, brief and theoretical concept of neural networks,

training laws, kind of networks used, and some history.

2

Chapter 3 provides an introduction of the mechanical properties of plain carbon steel and

the parameters affecting these mechanical properties. The parameters have their

significance as they can be used to enhance the capabilities of the neural network for

prediction.

Chapter 4 involves experimental techniques to measure the mechanical properties on

laboratory scale. In this chapter, the relationship between the mechanical properties is

also briefly explained.

Chapter 5 consists of a tutorial on how to use neural network toolbox in matlab. Matlab is

a very power full engineering and mathematical language, provided with a built in neural

network tool that is very easy to use.

Chapter 6 is the conclusion, future work, and the results of our experiment.

Chapter No. 2

LITERATURE REVIEW

3

Chapter No. 2

LITERATURE REVIEW

2.1 NEURAL NETWORK

Artificial Neural Network can be best defined as a system loosely modeled on the

human brain this field goes by many names, such as connectionism, parallel distributed

processing, Menno computing, Natural Intelligent systems, Machine learning algorithm

and artificial Neural Networks.

Neural Networks have seen an explosion of interest over the last few years, and

are being successfully applied across an extraordinary range of problem. In the areas as

diverse as, finance, medicine, engineering, geology and physics. In short anywhere,

where is a problem of prediction, classification or control neural Networks are tailoring

their applications.

Neural Networks is an attempt to simulate with in specialized hardware or

sophisticated software, the multiple layers of single processing elements called neurons.

Each Neuron is linked to certain its neighbors with varying coefficient of connectivity

that represent the strength there connections, learning is accomplished by adjusting these

strengths cause the over all network to output appropriate results.

2.2 CHARACTERISTICS OF NEURAL NETWORK

2.2.1 Capability of modeling: -

Neural Network is very sophisticated modeling techniques capable of modeling

extremely function in particular. Neural Network have been non linear, for many years

linear modeling has been the commonly used technique in most modeling domain since

linear models have well known optimization strategies. Where the linear approximation

was not valid, the models suffered accordingly.

2.2.2 Ease of use: -

Since Neural Networks learn by examples, therefore neural network user gathers

representative data first, and then uses training algorithms to automatically learn the

runtime of the data. Although the user does need to have some hemistich knowledge of

how to select and prepare data, how to select an appropriate neural network, and how to

4

interpret the results. The level of user knowledge needed to successfully apply Neural

Networks is much simple than would be in case of using rational non-linear statistical

methods.

2.3 ANALOGY TO THE BRAIN

The most basic components of neural networks are modeled after the structure of

the brain is developed. Some neural network structures are not closely to the brain and

some does not have a biological counter part in the brain. However, neural networks have

a strong similarity to the biological brain therefore a great deal of the technology is

barrowed from the sciences.

2.4 THE BIOLOGICAL NETWORK The most basic elements of the human brain is a specific type of cell, which

provide the abilities to remember, think and apply previous experiences to our every

action. Here cells are known as neuron, each of this neuron can connect with 20,0000

other neuron. The power of the brain comes from the numbers of these basic components

and the multiple connections between there.

All neurons have four basic components, which are dentenitres, soma, axon and

synapses (Fig 2.1). Basically, a biological neuron receives inputs from other sources.

Combines then in some why, performs a generally non linear operation on the results, and

then output the final results the figure below shows a simplified biological neuron and the

relationship of its four components.

2.4.1 The work mechanism

The brain is principally composed of a very large number (circa 10,000,000,000)

of neurons, massively interconnected (with an average of several thousand interconnects

per neuron, although this varies enormously). Each neutron is a specialized cell, which

can propagate an electrochemical signal. The neuron has a branching input structure (the

dendrites), a cell body, and a branching output structure (the axon). The axons of one cell

connect to the dendrites of another via a synapse. When a neuron is activated, it fires an

electrochemical signal along the axon. This signal crosses the synapses to other neurons,

which may in turn fire. A neuron fires only if the total signal received at the cell body

from the dendrites exceeds a certain level (the firing threshold).

5

The strength of the signal received by a neuron (and therefore is chances of firing)

critically depends on the efficiency of the synapse actually contains a gap, with

neurotransmitter chemicals poised to transmit a signal across the gap.

Thus, from a very large number of extremely simple processing units (each

performing a weighted sum of its inputs, and then firing a binary signal if the total input

exceeds a certain level) the brain manages to perform extremely complex tasks. Of

course, there is a great deal of complexity in the brain which has not been discussed here,

but it is interesting that artificial Neural Networks can achieve some remarkable results

using a model not much more complex than this.

Figure: 2.1 A systematic diagram of a single neuron nerve

2.5 THE ARTIFICIAL NEURON

The basic unit of neural networks, the artificial neurons, simulates the four basic

functions of natural neurons. Artificial are much simpler than the biological neuron; the

figure below shows the basics of an artificial neuron.

6

Figure: 2.3 Artificial Neuron

Notes that various inputs to the network are represented by the mathematical

symbol x (n). Each of these inputs are multiplied by a connection weight, these weights

are represented by w (n). In the simplest case, these products are simply summed, fed

through a transfer function to generate a result, and then output.

Even through all artificial neural networks are constructed from this basic building block

the fundamentals may vary in these building blocks and there are differences.

2.5.1 The work mechanism

It receives a number of inputs (either from original data, or from the output of

other neurons in the neural network). Each input comes via a connection that has a

strength (or weight); these weights correspond to synaptic efficacy in biological neuron.

Each neuron also has a single threshold value. The weighted sum of the inputs is formed,

and the threshold subtracted, to compose the activation of the neuron (also known as the

post – synaptic potential, or PSP, of the neuron).

The activation signal is passed through an activation function (also known as a transfer

function) to produce the output of the neuron.

7

2.6 OUT STANDING FEATURES OF NEURAL NETWORK

Neural networks are performing successfully where other methods do not

reorganizing and matching complicated, vague, or incomplete patterns. Neural networks

have been applied in solving a wide variety of problems.

Prediction: - The most common use for neural networks is to predict what will

most likely happen. There are many areas where prediction can help in setting

priorities.

For example: - The emergency room at a hospital can be hectic place; to know

who needs the most critical help, and who can enable a more successful operation.

Basically, all organizations must establish priorities, which govern the allocation

of their resources. Neural Networks have been used as a mechanism of knowledge

acquisition for export system in stock market forecasting with astonishingly

accurate results. Neural networks have also been used for bankruptcy prediction

for credit card institution.

Other most common applications of neural networks fall into the following categories:

Classification: - Use input values to determine the classification.

Data association: - It also recognizes data that contains errors. For example it

only identify the characters that were scanned but identify when the scanner is not

working properly.

Data Conceptualization: - Analyze the inputs so that grouping relationships can

be inferred. For example it extracts the names of customers from database that

most likely buy a particular product

Data Filtering: -Smooth an input signal. For example it take the noise out of a

telephone signal.

Frequently speaking, the neural network system can be applied for interpretation,

prediction, diagnosis, planning, monitoring, debugging, repair, instruction, and control.

Application of neural networks in materials:

Most of the neural networks applications in materials science and engineering lie

in the category of prediction, modeling and control. Or example the current work

is the prediction of mechanical properties of plain carbon steels, Jokhio [2004]

have found the application of neural networks in the field of powder metallurgy.

8

Iqbal Shah [2002] has worked predicting the tensile properties of austenitic

stainless steels. H K D H Bhadesia [1999] defines the neural network applications

in controlling the welding robots, predicting the solidification cracking of welds;

strength of steel welds, hot cracking of the weld; creep, predicting fatigue

properties, fatigue threshold; martensite start temperature, and most importantly

prediction of continuous cooling transformation (or TTT) diagram.

2.7 MAIN FEATURES OF NEURAL NETWORK

• Artificial Neural Network (ANNs) learns by experience rather than by modeling

or programming.

• ANN architectures are distributed, inherently parallel and potentially real time.

• They have the ability to generalize.

• They do not require a prior understanding of the process or phenomenon being

studied.

• They can form arbitrary continuous non-linear mappings.

• They are robust to noisy data.

• VLSI implementation is easy.

2.8 LIMITATIONS • Tools for analysis and model validation are not well established.

• An intelligent machine can only solve some specific problem for which it is

trained.

• Human brain is very complex and cannot be fully simulated with present

computing power. An artificial neural network does not have capability of

human brain.

2.9 ADVANTAGES:

I. Adaptive learning: An ability to learn how to do tasks based on the data given

for training or initial experience.

II. Self – organization: An ANN can create its own organization or

representation of the information it receives during learn time.

9

III. Real Time operation: ANN computations may be carried out in parallel, and

special hardware devices are being designed and manufactured which take

advantage of this capability.

IV. Fault Tolerance via redundant information coding: Partial destruction of a

network leads to the corresponding degradation of performance. However,

some network capabilities may be retained even with major network damage.

2.10 CLASSIFICATION OF NEURAL NETWORKS

Neural networks can be classified as [wikipedia]

a) Feed forward b) Back propagation

According of architecture, neural networks can be classified as

a) Single layer precepton b) Multi layer preceptons

Other types includes

a) Simple recurrent network b) Hopfield network

c) Boltzmann machine d) Support vector machine

e) Committee of machines f) Self organizing map

2.10.1 Feed forward Networks

Feed forward ANNs allow signals to travel one way only, from input to output.

There is no feedback (loop) i.e. the output of any layer does not affect that same layer.

Feed forward ANNs tend to be straightforward networks that associate inputs with

outputs. They are extensively used in pattern recognition. This type of organization is

also referred to as bottom up or top down.

Figure: 2.4 Feed Forward Networks

10

2.10.2 The back propagation

The term is an abbreviation for "backwards propagation of errors". Back

propagation still has advantages in some circumstances, and is the easiest algorithm to

understand. There are also heuristic modifications of back propagation which work well

for some problem domains, such as quick propagation

In back propagation, the gradient vector of the error surface is calculated. This

vector points along the line of steepest descent from the current point, so we know that if

we move along it a “short” distance, we will decrease the error. A sequence of such

moves (slowing as we near the bottom) will eventually find a minimum of some sort. The

difficult part is to decide how large the steps should be.

Large steps may converge more quickly, but may also overstep or (if the error

surface is very eccentric) go off in the wrong direction.

A classic example of this in neural network training is where the algorithm

progresses very slowly along a steep, narrow, valley, bouncing from one side across to

the other. In contrast, very small steps may go in the correct direction, but they also

require a large number of iterations. In practice, the step size is proportional to the slope

(so that the algorithms settle down in a minimum) and to a special constant: the learning

rate. The correct setting for the learning rate is application-dependant, and is typically

chosen by experiments; it may also be time- varying, getting smaller as the algorithm

progresses.

The algorithm is also usually modified by inclusion of a momentum term: this

encourages movement in a fixed direction, so that if several steps are taken in the same

direction, the algorithm “picks up speed”, which gives it the ability to (sometimes) escape

local minimum, and also to move rapidly over flat spots and plateaus.

The algorithm therefore progresses iteratively, through a number of epochs. On

each epoch, the training cases are each submitted in turn to the network, and target and

actual outputs compared and the error calculated. This error, together with the error

surface gradient, is used to adjust the weights, and then the process repeats. The initial

network configuration is random, and training stops when a given number of epochs

elapses, or when the error reacted and acceptable level, or when the error stops improving

(you can select which of these stopping conditions to use).

11

Figure: 2.5 Back propagation

2.10.3 Single layer perceptron

The earliest kind of neural network is a single-layer perceptron network, which

consists of a single layer of output nodes; the inputs are fed directly to the outputs via a

series of weights. In this way it can be considered the simplest kind of feedforward

network. The sum of the products of the weights and the inputs is calculated in each

node, and if the value is above some threshold (typically 0) the neuron fires and takes the

value 1; otherwise it takes the value -1. Neurons with this kind of activation function are

also called McCulloch-Pitts neurons or threshold neurons. In the literature the term

perceptron often refers to networks consisting of just one of these units.

Perceptrons can be trained by a simple learning algorithm that is usually called

the delta-rule. It calculates the errors between calculated output and sample output data,

and uses this to create an adjustment to the weights, thus implementing a form of

gradient descent.

12

2.10.4 Multi-layer perceptron

This class of networks consists of multiple layers of computational units, usually

interconnected in a feedforward way. This means that each neuron in one layer has

directed connections to the neurons of the subsequent layer. In many applications the

units of these networks apply a sigmoid function as an activation function.

The universal approximation theorem for neural networks states that every

continuous function that maps intervals of real numbers to some output interval of real

numbers can be approximated arbitrarily closely by a multi-layer perceptron with just

one hidden layer. This result holds only for restricted classes of activation functions, e.g.

for the sigmoidal functions.

Multi-layer networks use a variety of learning techniques, the most popular being

backpropagation. Here the output values are compared with the correct answer to

compute the value of some predefined error-function. By various techniques the error is

then fed back through the network. Using this information, the algorithm adjusts the

weights of each connection in order to reduce the value of the error-function by some

small amount. After repeating this process for a sufficiently large number of training

cycles the network will usually converge to some state where the error of the

calculations is small. In this case one says that the network has learned a certain target

function. To adjust weights properly one applies a general method for nonlinear

optimization task that is called gradient descent. For this the derivation of the error-

function with respect to the network weights is calculated and the weights are then

changed such that the error decreases (thus going downhill on the surface of the error

function). For this reason backpropagation can only be applied on networks with

differentiable activation function.

In general the problem of reaching a network that performs well, even on

examples that were not used as training examples, is a quite subtle issue that requires

additional techniques. This is especially important for cases where only very limited

numbers of training examples are available. The danger is that the network overfits the

13

training data and fails to capture the true statistical process generating the data.

Statistical learning theory is concerned with training classifiers on a limited amount of

data. In the context of neural networks a simple heuristic, called early stopping, often

ensures that the network will generalize well to examples not in the training set.

Other typical problems of the back-propagation algorithm are the speed of

convergence and the possibility to end up in a local minimum of the error function.

Today there are practical solutions that make backpropagation in multi-layer perceptrons

the solution of choice for many machine learning tasks.

2.10.5 Simple recurrent network

A simple recurrent network (SRN) is a variation on the multi-layer perceptron,

sometimes called an "Elman network" due to its invention by Professor Jeff Elman. A

three-layer network is used, with the addition of a set of "context units" in the input

layer. There are connections from the middle ("hidden") layer to these context units fixed

with weight 1. At each time step, the input is propagated in a standard feedforward

fashion, and then a learning rule (usually backpropagation) is applied. The fixed back

connections result in the context units always maintaining a copy of the previous values

of the hidden units (since they propagate over the connections before the learning rule is

applied). Thus the network can maintain a sort of state, allowing it to perform such tasks

as sequence-prediction that are beyond the power of a standard multi-layer perceptron.

2.10.6 Hopfield network

The Hopfield net is a recurrent neural network in which all connections are

symmetric. This network has the property that its dynamics are guaranteed to converge.

If the connections are trained using Hebbian learning then the Hopfield network can

perform robust content-addressable memory, robust to connection alteration.

2.10.7 Boltzmann machine

14

The Boltzmann machine can be thought of as a noisy Hopfield network. The

Boltzmann machine was important because it was one of the first neural networks in

which learning of latent variables (hidden units) was demonstrated. Boltzmann machine

learning was slow to simulate, but the Contrastive Divergence algorithm of Geoff Hinton

allows models including Boltzmann machines and Product of Experts to be trained much

faster.

2.10.8 Committee of machines

A committee of machines (CoM) is a collection of different neural networks that

together vote on a given example. It has been seen that this gives a much better result. In

fact in many cases, starting with the same architecture and training but different initial

random weights give vastly different networks. A CoM tends to stabilize the result.

2.10.9 Self-organizing map

The Self-organizing map (SOM), sometimes referred to as "Kohonen map", is an

unsupervised learning technique that reduces the dimensionality of data through the use

of a self-organizing neural network. A probabilistic version of SOM is the Generative

Topographic Map (GTM).

2.11 DESIGN

The developer must go through a period of trial and error in the design

decisions before coming up with a satisfactory design. The design issues in neural

networks are complex and are the major concerns of system developers.

Designing a neural network consist of:

• Arranging neurons in various layers.

• Deciding the type of connections among neurons for different

layers, as well as among the neurons within a layer.

• Deciding the way a neuron receives input and produces output.

15

• Determining the strength of connection within the network by

allowing the network learns the appropriate values of connection

weights by using a training data set.

The process of designing a neural network is an iterative process. Below are its

basic steps.

2.11.1 Layers

Biologically, neural networks are constructed in a three dimensional way

from microscopic components. These neurons seem capable of nearly unrestricted

interconnections. This is not true in any man-made network. Artificial neural

networks are the simple clustering of the primitive artificial neurons. This

clustering occurs by creating layers, which are then connected to one another.

How these layers connect may also vary. Basically, all artificial neural networks

have a similar structure of topology. Some of the neurons interface the real world

to receive its inputs and other neurons provide the real world with the network’s

outputs. All the rest of the neurons are hidden form view.

Figure: 2.6 network layers

16

As the figure above shows, the neurons are grouped into layers. The input

layer consists of neurons that receive input form the external environment. The

output layer consists of neurons that communicate the output of the system to the

user or external environment. There are usually a number of hidden layers

between these two layers; the figure above shows a simple structure with only one

hidden layer.

When the input layer receives the input its neurons produce output, which

becomes input to the other layers of the system. The process continues until a

certain condition is satisfied or until the output layer is invoked and fires their

output to the external environment.

To determine the number of hidden neurons the network should have to

perform its best, one are often left out to the method trial and error. If you

increase the hidden number of neurons too much you will get an over fit, that is

the net will have problem to generalize. The training set of data will be

memorized, making the network useless on new data sets.

2.11.2 Communication and types of connections

Neurons are connected via a network of paths carrying the output of one

neuron as input to another neuron. These paths is normally unidirectional, there

might however be a two-way connection between two neurons, because there may

be another path in reverse direction. A neuron receives input from many neurons,

but produce a single output, which is communicated to other neurons.

The neuron in a layer may communicate with each other, or they may not

have any connections. The neurons of one layer are always connected to the

neurons of at least another layer.

17

2.11.2.1 Inter-layer connections

There are different types of connections used between layers; these

connections between layers are called inter-layer connections.

• Fully connected

Each neuron on the first layer is connected to every neuron

on the second layer.

• Partially connected

A neuron of the first layer does not have to be connected to

all neurons on the second layer.

• Feed forward

The neurons on the first layer send their output to the

neurons on the second layer, but they do not receive any input back

form the neurons on the second layer.

• Bi-directional

There is another set of connections carrying the output of

the neurons of the second layer into the neurons of the first layer.

Feed forward and bi-directional connections could be fully- or partially

connected.

• Hierarchical

If a neural network has a hierarchical structure, the neurons of a lower

layer may only communicate with neurons on the next level of layer.

18

• Resonance

The layers have bi-directional connections, and they can continue

sending messages across the connections a number of times until a

certain condition is achieved.

2.11.2.2 Intra-layer connections

In more complex structures the neurons communicate among themselves

within a layer, this is known as intra-layer connections. There are two types of

intra-layer connections.

• Recurrent

The neurons within a layer are fully- or partially connected to one

another. After these neurons receive input form another layer, they

communicate their outputs with one another a number of times before they

are allowed to send their outputs to another layer. Generally some

conditions among the neurons of the layer should be achieved before they

communicate their outputs to another layer.

• On-center/off surround

A neuron within a layer has excitatory connections to itself and its

immediate neighbors, and has inhibitory connections to other neurons.

One can imagine this type of connection as a competitive gang of

neurons. Each gang excites itself and its gang members and inhibits all

members of other gangs. After a few rounds of signal interchange, the

neurons with an active output value will win, and is allowed to update

its and its gang member’s weights. (There are two types of connections

between two neurons, excitatory or inhibitory. In the excitatory

connection, the output of one neuron increases the action potential of

the neuron to which it is connected. When the connection type

between two neurons is inhibitory, then the output of the neuron

19

sending a message would reduce the activity or action potential of the

receiving neuron. One causes the summing mechanism of the next

neuron to add while the other causes it to subtract. One excites while

the other inhibits.)

2.11.3 Learning

The brain basically learns from experience. Neural networks are

sometimes called machine-learning algorithms, because changing of its

connection weights (training) causes the network to learn the solution to a

problem. The strength of connection between the neurons is stored as a weight-

value for the specific connection. The system learns new knowledge by adjusting

these connection weights.

The learning ability of a neural network is determined by its architecture

and by the algorithmic method chosen for training.

The training method usually consists of one of three schemes:

1. Unsupervised learning

The hidden neurons must find a way to organize themselves without

help from the outside. In this approach, no sample outputs are

provided to the network against which it can measure its predictive

performance for a given vector of inputs. This is learning by doing.

2. Reinforcement learning

This method works on reinforcement from the outside. The

connections among the neurons in the hidden layer are randomly

arranged, then reshuffled as the network is told how close it is to

solving the problem. Reinforcement learning is also called supervised

learning, because it requires a teacher. The teacher may be a training

20

set of data or an observer who grades the performance of the network

results.

Both unsupervised and reinforcement suffers from relative slowness

and inefficiency relying on a random shuffling to find the proper

connection weights.

3. Back propagation

This method is proven highly successful in training of multilayered

neural nets. The network is not just given reinforcement for how it is

doing on a task. Information about errors is also filtered back through

the system and is used to adjust the connections between the layers,

thus improving performance. A form of supervised learning.

2.11.3.1 Off-line or On-line

One can categorize the learning methods into yet another group, off-line or

on-line. When the system uses input data to change its weights to learn the

domain knowledge, the system could be in training mode or learning mode. When

the system is being used as a decision aid to make recommendations, it is in the

operation mode; this is also sometimes called recall.

• Off-line

In the off-line learning methods, once the systems enters into the

operation mode, its weights are fixed and do not change any more.

Most of the networks are of the off-line learning type.

• On-line

In on-line or real time learning, when the system is in operating

mode (recall), it continues to learn while being used as a decision

tool. This type of learning has a more complex design structure.

21

2.11.3.2 Learning laws

There are a variety of learning laws, which are in common use. These laws

are mathematical algorithms used to update the connection weights. Most of these

laws are some sorts of variation of the best-known and oldest learning law,

Hebb’s Rule. Man’s understanding of how neural processing actually works is

very limited. Learning is certainly more complex than the simplification

represented by the learning laws currently developed. Research into different

learning functions continues as new ideas routinely show up in trade publications

etc. A few of the major laws are given as an example below.

• Hebb’s Rule

The first and the best-known learning rule was introduced by Donald

Hebb. This basic rule is: If a neuron receives an input from another

neuron, and if both are highly active (mathematically have the same

sign), the weight between the neurons should be strengthened.

• Hopfield Law

This law is similar to Hebb’s Rule with the exception that it specifies

the magnitude of the strengthening or weakening. It states, "if the

desired output and the input are both active or both inactive, increment

the connection weight by the learning rate, otherwise decrement the

weight by the learning rate." (Most learning functions have some

provision for a learning rate, or a learning constant. Usually this term

is positive and between zero and one.)

• The Delta Rule

The Delta Rule is a further variation of Hebb’s Rule, and it is one of

the most commonly used. This rule is based on the idea of

continuously modifying the strengths of the input connections to

22

reduce the difference (the delta) between the desired output value and

the actual output of a neuron. This rule changes the connection weights

in the way that minimizes the mean squared error of the network. The

error is back propagated into previous layers one layer at a time. The

process of back-propagating the network errors continues until the first

layer is reached. The network type called Feed forward, Back-

propagation derives its name from this method of computing the error

term.

This rule is also referred to as the Windrow-Hoff Learning Rule and

the Least Mean Square Learning Rule.

• Kohonen’s Learning Law

This procedure, developed by Teuvo Kohonen, was inspired by

learning in biological systems. In this procedure, the neurons compete

for the opportunity to learn, or to update their weights. The processing

neuron with the largest output is declared the winner and has the

capability of inhibiting its competitors as well as exciting its

neighbors. Only the winner is permitted output, and only the winner

plus its neighbors are allowed to update their connection weights.

The Kohonen rule does not require desired output. Therefore it is

implemented in the unsupervised methods of learning. Kohonen has

used this rule combined with the on-center/off-surround intra- layer

connection to create the self-organizing neural network, which has an

unsupervised learning method.

Chapter No. 3

MECHANICAL PROPERTIES OF STEEL AND FACTORS AFFECTING MECHANICAL PROPERTIES

23

Chapter No. 3

MECHANICAL PROPERTIES OF STEEL AND FACTORS

AFFECTING MECHANICAL PROPERTIES

The mechanical property of materials is simply the ability of materials to with

stand internal/external physic-mechanical forces such as pulling, pushing, twisting,

bending and sudden impact. In general terms, these properties are various kinds of

strength.

These properties are measured by means of destructive testing of materials in

laboratory. However it is very difficult to provide actual service condition in laboratory.

The following material properties are of great importance.

3.1.1 Tensile Strength The ratio of the maximum load to original cross sectional area is called Tensile

Strength or Ultimate Tensile Strength. Where as Ultimate tensile strength refers to the

force needed to fracture the material. Theses are the Properties of materials related to its

ability to with stand external mechanical forces such as pulling, pushing, twisting,

bending and sudden impact and ductility of a material. Tensile strength or ultimate

strength is the maximum point shown on the stress–strain curve. (Fig: 3.1c).

Tensile strength value is commonly taken as a basis for fixing the working

stresses especially in brittle materials. The units of tensile strength are kg/cm2

( a ) low-carbon steel ( b ) Non-ferrous metals

24

( c ) stress-strain curve showing types of static strength.

Figure: 3.1 Yield point and yield strength

3.1.2 Yield Strength

When metals are subjected to a tensile force, they stretch or elongate as the stress

increases. The point where the stretch suddenly increases is known as the yield strength

of the material. Yield strength of a material represents the stress below which the

deformation is almost entirely elastic, and It is that value of stress at which a material

exhibits a specified deviation from proportionality of stress and strain.

It can be defined as the ability of a material to resist plastic deformation, and is

calculated by dividing the force initiating the yield by the original cross-sectional area of

the specimen.

In material where the proportion limit or the elastic limit (fig.3.1b) is less obvious,

it is common to define the yield load as that force required to give 0.2%plastic offset. In

other words, the yield strength is defined, as the stress required producing an arbitrary

permanent deformation. The deformation most often used is 0.2%(fig.3.1), and is

commonly referred as proof strain.

25

3.1.3 Elasticity Loading a solid will change its dimensions, but the resulting deformation will

disappear upon unloading. This tendency of a deformed solid to seek its original

dimensions upon unloading is described to a property called elasticity.

The recovery from the distorting effects of the loads may be instantaneous or

gradual, complete or partial. A solid is called perfectly elastic if this recovery is

instantaneous and complete; it is said to exhibit delayed elasticity or inelastic effects,

respectively, if the recovery is gradual or incomplete. Accurate measurements reveal

some delayed elasticity and inelastic effects in all solids. 3.1.4 Plasticity

Plasticity is that property of a material by virtue of which it may be permanently

deformed when it has been subjected to an externally applied force great enough to

exceed the elastic limit. It has great importance to a fabricator engineer because it is the

property that enables material to shape in the solid state.

For most materials, the plastic deformation follows the elastic deformation.

Referring to stress-strain curve (fig.3.1) a material obeys the law of elastic solids or

stresses below the yield stress and this is followed by the plastic deformation.

The mechanism of plastic deformation is essentially different in crystalline

materials and amorphous materials. Crystalline materials undergo plastic deformation as

the result of slip along definite crystallographic planes, whereas in amorphous materials

plastic deformation occurs when individual molecules or groups of molecules slide past

one another.

3.1.5 Ductility Ductility refers to the capability of a material to undergo deformation under

tension without rupture. It is the ability of a material to be drawn from a large section to a

small section such as in wire drawing.

Ductility may be expressed as percent elongation (%EL) or percent area reduction

(%AR). From a tensile test:

The percent elongation = (lƒ-lo) 100/lo

26

Percent area reduction = (Ao-Aƒ) 100/Ao

Where

Lƒ is fracture length

Lo is original gauge length

Ao is original cross-sectional area at the point of fracture.

Ductility is a measure of the degree of plastic deformation that has been sustained

at fracture. Knowledge of the ductility of materials is important for at least two reasons:

• First it indicates to a designer the degree to which a structure will deform

plastically before fracture.

• Second, it specifies the degree of allowable deformation during fabrication

operation.

3.1.6 Brittleness

Brittleness is defined as a tendency to fracture without appreciable deformation

and is therefore the opposite of ductility or malleability. A brittle material will fracture

with little permanent deformation/distortion; it is a sudden failure. A brittle material is

hard and has little ductility. It will not stretch or bend before breaking. Cast iron is an

example of a brittle material.

If a material can be mechanically worked to a different size or shape without

breaking or shattering, it is ductile or malleable; but if little or no change in the

dimensions can be made before fracture occurs, it is brittle.

Technically speaking if an elongation less than 5% in a 50 mm gauge length is

taking place, then a material will be recognized as a brittle material. The brittle fractures

normally follow the grain boundaries (inter granular or intercrystalline), whereas ductile

fractures normally occur through the grains (transgranular or transcrystalline).

3.1.7 Toughness Toughness is the ability of the material to absorb energy during plastic

deformation up to fracture. It refers to the ability of a material to withstand bending or the

application of shear stresses without fracture. By this definition, copper is extremely

tough but cast iron is not.

Specimen geometry as well as the manner of load application is important in

toughness determinations. For dynamic loading conditions and when a notch (or point of

27

stress concentration) is present, notch toughness is assessed by using an impact test.

Furthermore, fracture toughness is a property indicative of a material’s resistance to

fracture when a crack is present.

For the static situation, toughness may be ascertained from the results of a tensile

stress-strain test. Toughness of a material, then, is indicated by the total area under the

material’s tensile stress-strain curve up to the point of fracture.

3.1.8 Hardness Hardness is the resistance of a material to penetration/scratch. However, the term

may refer to stiffness or temper or to resistance to scratching, abrasion or cutting. Tests

such as Brinell, Rockwell, Vickers etc., are generally employed to measure hardness.

The hardness of materials depends upon the type of bonding forces between

atoms, ions or molecules and increases, like strength, with the magnitude of these forces.

Thus molecular solids such as plastics are relatively soft, metallic and ionic solids are

harder than molecular solids, and covalent solids are the hardest materials known.

3.1.9 Fatigue When subjected to fluctuating or repeated loads (or stresses), materials tend to

develop a characteristic behavior, which is different from that (of the materials) under

steady load. Fatigue is the phenomenon that leads to fracture under such conditions.

Fracture takes place under repeated or fluctuating stresses whose maximum value is less

than the tensile strength of the material (under steady loads). Fatigue fracture is

progressive, beginning as minute cracks that grow under the action of fluctuating stress.

The term fatigue is used because this type of failure normally occurs after a

lengthy period of repeated stress or strain cycling.

Fatigue is important as much as it is the single largest cause of failure in metals

(bridges, aircraft and machine components), estimated to comprise approximately 90% of

all metallic failures; it is catastrophic and insidious, occurring very suddenly and without

warning.

Fatigue failure is brittle like in nature even in normally ductile metals, in that

there is very little, if any, gross plastic deformation associated with failure. The process

occurs by the initiation and propagation of cracks, and ordinarily the fracture surface is

perpendicular to the direction of an applied tensile stress.

28

3.1.10 Creep

Creep is the time-dependent permanent deformation that occurs under stress; for

most materials, it is important only at elevated temperatures. Materials are often placed in

service at elevated temperatures and exposed to static mechanical stresses (e.g., turbine

rotors in jet engines and steam generators that experience centrifugal stresses and high-

pressure steam lines). Deformation under such circumstances is termed creep.

3.2 FACTORS AFFECTING MECHANICAL PROPERTIES Mechanical properties of materials are affected due to:

1. Alloy contents such as addition of W, Cr, etc. improve hardness and strength.

2. Grain size and microstructure.

3. Crystal imperfections such as dislocations.

4. Manufacturing defects such as cracks, blowholes etc.

5. Physio-mechanical treatments.

3.2.1 Effect Of Grain Size On Properties Of Metals On the basis of grain size, materials may be classified as:

1. Coarse-grained materials, (the grain size is large).

2. Fine-grained materials, (the grain size is small).

Grain size is very important in deciding the properties of polycrystalline materials

because it affects the area and length of the grain boundaries.

Various effects of grain size on mechanical properties of metals are:

1. Fine-grained materials possess higher strength, toughness, hardness and

resistance to suddenly applied force.

2. Fine-grained materials possess better fatigue resistance, and impact strength.

3. Fine grained materials are more crack-resistant and provide better finish in

deep drawing unlike coarse grained ones which gibe rise to orange-peel effect.

4. Fine-grained steel develops hardness faster in carburising (heat treatment).

5. Fine-grained materials are preferred for structural applications.

6. Fine-grained materials generally exhibit greater yield stresses than coarse-

grained materials at low temperature, whereas at high temperatures grain

boundaries become weak and sliding occurs.

7. A coarse grained material is responsible foe surface roughness.

29

8. A coarse grained material possesses more ductility, malleability (forging,

rolling, etc.) and better machinability.

9. Coarse-grained metals are difficult to polish or plating (as rough surface is

visible even after polish etc.).

10. Coarse-grained steels have greater depth of hardening power as compared to

fine-grained ones.

11. At elevated temperatures, coarse-grained materials show better creep strength

than the fine-grained ones.

3.2.2 Effect Of Heat Treatment On Properties Of Metals Heat treatment is an operation or combination of operations involving heating and

cooling of a metal / alloy in solid state to obtain desirable behavior or set of properties.

Actually it affects the grain size and shape in the metal and some time a change in

microstructure may or may not take place. By controlling the grain size and type of

microstructure the desired mechanical properties can be achieved. This is done by heat

treatment

Some important heat-treatment processes are:

Annealing Normalizing

Hardening Tempering

Martemperig Austempering etc.

One or the other heat-treatment processes produce the following effects on the

properties of metals:

1. Hardens and strengthens the metals.

2. Improves machinability.

3. Changes or refines grain size.

4. Softens metals for further working as in wire drawing.

5. Improves ductility and toughness.

6. Increases resistance of materials to heat, wear, shock and corrosion.

7. Improves electrical and magnetic properties.

8. Homogenises the metal structure.

9. Relieves internal stresses developed in metals / alloys during cold working,

welding, casting, forging etc.

30

10. Produces a hard wear resistant surface on a ductile steel piece (as in case

hardening).

11. Improves thermal properties such as conductivity.

3.2.3 Effect of environmental variables

Gaseous environment: The atmosphere contains mainly nitrogen and oxygen and

added to it are gaseous products such as sulphur dioxide, hydrogen sulphide, moisture,

chlorine, fluorine etc., etc., as industrial and other pollutants.

On account of oxygen, an oxide film forms on the metals.

In the presence of humid air, an oxide film-rust-can be seen on the surface of mild

steel which is not desirable.

Liquid environment: When exposed to moist (and saline) atmosphere, the metals

may corrode. Corrosion is a gradual chemical attack on a metal under the influence of a

moist atmosphere, (or of a natural or artificial solution).

Working temperature: When exposed to very cold atmosphere, even ductile

metals may behave like brittle metals. Water pipes in very cold countries normally burst

and this is the effect of atmospheric exposure.

When the metals are subjected to a very hot atmosphere there is

1. Accelerated oxidation and / or corrosion.

2. Creep.

3. Grain boundary weakening.

4. Allotropic and other phase changes.

5. Change of conventional properties.

6. Reduction in tensile strength and yield point.

3.2.4 Effect of alloying elements

Carbon – With an increase in the amount of carbon, the hardness and tensile strength of

the steel also increase (which slows as the level of carbon rises). An increase in carbon

thusly causes a decrease in both ductility and weldability.

31

Manganese – will also increase hardness as levels increase, but not to the same degree as

carbon. Ductility and weldability are decreased but, again, to a lesser degree than caused

by carbon.

Phosphorus – Benefits machinability and resistance to atmospheric corrosion. It

increases strength and hardness, much akin to carbon, but it decreases ductility and

impact strength (toughness). Phosphorus is often considered an impurity except in

specific situations.

Sulphur – Like phosphorus, sulphur is generally undesired, except where machinability

is an important goal for the steel. Ductility, impact strength or toughness, weldability, and

surface quality are all adversely affected by sulphur content.

Silicon – Serves as a principal deoxidizer in steel. Its content in the steel is dependent

upon the steel type. Killed steel has the highest percentage of silicon, upwards of 0.60

percent.

Copper – The sole purpose of copper is to increase resistance to atmospheric corrosion.

Does not significantly affect mechanical properties. Causes brittleness in the steel at high

temperatures, thereby negatively affecting surface quality.

Chromium (Cr) Increases the steel's hardenability, corrosion resistance, and provides

wear and abrasion resistance in the presence of carbon. It is largely present in stainless

steels, usually ranging from 12 to 20%.

Molybdenum (Mo) Its use as an alloying element in steel increases hardenability. Nickel

(Ni) One of the most widely used alloying elements in steel. In amounts 0.50% to 5.00%

its use in alloy steels increases the toughness and tensile strength without detrimental

effect on the ductility. Nickel also increases the hardenability. In larger quantities, 8.00%

and upwards, nickel is the constituent, together with chromium, of many corrosion

resistant and stainless austenitic steels.

32

Titanium (Ti) Small amounts added to steel contribute to its soundness and give a finer

grain size. Titanium carbide is also used with tungsten carbide in the manufacture of hard

metal tools.

Tungsten (W) When used as an alloying element it increases the strength of steel at

normal and elevated temperatures. Its "red hardness" value makes it suitable for cutting

tools as it enables the tool edge to be maintained at high temperatures.

Vanadium (V) - Steels containing vanadium have a much finer grain structure

than steels of similar composition without vanadium. It raises the temperature at

which grain coarsening sets in and increases hardenability where it is in solution

in the austenite prior to quenching. It also lessens softening on tempering and

confers secondary hardness on high-speed steels.

In the present study we have just touched with the effect of composition

on the mechanical properties.

Chapter No. 4

TESTING TECHNIQUES

33

Chapter No. 4

TESTING TECHNIQUES

4.1 TENSILE TEST

Strength is defined as the ability of a material to resist applied forces without

yielding or fracturing. By convention strength usually denotes the resistance of a material

to a tensile load applied axially to a specimen this is the principle of the tensile test.

Figure 4.1 Shows a more sophisticated machine suitable for industrial and research

laboratories. This machine is capable of performing compression, shear and bending tests

as well as tensile tests. Both these machines apply a carefully controlled tensile load to a

standard specimen and measure the corresponding extension of that specimen.

Figure 4.1 Testing machine

Figure 4.2 and Figure 4.3 shows some standard specimen and the direction of the

applied load. These specimens are based upon British standard BS 18. For the test results

to be consistent for any given material, it is most important that the standard dimension

and profiles are adhered to. The shoulder radii are particularly critical and small

variations, or the presences of tooling marks, can cause considerable differences in the

test data obtained. Flat specimens are usually machined only on their edges so that the

34

plate or sheet surfaces finish, and any structural deformation at the surface cause by the

rolling process are taken into account in the test results.

The gauge length is the length over which the elongation of the specimen is

measured. The minimum parallel length is the minimum length over which the specimen

must maintain a constant cross –sectional area before the test load is applied. The lengths

Lo, Lc, L1, and the cross –sectional area (a) are all specified in BS 18. Cylindrical test

specimens are proportioned so that the gauge length Lo, and the cross–sectional area a

maintain a constant relationship. Hence such specimens are called proportional test

pieces. The relationship is given by the expression:

Lo = 5.56 √a

Since a = 0.25(π d2)

√a = 0.886d

Thus Lo = 5.56 * 0.886d

= 4.93d

= 5d approx.

Therefore a specimen 5 mm diameter will have a gauge length of 25 mm. The

elongation obtained for a given force depends upon the length and area of cross –section

of the specimen or component, since.

Elongation = Force * L

E a

Where L = length

a = cross –section area

E = elastic modulus

Therefore if the L/a is kept constant (as it is in a proportional test piece), and E

remains constant for a given material, then comparisons can be made between elongation

and applied force for specimens of different sizes.

35

Figure 4.2 Tensile test specimen ( round )

Figure 4.3 Tensile test specimen ( flat )

4.1.1 Tensile test results

The load applied to the specimen and the corresponding extension is plotted in

the form of a graph as shown in fig. 4.4.

(a) From a to b the extension is proportional to the applied load. Also, if the applied

load is removed the specimen returns to its original length. Under these relatively

lightly loaded conditions the material is showing elastic properties.

(b) From b to c it can be seen from the graph that the metal suddenly extends with no

increase in load. If the load is removed at this point the metal will not spring back

to its original length and it is said to have taken a permanent set. This is the yield

point and the yield stress, which is the stress at the yield point, is the load at b

divided by the original cross –section area of the specimen. Usually the designer

works at the 50 percent of this figure to allow for a ‘factor of safety.

36

(c) From c to d extension is no longer proportional to the load and if the load is

removed little or no spring back will occur. Under these relatively greater loads

the material is showing plastic properties.

(d) The point d is referred to as the ‘ultimate tensile strength’ when referred to

load/extension graphs or the ‘ultimate tensile stress’ (UTS) when referred to

stress/strain graphs. The ultimate tensile stress is calculated by dividing the load

at d by the original cross sectional area of the specimen. Although a useful figure

for comparing the relative strengths of materials, it has little practical value since

engineering equipment is not usually operated so near to the breaking point.

(e) From d to e the specimen appears to the stretching under reduced load conditions.

In fact the specimen is thinning out (necking) so that the load per unit area or

stress is actually increasing. The specimen finally work hardens to such an extent

that it breaks at e. In practice, values of load and extension are of limited use since

they apply only to one particular size of specimen and it is more usual to plot the

stress/strain curve. (An example of a stress\strain curve for a low –carbon steel is

shown in fig.4.4) stress and strain are calculated as follows.

stress = load\area of cross –section

Strain = extension\original length

Figure 4.4 load / extension curve for low-carbon steel

Loa

d

37

4.1.2 Proof stress Only very ductile materials such as fully annealed mild steel show a clearly defined yield

point. The yield point will not even appear on bright drawn low carbon steel which has become

slightly worked hardened during the drawing process. Under such circumstances the proof stress

is used. Proof stress is defined as the stress, which produces a specified amount of plastic strain,

curve such as 0.1 or 0.2 per unit. Figure 4.5 shows a typical stress\strain curve for a material of

relatively low ductility. Such as hardened and tempered medium carbon steel. If a point such as C

is taken, the corresponding strain is given by D and this consists of a combination of plastic and

elastic components. If the stress is now gradually reduced (by reducing the load on the specimen),

the strain is also reduced and the stress\strain relationship during this reduction in stress is

represented by the line CB. During the reduction in stress the elastic deformation is recovered so

that the line CB is straight and parallel to the initial stages of the loading curve for the material,

that is, the part of the loading curve where the material is showing elastic properties.

In the example shown, the stress at C has produced a plastic strain of 0.2 percent as

represented by AB. Thus the stress at C is referred to as 0.2 percent proof stress, AB being the

plastic deformation and BD being the elastic deformation when the specimen is stressed to the

point C. The material will have fulfilled its specification if, after the proof stress has been applied

for 15 seconds and removed, the permanent set of the specimen is not greater than the specified

percentage of the gauge length which, in this example, is 0.2 percent.

Figure 4.5 Proof stress

38

4.1.3 The Interpretation of tensile test results

The interpretation of tensile test data requires skill born out of experience, since many

factors can affect the test results, for instance the temperature at which the test is carried out,

since the tensile modulus and tensile strength decrease as the temperature rises for most metals

and plastics, whereas the ductility increases as the temperature rises. The test results are also

influenced by the rate at which the specimen s strained.

Figure 4.6(a) shows a typical stress\strain curve for annealed mild steel. From

such a curve the following information can be deduced.

(a) The material is ductile since there is a long plastic range.

(b) The material is fairly rigid since the slope of the initial elastic range is sleep.

(c) The limit of proportionality (elastic limit) occurs at about 230 Mpa.

(d) The upper yield point occurs at about 260 Mpa.

(e) The lower yield point occurs at about 230 Mpa.

(f) The ultimate tensile stress (UTS) occurs at about 400 Mpa.

Figure 4.6(b) shows atypical stress\strain curve for a gray cast iron. From such a

curve the following information can be deduced.

(a) The material is brittle since there is little plastic deformation before it fractures.

(b) Again the material is fairly rigid since the slope of the initial elastic range is

sleep.

(c) It is difficult to determine the point at which the limit of proportionality occurs,

but it is approximately 200 Mpa.

(d) The ultimate tensile stress (UTS) is the same as the breaking stress for this

sample. This indicates negligible reduction in cross –section (necking) and

minimal ductility and malleability. It occurs at approximately 250 Mpa.

Figure 4.6(c) shows a typical stress\strain curve for a wrought light alloy. From

such a curve the following information can be deduced.

39

(a) The material has a high level of ductility since it shows a long plastic range. The

material is much less rigid than either low carbon steel or cast iron since the slope

of the initial plastic range is much less sleep when plotted to the same scale. The

limit of proportionality is almost impossible to determine, so proof stress will be

specified instead. For this sample a 0.2 percent proof stress is approximately 500

Mpa (AB). The tensile test can also yield other important facts about a material

under test.

( a ) Stress/strain curve for annealed mild steel

( b) Stress/strain curve for gray cast iron

40

( c ) Stress/strain curve for light alloy

Figure 4.6 Typical stress/strain curves

4.1.4 The effect of grain size and structure on tensile testing:

The test piece should be chosen so that it reflects as closely as possible the

component and the material from which the component is produced. This is relatively

easy for components produced from bar stock, but not so easy for components produced

from forgings as he grain flow will be influenced by the contour of the component and

will not be uniform. Castings also present problems since the properties of a specially

cast test piece are unlikely to reflect those of the actual casting. This is due to the

difference in size and the corresponding difference in cooling rates.

The lay of the strain in rolled bar and plate can greatly affect the tensile strength

and other properties of a specimen taken from them. Figure 4.7 shows the relative grain

orientation for transverse and longitudinal test pieces. The tensile strength for the

41

longitudinal test piece is substantially greater than that of the transverse test piece, a

factor which the designer of large fabrications must take into account.

Figure 4.8 shows the effect of processing upon the properties of a material. A low

carbon steel of high ductility, in the annealed condition shows the classical stress\strain

curve with a pronounced yield point and a long plastic deformation range. The same

material, after finishing by cold drawing no longer shows a yield point and the plastic

range is sinceably reduced.

Figure 4.7 Effect of grain orientation on material testing

( i ) Annealed low-carbon steel ( ii ) Cold-drawn low-carbon steel

Figure 4.8 effect of processing on the properties of low-carbon steel

42

Figure 4.9 shows the effect of heat treatment upon the properties of a medium carbon

steel. In this example the results have been obtained by quench hardening a batch of

identical specimens and then tempering them at different temperatures.

Figure 4.10 shows the effect of heat treatment upon the properties of a work

hardened metallic material. Stress relief (recovery) has very little effect upon the tensile

strength and elongation (ductility) until the recrystallization (annealing) temperature is

reached. The metal initially shows the high tensile strength and lack of ductility

associated with a severely distorted grain structure. After stress relief the tensile strength

rises and the ductility falls until the recrsytallization range temperature is reached. During

the recrystallization range there is a marked change in properties. The tensile strength is

rapidly reduced and the ductility, in terms of elongation percentage rapidly increases.

Figure 4.9 Effect of tempering on tensile test

43

Figure 4.10 effect of temperature on cold-worked material 4.2 IMPACT TESTING

The tensile test does not tell the whole story. Figure 4.12 shows how a piece of

high carbon steel rod will bend when in the annealed condition yet snap easily in the

quench hardened condition despite the fact that in the latter condition it will show a much

higher value of tensile strength. Impact tests consist of striking a suitable specimen with a

controlled blow and measuring the energy absorbed in bending or breaking the specimen.

The energy value indicates the toughness of the material under test. Figure 4.11 shows a

typical impact-testing machine. This machine has a hammer which is suspended like

pendulum, a voice for holding the specimen in the correct position relative to the hammer

and a dial for indicating the energy absorbed in carrying out the test in joules (J). If there

is maximum over swing, as there would be if no specimen was placed in the vice, then

zero energy absorption is indicated. If the hammer is stopped by the specimen with no

over swing, then maximum energy absorption is indicated. Intermediate readings are the

impact values (J) of the materials being tested (their toughness ort lack of brittleness).

There are two standard tests currently use.

44

Figure 4.11 Typical impact testing machine

( a ) ( b )

Figure 4.12 Impact loading ( a ) A piece of high-carbon steel rod ( 1.0%) in the annealed (soft) condition will bend when struck with a hammer. UTS 925 Mpa ( b ) The same piece of high-carbon steel rod, as in ( a ), after hardening and lightly tempering will fracture when hit a hammer despite its UTS having increased to 1285 Mpa.

45

4.2.1 The Izod Test

In this the test a 10 mm square, notched specimen is used. The striker of the

pendulum hits the specimen with a kinetic energy of 162.72 J at a velocity of 3.8 m\s.

Figure 4.13 shows details of the specimen and the manner in which it is supported.

Figure 4.13 Izod test

Detail of notch

Section of test piece

Position of the striker

46

4.2.2 The Charpy Test

In the Izod test the specimen is supported as a cantilever, but in the charpy test it

is supported as a beam. It is struck with a kinetic energy of 298.3 J at a velocity of 5 m\s.

figure 4.14 shows details of the charpy test specimen and the manner in which it is

supported.

Since both tests use a notched specimen, useful information can be obtained

regarding the resistance of the material to the spread of a crack which may originate from

a point of stress concentration such as sharp corners, undercuts, sudden changes in

section, and machining marks in stressed components. Such points of stress concentration

should be eliminated during design and manufacture.

Figure 4.14 Charpy test

47

4.2.3 The interpretation of impact tests

The results of an impact test should specify the energy used to bend or break the

specimen and the particular test used, i.e. Izod or charpy. In the case of the charpy test it

is also necessary to specify the type of notch used as this test allows for three types of

notch , as shown in fig.4.15. A visual examination of the fractured surface after the test

also provides useful information.

(a) Brittle Metals. A clean break with little deformation and little reduction in cross

sectional area at the point of fracture. The fractured surface will show a granular

structure .

(b) Ductile Metals: The fracture will be rough and fibrous. In very ductile materials

the fracture will not be complete the specimen bending over and only showing

slight tearing from the notch. There will also be some reduction in cross sectional

area at the point of fracture or bending.

Figure 4.15 Standard charpy notches

48

The temperature of the specimen at the time of making the test also has an

important influence on the test results. Figure 4.16 shows the embrittlement of low

carbon steels at refrigerated temperatures and hence their unsuitability for use in

refrigeration plant and space vehicles.

4.2.4 The effect of processing on toughness

Impact tests are frequently used to determine the effectiveness of annealing

temperatures on the grain structure and impact strength of cold worked ductile metals. In

the case of cold –worked low carbon steel, the impact strength is quite low, initially, as

the heavily deformed grain structure will be relatively brittle and lacking in ductility,

particularly if the limit of cold working has been approached. Annealing at low

temperatures has little effect as it only promotes recovery of the crystal lattice on the

atomic scale and does not result in crystallization. In fact during recovery there may even

be a slight reduction in the impact strength.

However, at about 550oC to 650 o C recrystallization of low carbon steels occurs

with only slight grain growth. Annealing in this temperature range results in the impact

strength increasing dramatically as shown in fig. 4.7 and the appearance of the fracture

changes from that of a brittle material to that of a ductile material. Annealing at higher

temperatures or prolonged soaking at the lower annealing temperature results in grain

growth and a corresponding fall in impact strength.

Figure 4.16 Effect of temperature on toughness

The effect of tempering on the impact value of a quench hardened high carbon

steel is shown in fig.4.18. Initially, only stress relief occurs but as the tempering

temperature increases, the toughness also increases which is why cutting tools are

tempered. Tempering modifies the extremely hard and brittle martensitic structure of

49

quench hardened plain carbon steels and causes a considerable increase in toughness with

very little loss of hardness.

Figure 4.17 Effect of Annealing on the toughness of low-carbon steel

Figure 4.18 Effect of tempering on quench-hardened high carbon steel

4.3 HARDNESS TESTING

Hardness has defined as the resistance of a material to indentation or abrasion by

another hard body. It is by indentation that most hardness tests are performed. A hard

50

indenter is pressed into the specimen by a standard load, and the magnitude of the

indentation (either area or depth) is taken as a measure of hardness.

4.3.1 The Brinell hardness test

In this test, hardness is measured by pressing a hard steel ball into the surface of

the test piece, using a known load. It is important to choose the combination of load and

ball size carefully so that the indentation is free from distortion and suitable for

measurement. The relationship between load P(kg) and the diameter D(mm) of the

hardened ball indenter is given by the expression:

P/D2 = K

Where K is a constant. Typical values of K are:

Ferrous metals K=30

Copper and copper alloys K=10

Aluminum and aluminum alloys K=05

Lead, tin, and white beating metals K=01

Thus, for steel, a load of 3000 kg is required if a 10mm diameter ball indenter is

used.

Figure 4.20 shows the principle of the Brinell hardness test. The diameter of the

indentation d is measured in two directions at right-angles and the average taken. The

hardness number HB is the load divided by the spherical area of the indentation which can

be calculated knowing the values of d and D. In practice, conversion tables are used to

translate the value of diameter d directly into hardness numbers HB.

51

Figure 4.19 Brinell hardness tester

Figure 4.20 Brinell hardness principle

To ensure consistent results the following precautions should be observed:

(a) the thickness of the specimen should be at least seven times the depth of the

indentation to allow unrestricted plastic flow below the indenter;

(b) the edge of the indentation should be at least three times the diameter of the

indentation from the edge of the test piece;

(c) the test is unsuitable for materials whose hardness exceeds 500 HB, as the ball

indenter tends to flatten.

52

Relationship between hardness and tensile strength:

4.3.1.1 Machinability

With high speed steel cutting tools, the hardness of the stock being cut should not exceed

HB =100 will tend to tear and leave a poor surface finish.

4.3.1.2 Relationship Between Hardness And Tensile strength

There is a definite relationship between strength and hardness, and the ultimate

tensile stress (UTS) of a component can be approximated as follows:

UTS (MPa) = HB * 3.54 (for annealed plain-carbon steels);

= HB * 3.25 (for quench-hardened and tempered plain-carbon steels);

= HB * 5.6 (for ductile brass alloys);

= HB * 4.2 (for wrought aluminium alloys).

Figure: 4.21 Work-hardening capacity

53

4.3.1.3 Work-hardening capacity

Materials which will cold work without work hardening unduly will pile up round

the indenter as shown in fig. 4.21 (a). Material which work-harden readily will sink

around the indenter as shown in fig. 4.21(b).

4.3.2 The Vickers hardness test

This test is preferable to the Brinell test where hard materials are concerned, as it

uses a diamond indenter.(Diamond is the hardest material known_approximatly 6000

HB.) The diamond indenter is in the form of a square-based pyramid with an angle of

136o between opposite faces. Since only one type of indenter is used the load has to be

varied for different hardness ranges. Standard loads are 5, 10, 20, 30, 50 and 100kg. It is

necessary to state the load when specifying a Vickers hardness number. For example if

the hardness number is found to be 200 when using a 50kg load, then the hardness

number is written HD (50) =200. Figure 4.22(a) shows a universal hardness testing

machine suitable for performing both Brinell and Vickers hardness tests, whilst fig.

4.22(b) shows the measuring screen for determining the distance across the corners of the

indentation. The screen can be rotated so that two readings at right-angles can be taken

and the average is used to determine the hardness number (HD). This is calculated by

dividing the load by the projected area of the indentation:

HD = P/D2,

Where D = the average diagonal (mm), P = load (kg).

Figure 4.22 Micro Vicker And Vicker Hardness Testers

54

4.3.3 The Rockwell hardness test

Although not so reliable as the Brinell and Vickers hardness tests for laboratory

purposes, the Rockwell test is widely used in industry, as it is quick, simple, and direct

reading. Figure 4.23 shows a typical hardness indicating scale. Universal electronic

hardness testing machines are now widely used which, at the turn of a switch, can

provide either Brinell, Vicker, or Rockwell tests and which show the hardness number as

a digital readout automatically. They also give a ‘hard copy’ printout of the test result

together with the test conditions and date. However, the mechanical testing machines

described in this chapter are still widely used and will be for some time to come.

In principle the Rockwell hardness test compares the difference in depth of

penetration of the indenter when using forces of two different values, that is, a minor

force is first applied (to take up the backlash and pierce the skin of the component) and

the scales are set to read zero. Then a major force is applied over and above the minor

force and the increased depth of penetration is shown on the scales of the machine as a

direct reading of hardness without the need for calculation or conversion tables. The

indenters most commonly used are a 1.6mm diameter hard steel ball and a diamond cone

with an apex angle of 1200. The minor force in each instance is 98 N. Table 4.1 gives the

combinations of type of indenter and additional (major) force for the range of Rockwell

scales, together with typical applications. The B and C scales are the most widely used in

engineering.

The standard Rockwell test cannot be used for very thin sheet and foils, and for

these the Rockwell superficial Hardness Test is used. The minor force is reduced from 98

N to 29.4 N and the major force is also reduced. Typical values are listed in Table4.2.

55

Figure 4.23 Rockwell hardness tester

56

Scale Indenter Additional

force(kN)

Applications

A Diamond cone 0.59 Steel sheet; shallow

case-hardened

components

B Ball, ∅ 1.588mm 0.98 Copper alloys;

aluminium alloys,

and annealed low

carbon steels

C Diamond cone 1.47 Most widely used

range: hardened

steels; cast irons;

deep case-hardened

components

D Diamond cone 0.98 Thin but hard steel-

medium depth case-

hardened

compounds

E Ball, ∅ 3.175mm 0.98 Cast iron,

aluminium alloys;

magnesium alloys,

bearing metals

F Ball, ∅ 1.588mm 0.59 Annealed copper

alloys, thin soft

sheet metals

G Ball, ∅ 1.558mm 1.47 Malleable irons;

phosphor bronze;

gun-metal; cupro-

nikel alloys, etc

H Ball ∅ 3.175mm 0.59 Soft materials; high

ferritic aluminium,

57

lead, zinc

K Ball ∅ 3.175mm 1.47 Aluminium and

magnesium alloys

Table 4.1 Rockwell hardness test conditions

Table 4.2 Rockwell superficial hardness test conditions

Scale Indenter Additional force (kN)

15-N Diamond cone 0.14



15-T Ball, ∅ 1.588mm 0.14

30-T Ball, ∅ 1.588mm 0.29

45-T Ball, ∅ 1.588mm 0.44

4.3.4 Shore scleroscope In the tests previously described, the test piece must be small enough to mount in

the testing machine, and hardness is measured as a function of indentation. However the

scleroscoe, works on a different principle and hardness is measured as a function of

resilience. Further, since the scleroscope can be carried to the work piece, it is useful for

testing large surfaces such as the slideways on machine tools. A diamond-tipped hammer

of mass 2.5g drops through a height of 250mm. The height of the first rebound indicates

the hardness on a 140-division scale.

4.3.4.1 The effect of processing on hardness

All metals work-harden to some extent when cold-worked. Figure 4.24 shows the

relationship between the Vickers hardness number (HD) and the percentage reduction in

thickness for rolled strip. The metals become harder and more brittle as the amount of

cold-working increases until a point is reached where the metal is so hard and brittle that

cold-working cannot be continued. Aluminium reaches this state when a 60 percent

reduction in strip thickness is achieved in one pass through the rolls of a rolling mill. In

this condition the material is said to be fully work-hardened. The degree of work-

58

hardening or ‘temper’of strip and sheet material is arbitrarily stated as soft (fully

annealed), ¼ hard, ½ hard, ¾ hard, and hard (fully work-hardened).

The effect of heating a work-hardened material such as α brass is shown in Fig.

4.25. Once again very little effect occurs until the temperature of recrystallisation is

reached. At this temperature there is a rapid fall off in hardness, after which the decline in

hardness becomes more gradual as grain growth occurs and the metal becomes fully

annealed.

The effect of heating a quench-hardened plain-carbon steel is more gradual as

shown in Fig. 4.26. During the tempering range of the steel no grain growth occurs, but

there are structural changes. Initially, there is a change in the very hard martensite as

particles of carbide precipitate out. As tempering proceeds and the temperature is

increased, the structure loses its acicular martensitic appearance and spheroidal carbide

particles in a matrix of ferrite can be seen under high magnification. These structural

changes increase the toughness of the metal considerably, but with some loss of hardness.

Figure 4.24 Effect of cold-working on the hardness of various metals

Chapter No. 5

USING NEURAL NETWORK TOOL BOX

60

Chapter No. 5

USING NEURAL NETWORK TOOL BOX

5.1 INTRODUCTION

Neural networks involves very huge amount of calculations in shape of

manipulating data for training and verification that becomes very easy with the help of a

computer. The manipulation can be carried out in any of the programming languages i.e.

C++, Fortran, Matlab etc. In this work Matlab is used for this purpose because, Matlab is

a very powerful tool for mathematical calculation, visualization and programming. In

addition to the pure mathematical part of Matlab there are several tool boxes available to

expand the capabilities of Matlab, the Neural Network Toolbox (NN Toolbox) is one of

these toolboxes.

This chapter is intended for students unacquainted to Matlab and the neural network

toolbox to get practice in using these tools. The contents of this chapter are more focused

toward practical examples and problems than to theory. The reason for this is that most

theory is covered in the help files of Matlab itself. In fact, one will not be able to learn

Matlab and NN Toolbox with this tutorial alone; one will have to actively explore the

documentation and demos available in Matlab. What lacks from Matlab is a set of

examples and problems that helps the user to learn to use the tools.

5.2 THE STRUCTURE OF THE NEURAL NETWORK TOOLBOX The toolbox is based on the network object. This object contains information

about everything that concern the neural network. Type network at the matlab command

prompt, and an empty network will be created and its parameters will be shown. >> network

ans =

Neural Network object:

architecture:

61

numInputs: 0

numLayers: 0

biasConnect: []

inputConnect: []

layerConnect: []

outputConnect: []

targetConnect: []

numOutputs: 0 (read-only)

numTargets: 0 (read-only)

numInputDelays: 0 (read-only)

numLayerDelays: 0 (read-only)

First the architecture parameters are shown. Because the network command creates an

empty network all parameters are set to 0. The subobject structures follows:

subobject structures:

inputs: {0x1 cell} of inputs

layers: {0x1 cell} of layers

outputs: {1x0 cell} containing no outputs

targets: {1x0 cell} containing no targets

biases: {0x1 cell} containing no biases

inputWeights: {0x0 cell} containing no input weights

layerWeights: {0x0 cell} containing no layer weights

This paragraph is subobject structures which is the various input and output

matrices, biases and inputweights.

functions:

adaptFcn: (none)

initFcn: (none)

performFcn: (none)

trainFcn: (none)

The next paragraph is interesting, it contains the training, initialization and performance

functions. The trainFcn and adaptFcn are essentially the same but trainFcn will be

62

used in this tutorial. By setting the trainFcn parameter you tell Matlab which training

algorithm it should use. The ANN toolbox includes almost 20 training functions. The

performance function is the function that determines how well the ANN is doing its task.

The initFcn is the function that initialized the weights and biases of the network. To get

a list of the functions that is available type help nnet. To change one of these functions

to another one in the toolbox or one that you have created, just assign the name of the

function to the parameter, e.g. net.trainFcn = 'mytrainingfun'; The parameters that concerns these functions are listed in the next paragraph. parameters:

adaptParam: (none)

initParam: (none)

performParam: (none)

trainParam: (none)

By changing these parameters you can change the default behavior of the functions

mentioned above. The parameters you will use the most are probably the components of

trainParam. The most used of these are net.trainParam.epochs, which tells the

algorithm the maximum number of epochs to train, and net.trainParam.show that tells

the algorithm how many epochs there should be between each presentation of the

performance. Type help train for more information. The weights and biases are also

stored in the network structure: weight and bias values:

IW: {0x0 cell} containing no input weight matrices

LW: {0x0 cell} containing no layer weight matrices

b: {0x1 cell} containing no bias vectors

other:

userdata: (user stuff)

The .IW component is a cell array that holds the weights between the input layer and the

first hidden layer. The .LW component holds the weights between the hidden layers and

the output layer.

63

5.3 CONSTRUCTING LAYERS

It is assumed that you have an empty network object named `net' in your

workspace, if not, type

>> net = network;

To get one.

Let's start with defining the properties of the input layer. The NNT supports

networks that have multiple input layers. Let’s set this to 1:

>> net.numInputs = 1;

Now we should define the number of neurons in the input layer. This should of

course be equal to the dimensionality of your data set. The appropriate property to set is

net.inputs{i}.size, where i is the index of the input layers. So to make a network,

which has 2 dimensional points as inputs, type: >> net.inputs{1}.size = 2;

This defines (for now) the input layer.

The next properties to set are net.numLayers, which not surprisingly sets the

total number of layers in the network, and net.layers{i}.size, which sets the number

of neurons in the ith layer. To build our example network, we define 2 extra layers (a

hidden layer with 3 neurons and an output layer with 1 neuron), using: >> net.numLayers = 2;

>> net.layers{1}.size = 3;

>> net.layers{2}.size = 1;

For details refer Appendix B

5.3.2 Connecting Layers

Now it's time to define which layers are connected. First, define to which layer

the inputs are connected by setting net.inputConnect(i) to 1 for the appropriate layer i

(usually the first, so i = 1).

The connections between the rest of the layers are defined a connectivity matrix

called net.layerConnect, which can have either 0 or 1 as element entries. If element

(i,j) is 1, then the outputs of layer j are connected to the inputs of layer i.

We also have to define which layer is the output layer by setting

net.outputConnect(i) to 1 for the appropriate layer i.

64

Finally, if we have a supervised training set, we also have to define which layers

are connected to the target values. (Usually, this will be the output layer.) This is done by

setting net.targetConnect(i) to 1 for the appropriate layer i. So, for our example, the

appropriate commands would be >> net.inputConnect(1) = 1;

>> net.layerConnect(2, 1) = 1;

>> net.outputConnect(2) = 1;

>> net.targetConnect(2) = 1; 5.4 SETTING TRANSFER FUNCTIONS

Each layer has its own transfer function which is set through the

net.layers{i}.transferFcn property. So to make the first layer use sigmoid transfer

functions, and the second layer linear transfer functions, use >> net.layers{1}.transferFcn = 'logsig';

>> net.layers{2}.transferFcn = 'purelin';

For detail refer appendix B

5.5 WEIGHTS AND BIASES

Now, define which layers have biases by setting the elements of

net.biasConnect to either 0 or 1, where net.biasConnect(i) = 1 means layer i has

biases attached to it.

To attach biases to each layer in our example network, we'd use >> net.biasConnect = [ 1 ; 1];

Now you should decide on an initialization procedure for the weights and biases.

When done correctly, you should be able to simply issue a >> net = init(net);

to reset all weights and biases according to your choices.

The first thing to do is to set net.initFcn. Unless you have built your own

initialization routine, the value 'initlay' is the way to go. This let's each layer of weights

and biases use their own initialization routine to initialize.

>> net.initFcn = 'initlay';

65

Exactly which function this is should of course be specified as well. This is done

through the property net.layers{i}.initFcn for each layer. The two most practical

options here are Nguyen-Widrow initialization ('initnw', type 'help initnw' for details),

or 'initwb', which let's you choose the initialization for each set of weights and biases

separately.

When using 'initnw' you only have to set

>> net.layers{i}.initFcn = 'initnw';

For each layer i

When using 'initwb', you have to specify the initialization routine for each set of

weights and biases separately. The most common option here is 'rands', which sets all

weights or biases to a random number between -1 and 1. First, use >> net.layers{i}.initFcn = 'initwb';

For each layer i. Next, define the initialization for the input weights, >> net.inputWeights{1,1}.initFcn = 'rands';

And for each set of biases >> net.biases{i}.initFcn = 'rands';

And weight matrices >> net.layerWeights{i,j}.initFcn = 'rands';

Where net.layerWeights{i,j} denotes the weights from layer j to layer i.

5.6 TRAINING FUNCTIONS & PARAMETERS The difference between train and adapt

One of the more counterintuitive aspects of the NNT is the distinction between

train and adapt. Both functions are used for training a neural network, and most of the

time both can be used for the same network.

What then is the difference between the two? The most important one has to do with

incremental training (updating the weights after the presentation of each single training

sample) versus batch training (updating the weights after each presenting the complete

data set).

When using adapt, both incremental and batch training can be used. Which one is

actually used depends on the format of your training set. If it consists of two matrices of

input and target vectors, like >> P = [ 0.3 0.2 0.54 0.6 ; 1.2 2.0 1.4 1.5]

66

P =

0.3000 0.2000 0.5400 0.6000

1.2000 2.0000 1.4000 1.5000

>> T = [ 0 1 1 0 ]

T =

0 1 1 0

The network will be updated using batch training. (In this case, we have 4

samples of 2 dimensional input vectors, and 4 corresponding 1D target vectors).

If the training set is given in the form of a cell array, >> P = {[0.3 ; 1.2] [0.2 ; 2.0] [0.54 ; 1.4] [0.6 ; 1.5]}

P =

[2x1 double] [2x1 double] [2x1 double] [2x1 double]

>> T = { [0] [1] [1] [0] }

T =

[0] [1] [1] [0]

Then incremental training will be used.

When using train on the other hand, only batch training will be used, regardless of

the format of the data (you can use both).

The big plus of train is that it gives you a lot more choice in training functions (gradient

descent, gradient descent w/ momentum, Levenberg-Marquardt, etc.), which are

implemented very efficiently. So when you don't have a good reason for doing

incremental training, train is probably your best choice. (And it usually saves you

setting some parameters).

The most important difference between adapt and train is the difference

between passes and epochs. When using adapt, the property that determines how many

67

times the complete training data set is used for training the network is called

net.adaptParam.passes. But, when using train, the exact same property is now called

net.trainParam.epochs.

5.6.1 Performance Functions

The two most common options here are the Mean Absolute Error (mae) and the

Mean Squared Error (mse). The mae is usually used in networks for classification, while

the mse is most commonly seen in function approximation networks.

The performance function is set with the net.performFcn property, for instance: >> net.performFcn = 'mse';

5.6.2 Train Parameters

If you are going to train your network using train, the last step is defining

net.trainFcn, and setting the appropriate parameters in net.trainParam. Which

parameters are present depends on your choice for the training function.

So if you for example want to train your network using a Gradient Descent w/

Momentum algorithm, you'd set >> net.trainFcn = 'traingdm';

And then set the parameters >> net.trainParam.lr = 0.1;

>> net.trainParam.mc = 0.9; To the desired values (In this case, lr is the learning rate, and mc the momentum

term.)

Two other useful parameters are net.trainParam.epochs, which is the

maximum number of times the complete data set may be used for training, and

net.trainParam.show, which is the time between status reports of the training function.

For example, >> net.trainParam.epochs = 1000;

>> net.trainParam.show = 100; 5.6.3 Adapt Parameters

The same general scheme is also used in setting adapt parameters. First, set

net.adaptFcn to the desired adaptation function. We'll use adaptwb (from 'adapt

weights and biases'), which allows for a separate update algorithm for each layer. Again,

check the Matlab documentation for a complete overview of possible update algorithms.

68

>> net.adaptFcn = 'adaptwb';

Next, since we're using adaptwb, we'll have to set the learning function for all

weights and biases: >> net.inputWeights{1,1}.learnFcn = 'learnp';

>> net.biases{1}.learnFcn = 'learnp';

Where in this example we've used learnp, the Perceptron learning rule. (Type 'help

learnp', etc.)

Finally, a useful parameter is net.adaptParam.passes, which is the maximum

number of times the complete training set may be used for updating the network: >> net.adaptParam.passes = 10;

5.7 BASIC NEURAL NETWORK EXAMPLE

The task is to create and train a neural network that solves the XOR problem.

XOR is a function that returns 1 when the two inputs are not equal see table 5.1.

Table 5.1: The XOR-problem

A B A XOR B

1 1 0

1 0 1

0 1 1

0 0 0

To solve this we will need a feedforward neural network with two input neurons,

and one output neuron. Because that the problem is not linearly separable it will also need

a hidden layer with two neurons.

Now we know how our network should look like, but how do we create it?

To create a new feed forward neural network use the command newff. You have

to enter the max and min of the input values, the number of neurons in each layer and

optionally the activation functions.

69

>> net = newff([0 1; 0 1],[2 1],{'logsig','logsig'})

The variable net will now contain an untrained feedforward neural network with

two neurons in the input layer, two neurons in the hidden layer and one output neuron,

exactly as we want it. The [0 1; 0 1] tells matlab that the input values ranges between

0 and 1. The {'logsig','logsig'} tells matlab that we want to use the logsig function

as activation function in all layers. The first parameter tells the network how many nodes

there should be in the input layer, hence you do not have to specify this in the second

parameter. You have to specify at least as many transfer functions as there are layers, not

counting the input layer. If you do not specify any transfer function Matlab will use the

default settings.

Figure 5.1: The logsig activation function

Now we want to test how good our untrained network is on the XOR problem.

First we construct a matrix of the inputs. The input to the network is always in the

columns of the matrix. To create a matrix with the inputs "1 1", "1 0", "0 1" and "0 0" we

enter:

>> input = [1 1 0 0; 1 0 1 0]

70

input =

1 1 0 0

1 0 1 0

Now we have constructed inputs to our network. Let us push these into the

network to see what it produces as output. The command sim is used to simulate the

network and calculate the outputs, for more information on how to use the command type

helpwin sim. The simplest way to use it, is to enter the name of the neural network and

input matrix, it returns an output matrix.

>> output=sim(net,input)

output =

0.5923 0.0335 0.9445 0.3937

The output was not exactly what we wanted! We wanted (0 1 1 0) but got near to

(0.60 0.03 0.95 0.40). (Note that your network might give a different result, because the

network's weights are given random values at the initialization.)

You can now plot the output and the targets; the targets are the values that we

want the network to generate. Construct the target vector:

>> target = [0 1 1 0]

target =

0 1 1 0

To plot points we use the command "plot". We want that the targets should be

small circles so we use the command:

>> plot(target, 'o')

We want to plot the output in the same window. Normally the contents in a

window are erased when you plot something new in it. In this case we want the targets to

remain in the picture so we use the command hold on. The output is plotted as +'s.

71

>> hold on

>> plot(output, '+')

In the resulting figure (Fig5.2) it's easy to see that the network does not give the

wanted results. To change this we have to train it. Now we will train the network by hand

by adjusting the weights manually.

Figure 5.2: The targets and the actual output from an untrained XOR network. The

targets are represented as 'o' and the output as '+'

5.7.1 Manually set weights

The network we have constructed so far does not really behave as it should. To

correct this the weights will be adjusted. All the weights are stored in the net structure

that were created with newff. The weights are numbered by the layers they connect and

the neurons within these layers. To get the value of the weights between the input layer

and the first hidden layer we type:

>> net.IW

ans =

72

[2x2 double]

[]

>> net.IW{1,1}

ans =

5.5008 -5.6975

2.5404 -7.5011

This means that the weight between the second neuron in the input layer to the

first neuron in the first hidden layer is -5.6975. To change it to 1, enter: >> net.IW{1,1}(1,2)=1;

>> net.IW{1,1}

ans =

5.5008 1.0000

2.5404 -7.5011

The weights between the hidden layers and the output layer are stored in the .LW

component, which can be used in the same manner as .IW. >> net.LW

ans =

[] []

[1x2 double] []

>> net.LW{2,1}

ans =

-3.5779 -4.3080

The change we made in the weight makes our network give an other output when

we simulate it, try it by enter: >> output=sim(net,input)

output =

73

0.8574 0.0336 0.9445 0.3937

>> plot(output,'g*');

Now the new output will appear as green stars in your picture, are they closer to

the o's than the +'s were?

5.7.2 Training Algorithms In the neural network toolbox there are several training algorithms already

implemented. That is good because they can do the heavy work of training much

smoother and faster than we do by manually adjust the weights. Now let us apply the

default training algorithm to our network. The matlab command to use is train, it takes

the network, the input matrix and the target matrix as input. The train command returns

a new trained network. For more information type helpwin train. In this example we

do not need all the information that the training algorithms shows, so we turn it of by

entering: >> net.trainParam.show=NaN;

The most important training parameters are .epochs which determines the

maximum number of epochs to train, .show the interval between each presentation of

training progress. If the gradient of the performance is less than .min_grad the training is

ended. The .time component determines the maximum time to train.

And to train the network enter:

>> net = train(net,input,target);

Because of the small size of the network, the training is done in only a second or two.

Now we try to simulate the network again, to se how it reacts to the inputs: >> output = sim(net,input)

output =

0.0000 1.0000 1.0000 0.0000

74

That was exactly what we wanted the network to output! You may now plot the

output and see that the +'s falls in the o's. Now examine the weights that the training

algorithm has set, does they look like the weights that you found? >> net.IW{1,1}

ans =

11.0358 -9.5595

16.8909 -17.5570

>> net.LW{2,1}

ans =

25.9797 -25.7624

It is also possible to enter the name of the training algorithm when the network is

created, see help newff for more information

5.8 GRAPHICAL USER INTERFACE

5.8.1 Introduction to the GUI

The graphical user interface (GUI) is designed to be simple and user friendly, but

we will go through a simple example to get you started.

In what follows you bring up a GUI Network/Data Manager window. This

window has its own work area, separate from the more familiar command line

workspace. Thus, when using the GUI, you might "export" the GUI results to the

(command line) workspace. Similarly you may want to "import" results from the

command line workspace to the GUI.

Once the Network/Data Manager is up and running, you can create a network,

view it, train it, simulate it and export the final results to the workspace. Similarly, you

can import data from the workspace for use in the GUI.

75

The following example deals with a perceptron network. We go through all the

steps of creating a network and show you what you might expect to see as you go along.

5.8.2 Create a Perceptron Network (nntool)

We create a perceptron network to perform the AND function in this example. It

has an input vector p= [0 0 1 1;0 1 0 1] and a target vector t=[0 0 0 1]. We call

the network ANDNet. Once created, the network will be trained. We can then save the

network, its output, etc., by "exporting" it to the command line.

5.8.3 Input and target

To start, type nntool. The following window appears.

Figure 5.3 Network data manager

76

Click on Help to get started on new problem and see descriptions of the buttons and lists.

First, we want to define the network input, which we call p, as having the

particular value [0 0 1 1;0 1 0 1]. Thus, the network had a two-element input and four sets

of such two-element vectors are presented to it in training. To define this data, click on

New Data, and a new window, Create New Data appears. Set the Name to p, the Value

to [0 0 1 1;0 1 0 1], and make sure that Data Type is set to Inputs.The Create New Data

window will then look like this:

Figure 5.4 Create new data window

Now click Create to actually create an input file p. The Network/Data Manager

window comes up and p shows as an input.

Next we create a network target. Click on New Data again, and this time enter the

variable name t, specify the value [0 0 0 1], and click on Target under data type.

Again click on Create and you will see in the resulting Network/Data Manager window

that you now have t as a target as well as the previous p as an input.

77

5.8.4 Create Network

Now we want to create a new network, which we will call ANDNet.To do this,

click on New Network, and a CreateNew Network window appears. Enter ANDNet

under Network Name. Set the Network Type to Perceptron, for that is the kind of

network we want to create. The input ranges can be set by entering numbers in that field,

but it is easier to get them from the particular input data that you want to use. To do this,

click on the down arrow at the right side of Input Range. This pull-down menu shows

that you can get the input ranges from the file p if you want. That is what we want to do,

so click on p. This should lead to input ranges [0 1;0 1].We want to use a hardlim

transfer function and a learnp learning function, so set those values using the arrows for

Transfer function and Learning function respectively. By now your Create New

Network window should look like:

Figure 5.5 Create new network window

Next you might look at the network by clicking on View. For example:

78

Figure 5.6 View network window

This picture shows that you are about to create a network with a single input

(composed of two elements), a hardlim transfer function, and a single output. This is the

perceptron network that we wanted.

Now click Create to generate the network. You will get back the Network/Data

Manager window. Note that ANDNet is now listed as a network.

5.8.5 Train the Perceptron

To train the network, click on ANDNet to highlight it. Then click on Train. This

leads to a new window labeled Network:ANDNet. At this point you can view the

network again by clicking on the top tab Train. You can also check on the initialization

by clicking on the top tab Initialize. Now click on the top tab Train. Specify the inputs

and output by clicking on the left tab Training Info and selecting p from the pop-down

list of inputs and t from the pull-down list of targets. The Network:ANDNet window

should look like:

79

Figure 5.7 Main network window

Note that the Training Result Outputs and Errors have the

name ANDNet appended to them. This makes them easy to identify later when they are

exported to the command line.

While you are here, click on the Training Parameters tab. It shows you

parameters such as the epochs and error goal. You can change these parameters at this

point if you want.

Now click Train Network to train the perceptron network. You will see the

following training results.

80

Figure 5.8 training result window

Thus, the network was trained to zero error in four epochs. (Note that other kinds

of networks commonly do not train to zero error and their error commonly cover a much

larger range. On that account, we plot their errors on a log scale rather than on a linear

scale such as that used above for perceptrons.)

You can check that the trained network does indeed give zero error by using the

input p and simulating the network. To do this, get to the Network/Data Manager

window and click on Network Only: Simulate). This will bring up the

Network:ANDNet window. Click there on Simulate. Now use the Input pull-down

menu to specify p as the input, and label the output as ANDNet_outputsSim to

distinguish it from the training output. Now click Simulate Network in the lower right

corner. Look at the Network/Data Manager and you will see a new variable in the

81

output: ANDNet_outputsSim. Double-click on it and a small window

Data:ANDNet_outputsSim appears with the value

[0 0 0 1]

Thus, the network does perform the AND of the inputs, giving a 1 as an output

only in this last case, when both inputs are 1.

5.8.6 Export Perceptron Results to Workspace

To export the network outputs and errors to the MATLAB command line

workspace, click in the lower left of the Network:ANDNet window to go back to the

Network/Data Manager. Note that the output and error for the ANDNet are listed in the

Outputs and Error lists on the right side. Next click on Export This will give you an

Export or Save from Network/Data Manager window. Click on ANDNet_outputs and

ANDNet_errors to highlight them, and then click the Export button. These two variables

now should be in the command line workspace. To check this, go to the command line

and type who to see all the defined variables. The result should be

who

Your variables are:

ANDNet_errors ANDNet_outputs

You might type ANDNet_outputs and ANDNet_errors to obtain the following

ANDNet_outputs =

0 0 0 1

and

ANDNet_errors =

82

0 0 0 0.

You can export p, t, and ANDNet in a similar way. You might do this and check

with who to make sure that they got to the command line.

Now that ANDNet is exported you can view the network description and examine

the network weight matrix. For instance, the command

ANDNet.iw{1,1}

gives

ans =

2 1

Similarly,

ANDNet.b{1}

yields

ans =

-3.

5.8.7 Clear Network/Data Window

You can clear the Network/Data Manager window by highlighting a variable

such as p and clicking the Delete button until all entries in the list boxes are gone. By

doing this, we start from clean slate.

Alternatively, you can quit MATLAB. A restart with a new MATLAB, followed

by nntool, gives a clean Network/Data Manager window.

83

Recall however, that we exported p, t, etc., to the command line from the

perceptron example. They are still there for your use even after you clear the

Network/Data Manager.

5.8.8 Importing from the Command Line

To make thing simple, quit MATLAB. Start it again, and type nntool to begin a

new session.

Create a new vector.

r= [0; 1; 2; 3]

r =

0

1

2

3

Now click on Import, and set the destination Name to R (to distinguish between

the variable named at the command line and the variable in the GUI). You will have a

window that looks like this

84

.

Figure 5.9 Import / load window

Now click Import and verify by looking at the Network/DAta Manager that the

variable R is there as an input.

5.8.9 Save a Variable to a File and Load It Later

Bring up the Network/Data Manager and click on New Network. Set the name to

mynet. Click on Create. the network name mynet should appear in the Network/Data

Manager. In this same manager window click on Export. Select mynet in the variable

85

list of the Export or Save window and click on Save. This leads to the Save to a MAT

file window. Save to a file mynetfile.

Now lets get rid of mynet in the GUI and retrieve it from the saved file. First go to

the Data/Network Manager, highlight mynet, and click Delete. Next click on Import.

This brings up the Import or Load to Network/Data Manager window. Select the

Load from Disk button and type mynetfile as the MAT-file Name. Now click on

Browse. This brings up the Select MAT file window with mynetfile as an option that

you can select as a variable to be imported. Highlight mynetfile, press Open, and you

return to the Import or Load to Network/Data Manager window. On the Import As

list, select Network. Highlight mynet and lick on Load to bring mynet to the GUI. Now

mynet is back in the GUI Network/Data Manager window.

Chapter No. 6

EXPERIMENTAL WORK

86

Chapter No. 6

EXPERIMENTAL WORK

The data shown in the table 6.1 was used to train the network. Some of the data

was used as unseen data to obtain the result; this is shown in table 6.2.

6.1 DATA SET USED

C= carbon (minimum and maximum) Mn= manganese (minimum and maximum) P=phasphorus S=sulphur T-S= tensile strength

S.No. SAE No. C-min Mn-min P-max S-max T-S 1 1006 0.08 0.25 0.04 0.05 43000 2 1008 0.1 0.25 0.04 0.05 44000 3 1009 0.15 0.6 0.04 0.05 43000 4 1010 0.08 0.3 0.04 0.05 47000 5 1012 0.1 0.3 0.04 0.05 48000 6 1015 0.13 0.3 0.04 0.05 50000 7 1016 0.13 0.6 0.04 0.05 55000 8 1017 0.15 0.3 0.04 0.05 53000 9 1018 0.15 0.6 0.04 0.05 58000 10 1019 0.15 0.7 0.04 0.05 59000 11 1020 0.18 0.3 0.04 0.05 55000 12 1022 0.18 0.7 0.04 0.05 62000 13 1023 0.2 0.3 0.04 0.05 56000 14 1024 0.19 1.35 0.04 0.05 74000 15 1025 0.22 0.3 0.04 0.05 58000 16 1027 0.22 1.2 0.04 0.05 75000 17 1030 0.28 0.6 0.04 0.05 68000 18 1033 0.3 0.7 0.04 0.05 72000 19 1035 0.32 0.6 0.04 0.05 72000 20 1037 0.32 0.7 0.04 0.05 74000 21 1038 0.35 0.6 0.04 0.05 75000 22 1039 0.37 0.7 0.04 0.05 79000 23 1040 0.37 0.6 0.04 0.05 76000 24 1042 0.4 0.6 0.04 0.05 80000 25 1043 0.4 0.7 0.04 0.05 82000 26 1045 0.43 0.6 0.04 0.05 82000 27 1046 0.43 0.7 0.04 0.05 85000 28 1050 0.48 0.6 0.04 0.05 90000 29 1052 0.47 1.2 0.04 0.05 108000

87

30 1055 0.5 0.6 0.04 0.05 94000 31 1060 0.55 0.6 0.04 0.05 98000 32 1065 0.6 0.6 0.04 0.05 100000 33 1070 0.65 0.6 0.04 0.05 102000 34 1074 0.7 0.5 0.04 0.05 105000 35 1078 0.72 0.3 0.04 0.05 100000 36 1084 0.8 0.6 0.04 0.05 119000 37 1085 0.8 0.7 0.04 0.05 121000 38 1086 0.8 0.3 0.04 0.05 112000 39 1090 0.85 0.6 0.04 0.05 122000 40 1095 0.9 0.3 0.04 0.05 120000

Table 6.1. Data set to be used for training the network

S.No. SAE No. C-min Mn-min P-max S-max T-S

1 1021 0.18 0.6 0.04 0.05 61000 2 1026 0.22 0.6 0.04 0.05 64000 3 1036 0.3 1.2 0.04 0.05 83000 4 1041 0.36 1.35 0.04 0.05 92000 5 1049 0.46 0.6 0.04 0.05 87000 6 1064 0.6 0.5 0.04 0.05 97000 7 1080 0.75 0.6 0.04 0.05 112000

Table 6.2. Data set to be used as unseen data for the network

6.2 METHODOLOGY

The data shown in table 6.1 was used to train the network while the performance

of the network was tested on the data shown in table 6.2.

6.2.1 Algorithm

% p1,p2,p3 and p4 are the variables with the values of C, Mn, P, and S respectively%

p1=[0.08 0.1 0.15 0.08 0.1 0.13 0.13 0.15 0.15 0.15 0.18 0.18 0.2 0.19 0.22 0.22 0.28 0.3

0.32 0.32 0.35 0.37 0.37 0.4 0.4 0.43 0.43 0.48 0.47 0.5 0.55 0.6 0.65 0.7 0.72 0.8 0.8 0.8

0.85 0.9];

p2=[0.25 0.025 0.6 0.3 0.3 0.3 0.6 0.3 0.6 0.7 0.3 0.7 0.3 1.35 0.3 1.2 0.6 0.7 0.6 0.7 0.6

0.7 0.6 0.6 0.7 0.6 0.7 0.6 1.2 0.6 0.6 0.6 0.6 0.5 0.3 0.6 0.7 0.3 0.6 0.3];

88

p3=[0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04

0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04

0.04 0.04 0.04 0.04 0.04];

p4=[0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05

0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05

0.05 0.05 0.05 0.05 0.05];

% t is the target value for the above inputs, they will be used for training%

t=[43000 44000 43000 47000 48000 50000 55000 53000 58000 59000 55000 62000

56000 74000 58000 75000 68000 72000 72000 74000 75000 79000 76000 80000 82000

82000 85000 90000 108000 94000 98000 100000 102000 105000 100000 119000

121000 112000 122000 120000];

p=[p1;p2;p3;p4];

% Initializing “net” with one neuron , one layer, tansig function for input layer, purelin

function for hidden layer and trainlm function for training%

net=newff(minmax(p),{1,1},{'tansig','purelin'},'trainlm');

** Warning in INIT

** Network "input{1}.range" has a row with equal min and max values.

** Constant inputs do not provide useful information.

net.trainparam.show=10;

net.trainparam.goal=0.001;

%Training the network with inputs ‘p’ and target ‘t’%

net=train(net,p,t);

TRAINLM, Epoch 0/100, MSE 6.6312e+009/0.001, Gradient 3.15473e+006/1e-010

TRAINLM, Epoch 4/100, MSE 5.57749e+008/0.001, Gradient 5.05325e-006/1e-010

89

TRAINLM, Maximum MU reached, performance goal was not met.

Figure 6.1 training of the network with TRAINLM function and 1 neuron % Defining input vectors for testing%

in1=[0.18;0.06;0.04;0.05];

in2=[022;0.6;0.04;0.05];

in2=[0.22;0.6;0.04;0.05];

in3=[0.3;1.2;0.04;0.05];

in4=[0.36;1.35;0.04;0.05];

in5=[0.46;0.6;0.04;0.05];

in6=[0.6;0.5;0.04;0.05];

in7=[0.75;0.6;0.04;0.05];

NUMBER OF NEURONS = 1

%Simulating the network with inputs 1, 2, 3, 4, 5, 6,and 7%

y=sim(net,in1)

90

y = 7.7270e+004

y=sim(net,in2)

y = 7.7270e+004

y=sim(net,in3)

y = 8.5667e+004

y=sim(net,in4)

y = 8.5667e+004

y=sim(net,in5)

y = 7.7270e+004

y=sim(net,in6)

y = 7.7270e+004

y=sim(net,in7)

y = 7.8113e+004

91


Figure 6.2 training of network with 7neurons

%%Simulating the network with inputs 1, 2, 3, 4, 5, 6,and 7%

y=sim(net,in1)

y = 5.2638e+004

y=sim(net,in2)

y = 6.8513e+004

y=sim(net,in3)

y = 6.8513e+004

92

y=sim(net,in4)

y = 6.8513e+004

y=sim(net,in5)

y = 8.8258e+004

y=sim(net,in6)

y = 1.0500e+005

y=sim(net,in7)

y = 8.8258e+004

93


Figure 6.3 training of network with 9 neurons %Simulating the network with inputs 1, 2, 3, 4, 5, 6,and 7%

y=sim(net,in1)

y = 5.5400e+004

y=sim(net,in2)

y = 8.2920e+004

y=sim(net,in3)

y = 1.0800e+005

94

y=sim(net,in4)

y = 1.0800e+005

y=sim(net,in5)

y = 8.2920e+004

y=sim(net,in6)

y = 8.2920e+004

y=sim(net,in7)

y = 8.2920e+004

Figure 6.4 the network

Chapter No. 7 RESULTS AND CONCLUSION

95

Chapter No. 7

RESULTS AND CONCLUSION

7.1 RESULTS

S.No. SAE No. C-min Mn-min P-max S-max T-S (act) 1neuron 7 neuron 9 neurons1 1021 0.18 0.6 0.04 0.05 61000 77270 52638 55400 2 1026 0.22 0.6 0.04 0.05 64000 77270 68513 82920 3 1036 0.3 1.2 0.04 0.05 83000 85667 68513 108000 4 1041 0.36 1.35 0.04 0.05 92000 85667 68513 108000 5 1049 0.46 0.6 0.04 0.05 87000 77270 88258 82920 6 1064 0.6 0.5 0.04 0.05 97000 77270 105000 82920 7 1080 0.75 0.6 0.04 0.05 112000 78113 88258 82920

Average Error 18.6062 16.26343 18.26556

Weights to layer 1 from input

net.iw{1,1}

ans =

1.0e+004 *

0.8415 -0.2418 0.0978 0.1222

0.5361 -5.3917 -0.1597 -0.1996

-4.1999 2.2293 -0.2352 -0.2940

4.8301 -1.5508 0.0239 0.0299

-0.2510 -0.9045 -0.0411 -0.0513

-1.7587 3.4961 -0.1782 -0.2228

-0.9265 -2.8471 0.1268 0.1585

0.5230 -6.9870 0.0877 0.1096

-2.4701 -6.0088 0.1428 0.1785

Weight to layer

lw{2,1}

96

[9468.7077 27719.2353 -11582.5085 16750 3650.5453 -20946.5199 -12540 16540 -

30300]

Bias To Layer 1

b{1}

[24454.0639; -39904.3358; -58806.4363; 5984.5838; -10268.4537; -44553.1615;

31696.3873; 21915.2677; 35697.9616]

Bias to layer 2

b{2}

[54322.0444]

comparison of actual value and predicted value with 1 neuron

1

10

100

1000

10000

100000

1000000

1 2 3 4 5 6 7

x

Graph 7.1 Comparisons of actual value and predicted value with 1 neuron

97


1

10

100

1000

10000

100000

1000000

1 2 3 4 5 6 7

x



1

10

100

1000

10000

100000

1000000

1 2 3 4 5 6 7

x


98

0

20000

40000

60000

80000

100000

120000

0 20000 40000 60000 80000 100000 120000

Graph 7.4 Regression line for the values predicted and actual

7.2 CONCLUSION

The mechanical properties of plain carbon steels were predicted using feed forward

back propagation artificial neural network, with an error of 18.6062 with 1 neuron,

16.26343 with 7 neurons and 18.26556 with 9 neurons. The reason of these Errors is the use of

constant data for sulfur and phosphorus. Using more parameters and experimental data can

reduce errors. Overall performance of neural networks was very satisfactory; it is a highly

significant and beneficial tool in design, development and analysis of plain carbon steels that will

result in increasing efficiency and productivity.

7.3 FUTURE WORK

In future, more data and parameters can be used to upgrade the present work. Such as

• Process of manufacturing

• Heat treatment performed

• Type of product

• Mechanical working done

Genetic algorithm may be applied to obtain reverse predictions, such as, obtaining composition by using mechanical properties as input parameters.

APPENDICES

99

APPENDIX A

DEFINITION OF TERMS

Activation / initialisation function

The time varying value that is the output of a neuron.

Backpropagation (generalized delta-rule)

A name given to the process by which the Perceptron neural network is

"trained" to produce good responses to a set of input patterns. In light of this, the

Perceptron network is sometimes called a "back-prop" network.

Bias

The net input (or bias) is proportional to the amount that incoming neural

activations must exceed in order for a neuron to fire.

Connectivity

The amount of interaction in a system, the structure of the weights in a

neural network, or the relative number of edges in a graph.

Pattern recognition

The act of identifying patterns within previously learned data. A neural

network even in the presence of noise or when some data is missing can carry this

out.

Epoch

One complete presentation of the training set to the network during

training.

Input layer

Neurons whose inputs are fed from the outside world.

Learning algorithms (supervised, unsupervised)

100

An adaptation process whereby synapses, weights of neural network's,

classifier strengths, or some other set of adjustable parameters is automatically

modified so that some objective is more readily achieved. The backpropagation

and bucket brigade algorithms are two types of learning procedures.

Learning rule

The algorithm used for modifying the connection strengths, or weights, in

response to training patterns while training is being carried out.

Layer

A group of neurons that have a specific function and are processed as a

whole. The most common example is in a feedforward network that has an input

layer, an output layer and one or more hidden layers.

Monte-Carlo method

The Monte-Carlo method provides approximate solutions to a variety of

mathematical problems by performing statistical sampling experiments on a

computer.

Multilayer-perceptron (MLP)

A type of feedforward neural network that is an extension of the

perceptron in that it has at least one hidden layer of neurons. Layers are updated

by starting at the inputs and ending with the outputs. Each neuron computes a

weighted sum of the incoming signals, to yield a net input, and passes this value

through its sigmoidal activation function to yield the neuron's activation value.

Unlike the perceptron, an MLP can solve linearly inseparable problems.

Neural Network (NN)

A network of neurons that are connected through synapses or weights.

Each neuron performs a simple calculation that is a function of the activations of

the neurons that are connected to it. Through feedback mechanisms and/or the

nonlinear output response of neurons, the network as a whole is capable of

performing extremely complicated tasks, including universal computation and

universal approximation. Three different classes of neural networks are

101

feedforward, feedback, and recurrent neural networks, which differ in the degree

and type of connectivity that they possess.

Neuron

A simple computational unit that performs a weighted sum on incoming

signals, adds a threshold or bias term to this value to yield a net input, and maps

this last value through an activation function to compute its own activation. Some

neurons, such as those found in feedback or Hopfield networks, will retain a

portion of their previous activation.

Output neuron

A neuron within a neural network whose outputs are the result of the

network.

Perceptron

An artificial neural network capable of simple pattern recognition and

classification tasks. It is composed of three layers where signals only pass forward

from nodes in the input layer to nodes in the hidden layer and finally out to the

output layer. There are no connections within a layer.

Sigmoid function

An S-shaped function that is often used as an activation function in a

neural network.

Threshold

A quantity added to (or subtracted from) the weighted sum of inputs into a

neuron, which forms the neuron's net input. Intuitively, the net input (or bias) is

proportional to the amount that the incoming neural activations must exceed in

order for a neuron to fire.

Training set

A neural network is trained using a training set. A training set

comprises information about the problem to be solved as input stimuli. In

some computing systems the training set is called the "facts" file.

102

Weight

In a neural network, the strength of a synapse (or connection) between two

neurons. Weights may be positive (excitatory) or negative (inhibitory). The

thresholds of a neuron are also considered weights, since they undergo adaptation

by a learning algorithm.

103

APPENDIX B

Network Layers

The term `layer' in the neural network sense means different things to different

people. In the NNT, a layer is defined as a layer of neurons, with the exception of the

input layer. So in NNT terminology this would be a one-layer network:

Figure:

And this would be a two-layer network:

Figure:

Each layer has a number of properties, the most important being the

transfer functions of the neurons in that layer, and the function that defines the net

input of each neuron given its weights and the output of the previous layer

104

Activation Functions

When a neuron updates it passes the sum of the incoming signals through an

activation function, or transfer function as Matlab calls it. There are different types of

activation functions, some are saturated and assures that the output value lies within a

specific range, like logsig, tansig, hardlims and satlin. Some transfer functions are

not saturated like purelin. Some of the transfer functions in the neural network toolbox

are plotted in figure 5.3. The transfer function is chosen when you create the network,

and is assigned to each layer. To create a feed forward network with a layer of two input

neurons, three tansig neurons in the hidden layer and one logsig neuron in the output

layer enter: >> net=newff([0 1;0 1],[3 1],{'tansig','logsig'});

Figure: Transfer functions supplied by Matlab plotted in the same scale. Note the

difference between tansig and logsig. tansig ranges between -1 and 1 and logsig

ranges between 0 and 1. The same relationship applies between hardlim and hardlims

and between satlin and satlins.

BIBLIOGRAPHY

105

BIBLIOGRAPHY

1. R. L. Timings “Engineering Materials” vol; 1 longman group United Kingdom 1994.

2. Heikkin, Koivo : “Neural Networks: Basics Using Matlab, Neural Network Tool

Box” USA 2000.

3. Iqbal Shah “Tensile Properties Of Austenitic Stainless Steels” UK 2002.

4. H. K. P. H. Bhadeshia “Neural Networks In Materials Science” isij international 39:10 1999 pp 966-979.

5. Degramo E.Paul “Materials And Process In Manufacturing” 9th edition, John

Willey And Sons United States 2003. 6. Colanglo, Vito J “Analysis of Metallurgical Failure” 2nd edition John Willey And

Sons Singapore 1987.

7. O. P. Khana “Text Book Of Metallurgy And Materials Engineering” India 2002.

8. M. H. Jokhio, M. A. Unar “ Application Of Neural Network In Powder Metallurgy” Engineering Materials Proceeding, 2004.

9. Internet web site www.ide.his.se

10. Demuth, Markbeak “Neural Network Tool Box For Matlab” Mathwork Inc. USA

2000.

11. Internet web site www.astm.org

12. Internet webs site www.mathworks.com

13. Internet webs site www.igi.trgraz.at

14. Internet webs site www.cs.wisc.edu

15. Internet webs site www.statsoftinc.com

16. Internet webs site www.azom.com

17. Internet webs site http://carol.wins.ura.nl

18. Internet webs site http://envistat.esa.cn

106

19. Internet webs site http://www.brain.web-us.com

20. Internet webs site http://njuct.edu.cn

21. Internet webs site www.baldwininternational.com

22. Internet webs site www.cs.man.ac.uk

23. Internet webs site www.tms.org/pubs/jom.htm

24. Internet web site http://www.torch.ch/matos/convolutions.pdf.

25. Carlos Gershenson “Artificial Neural Networks for Beginners” UK 2003

26. Internet web site www.benbest.com

27. Ivan Galkin, U. Mass Lawell “Crash Introduction To Artificial Neural Networks” (Materials for UML 91.531 Data Mining Course).

Ibtesam' Thesis

Documents

Transcript of Ibtesam' Thesis