Ibtesam' Thesis
Transcript of Ibtesam' Thesis
PREDICTING THE MECHANICAL PROPERTIES OF PLAIN CARBON STEELS USING ANN
By
1. S. Ibtesam Hasan Abidi (Group Leader) 01 MT 23 2. Jamal Ahmed (A.G.L) 01 MT 21 3. S. Raheem shah 2k-01 MT 31
Supervised by:
Professor Mohammad Hayat Jokhio Chairman
Department of Metallurgy And Materials Engineering Mehran University Of Engineering And Technology, Jamshoro
Submitted in the partial fulfillment of the requirement for the degree
of Bachelor Of Metallurgy And Materials Engineering
Dec 2004
ii
Dedication
Dedicated to Time, our best teacher, and then to our Parents, our first teachers
iii
CERTIFICATE
This is to certify that the work presented in this project thesis on “Predicting the
Mechanical Properties Of Plain Carbon Steels Using ANN ” is entirely written by the
following students under the supervision of Prof. Mohammad Hayat Jokhio.
1. S. Ibtesam Hasan Abidi (Group Leader) 01 MT 23
2. Jamal Ahmed (A.G.L) 01 MT 21
3. S. Raheem Shah 2k-01 MT 31
Thesis Supervisor External Examiner
______, Dec 2004
Chairman
Department of Metallurgy And Materials Engineering
iv
Acknowledgements
Every praise is to ALLAH alone, and the grace of ALLAH is on Prophet Muhammad (PBUH) the only source of guidance and knowledge for all humanity.
It gives us great pleasure to record our sincere thanks to our supervisor Professor
Mohammad Hayat Jokhio, Chairman Department Of Metallurgy And Materials
Engineering, who gave his consent to guide us in the project. He had been very
encouraging and cooperative, while the work was carried out.
We are thankful to Mr. Isahque Abro, assistant professor, Department Of
Metallurgy And Materials Engineering, for his contributions in making the work
possible.
We consider ourselves extremely lucky to be a member of Mehran University Of
Engineering And Technology. Our Friends were great fun to be with. They always helped
us during the hard time. We will miss them all dearly.
We also thank to all those people who helped and contributed in our preparation of this thesis.
Finally, we would like to say how much we love our family. Without their support
and encouragement, these years would not have been possible.
v
ABSTRACT
Neural networks are the most emerging and prominent technology in the field of
Artificial intelligence. Present work, carried out in Mehran University Of Engineering
And Technology, Jamshoro is a try to use feed forward neural network with back
propagation training algorithm to predict the mechanical properties of plain carbon. In
present work the composition of the plain carbon steels was used as the parameter. 40
samples were used to train the network while it was validated on 7 samples. Neural
network was used with tansig, purelin, and trainlm functions.
The work would help the materials engineers and manufacturers’ suitability
design product for high performance designed mechanical properties.
vi
CONTENTS Chapter No. 1
INTRODUCTION……………………………………………………………………….1
Chapter No. 2
LITERATURE REVIEW…………………………………………………………….…3
2.1 NEURAL NETWORK
2.2 CHARACTERISTICS OF NEURAL NETWORK
2.2.1 Capabilities Of Modeling
2.2.2 Ease Of Use
2.3 ANALOGY TO THE BRAIN
2.4 THE BIOLOGICAL NETWORK
2.4.1 The Work Mechanism
2.5 THE ARTIFICIAL NEURON
2.5.1 The Work Mechanism
2.6 OUT STANDING FEATURES OF NEURAL NETWORK
2.7 MAIN FEATURES OF NEURAL NETWORK
2.8 LIMITATIONS
2.9 ADVANTAGES
2.10 CLASSIFICATION OF NEURAL NETWORKS
2.10.1 Feed Forward Networks
2.10.2 The Back Propagation
2.10.3 Single layer perceptron
2.10.4 Multi-layer perceptron
2.10.5 Simple recurrent network
2.10.6 Hopfield network
2.10.7 Boltzmann machine
2.10.8 Committee of machines
2.10.9 Self-organizing map
2.11. DESIGNING
2.11.1 Layers
vii
2.11.2 Communicating And Types Of Connections
2.11.2.1 Inter-layer connections
2.11.2.2 Intera-layer connections
2.11.3 Learning
2.11.3.1 Off-Line Or Online
2.11.3.2 Learning Laws
Chapter No. 3
MECHANICAL PROPERTIES OF STEEL AND FACTORS AFFECTING
MECHANICAL PROPERTIES……………………………………………………….23
3.1.1 Tensile Strength
3.1.2 Yield Strength
3.1.3 Elasticity
3.1.4 Plasticity
3.1.5 Ductility
3.1.6 Brittleness
3.1.7 Toughness
3.1.8 Hardness
3.1.9 Fatigue
3.1.10 Creep
3.2 FACTORS AFFECTING MECHANICAL PROPERTIES
3.2.1 Effect Of Grain Size On Properties Of Metals
3.2.2 Effect Of Heat Treatment On Properties Of Metals
3.2.3 Effect Of Environmental Variables
3.2.4 Effect Of Alloying Elements
Chapter No. 4
TESTING TECHNIQUES……………………………………………………………33
4.1 TENSILE TEST
4.1.1 Tensile Test Results
4.1.2 Proof stress
4.1.3 The Interpretation of tensile test results
viii
4.1.4 The effect of grain size and structure on tensile testing:
4.2 IMPACT TESTING
4.2.1 The Izod Test
4.2.2 The Charpy Test
4.2.3 The Interpretation Of Impact Tests
4.2.4 The Effect Of Processing On Toughness
4.3 HARDNESS TESTING
4.3.1 The Brinell Hardness Test
4.3.1.1 Machinability
4.3.1.2 Relationship Between Hardness And Tensile Strength
4.3.1.3 Work-Hardening Capacity
4.3.2 The Vickers Hardness Test
4.3.3 The Rockwell Hardness Test
4.3.4 Shore scleroscope
4.3.4.1 The Effect Of Processing On Hardness
Chapter No. 5
USING NEURAL NETWORK TOOL BOX…………………………………………60
5.1 INTRODUCTION
5.2 THE STRUCTURE OF THE NEURAL NETWORK TOOLBOX
5.3 NETWORK LAYERS
5.3.1 Constructing Layers
5.3.2 Connecting Layers
5.4 SETTING TRANSFER FUNCTIONS
5.4.1 Activation Functions
5.5 WEIGHTS AND BIASES
5.6 TRAINING FUNCTIONS & PARAMETERS
5.6.1 Performance Functions
5.6.2 Train Parameters
5.6.3 Adapt Parameters
5.7 BASIC NEURAL NETWORK EXAMPLE
5.7.1 Manually Set Weights
ix
5.7.2 Training Algorithms
5.8 GRAPHICAL USER INTERFACE
5.8.1 Introduction of GUI
5.8.2 Create A Perceptron Network
5.8.3 Input And Target
5.8.4 Create Network
5.8.5 Train the Perceptron
5.8.6 Export Perceptron Results To Workspace
5.8.7 Clear Network/Data Window
5.8.8 Importing From Command Line
5.8.9 Save A Variable To A File And Load It Later
Chapter No. 6
EXPERIMENTAL WORK…………………………………………………………..86
6.1 DATA SET
6.2 METHODOLOGY
6.2.1 Algorithm
Chapter No. 7
RESULTS AND CONCLUSION ……………………………………………………95
7.1 RESULTS
7.2 CONCLUSION
7.3 FUTURE WORK
APPENDICES
Appendix a ……………………………………………………………………………99
Appendix b ……………………………………………………………………………103
BIBLIOGRAPHY ……………………………………………………………………105
x
List of tables/illustration Figure 2.1 A systematic diagram of a single neuron nerve 5 Figure 2.3 Artificial Neuron 6 Figure 2.4 Feed Forward Networks 9 Figure 2.5 Back propagation 11 Figure 2.6 network layers 15 Figure 3.1 Yield point and yield strength 24 Figure 4.1 Testing machine 33 Figure 4.2 Tensile test specimen ( round ) 35 Figure 4.3 Tensile test specimen ( flat ) 35 Figure 4.4 load / extension curve for low-carbon steel 36 Figure 4.5 Proof stress 37 Figure 4.6 Typical stress/strain curves 40 Figure 4.7 Effect of grain orientation on material testing 41 Figure 4.8 effect of processing on the properties of low-carbon steel 41 Figure 4.9 Effect of tempering on tensile test 42 Figure 4.10 effect of temperature on cold-worked material 43 Figure 4.11 Typical impact testing machine 44 Figure 4.12 Impact loading 44 Figure 4.13 Izod test 45 Figure 4.14 Charpy test 46 Figure 4.15 Standard charpy notches 47 Figure 4.16 Effect of temperature on toughness 48 Figure 4.17 Effect of Annealing on the toughness of low-carbon steel 49 Figure 4.18 Effect of tempering on quench-hardened high carbon steel 49 Figure 4.19 Brinell hardness tester 51 Figure 4.20 Brinell hardness principle 51 Figure 4.21 Work-hardening capacity 52 Figure 4.22 Micro Vicker And Vicker Hardness Testers 53 Figure 4.23 Rockwell hardness tester 55 Table 4.1 Rockwell hardness test conditions 56 Table 4.2 Rockwell superficial hardness test conditions 57 Figure 4.24 Effect of cold-working on the hardness of various 58 Figure 4.25 Effect of heating cold-worked 70/30 brass 59 Figure 4.26 Effect of heating a Quench-hardened 0.8% plain carbon 59 Table 5.1 The XOR-problem 68 Figure 5.1 The logsig activation function 69 Figure 5.2 The targets and the actual output 71 Figure 5.3 Network data manager 75 Figure 5.4 Create new data window 76 Figure 5.5 Create new network window 77 Figure 5.6 View network window 78 Figure 5.7 Main network window 79 Figure 5.8 training result window 80 Figure 5.9 Import / load window 84
xi
Table 6.1. Data set to be used for training the network 86 Table 6.2. Data set to be used as unseen data for the network 87 Figure 6.1 training of the network with TRAINLM function and 1 neuron 89 Figure 6.2 training of network with 7neurons 91 Figure 6.3 training of network with 9 neurons 93 Figure 6.4 the network 94 Graph 7.1 Comparisons of actual value and predicted value with 1 neuron 96 Graph 7.2 Comparisons of actual value and predicted value with 7 neuron 97 Graph 7.3 Comparisons of actual value and predicted value with 9 neuron 97 Graph 7.4 Regression line for the values predicted and actual 98
Chapter No. 1
INTRODUCTION
1
Chapter No. 1
INTRODUCTION Neural networks are the most emerging and prominent technology in the field of
Artificial intelligence, Used in almost every Engineering, Finance, Defiance, Economics,
and other areas, neural networks have proven them selves the best predicting and
controlling tool. Present work is an attempt in the Department Of Metallurgy And
Materials Engineering MUET, Jamshoro to use these neural networks to predict the
mechanical properties of plain carbon, as these (mechanical properties) are the single
most important factor considered while designing a new composition, and also this work
will be useful in creation of a reference to compare the tested mechanical properties. In
present work we have used the composition of the plain carbon steels modeling the
mechanical properties, although, many other parameters are involved that affects the
mechanical properties of plain carbon steel such as, heat treatment, grain size, cold
working, environment etc., which are explained in chapter 3. 40 samples were used to
train the network while it was tested on 7 samples. As metallurgy and materials
engineers, we have adopted and preferred a theoretical approach, however research work
reveals that a mathematical approach adopted on similar topic. The main objectives of
this thesis work are (1) To modify the neural networks into comprehensive and simple
tutorial that could be helpful for utilization of neural networks in the field of engineering
materials by coming students. (2) To create an example we have tried to implement this
technology for prediction of mechanical properties.
During the research we have found the neural networks an intelligent, powerful
and a useful tool for engineering applications and especially for the prediction of
mechanical properties of plain carbon steel. All the wok was carried out at Mehran
University of Engineering And Technology, Jamshoro. The data was obtained from
Pakistan Steel Mills and ASTM Materials Handbook On Properties Of Metal Vol: 9.
The work includes:
Chapter 2 highlights a comprehensive, brief and theoretical concept of neural networks,
training laws, kind of networks used, and some history.
2
Chapter 3 provides an introduction of the mechanical properties of plain carbon steel and
the parameters affecting these mechanical properties. The parameters have their
significance as they can be used to enhance the capabilities of the neural network for
prediction.
Chapter 4 involves experimental techniques to measure the mechanical properties on
laboratory scale. In this chapter, the relationship between the mechanical properties is
also briefly explained.
Chapter 5 consists of a tutorial on how to use neural network toolbox in matlab. Matlab is
a very power full engineering and mathematical language, provided with a built in neural
network tool that is very easy to use.
Chapter 6 is the conclusion, future work, and the results of our experiment.
Chapter No. 2
LITERATURE REVIEW
3
Chapter No. 2
LITERATURE REVIEW
2.1 NEURAL NETWORK
Artificial Neural Network can be best defined as a system loosely modeled on the
human brain this field goes by many names, such as connectionism, parallel distributed
processing, Menno computing, Natural Intelligent systems, Machine learning algorithm
and artificial Neural Networks.
Neural Networks have seen an explosion of interest over the last few years, and
are being successfully applied across an extraordinary range of problem. In the areas as
diverse as, finance, medicine, engineering, geology and physics. In short anywhere,
where is a problem of prediction, classification or control neural Networks are tailoring
their applications.
Neural Networks is an attempt to simulate with in specialized hardware or
sophisticated software, the multiple layers of single processing elements called neurons.
Each Neuron is linked to certain its neighbors with varying coefficient of connectivity
that represent the strength there connections, learning is accomplished by adjusting these
strengths cause the over all network to output appropriate results.
2.2 CHARACTERISTICS OF NEURAL NETWORK
2.2.1 Capability of modeling: -
Neural Network is very sophisticated modeling techniques capable of modeling
extremely function in particular. Neural Network have been non linear, for many years
linear modeling has been the commonly used technique in most modeling domain since
linear models have well known optimization strategies. Where the linear approximation
was not valid, the models suffered accordingly.
2.2.2 Ease of use: -
Since Neural Networks learn by examples, therefore neural network user gathers
representative data first, and then uses training algorithms to automatically learn the
runtime of the data. Although the user does need to have some hemistich knowledge of
how to select and prepare data, how to select an appropriate neural network, and how to
4
interpret the results. The level of user knowledge needed to successfully apply Neural
Networks is much simple than would be in case of using rational non-linear statistical
methods.
2.3 ANALOGY TO THE BRAIN
The most basic components of neural networks are modeled after the structure of
the brain is developed. Some neural network structures are not closely to the brain and
some does not have a biological counter part in the brain. However, neural networks have
a strong similarity to the biological brain therefore a great deal of the technology is
barrowed from the sciences.
2.4 THE BIOLOGICAL NETWORK The most basic elements of the human brain is a specific type of cell, which
provide the abilities to remember, think and apply previous experiences to our every
action. Here cells are known as neuron, each of this neuron can connect with 20,0000
other neuron. The power of the brain comes from the numbers of these basic components
and the multiple connections between there.
All neurons have four basic components, which are dentenitres, soma, axon and
synapses (Fig 2.1). Basically, a biological neuron receives inputs from other sources.
Combines then in some why, performs a generally non linear operation on the results, and
then output the final results the figure below shows a simplified biological neuron and the
relationship of its four components.
2.4.1 The work mechanism
The brain is principally composed of a very large number (circa 10,000,000,000)
of neurons, massively interconnected (with an average of several thousand interconnects
per neuron, although this varies enormously). Each neutron is a specialized cell, which
can propagate an electrochemical signal. The neuron has a branching input structure (the
dendrites), a cell body, and a branching output structure (the axon). The axons of one cell
connect to the dendrites of another via a synapse. When a neuron is activated, it fires an
electrochemical signal along the axon. This signal crosses the synapses to other neurons,
which may in turn fire. A neuron fires only if the total signal received at the cell body
from the dendrites exceeds a certain level (the firing threshold).
5
The strength of the signal received by a neuron (and therefore is chances of firing)
critically depends on the efficiency of the synapse actually contains a gap, with
neurotransmitter chemicals poised to transmit a signal across the gap.
Thus, from a very large number of extremely simple processing units (each
performing a weighted sum of its inputs, and then firing a binary signal if the total input
exceeds a certain level) the brain manages to perform extremely complex tasks. Of
course, there is a great deal of complexity in the brain which has not been discussed here,
but it is interesting that artificial Neural Networks can achieve some remarkable results
using a model not much more complex than this.
Figure: 2.1 A systematic diagram of a single neuron nerve
2.5 THE ARTIFICIAL NEURON
The basic unit of neural networks, the artificial neurons, simulates the four basic
functions of natural neurons. Artificial are much simpler than the biological neuron; the
figure below shows the basics of an artificial neuron.
6
Figure: 2.3 Artificial Neuron
Notes that various inputs to the network are represented by the mathematical
symbol x (n). Each of these inputs are multiplied by a connection weight, these weights
are represented by w (n). In the simplest case, these products are simply summed, fed
through a transfer function to generate a result, and then output.
Even through all artificial neural networks are constructed from this basic building block
the fundamentals may vary in these building blocks and there are differences.
2.5.1 The work mechanism
It receives a number of inputs (either from original data, or from the output of
other neurons in the neural network). Each input comes via a connection that has a
strength (or weight); these weights correspond to synaptic efficacy in biological neuron.
Each neuron also has a single threshold value. The weighted sum of the inputs is formed,
and the threshold subtracted, to compose the activation of the neuron (also known as the
post – synaptic potential, or PSP, of the neuron).
The activation signal is passed through an activation function (also known as a transfer
function) to produce the output of the neuron.
7
2.6 OUT STANDING FEATURES OF NEURAL NETWORK
Neural networks are performing successfully where other methods do not
reorganizing and matching complicated, vague, or incomplete patterns. Neural networks
have been applied in solving a wide variety of problems.
Prediction: - The most common use for neural networks is to predict what will
most likely happen. There are many areas where prediction can help in setting
priorities.
For example: - The emergency room at a hospital can be hectic place; to know
who needs the most critical help, and who can enable a more successful operation.
Basically, all organizations must establish priorities, which govern the allocation
of their resources. Neural Networks have been used as a mechanism of knowledge
acquisition for export system in stock market forecasting with astonishingly
accurate results. Neural networks have also been used for bankruptcy prediction
for credit card institution.
Other most common applications of neural networks fall into the following categories:
Classification: - Use input values to determine the classification.
Data association: - It also recognizes data that contains errors. For example it
only identify the characters that were scanned but identify when the scanner is not
working properly.
Data Conceptualization: - Analyze the inputs so that grouping relationships can
be inferred. For example it extracts the names of customers from database that
most likely buy a particular product
Data Filtering: -Smooth an input signal. For example it take the noise out of a
telephone signal.
Frequently speaking, the neural network system can be applied for interpretation,
prediction, diagnosis, planning, monitoring, debugging, repair, instruction, and control.
Application of neural networks in materials:
Most of the neural networks applications in materials science and engineering lie
in the category of prediction, modeling and control. Or example the current work
is the prediction of mechanical properties of plain carbon steels, Jokhio [2004]
have found the application of neural networks in the field of powder metallurgy.
8
Iqbal Shah [2002] has worked predicting the tensile properties of austenitic
stainless steels. H K D H Bhadesia [1999] defines the neural network applications
in controlling the welding robots, predicting the solidification cracking of welds;
strength of steel welds, hot cracking of the weld; creep, predicting fatigue
properties, fatigue threshold; martensite start temperature, and most importantly
prediction of continuous cooling transformation (or TTT) diagram.
2.7 MAIN FEATURES OF NEURAL NETWORK
• Artificial Neural Network (ANNs) learns by experience rather than by modeling
or programming.
• ANN architectures are distributed, inherently parallel and potentially real time.
• They have the ability to generalize.
• They do not require a prior understanding of the process or phenomenon being
studied.
• They can form arbitrary continuous non-linear mappings.
• They are robust to noisy data.
• VLSI implementation is easy.
2.8 LIMITATIONS • Tools for analysis and model validation are not well established.
• An intelligent machine can only solve some specific problem for which it is
trained.
• Human brain is very complex and cannot be fully simulated with present
computing power. An artificial neural network does not have capability of
human brain.
2.9 ADVANTAGES:
I. Adaptive learning: An ability to learn how to do tasks based on the data given
for training or initial experience.
II. Self – organization: An ANN can create its own organization or
representation of the information it receives during learn time.
9
III. Real Time operation: ANN computations may be carried out in parallel, and
special hardware devices are being designed and manufactured which take
advantage of this capability.
IV. Fault Tolerance via redundant information coding: Partial destruction of a
network leads to the corresponding degradation of performance. However,
some network capabilities may be retained even with major network damage.
2.10 CLASSIFICATION OF NEURAL NETWORKS
Neural networks can be classified as [wikipedia]
a) Feed forward b) Back propagation
According of architecture, neural networks can be classified as
a) Single layer precepton b) Multi layer preceptons
Other types includes
a) Simple recurrent network b) Hopfield network
c) Boltzmann machine d) Support vector machine
e) Committee of machines f) Self organizing map
2.10.1 Feed forward Networks
Feed forward ANNs allow signals to travel one way only, from input to output.
There is no feedback (loop) i.e. the output of any layer does not affect that same layer.
Feed forward ANNs tend to be straightforward networks that associate inputs with
outputs. They are extensively used in pattern recognition. This type of organization is
also referred to as bottom up or top down.
Figure: 2.4 Feed Forward Networks
10
2.10.2 The back propagation
The term is an abbreviation for "backwards propagation of errors". Back
propagation still has advantages in some circumstances, and is the easiest algorithm to
understand. There are also heuristic modifications of back propagation which work well
for some problem domains, such as quick propagation
In back propagation, the gradient vector of the error surface is calculated. This
vector points along the line of steepest descent from the current point, so we know that if
we move along it a “short” distance, we will decrease the error. A sequence of such
moves (slowing as we near the bottom) will eventually find a minimum of some sort. The
difficult part is to decide how large the steps should be.
Large steps may converge more quickly, but may also overstep or (if the error
surface is very eccentric) go off in the wrong direction.
A classic example of this in neural network training is where the algorithm
progresses very slowly along a steep, narrow, valley, bouncing from one side across to
the other. In contrast, very small steps may go in the correct direction, but they also
require a large number of iterations. In practice, the step size is proportional to the slope
(so that the algorithms settle down in a minimum) and to a special constant: the learning
rate. The correct setting for the learning rate is application-dependant, and is typically
chosen by experiments; it may also be time- varying, getting smaller as the algorithm
progresses.
The algorithm is also usually modified by inclusion of a momentum term: this
encourages movement in a fixed direction, so that if several steps are taken in the same
direction, the algorithm “picks up speed”, which gives it the ability to (sometimes) escape
local minimum, and also to move rapidly over flat spots and plateaus.
The algorithm therefore progresses iteratively, through a number of epochs. On
each epoch, the training cases are each submitted in turn to the network, and target and
actual outputs compared and the error calculated. This error, together with the error
surface gradient, is used to adjust the weights, and then the process repeats. The initial
network configuration is random, and training stops when a given number of epochs
elapses, or when the error reacted and acceptable level, or when the error stops improving
(you can select which of these stopping conditions to use).
11
Figure: 2.5 Back propagation
2.10.3 Single layer perceptron
The earliest kind of neural network is a single-layer perceptron network, which
consists of a single layer of output nodes; the inputs are fed directly to the outputs via a
series of weights. In this way it can be considered the simplest kind of feedforward
network. The sum of the products of the weights and the inputs is calculated in each
node, and if the value is above some threshold (typically 0) the neuron fires and takes the
value 1; otherwise it takes the value -1. Neurons with this kind of activation function are
also called McCulloch-Pitts neurons or threshold neurons. In the literature the term
perceptron often refers to networks consisting of just one of these units.
Perceptrons can be trained by a simple learning algorithm that is usually called
the delta-rule. It calculates the errors between calculated output and sample output data,
and uses this to create an adjustment to the weights, thus implementing a form of
gradient descent.
12
2.10.4 Multi-layer perceptron
This class of networks consists of multiple layers of computational units, usually
interconnected in a feedforward way. This means that each neuron in one layer has
directed connections to the neurons of the subsequent layer. In many applications the
units of these networks apply a sigmoid function as an activation function.
The universal approximation theorem for neural networks states that every
continuous function that maps intervals of real numbers to some output interval of real
numbers can be approximated arbitrarily closely by a multi-layer perceptron with just
one hidden layer. This result holds only for restricted classes of activation functions, e.g.
for the sigmoidal functions.
Multi-layer networks use a variety of learning techniques, the most popular being
backpropagation. Here the output values are compared with the correct answer to
compute the value of some predefined error-function. By various techniques the error is
then fed back through the network. Using this information, the algorithm adjusts the
weights of each connection in order to reduce the value of the error-function by some
small amount. After repeating this process for a sufficiently large number of training
cycles the network will usually converge to some state where the error of the
calculations is small. In this case one says that the network has learned a certain target
function. To adjust weights properly one applies a general method for nonlinear
optimization task that is called gradient descent. For this the derivation of the error-
function with respect to the network weights is calculated and the weights are then
changed such that the error decreases (thus going downhill on the surface of the error
function). For this reason backpropagation can only be applied on networks with
differentiable activation function.
In general the problem of reaching a network that performs well, even on
examples that were not used as training examples, is a quite subtle issue that requires
additional techniques. This is especially important for cases where only very limited
numbers of training examples are available. The danger is that the network overfits the
13
training data and fails to capture the true statistical process generating the data.
Statistical learning theory is concerned with training classifiers on a limited amount of
data. In the context of neural networks a simple heuristic, called early stopping, often
ensures that the network will generalize well to examples not in the training set.
Other typical problems of the back-propagation algorithm are the speed of
convergence and the possibility to end up in a local minimum of the error function.
Today there are practical solutions that make backpropagation in multi-layer perceptrons
the solution of choice for many machine learning tasks.
2.10.5 Simple recurrent network
A simple recurrent network (SRN) is a variation on the multi-layer perceptron,
sometimes called an "Elman network" due to its invention by Professor Jeff Elman. A
three-layer network is used, with the addition of a set of "context units" in the input
layer. There are connections from the middle ("hidden") layer to these context units fixed
with weight 1. At each time step, the input is propagated in a standard feedforward
fashion, and then a learning rule (usually backpropagation) is applied. The fixed back
connections result in the context units always maintaining a copy of the previous values
of the hidden units (since they propagate over the connections before the learning rule is
applied). Thus the network can maintain a sort of state, allowing it to perform such tasks
as sequence-prediction that are beyond the power of a standard multi-layer perceptron.
2.10.6 Hopfield network
The Hopfield net is a recurrent neural network in which all connections are
symmetric. This network has the property that its dynamics are guaranteed to converge.
If the connections are trained using Hebbian learning then the Hopfield network can
perform robust content-addressable memory, robust to connection alteration.
2.10.7 Boltzmann machine
14
The Boltzmann machine can be thought of as a noisy Hopfield network. The
Boltzmann machine was important because it was one of the first neural networks in
which learning of latent variables (hidden units) was demonstrated. Boltzmann machine
learning was slow to simulate, but the Contrastive Divergence algorithm of Geoff Hinton
allows models including Boltzmann machines and Product of Experts to be trained much
faster.
2.10.8 Committee of machines
A committee of machines (CoM) is a collection of different neural networks that
together vote on a given example. It has been seen that this gives a much better result. In
fact in many cases, starting with the same architecture and training but different initial
random weights give vastly different networks. A CoM tends to stabilize the result.
2.10.9 Self-organizing map
The Self-organizing map (SOM), sometimes referred to as "Kohonen map", is an
unsupervised learning technique that reduces the dimensionality of data through the use
of a self-organizing neural network. A probabilistic version of SOM is the Generative
Topographic Map (GTM).
2.11 DESIGN
The developer must go through a period of trial and error in the design
decisions before coming up with a satisfactory design. The design issues in neural
networks are complex and are the major concerns of system developers.
Designing a neural network consist of:
• Arranging neurons in various layers.
• Deciding the type of connections among neurons for different
layers, as well as among the neurons within a layer.
• Deciding the way a neuron receives input and produces output.
15
• Determining the strength of connection within the network by
allowing the network learns the appropriate values of connection
weights by using a training data set.
The process of designing a neural network is an iterative process. Below are its
basic steps.
2.11.1 Layers
Biologically, neural networks are constructed in a three dimensional way
from microscopic components. These neurons seem capable of nearly unrestricted
interconnections. This is not true in any man-made network. Artificial neural
networks are the simple clustering of the primitive artificial neurons. This
clustering occurs by creating layers, which are then connected to one another.
How these layers connect may also vary. Basically, all artificial neural networks
have a similar structure of topology. Some of the neurons interface the real world
to receive its inputs and other neurons provide the real world with the network’s
outputs. All the rest of the neurons are hidden form view.
Figure: 2.6 network layers
16
As the figure above shows, the neurons are grouped into layers. The input
layer consists of neurons that receive input form the external environment. The
output layer consists of neurons that communicate the output of the system to the
user or external environment. There are usually a number of hidden layers
between these two layers; the figure above shows a simple structure with only one
hidden layer.
When the input layer receives the input its neurons produce output, which
becomes input to the other layers of the system. The process continues until a
certain condition is satisfied or until the output layer is invoked and fires their
output to the external environment.
To determine the number of hidden neurons the network should have to
perform its best, one are often left out to the method trial and error. If you
increase the hidden number of neurons too much you will get an over fit, that is
the net will have problem to generalize. The training set of data will be
memorized, making the network useless on new data sets.
2.11.2 Communication and types of connections
Neurons are connected via a network of paths carrying the output of one
neuron as input to another neuron. These paths is normally unidirectional, there
might however be a two-way connection between two neurons, because there may
be another path in reverse direction. A neuron receives input from many neurons,
but produce a single output, which is communicated to other neurons.
The neuron in a layer may communicate with each other, or they may not
have any connections. The neurons of one layer are always connected to the
neurons of at least another layer.
17
2.11.2.1 Inter-layer connections
There are different types of connections used between layers; these
connections between layers are called inter-layer connections.
• Fully connected
Each neuron on the first layer is connected to every neuron
on the second layer.
• Partially connected
A neuron of the first layer does not have to be connected to
all neurons on the second layer.
• Feed forward
The neurons on the first layer send their output to the
neurons on the second layer, but they do not receive any input back
form the neurons on the second layer.
• Bi-directional
There is another set of connections carrying the output of
the neurons of the second layer into the neurons of the first layer.
Feed forward and bi-directional connections could be fully- or partially
connected.
• Hierarchical
If a neural network has a hierarchical structure, the neurons of a lower
layer may only communicate with neurons on the next level of layer.
18
• Resonance
The layers have bi-directional connections, and they can continue
sending messages across the connections a number of times until a
certain condition is achieved.
2.11.2.2 Intra-layer connections
In more complex structures the neurons communicate among themselves
within a layer, this is known as intra-layer connections. There are two types of
intra-layer connections.
• Recurrent
The neurons within a layer are fully- or partially connected to one
another. After these neurons receive input form another layer, they
communicate their outputs with one another a number of times before they
are allowed to send their outputs to another layer. Generally some
conditions among the neurons of the layer should be achieved before they
communicate their outputs to another layer.
• On-center/off surround
A neuron within a layer has excitatory connections to itself and its
immediate neighbors, and has inhibitory connections to other neurons.
One can imagine this type of connection as a competitive gang of
neurons. Each gang excites itself and its gang members and inhibits all
members of other gangs. After a few rounds of signal interchange, the
neurons with an active output value will win, and is allowed to update
its and its gang member’s weights. (There are two types of connections
between two neurons, excitatory or inhibitory. In the excitatory
connection, the output of one neuron increases the action potential of
the neuron to which it is connected. When the connection type
between two neurons is inhibitory, then the output of the neuron
19
sending a message would reduce the activity or action potential of the
receiving neuron. One causes the summing mechanism of the next
neuron to add while the other causes it to subtract. One excites while
the other inhibits.)
2.11.3 Learning
The brain basically learns from experience. Neural networks are
sometimes called machine-learning algorithms, because changing of its
connection weights (training) causes the network to learn the solution to a
problem. The strength of connection between the neurons is stored as a weight-
value for the specific connection. The system learns new knowledge by adjusting
these connection weights.
The learning ability of a neural network is determined by its architecture
and by the algorithmic method chosen for training.
The training method usually consists of one of three schemes:
1. Unsupervised learning
The hidden neurons must find a way to organize themselves without
help from the outside. In this approach, no sample outputs are
provided to the network against which it can measure its predictive
performance for a given vector of inputs. This is learning by doing.
2. Reinforcement learning
This method works on reinforcement from the outside. The
connections among the neurons in the hidden layer are randomly
arranged, then reshuffled as the network is told how close it is to
solving the problem. Reinforcement learning is also called supervised
learning, because it requires a teacher. The teacher may be a training
20
set of data or an observer who grades the performance of the network
results.
Both unsupervised and reinforcement suffers from relative slowness
and inefficiency relying on a random shuffling to find the proper
connection weights.
3. Back propagation
This method is proven highly successful in training of multilayered
neural nets. The network is not just given reinforcement for how it is
doing on a task. Information about errors is also filtered back through
the system and is used to adjust the connections between the layers,
thus improving performance. A form of supervised learning.
2.11.3.1 Off-line or On-line
One can categorize the learning methods into yet another group, off-line or
on-line. When the system uses input data to change its weights to learn the
domain knowledge, the system could be in training mode or learning mode. When
the system is being used as a decision aid to make recommendations, it is in the
operation mode; this is also sometimes called recall.
• Off-line
In the off-line learning methods, once the systems enters into the
operation mode, its weights are fixed and do not change any more.
Most of the networks are of the off-line learning type.
• On-line
In on-line or real time learning, when the system is in operating
mode (recall), it continues to learn while being used as a decision
tool. This type of learning has a more complex design structure.
21
2.11.3.2 Learning laws
There are a variety of learning laws, which are in common use. These laws
are mathematical algorithms used to update the connection weights. Most of these
laws are some sorts of variation of the best-known and oldest learning law,
Hebb’s Rule. Man’s understanding of how neural processing actually works is
very limited. Learning is certainly more complex than the simplification
represented by the learning laws currently developed. Research into different
learning functions continues as new ideas routinely show up in trade publications
etc. A few of the major laws are given as an example below.
• Hebb’s Rule
The first and the best-known learning rule was introduced by Donald
Hebb. This basic rule is: If a neuron receives an input from another
neuron, and if both are highly active (mathematically have the same
sign), the weight between the neurons should be strengthened.
• Hopfield Law
This law is similar to Hebb’s Rule with the exception that it specifies
the magnitude of the strengthening or weakening. It states, "if the
desired output and the input are both active or both inactive, increment
the connection weight by the learning rate, otherwise decrement the
weight by the learning rate." (Most learning functions have some
provision for a learning rate, or a learning constant. Usually this term
is positive and between zero and one.)
• The Delta Rule
The Delta Rule is a further variation of Hebb’s Rule, and it is one of
the most commonly used. This rule is based on the idea of
continuously modifying the strengths of the input connections to
22
reduce the difference (the delta) between the desired output value and
the actual output of a neuron. This rule changes the connection weights
in the way that minimizes the mean squared error of the network. The
error is back propagated into previous layers one layer at a time. The
process of back-propagating the network errors continues until the first
layer is reached. The network type called Feed forward, Back-
propagation derives its name from this method of computing the error
term.
This rule is also referred to as the Windrow-Hoff Learning Rule and
the Least Mean Square Learning Rule.
• Kohonen’s Learning Law
This procedure, developed by Teuvo Kohonen, was inspired by
learning in biological systems. In this procedure, the neurons compete
for the opportunity to learn, or to update their weights. The processing
neuron with the largest output is declared the winner and has the
capability of inhibiting its competitors as well as exciting its
neighbors. Only the winner is permitted output, and only the winner
plus its neighbors are allowed to update their connection weights.
The Kohonen rule does not require desired output. Therefore it is
implemented in the unsupervised methods of learning. Kohonen has
used this rule combined with the on-center/off-surround intra- layer
connection to create the self-organizing neural network, which has an
unsupervised learning method.
Chapter No. 3
MECHANICAL PROPERTIES OF STEEL AND FACTORS AFFECTING MECHANICAL PROPERTIES
23
Chapter No. 3
MECHANICAL PROPERTIES OF STEEL AND FACTORS
AFFECTING MECHANICAL PROPERTIES
The mechanical property of materials is simply the ability of materials to with
stand internal/external physic-mechanical forces such as pulling, pushing, twisting,
bending and sudden impact. In general terms, these properties are various kinds of
strength.
These properties are measured by means of destructive testing of materials in
laboratory. However it is very difficult to provide actual service condition in laboratory.
The following material properties are of great importance.
3.1.1 Tensile Strength The ratio of the maximum load to original cross sectional area is called Tensile
Strength or Ultimate Tensile Strength. Where as Ultimate tensile strength refers to the
force needed to fracture the material. Theses are the Properties of materials related to its
ability to with stand external mechanical forces such as pulling, pushing, twisting,
bending and sudden impact and ductility of a material. Tensile strength or ultimate
strength is the maximum point shown on the stress–strain curve. (Fig: 3.1c).
Tensile strength value is commonly taken as a basis for fixing the working
stresses especially in brittle materials. The units of tensile strength are kg/cm2
( a ) low-carbon steel ( b ) Non-ferrous metals
24
( c ) stress-strain curve showing types of static strength.
Figure: 3.1 Yield point and yield strength
3.1.2 Yield Strength
When metals are subjected to a tensile force, they stretch or elongate as the stress
increases. The point where the stretch suddenly increases is known as the yield strength
of the material. Yield strength of a material represents the stress below which the
deformation is almost entirely elastic, and It is that value of stress at which a material
exhibits a specified deviation from proportionality of stress and strain.
It can be defined as the ability of a material to resist plastic deformation, and is
calculated by dividing the force initiating the yield by the original cross-sectional area of
the specimen.
In material where the proportion limit or the elastic limit (fig.3.1b) is less obvious,
it is common to define the yield load as that force required to give 0.2%plastic offset. In
other words, the yield strength is defined, as the stress required producing an arbitrary
permanent deformation. The deformation most often used is 0.2%(fig.3.1), and is
commonly referred as proof strain.
25
3.1.3 Elasticity Loading a solid will change its dimensions, but the resulting deformation will
disappear upon unloading. This tendency of a deformed solid to seek its original
dimensions upon unloading is described to a property called elasticity.
The recovery from the distorting effects of the loads may be instantaneous or
gradual, complete or partial. A solid is called perfectly elastic if this recovery is
instantaneous and complete; it is said to exhibit delayed elasticity or inelastic effects,
respectively, if the recovery is gradual or incomplete. Accurate measurements reveal
some delayed elasticity and inelastic effects in all solids. 3.1.4 Plasticity
Plasticity is that property of a material by virtue of which it may be permanently
deformed when it has been subjected to an externally applied force great enough to
exceed the elastic limit. It has great importance to a fabricator engineer because it is the
property that enables material to shape in the solid state.
For most materials, the plastic deformation follows the elastic deformation.
Referring to stress-strain curve (fig.3.1) a material obeys the law of elastic solids or
stresses below the yield stress and this is followed by the plastic deformation.
The mechanism of plastic deformation is essentially different in crystalline
materials and amorphous materials. Crystalline materials undergo plastic deformation as
the result of slip along definite crystallographic planes, whereas in amorphous materials
plastic deformation occurs when individual molecules or groups of molecules slide past
one another.
3.1.5 Ductility Ductility refers to the capability of a material to undergo deformation under
tension without rupture. It is the ability of a material to be drawn from a large section to a
small section such as in wire drawing.
Ductility may be expressed as percent elongation (%EL) or percent area reduction
(%AR). From a tensile test:
The percent elongation = (lƒ-lo) 100/lo
26
Percent area reduction = (Ao-Aƒ) 100/Ao
Where
Lƒ is fracture length
Lo is original gauge length
Ao is original cross-sectional area at the point of fracture.
Ductility is a measure of the degree of plastic deformation that has been sustained
at fracture. Knowledge of the ductility of materials is important for at least two reasons:
• First it indicates to a designer the degree to which a structure will deform
plastically before fracture.
• Second, it specifies the degree of allowable deformation during fabrication
operation.
3.1.6 Brittleness
Brittleness is defined as a tendency to fracture without appreciable deformation
and is therefore the opposite of ductility or malleability. A brittle material will fracture
with little permanent deformation/distortion; it is a sudden failure. A brittle material is
hard and has little ductility. It will not stretch or bend before breaking. Cast iron is an
example of a brittle material.
If a material can be mechanically worked to a different size or shape without
breaking or shattering, it is ductile or malleable; but if little or no change in the
dimensions can be made before fracture occurs, it is brittle.
Technically speaking if an elongation less than 5% in a 50 mm gauge length is
taking place, then a material will be recognized as a brittle material. The brittle fractures
normally follow the grain boundaries (inter granular or intercrystalline), whereas ductile
fractures normally occur through the grains (transgranular or transcrystalline).
3.1.7 Toughness Toughness is the ability of the material to absorb energy during plastic
deformation up to fracture. It refers to the ability of a material to withstand bending or the
application of shear stresses without fracture. By this definition, copper is extremely
tough but cast iron is not.
Specimen geometry as well as the manner of load application is important in
toughness determinations. For dynamic loading conditions and when a notch (or point of
27
stress concentration) is present, notch toughness is assessed by using an impact test.
Furthermore, fracture toughness is a property indicative of a material’s resistance to
fracture when a crack is present.
For the static situation, toughness may be ascertained from the results of a tensile
stress-strain test. Toughness of a material, then, is indicated by the total area under the
material’s tensile stress-strain curve up to the point of fracture.
3.1.8 Hardness Hardness is the resistance of a material to penetration/scratch. However, the term
may refer to stiffness or temper or to resistance to scratching, abrasion or cutting. Tests
such as Brinell, Rockwell, Vickers etc., are generally employed to measure hardness.
The hardness of materials depends upon the type of bonding forces between
atoms, ions or molecules and increases, like strength, with the magnitude of these forces.
Thus molecular solids such as plastics are relatively soft, metallic and ionic solids are
harder than molecular solids, and covalent solids are the hardest materials known.
3.1.9 Fatigue When subjected to fluctuating or repeated loads (or stresses), materials tend to
develop a characteristic behavior, which is different from that (of the materials) under
steady load. Fatigue is the phenomenon that leads to fracture under such conditions.
Fracture takes place under repeated or fluctuating stresses whose maximum value is less
than the tensile strength of the material (under steady loads). Fatigue fracture is
progressive, beginning as minute cracks that grow under the action of fluctuating stress.
The term fatigue is used because this type of failure normally occurs after a
lengthy period of repeated stress or strain cycling.
Fatigue is important as much as it is the single largest cause of failure in metals
(bridges, aircraft and machine components), estimated to comprise approximately 90% of
all metallic failures; it is catastrophic and insidious, occurring very suddenly and without
warning.
Fatigue failure is brittle like in nature even in normally ductile metals, in that
there is very little, if any, gross plastic deformation associated with failure. The process
occurs by the initiation and propagation of cracks, and ordinarily the fracture surface is
perpendicular to the direction of an applied tensile stress.
28
3.1.10 Creep
Creep is the time-dependent permanent deformation that occurs under stress; for
most materials, it is important only at elevated temperatures. Materials are often placed in
service at elevated temperatures and exposed to static mechanical stresses (e.g., turbine
rotors in jet engines and steam generators that experience centrifugal stresses and high-
pressure steam lines). Deformation under such circumstances is termed creep.
3.2 FACTORS AFFECTING MECHANICAL PROPERTIES Mechanical properties of materials are affected due to:
1. Alloy contents such as addition of W, Cr, etc. improve hardness and strength.
2. Grain size and microstructure.
3. Crystal imperfections such as dislocations.
4. Manufacturing defects such as cracks, blowholes etc.
5. Physio-mechanical treatments.
3.2.1 Effect Of Grain Size On Properties Of Metals On the basis of grain size, materials may be classified as:
1. Coarse-grained materials, (the grain size is large).
2. Fine-grained materials, (the grain size is small).
Grain size is very important in deciding the properties of polycrystalline materials
because it affects the area and length of the grain boundaries.
Various effects of grain size on mechanical properties of metals are:
1. Fine-grained materials possess higher strength, toughness, hardness and
resistance to suddenly applied force.
2. Fine-grained materials possess better fatigue resistance, and impact strength.
3. Fine grained materials are more crack-resistant and provide better finish in
deep drawing unlike coarse grained ones which gibe rise to orange-peel effect.
4. Fine-grained steel develops hardness faster in carburising (heat treatment).
5. Fine-grained materials are preferred for structural applications.
6. Fine-grained materials generally exhibit greater yield stresses than coarse-
grained materials at low temperature, whereas at high temperatures grain
boundaries become weak and sliding occurs.
7. A coarse grained material is responsible foe surface roughness.
29
8. A coarse grained material possesses more ductility, malleability (forging,
rolling, etc.) and better machinability.
9. Coarse-grained metals are difficult to polish or plating (as rough surface is
visible even after polish etc.).
10. Coarse-grained steels have greater depth of hardening power as compared to
fine-grained ones.
11. At elevated temperatures, coarse-grained materials show better creep strength
than the fine-grained ones.
3.2.2 Effect Of Heat Treatment On Properties Of Metals Heat treatment is an operation or combination of operations involving heating and
cooling of a metal / alloy in solid state to obtain desirable behavior or set of properties.
Actually it affects the grain size and shape in the metal and some time a change in
microstructure may or may not take place. By controlling the grain size and type of
microstructure the desired mechanical properties can be achieved. This is done by heat
treatment
Some important heat-treatment processes are:
Annealing Normalizing
Hardening Tempering
Martemperig Austempering etc.
One or the other heat-treatment processes produce the following effects on the
properties of metals:
1. Hardens and strengthens the metals.
2. Improves machinability.
3. Changes or refines grain size.
4. Softens metals for further working as in wire drawing.
5. Improves ductility and toughness.
6. Increases resistance of materials to heat, wear, shock and corrosion.
7. Improves electrical and magnetic properties.
8. Homogenises the metal structure.
9. Relieves internal stresses developed in metals / alloys during cold working,
welding, casting, forging etc.
30
10. Produces a hard wear resistant surface on a ductile steel piece (as in case
hardening).
11. Improves thermal properties such as conductivity.
3.2.3 Effect of environmental variables
Gaseous environment: The atmosphere contains mainly nitrogen and oxygen and
added to it are gaseous products such as sulphur dioxide, hydrogen sulphide, moisture,
chlorine, fluorine etc., etc., as industrial and other pollutants.
On account of oxygen, an oxide film forms on the metals.
In the presence of humid air, an oxide film-rust-can be seen on the surface of mild
steel which is not desirable.
Liquid environment: When exposed to moist (and saline) atmosphere, the metals
may corrode. Corrosion is a gradual chemical attack on a metal under the influence of a
moist atmosphere, (or of a natural or artificial solution).
Working temperature: When exposed to very cold atmosphere, even ductile
metals may behave like brittle metals. Water pipes in very cold countries normally burst
and this is the effect of atmospheric exposure.
When the metals are subjected to a very hot atmosphere there is
1. Accelerated oxidation and / or corrosion.
2. Creep.
3. Grain boundary weakening.
4. Allotropic and other phase changes.
5. Change of conventional properties.
6. Reduction in tensile strength and yield point.
3.2.4 Effect of alloying elements
Carbon – With an increase in the amount of carbon, the hardness and tensile strength of
the steel also increase (which slows as the level of carbon rises). An increase in carbon
thusly causes a decrease in both ductility and weldability.
31
Manganese – will also increase hardness as levels increase, but not to the same degree as
carbon. Ductility and weldability are decreased but, again, to a lesser degree than caused
by carbon.
Phosphorus – Benefits machinability and resistance to atmospheric corrosion. It
increases strength and hardness, much akin to carbon, but it decreases ductility and
impact strength (toughness). Phosphorus is often considered an impurity except in
specific situations.
Sulphur – Like phosphorus, sulphur is generally undesired, except where machinability
is an important goal for the steel. Ductility, impact strength or toughness, weldability, and
surface quality are all adversely affected by sulphur content.
Silicon – Serves as a principal deoxidizer in steel. Its content in the steel is dependent
upon the steel type. Killed steel has the highest percentage of silicon, upwards of 0.60
percent.
Copper – The sole purpose of copper is to increase resistance to atmospheric corrosion.
Does not significantly affect mechanical properties. Causes brittleness in the steel at high
temperatures, thereby negatively affecting surface quality.
Chromium (Cr) Increases the steel's hardenability, corrosion resistance, and provides
wear and abrasion resistance in the presence of carbon. It is largely present in stainless
steels, usually ranging from 12 to 20%.
Molybdenum (Mo) Its use as an alloying element in steel increases hardenability. Nickel
(Ni) One of the most widely used alloying elements in steel. In amounts 0.50% to 5.00%
its use in alloy steels increases the toughness and tensile strength without detrimental
effect on the ductility. Nickel also increases the hardenability. In larger quantities, 8.00%
and upwards, nickel is the constituent, together with chromium, of many corrosion
resistant and stainless austenitic steels.
32
Titanium (Ti) Small amounts added to steel contribute to its soundness and give a finer
grain size. Titanium carbide is also used with tungsten carbide in the manufacture of hard
metal tools.
Tungsten (W) When used as an alloying element it increases the strength of steel at
normal and elevated temperatures. Its "red hardness" value makes it suitable for cutting
tools as it enables the tool edge to be maintained at high temperatures.
Vanadium (V) - Steels containing vanadium have a much finer grain structure
than steels of similar composition without vanadium. It raises the temperature at
which grain coarsening sets in and increases hardenability where it is in solution
in the austenite prior to quenching. It also lessens softening on tempering and
confers secondary hardness on high-speed steels.
In the present study we have just touched with the effect of composition
on the mechanical properties.
Chapter No. 4
TESTING TECHNIQUES
33
Chapter No. 4
TESTING TECHNIQUES
4.1 TENSILE TEST
Strength is defined as the ability of a material to resist applied forces without
yielding or fracturing. By convention strength usually denotes the resistance of a material
to a tensile load applied axially to a specimen this is the principle of the tensile test.
Figure 4.1 Shows a more sophisticated machine suitable for industrial and research
laboratories. This machine is capable of performing compression, shear and bending tests
as well as tensile tests. Both these machines apply a carefully controlled tensile load to a
standard specimen and measure the corresponding extension of that specimen.
Figure 4.1 Testing machine
Figure 4.2 and Figure 4.3 shows some standard specimen and the direction of the
applied load. These specimens are based upon British standard BS 18. For the test results
to be consistent for any given material, it is most important that the standard dimension
and profiles are adhered to. The shoulder radii are particularly critical and small
variations, or the presences of tooling marks, can cause considerable differences in the
test data obtained. Flat specimens are usually machined only on their edges so that the
34
plate or sheet surfaces finish, and any structural deformation at the surface cause by the
rolling process are taken into account in the test results.
The gauge length is the length over which the elongation of the specimen is
measured. The minimum parallel length is the minimum length over which the specimen
must maintain a constant cross –sectional area before the test load is applied. The lengths
Lo, Lc, L1, and the cross –sectional area (a) are all specified in BS 18. Cylindrical test
specimens are proportioned so that the gauge length Lo, and the cross–sectional area a
maintain a constant relationship. Hence such specimens are called proportional test
pieces. The relationship is given by the expression:
Lo = 5.56 √a
Since a = 0.25(π d2)
√a = 0.886d
Thus Lo = 5.56 * 0.886d
= 4.93d
= 5d approx.
Therefore a specimen 5 mm diameter will have a gauge length of 25 mm. The
elongation obtained for a given force depends upon the length and area of cross –section
of the specimen or component, since.
Elongation = Force * L
E a
Where L = length
a = cross –section area
E = elastic modulus
Therefore if the L/a is kept constant (as it is in a proportional test piece), and E
remains constant for a given material, then comparisons can be made between elongation
and applied force for specimens of different sizes.
35
Figure 4.2 Tensile test specimen ( round )
Figure 4.3 Tensile test specimen ( flat )
4.1.1 Tensile test results
The load applied to the specimen and the corresponding extension is plotted in
the form of a graph as shown in fig. 4.4.
(a) From a to b the extension is proportional to the applied load. Also, if the applied
load is removed the specimen returns to its original length. Under these relatively
lightly loaded conditions the material is showing elastic properties.
(b) From b to c it can be seen from the graph that the metal suddenly extends with no
increase in load. If the load is removed at this point the metal will not spring back
to its original length and it is said to have taken a permanent set. This is the yield
point and the yield stress, which is the stress at the yield point, is the load at b
divided by the original cross –section area of the specimen. Usually the designer
works at the 50 percent of this figure to allow for a ‘factor of safety.
36
(c) From c to d extension is no longer proportional to the load and if the load is
removed little or no spring back will occur. Under these relatively greater loads
the material is showing plastic properties.
(d) The point d is referred to as the ‘ultimate tensile strength’ when referred to
load/extension graphs or the ‘ultimate tensile stress’ (UTS) when referred to
stress/strain graphs. The ultimate tensile stress is calculated by dividing the load
at d by the original cross sectional area of the specimen. Although a useful figure
for comparing the relative strengths of materials, it has little practical value since
engineering equipment is not usually operated so near to the breaking point.
(e) From d to e the specimen appears to the stretching under reduced load conditions.
In fact the specimen is thinning out (necking) so that the load per unit area or
stress is actually increasing. The specimen finally work hardens to such an extent
that it breaks at e. In practice, values of load and extension are of limited use since
they apply only to one particular size of specimen and it is more usual to plot the
stress/strain curve. (An example of a stress\strain curve for a low –carbon steel is
shown in fig.4.4) stress and strain are calculated as follows.
stress = load\area of cross –section
Strain = extension\original length
Figure 4.4 load / extension curve for low-carbon steel
Loa
d
37
4.1.2 Proof stress Only very ductile materials such as fully annealed mild steel show a clearly defined yield
point. The yield point will not even appear on bright drawn low carbon steel which has become
slightly worked hardened during the drawing process. Under such circumstances the proof stress
is used. Proof stress is defined as the stress, which produces a specified amount of plastic strain,
curve such as 0.1 or 0.2 per unit. Figure 4.5 shows a typical stress\strain curve for a material of
relatively low ductility. Such as hardened and tempered medium carbon steel. If a point such as C
is taken, the corresponding strain is given by D and this consists of a combination of plastic and
elastic components. If the stress is now gradually reduced (by reducing the load on the specimen),
the strain is also reduced and the stress\strain relationship during this reduction in stress is
represented by the line CB. During the reduction in stress the elastic deformation is recovered so
that the line CB is straight and parallel to the initial stages of the loading curve for the material,
that is, the part of the loading curve where the material is showing elastic properties.
In the example shown, the stress at C has produced a plastic strain of 0.2 percent as
represented by AB. Thus the stress at C is referred to as 0.2 percent proof stress, AB being the
plastic deformation and BD being the elastic deformation when the specimen is stressed to the
point C. The material will have fulfilled its specification if, after the proof stress has been applied
for 15 seconds and removed, the permanent set of the specimen is not greater than the specified
percentage of the gauge length which, in this example, is 0.2 percent.
Figure 4.5 Proof stress
38
4.1.3 The Interpretation of tensile test results
The interpretation of tensile test data requires skill born out of experience, since many
factors can affect the test results, for instance the temperature at which the test is carried out,
since the tensile modulus and tensile strength decrease as the temperature rises for most metals
and plastics, whereas the ductility increases as the temperature rises. The test results are also
influenced by the rate at which the specimen s strained.
Figure 4.6(a) shows a typical stress\strain curve for annealed mild steel. From
such a curve the following information can be deduced.
(a) The material is ductile since there is a long plastic range.
(b) The material is fairly rigid since the slope of the initial elastic range is sleep.
(c) The limit of proportionality (elastic limit) occurs at about 230 Mpa.
(d) The upper yield point occurs at about 260 Mpa.
(e) The lower yield point occurs at about 230 Mpa.
(f) The ultimate tensile stress (UTS) occurs at about 400 Mpa.
Figure 4.6(b) shows atypical stress\strain curve for a gray cast iron. From such a
curve the following information can be deduced.
(a) The material is brittle since there is little plastic deformation before it fractures.
(b) Again the material is fairly rigid since the slope of the initial elastic range is
sleep.
(c) It is difficult to determine the point at which the limit of proportionality occurs,
but it is approximately 200 Mpa.
(d) The ultimate tensile stress (UTS) is the same as the breaking stress for this
sample. This indicates negligible reduction in cross –section (necking) and
minimal ductility and malleability. It occurs at approximately 250 Mpa.
Figure 4.6(c) shows a typical stress\strain curve for a wrought light alloy. From
such a curve the following information can be deduced.
39
(a) The material has a high level of ductility since it shows a long plastic range. The
material is much less rigid than either low carbon steel or cast iron since the slope
of the initial plastic range is much less sleep when plotted to the same scale. The
limit of proportionality is almost impossible to determine, so proof stress will be
specified instead. For this sample a 0.2 percent proof stress is approximately 500
Mpa (AB). The tensile test can also yield other important facts about a material
under test.
( a ) Stress/strain curve for annealed mild steel
( b) Stress/strain curve for gray cast iron
40
( c ) Stress/strain curve for light alloy
Figure 4.6 Typical stress/strain curves
4.1.4 The effect of grain size and structure on tensile testing:
The test piece should be chosen so that it reflects as closely as possible the
component and the material from which the component is produced. This is relatively
easy for components produced from bar stock, but not so easy for components produced
from forgings as he grain flow will be influenced by the contour of the component and
will not be uniform. Castings also present problems since the properties of a specially
cast test piece are unlikely to reflect those of the actual casting. This is due to the
difference in size and the corresponding difference in cooling rates.
The lay of the strain in rolled bar and plate can greatly affect the tensile strength
and other properties of a specimen taken from them. Figure 4.7 shows the relative grain
orientation for transverse and longitudinal test pieces. The tensile strength for the
41
longitudinal test piece is substantially greater than that of the transverse test piece, a
factor which the designer of large fabrications must take into account.
Figure 4.8 shows the effect of processing upon the properties of a material. A low
carbon steel of high ductility, in the annealed condition shows the classical stress\strain
curve with a pronounced yield point and a long plastic deformation range. The same
material, after finishing by cold drawing no longer shows a yield point and the plastic
range is sinceably reduced.
Figure 4.7 Effect of grain orientation on material testing
( i ) Annealed low-carbon steel ( ii ) Cold-drawn low-carbon steel
Figure 4.8 effect of processing on the properties of low-carbon steel
42
Figure 4.9 shows the effect of heat treatment upon the properties of a medium carbon
steel. In this example the results have been obtained by quench hardening a batch of
identical specimens and then tempering them at different temperatures.
Figure 4.10 shows the effect of heat treatment upon the properties of a work
hardened metallic material. Stress relief (recovery) has very little effect upon the tensile
strength and elongation (ductility) until the recrystallization (annealing) temperature is
reached. The metal initially shows the high tensile strength and lack of ductility
associated with a severely distorted grain structure. After stress relief the tensile strength
rises and the ductility falls until the recrsytallization range temperature is reached. During
the recrystallization range there is a marked change in properties. The tensile strength is
rapidly reduced and the ductility, in terms of elongation percentage rapidly increases.
Figure 4.9 Effect of tempering on tensile test
43
Figure 4.10 effect of temperature on cold-worked material 4.2 IMPACT TESTING
The tensile test does not tell the whole story. Figure 4.12 shows how a piece of
high carbon steel rod will bend when in the annealed condition yet snap easily in the
quench hardened condition despite the fact that in the latter condition it will show a much
higher value of tensile strength. Impact tests consist of striking a suitable specimen with a
controlled blow and measuring the energy absorbed in bending or breaking the specimen.
The energy value indicates the toughness of the material under test. Figure 4.11 shows a
typical impact-testing machine. This machine has a hammer which is suspended like
pendulum, a voice for holding the specimen in the correct position relative to the hammer
and a dial for indicating the energy absorbed in carrying out the test in joules (J). If there
is maximum over swing, as there would be if no specimen was placed in the vice, then
zero energy absorption is indicated. If the hammer is stopped by the specimen with no
over swing, then maximum energy absorption is indicated. Intermediate readings are the
impact values (J) of the materials being tested (their toughness ort lack of brittleness).
There are two standard tests currently use.
44
Figure 4.11 Typical impact testing machine
( a ) ( b )
Figure 4.12 Impact loading ( a ) A piece of high-carbon steel rod ( 1.0%) in the annealed (soft) condition will bend when struck with a hammer. UTS 925 Mpa ( b ) The same piece of high-carbon steel rod, as in ( a ), after hardening and lightly tempering will fracture when hit a hammer despite its UTS having increased to 1285 Mpa.
45
4.2.1 The Izod Test
In this the test a 10 mm square, notched specimen is used. The striker of the
pendulum hits the specimen with a kinetic energy of 162.72 J at a velocity of 3.8 m\s.
Figure 4.13 shows details of the specimen and the manner in which it is supported.
Figure 4.13 Izod test
Detail of notch
Section of test piece
Position of the striker
46
4.2.2 The Charpy Test
In the Izod test the specimen is supported as a cantilever, but in the charpy test it
is supported as a beam. It is struck with a kinetic energy of 298.3 J at a velocity of 5 m\s.
figure 4.14 shows details of the charpy test specimen and the manner in which it is
supported.
Since both tests use a notched specimen, useful information can be obtained
regarding the resistance of the material to the spread of a crack which may originate from
a point of stress concentration such as sharp corners, undercuts, sudden changes in
section, and machining marks in stressed components. Such points of stress concentration
should be eliminated during design and manufacture.
Figure 4.14 Charpy test
47
4.2.3 The interpretation of impact tests
The results of an impact test should specify the energy used to bend or break the
specimen and the particular test used, i.e. Izod or charpy. In the case of the charpy test it
is also necessary to specify the type of notch used as this test allows for three types of
notch , as shown in fig.4.15. A visual examination of the fractured surface after the test
also provides useful information.
(a) Brittle Metals. A clean break with little deformation and little reduction in cross
sectional area at the point of fracture. The fractured surface will show a granular
structure .
(b) Ductile Metals: The fracture will be rough and fibrous. In very ductile materials
the fracture will not be complete the specimen bending over and only showing
slight tearing from the notch. There will also be some reduction in cross sectional
area at the point of fracture or bending.
Figure 4.15 Standard charpy notches
48
The temperature of the specimen at the time of making the test also has an
important influence on the test results. Figure 4.16 shows the embrittlement of low
carbon steels at refrigerated temperatures and hence their unsuitability for use in
refrigeration plant and space vehicles.
4.2.4 The effect of processing on toughness
Impact tests are frequently used to determine the effectiveness of annealing
temperatures on the grain structure and impact strength of cold worked ductile metals. In
the case of cold –worked low carbon steel, the impact strength is quite low, initially, as
the heavily deformed grain structure will be relatively brittle and lacking in ductility,
particularly if the limit of cold working has been approached. Annealing at low
temperatures has little effect as it only promotes recovery of the crystal lattice on the
atomic scale and does not result in crystallization. In fact during recovery there may even
be a slight reduction in the impact strength.
However, at about 550oC to 650 o C recrystallization of low carbon steels occurs
with only slight grain growth. Annealing in this temperature range results in the impact
strength increasing dramatically as shown in fig. 4.7 and the appearance of the fracture
changes from that of a brittle material to that of a ductile material. Annealing at higher
temperatures or prolonged soaking at the lower annealing temperature results in grain
growth and a corresponding fall in impact strength.
Figure 4.16 Effect of temperature on toughness
The effect of tempering on the impact value of a quench hardened high carbon
steel is shown in fig.4.18. Initially, only stress relief occurs but as the tempering
temperature increases, the toughness also increases which is why cutting tools are
tempered. Tempering modifies the extremely hard and brittle martensitic structure of
49
quench hardened plain carbon steels and causes a considerable increase in toughness with
very little loss of hardness.
Figure 4.17 Effect of Annealing on the toughness of low-carbon steel
Figure 4.18 Effect of tempering on quench-hardened high carbon steel
4.3 HARDNESS TESTING
Hardness has defined as the resistance of a material to indentation or abrasion by
another hard body. It is by indentation that most hardness tests are performed. A hard
50
indenter is pressed into the specimen by a standard load, and the magnitude of the
indentation (either area or depth) is taken as a measure of hardness.
4.3.1 The Brinell hardness test
In this test, hardness is measured by pressing a hard steel ball into the surface of
the test piece, using a known load. It is important to choose the combination of load and
ball size carefully so that the indentation is free from distortion and suitable for
measurement. The relationship between load P(kg) and the diameter D(mm) of the
hardened ball indenter is given by the expression:
P/D2 = K
Where K is a constant. Typical values of K are:
Ferrous metals K=30
Copper and copper alloys K=10
Aluminum and aluminum alloys K=05
Lead, tin, and white beating metals K=01
Thus, for steel, a load of 3000 kg is required if a 10mm diameter ball indenter is
used.
Figure 4.20 shows the principle of the Brinell hardness test. The diameter of the
indentation d is measured in two directions at right-angles and the average taken. The
hardness number HB is the load divided by the spherical area of the indentation which can
be calculated knowing the values of d and D. In practice, conversion tables are used to
translate the value of diameter d directly into hardness numbers HB.
51
Figure 4.19 Brinell hardness tester
Figure 4.20 Brinell hardness principle
To ensure consistent results the following precautions should be observed:
(a) the thickness of the specimen should be at least seven times the depth of the
indentation to allow unrestricted plastic flow below the indenter;
(b) the edge of the indentation should be at least three times the diameter of the
indentation from the edge of the test piece;
(c) the test is unsuitable for materials whose hardness exceeds 500 HB, as the ball
indenter tends to flatten.
52
Relationship between hardness and tensile strength:
4.3.1.1 Machinability
With high speed steel cutting tools, the hardness of the stock being cut should not exceed
HB =100 will tend to tear and leave a poor surface finish.
4.3.1.2 Relationship Between Hardness And Tensile strength
There is a definite relationship between strength and hardness, and the ultimate
tensile stress (UTS) of a component can be approximated as follows:
UTS (MPa) = HB * 3.54 (for annealed plain-carbon steels);
= HB * 3.25 (for quench-hardened and tempered plain-carbon steels);
= HB * 5.6 (for ductile brass alloys);
= HB * 4.2 (for wrought aluminium alloys).
Figure: 4.21 Work-hardening capacity
53
4.3.1.3 Work-hardening capacity
Materials which will cold work without work hardening unduly will pile up round
the indenter as shown in fig. 4.21 (a). Material which work-harden readily will sink
around the indenter as shown in fig. 4.21(b).
4.3.2 The Vickers hardness test
This test is preferable to the Brinell test where hard materials are concerned, as it
uses a diamond indenter.(Diamond is the hardest material known_approximatly 6000
HB.) The diamond indenter is in the form of a square-based pyramid with an angle of
136o between opposite faces. Since only one type of indenter is used the load has to be
varied for different hardness ranges. Standard loads are 5, 10, 20, 30, 50 and 100kg. It is
necessary to state the load when specifying a Vickers hardness number. For example if
the hardness number is found to be 200 when using a 50kg load, then the hardness
number is written HD (50) =200. Figure 4.22(a) shows a universal hardness testing
machine suitable for performing both Brinell and Vickers hardness tests, whilst fig.
4.22(b) shows the measuring screen for determining the distance across the corners of the
indentation. The screen can be rotated so that two readings at right-angles can be taken
and the average is used to determine the hardness number (HD). This is calculated by
dividing the load by the projected area of the indentation:
HD = P/D2,
Where D = the average diagonal (mm), P = load (kg).
Figure 4.22 Micro Vicker And Vicker Hardness Testers
54
4.3.3 The Rockwell hardness test
Although not so reliable as the Brinell and Vickers hardness tests for laboratory
purposes, the Rockwell test is widely used in industry, as it is quick, simple, and direct
reading. Figure 4.23 shows a typical hardness indicating scale. Universal electronic
hardness testing machines are now widely used which, at the turn of a switch, can
provide either Brinell, Vicker, or Rockwell tests and which show the hardness number as
a digital readout automatically. They also give a ‘hard copy’ printout of the test result
together with the test conditions and date. However, the mechanical testing machines
described in this chapter are still widely used and will be for some time to come.
In principle the Rockwell hardness test compares the difference in depth of
penetration of the indenter when using forces of two different values, that is, a minor
force is first applied (to take up the backlash and pierce the skin of the component) and
the scales are set to read zero. Then a major force is applied over and above the minor
force and the increased depth of penetration is shown on the scales of the machine as a
direct reading of hardness without the need for calculation or conversion tables. The
indenters most commonly used are a 1.6mm diameter hard steel ball and a diamond cone
with an apex angle of 1200. The minor force in each instance is 98 N. Table 4.1 gives the
combinations of type of indenter and additional (major) force for the range of Rockwell
scales, together with typical applications. The B and C scales are the most widely used in
engineering.
The standard Rockwell test cannot be used for very thin sheet and foils, and for
these the Rockwell superficial Hardness Test is used. The minor force is reduced from 98
N to 29.4 N and the major force is also reduced. Typical values are listed in Table4.2.
55
Figure 4.23 Rockwell hardness tester
56
Scale Indenter Additional
force(kN)
Applications
A Diamond cone 0.59 Steel sheet; shallow
case-hardened
components
B Ball, ∅ 1.588mm 0.98 Copper alloys;
aluminium alloys,
and annealed low
carbon steels
C Diamond cone 1.47 Most widely used
range: hardened
steels; cast irons;
deep case-hardened
components
D Diamond cone 0.98 Thin but hard steel-
medium depth case-
hardened
compounds
E Ball, ∅ 3.175mm 0.98 Cast iron,
aluminium alloys;
magnesium alloys,
bearing metals
F Ball, ∅ 1.588mm 0.59 Annealed copper
alloys, thin soft
sheet metals
G Ball, ∅ 1.558mm 1.47 Malleable irons;
phosphor bronze;
gun-metal; cupro-
nikel alloys, etc
H Ball ∅ 3.175mm 0.59 Soft materials; high
ferritic aluminium,
57
lead, zinc
K Ball ∅ 3.175mm 1.47 Aluminium and
magnesium alloys
Table 4.1 Rockwell hardness test conditions
Table 4.2 Rockwell superficial hardness test conditions
Scale Indenter Additional force (kN)
15-N Diamond cone 0.14
30-N Diamond cone 0.29
45-N Diamond cone 0.44
15-T Ball, ∅ 1.588mm 0.14
30-T Ball, ∅ 1.588mm 0.29
45-T Ball, ∅ 1.588mm 0.44
4.3.4 Shore scleroscope In the tests previously described, the test piece must be small enough to mount in
the testing machine, and hardness is measured as a function of indentation. However the
scleroscoe, works on a different principle and hardness is measured as a function of
resilience. Further, since the scleroscope can be carried to the work piece, it is useful for
testing large surfaces such as the slideways on machine tools. A diamond-tipped hammer
of mass 2.5g drops through a height of 250mm. The height of the first rebound indicates
the hardness on a 140-division scale.
4.3.4.1 The effect of processing on hardness
All metals work-harden to some extent when cold-worked. Figure 4.24 shows the
relationship between the Vickers hardness number (HD) and the percentage reduction in
thickness for rolled strip. The metals become harder and more brittle as the amount of
cold-working increases until a point is reached where the metal is so hard and brittle that
cold-working cannot be continued. Aluminium reaches this state when a 60 percent
reduction in strip thickness is achieved in one pass through the rolls of a rolling mill. In
this condition the material is said to be fully work-hardened. The degree of work-
58
hardening or ‘temper’of strip and sheet material is arbitrarily stated as soft (fully
annealed), ¼ hard, ½ hard, ¾ hard, and hard (fully work-hardened).
The effect of heating a work-hardened material such as α brass is shown in Fig.
4.25. Once again very little effect occurs until the temperature of recrystallisation is
reached. At this temperature there is a rapid fall off in hardness, after which the decline in
hardness becomes more gradual as grain growth occurs and the metal becomes fully
annealed.
The effect of heating a quench-hardened plain-carbon steel is more gradual as
shown in Fig. 4.26. During the tempering range of the steel no grain growth occurs, but
there are structural changes. Initially, there is a change in the very hard martensite as
particles of carbide precipitate out. As tempering proceeds and the temperature is
increased, the structure loses its acicular martensitic appearance and spheroidal carbide
particles in a matrix of ferrite can be seen under high magnification. These structural
changes increase the toughness of the metal considerably, but with some loss of hardness.
Figure 4.24 Effect of cold-working on the hardness of various metals
Chapter No. 5
USING NEURAL NETWORK TOOL BOX
60
Chapter No. 5
USING NEURAL NETWORK TOOL BOX
5.1 INTRODUCTION
Neural networks involves very huge amount of calculations in shape of
manipulating data for training and verification that becomes very easy with the help of a
computer. The manipulation can be carried out in any of the programming languages i.e.
C++, Fortran, Matlab etc. In this work Matlab is used for this purpose because, Matlab is
a very powerful tool for mathematical calculation, visualization and programming. In
addition to the pure mathematical part of Matlab there are several tool boxes available to
expand the capabilities of Matlab, the Neural Network Toolbox (NN Toolbox) is one of
these toolboxes.
This chapter is intended for students unacquainted to Matlab and the neural network
toolbox to get practice in using these tools. The contents of this chapter are more focused
toward practical examples and problems than to theory. The reason for this is that most
theory is covered in the help files of Matlab itself. In fact, one will not be able to learn
Matlab and NN Toolbox with this tutorial alone; one will have to actively explore the
documentation and demos available in Matlab. What lacks from Matlab is a set of
examples and problems that helps the user to learn to use the tools.
5.2 THE STRUCTURE OF THE NEURAL NETWORK TOOLBOX The toolbox is based on the network object. This object contains information
about everything that concern the neural network. Type network at the matlab command
prompt, and an empty network will be created and its parameters will be shown. >> network
ans =
Neural Network object:
architecture:
61
numInputs: 0
numLayers: 0
biasConnect: []
inputConnect: []
layerConnect: []
outputConnect: []
targetConnect: []
numOutputs: 0 (read-only)
numTargets: 0 (read-only)
numInputDelays: 0 (read-only)
numLayerDelays: 0 (read-only)
First the architecture parameters are shown. Because the network command creates an
empty network all parameters are set to 0. The subobject structures follows:
subobject structures:
inputs: {0x1 cell} of inputs
layers: {0x1 cell} of layers
outputs: {1x0 cell} containing no outputs
targets: {1x0 cell} containing no targets
biases: {0x1 cell} containing no biases
inputWeights: {0x0 cell} containing no input weights
layerWeights: {0x0 cell} containing no layer weights
This paragraph is subobject structures which is the various input and output
matrices, biases and inputweights.
functions:
adaptFcn: (none)
initFcn: (none)
performFcn: (none)
trainFcn: (none)
The next paragraph is interesting, it contains the training, initialization and performance
functions. The trainFcn and adaptFcn are essentially the same but trainFcn will be
62
used in this tutorial. By setting the trainFcn parameter you tell Matlab which training
algorithm it should use. The ANN toolbox includes almost 20 training functions. The
performance function is the function that determines how well the ANN is doing its task.
The initFcn is the function that initialized the weights and biases of the network. To get
a list of the functions that is available type help nnet. To change one of these functions
to another one in the toolbox or one that you have created, just assign the name of the
function to the parameter, e.g. net.trainFcn = 'mytrainingfun'; The parameters that concerns these functions are listed in the next paragraph. parameters:
adaptParam: (none)
initParam: (none)
performParam: (none)
trainParam: (none)
By changing these parameters you can change the default behavior of the functions
mentioned above. The parameters you will use the most are probably the components of
trainParam. The most used of these are net.trainParam.epochs, which tells the
algorithm the maximum number of epochs to train, and net.trainParam.show that tells
the algorithm how many epochs there should be between each presentation of the
performance. Type help train for more information. The weights and biases are also
stored in the network structure: weight and bias values:
IW: {0x0 cell} containing no input weight matrices
LW: {0x0 cell} containing no layer weight matrices
b: {0x1 cell} containing no bias vectors
other:
userdata: (user stuff)
The .IW component is a cell array that holds the weights between the input layer and the
first hidden layer. The .LW component holds the weights between the hidden layers and
the output layer.
63
5.3 CONSTRUCTING LAYERS
It is assumed that you have an empty network object named `net' in your
workspace, if not, type
>> net = network;
To get one.
Let's start with defining the properties of the input layer. The NNT supports
networks that have multiple input layers. Let’s set this to 1:
>> net.numInputs = 1;
Now we should define the number of neurons in the input layer. This should of
course be equal to the dimensionality of your data set. The appropriate property to set is
net.inputs{i}.size, where i is the index of the input layers. So to make a network,
which has 2 dimensional points as inputs, type: >> net.inputs{1}.size = 2;
This defines (for now) the input layer.
The next properties to set are net.numLayers, which not surprisingly sets the
total number of layers in the network, and net.layers{i}.size, which sets the number
of neurons in the ith layer. To build our example network, we define 2 extra layers (a
hidden layer with 3 neurons and an output layer with 1 neuron), using: >> net.numLayers = 2;
>> net.layers{1}.size = 3;
>> net.layers{2}.size = 1;
For details refer Appendix B
5.3.2 Connecting Layers
Now it's time to define which layers are connected. First, define to which layer
the inputs are connected by setting net.inputConnect(i) to 1 for the appropriate layer i
(usually the first, so i = 1).
The connections between the rest of the layers are defined a connectivity matrix
called net.layerConnect, which can have either 0 or 1 as element entries. If element
(i,j) is 1, then the outputs of layer j are connected to the inputs of layer i.
We also have to define which layer is the output layer by setting
net.outputConnect(i) to 1 for the appropriate layer i.
64
Finally, if we have a supervised training set, we also have to define which layers
are connected to the target values. (Usually, this will be the output layer.) This is done by
setting net.targetConnect(i) to 1 for the appropriate layer i. So, for our example, the
appropriate commands would be >> net.inputConnect(1) = 1;
>> net.layerConnect(2, 1) = 1;
>> net.outputConnect(2) = 1;
>> net.targetConnect(2) = 1; 5.4 SETTING TRANSFER FUNCTIONS
Each layer has its own transfer function which is set through the
net.layers{i}.transferFcn property. So to make the first layer use sigmoid transfer
functions, and the second layer linear transfer functions, use >> net.layers{1}.transferFcn = 'logsig';
>> net.layers{2}.transferFcn = 'purelin';
For detail refer appendix B
5.5 WEIGHTS AND BIASES
Now, define which layers have biases by setting the elements of
net.biasConnect to either 0 or 1, where net.biasConnect(i) = 1 means layer i has
biases attached to it.
To attach biases to each layer in our example network, we'd use >> net.biasConnect = [ 1 ; 1];
Now you should decide on an initialization procedure for the weights and biases.
When done correctly, you should be able to simply issue a >> net = init(net);
to reset all weights and biases according to your choices.
The first thing to do is to set net.initFcn. Unless you have built your own
initialization routine, the value 'initlay' is the way to go. This let's each layer of weights
and biases use their own initialization routine to initialize.
>> net.initFcn = 'initlay';
65
Exactly which function this is should of course be specified as well. This is done
through the property net.layers{i}.initFcn for each layer. The two most practical
options here are Nguyen-Widrow initialization ('initnw', type 'help initnw' for details),
or 'initwb', which let's you choose the initialization for each set of weights and biases
separately.
When using 'initnw' you only have to set
>> net.layers{i}.initFcn = 'initnw';
For each layer i
When using 'initwb', you have to specify the initialization routine for each set of
weights and biases separately. The most common option here is 'rands', which sets all
weights or biases to a random number between -1 and 1. First, use >> net.layers{i}.initFcn = 'initwb';
For each layer i. Next, define the initialization for the input weights, >> net.inputWeights{1,1}.initFcn = 'rands';
And for each set of biases >> net.biases{i}.initFcn = 'rands';
And weight matrices >> net.layerWeights{i,j}.initFcn = 'rands';
Where net.layerWeights{i,j} denotes the weights from layer j to layer i.
5.6 TRAINING FUNCTIONS & PARAMETERS The difference between train and adapt
One of the more counterintuitive aspects of the NNT is the distinction between
train and adapt. Both functions are used for training a neural network, and most of the
time both can be used for the same network.
What then is the difference between the two? The most important one has to do with
incremental training (updating the weights after the presentation of each single training
sample) versus batch training (updating the weights after each presenting the complete
data set).
When using adapt, both incremental and batch training can be used. Which one is
actually used depends on the format of your training set. If it consists of two matrices of
input and target vectors, like >> P = [ 0.3 0.2 0.54 0.6 ; 1.2 2.0 1.4 1.5]
66
P =
0.3000 0.2000 0.5400 0.6000
1.2000 2.0000 1.4000 1.5000
>> T = [ 0 1 1 0 ]
T =
0 1 1 0
The network will be updated using batch training. (In this case, we have 4
samples of 2 dimensional input vectors, and 4 corresponding 1D target vectors).
If the training set is given in the form of a cell array, >> P = {[0.3 ; 1.2] [0.2 ; 2.0] [0.54 ; 1.4] [0.6 ; 1.5]}
P =
[2x1 double] [2x1 double] [2x1 double] [2x1 double]
>> T = { [0] [1] [1] [0] }
T =
[0] [1] [1] [0]
Then incremental training will be used.
When using train on the other hand, only batch training will be used, regardless of
the format of the data (you can use both).
The big plus of train is that it gives you a lot more choice in training functions (gradient
descent, gradient descent w/ momentum, Levenberg-Marquardt, etc.), which are
implemented very efficiently. So when you don't have a good reason for doing
incremental training, train is probably your best choice. (And it usually saves you
setting some parameters).
The most important difference between adapt and train is the difference
between passes and epochs. When using adapt, the property that determines how many
67
times the complete training data set is used for training the network is called
net.adaptParam.passes. But, when using train, the exact same property is now called
net.trainParam.epochs.
5.6.1 Performance Functions
The two most common options here are the Mean Absolute Error (mae) and the
Mean Squared Error (mse). The mae is usually used in networks for classification, while
the mse is most commonly seen in function approximation networks.
The performance function is set with the net.performFcn property, for instance: >> net.performFcn = 'mse';
5.6.2 Train Parameters
If you are going to train your network using train, the last step is defining
net.trainFcn, and setting the appropriate parameters in net.trainParam. Which
parameters are present depends on your choice for the training function.
So if you for example want to train your network using a Gradient Descent w/
Momentum algorithm, you'd set >> net.trainFcn = 'traingdm';
And then set the parameters >> net.trainParam.lr = 0.1;
>> net.trainParam.mc = 0.9; To the desired values (In this case, lr is the learning rate, and mc the momentum
term.)
Two other useful parameters are net.trainParam.epochs, which is the
maximum number of times the complete data set may be used for training, and
net.trainParam.show, which is the time between status reports of the training function.
For example, >> net.trainParam.epochs = 1000;
>> net.trainParam.show = 100; 5.6.3 Adapt Parameters
The same general scheme is also used in setting adapt parameters. First, set
net.adaptFcn to the desired adaptation function. We'll use adaptwb (from 'adapt
weights and biases'), which allows for a separate update algorithm for each layer. Again,
check the Matlab documentation for a complete overview of possible update algorithms.
68
>> net.adaptFcn = 'adaptwb';
Next, since we're using adaptwb, we'll have to set the learning function for all
weights and biases: >> net.inputWeights{1,1}.learnFcn = 'learnp';
>> net.biases{1}.learnFcn = 'learnp';
Where in this example we've used learnp, the Perceptron learning rule. (Type 'help
learnp', etc.)
Finally, a useful parameter is net.adaptParam.passes, which is the maximum
number of times the complete training set may be used for updating the network: >> net.adaptParam.passes = 10;
5.7 BASIC NEURAL NETWORK EXAMPLE
The task is to create and train a neural network that solves the XOR problem.
XOR is a function that returns 1 when the two inputs are not equal see table 5.1.
Table 5.1: The XOR-problem
A B A XOR B
1 1 0
1 0 1
0 1 1
0 0 0
To solve this we will need a feedforward neural network with two input neurons,
and one output neuron. Because that the problem is not linearly separable it will also need
a hidden layer with two neurons.
Now we know how our network should look like, but how do we create it?
To create a new feed forward neural network use the command newff. You have
to enter the max and min of the input values, the number of neurons in each layer and
optionally the activation functions.
69
>> net = newff([0 1; 0 1],[2 1],{'logsig','logsig'})
The variable net will now contain an untrained feedforward neural network with
two neurons in the input layer, two neurons in the hidden layer and one output neuron,
exactly as we want it. The [0 1; 0 1] tells matlab that the input values ranges between
0 and 1. The {'logsig','logsig'} tells matlab that we want to use the logsig function
as activation function in all layers. The first parameter tells the network how many nodes
there should be in the input layer, hence you do not have to specify this in the second
parameter. You have to specify at least as many transfer functions as there are layers, not
counting the input layer. If you do not specify any transfer function Matlab will use the
default settings.
Figure 5.1: The logsig activation function
Now we want to test how good our untrained network is on the XOR problem.
First we construct a matrix of the inputs. The input to the network is always in the
columns of the matrix. To create a matrix with the inputs "1 1", "1 0", "0 1" and "0 0" we
enter:
>> input = [1 1 0 0; 1 0 1 0]
70
input =
1 1 0 0
1 0 1 0
Now we have constructed inputs to our network. Let us push these into the
network to see what it produces as output. The command sim is used to simulate the
network and calculate the outputs, for more information on how to use the command type
helpwin sim. The simplest way to use it, is to enter the name of the neural network and
input matrix, it returns an output matrix.
>> output=sim(net,input)
output =
0.5923 0.0335 0.9445 0.3937
The output was not exactly what we wanted! We wanted (0 1 1 0) but got near to
(0.60 0.03 0.95 0.40). (Note that your network might give a different result, because the
network's weights are given random values at the initialization.)
You can now plot the output and the targets; the targets are the values that we
want the network to generate. Construct the target vector:
>> target = [0 1 1 0]
target =
0 1 1 0
To plot points we use the command "plot". We want that the targets should be
small circles so we use the command:
>> plot(target, 'o')
We want to plot the output in the same window. Normally the contents in a
window are erased when you plot something new in it. In this case we want the targets to
remain in the picture so we use the command hold on. The output is plotted as +'s.
71
>> hold on
>> plot(output, '+')
In the resulting figure (Fig5.2) it's easy to see that the network does not give the
wanted results. To change this we have to train it. Now we will train the network by hand
by adjusting the weights manually.
Figure 5.2: The targets and the actual output from an untrained XOR network. The
targets are represented as 'o' and the output as '+'
5.7.1 Manually set weights
The network we have constructed so far does not really behave as it should. To
correct this the weights will be adjusted. All the weights are stored in the net structure
that were created with newff. The weights are numbered by the layers they connect and
the neurons within these layers. To get the value of the weights between the input layer
and the first hidden layer we type:
>> net.IW
ans =
72
[2x2 double]
[]
>> net.IW{1,1}
ans =
5.5008 -5.6975
2.5404 -7.5011
This means that the weight between the second neuron in the input layer to the
first neuron in the first hidden layer is -5.6975. To change it to 1, enter: >> net.IW{1,1}(1,2)=1;
>> net.IW{1,1}
ans =
5.5008 1.0000
2.5404 -7.5011
The weights between the hidden layers and the output layer are stored in the .LW
component, which can be used in the same manner as .IW. >> net.LW
ans =
[] []
[1x2 double] []
>> net.LW{2,1}
ans =
-3.5779 -4.3080
The change we made in the weight makes our network give an other output when
we simulate it, try it by enter: >> output=sim(net,input)
output =
73
0.8574 0.0336 0.9445 0.3937
>> plot(output,'g*');
Now the new output will appear as green stars in your picture, are they closer to
the o's than the +'s were?
5.7.2 Training Algorithms In the neural network toolbox there are several training algorithms already
implemented. That is good because they can do the heavy work of training much
smoother and faster than we do by manually adjust the weights. Now let us apply the
default training algorithm to our network. The matlab command to use is train, it takes
the network, the input matrix and the target matrix as input. The train command returns
a new trained network. For more information type helpwin train. In this example we
do not need all the information that the training algorithms shows, so we turn it of by
entering: >> net.trainParam.show=NaN;
The most important training parameters are .epochs which determines the
maximum number of epochs to train, .show the interval between each presentation of
training progress. If the gradient of the performance is less than .min_grad the training is
ended. The .time component determines the maximum time to train.
And to train the network enter:
>> net = train(net,input,target);
Because of the small size of the network, the training is done in only a second or two.
Now we try to simulate the network again, to se how it reacts to the inputs: >> output = sim(net,input)
output =
0.0000 1.0000 1.0000 0.0000
74
That was exactly what we wanted the network to output! You may now plot the
output and see that the +'s falls in the o's. Now examine the weights that the training
algorithm has set, does they look like the weights that you found? >> net.IW{1,1}
ans =
11.0358 -9.5595
16.8909 -17.5570
>> net.LW{2,1}
ans =
25.9797 -25.7624
It is also possible to enter the name of the training algorithm when the network is
created, see help newff for more information
5.8 GRAPHICAL USER INTERFACE
5.8.1 Introduction to the GUI
The graphical user interface (GUI) is designed to be simple and user friendly, but
we will go through a simple example to get you started.
In what follows you bring up a GUI Network/Data Manager window. This
window has its own work area, separate from the more familiar command line
workspace. Thus, when using the GUI, you might "export" the GUI results to the
(command line) workspace. Similarly you may want to "import" results from the
command line workspace to the GUI.
Once the Network/Data Manager is up and running, you can create a network,
view it, train it, simulate it and export the final results to the workspace. Similarly, you
can import data from the workspace for use in the GUI.
75
The following example deals with a perceptron network. We go through all the
steps of creating a network and show you what you might expect to see as you go along.
5.8.2 Create a Perceptron Network (nntool)
We create a perceptron network to perform the AND function in this example. It
has an input vector p= [0 0 1 1;0 1 0 1] and a target vector t=[0 0 0 1]. We call
the network ANDNet. Once created, the network will be trained. We can then save the
network, its output, etc., by "exporting" it to the command line.
5.8.3 Input and target
To start, type nntool. The following window appears.
Figure 5.3 Network data manager
76
Click on Help to get started on new problem and see descriptions of the buttons and lists.
First, we want to define the network input, which we call p, as having the
particular value [0 0 1 1;0 1 0 1]. Thus, the network had a two-element input and four sets
of such two-element vectors are presented to it in training. To define this data, click on
New Data, and a new window, Create New Data appears. Set the Name to p, the Value
to [0 0 1 1;0 1 0 1], and make sure that Data Type is set to Inputs.The Create New Data
window will then look like this:
Figure 5.4 Create new data window
Now click Create to actually create an input file p. The Network/Data Manager
window comes up and p shows as an input.
Next we create a network target. Click on New Data again, and this time enter the
variable name t, specify the value [0 0 0 1], and click on Target under data type.
Again click on Create and you will see in the resulting Network/Data Manager window
that you now have t as a target as well as the previous p as an input.
77
5.8.4 Create Network
Now we want to create a new network, which we will call ANDNet.To do this,
click on New Network, and a CreateNew Network window appears. Enter ANDNet
under Network Name. Set the Network Type to Perceptron, for that is the kind of
network we want to create. The input ranges can be set by entering numbers in that field,
but it is easier to get them from the particular input data that you want to use. To do this,
click on the down arrow at the right side of Input Range. This pull-down menu shows
that you can get the input ranges from the file p if you want. That is what we want to do,
so click on p. This should lead to input ranges [0 1;0 1].We want to use a hardlim
transfer function and a learnp learning function, so set those values using the arrows for
Transfer function and Learning function respectively. By now your Create New
Network window should look like:
Figure 5.5 Create new network window
Next you might look at the network by clicking on View. For example:
78
Figure 5.6 View network window
This picture shows that you are about to create a network with a single input
(composed of two elements), a hardlim transfer function, and a single output. This is the
perceptron network that we wanted.
Now click Create to generate the network. You will get back the Network/Data
Manager window. Note that ANDNet is now listed as a network.
5.8.5 Train the Perceptron
To train the network, click on ANDNet to highlight it. Then click on Train. This
leads to a new window labeled Network:ANDNet. At this point you can view the
network again by clicking on the top tab Train. You can also check on the initialization
by clicking on the top tab Initialize. Now click on the top tab Train. Specify the inputs
and output by clicking on the left tab Training Info and selecting p from the pop-down
list of inputs and t from the pull-down list of targets. The Network:ANDNet window
should look like:
79
Figure 5.7 Main network window
Note that the Training Result Outputs and Errors have the
name ANDNet appended to them. This makes them easy to identify later when they are
exported to the command line.
While you are here, click on the Training Parameters tab. It shows you
parameters such as the epochs and error goal. You can change these parameters at this
point if you want.
Now click Train Network to train the perceptron network. You will see the
following training results.
80
Figure 5.8 training result window
Thus, the network was trained to zero error in four epochs. (Note that other kinds
of networks commonly do not train to zero error and their error commonly cover a much
larger range. On that account, we plot their errors on a log scale rather than on a linear
scale such as that used above for perceptrons.)
You can check that the trained network does indeed give zero error by using the
input p and simulating the network. To do this, get to the Network/Data Manager
window and click on Network Only: Simulate). This will bring up the
Network:ANDNet window. Click there on Simulate. Now use the Input pull-down
menu to specify p as the input, and label the output as ANDNet_outputsSim to
distinguish it from the training output. Now click Simulate Network in the lower right
corner. Look at the Network/Data Manager and you will see a new variable in the
81
output: ANDNet_outputsSim. Double-click on it and a small window
Data:ANDNet_outputsSim appears with the value
[0 0 0 1]
Thus, the network does perform the AND of the inputs, giving a 1 as an output
only in this last case, when both inputs are 1.
5.8.6 Export Perceptron Results to Workspace
To export the network outputs and errors to the MATLAB command line
workspace, click in the lower left of the Network:ANDNet window to go back to the
Network/Data Manager. Note that the output and error for the ANDNet are listed in the
Outputs and Error lists on the right side. Next click on Export This will give you an
Export or Save from Network/Data Manager window. Click on ANDNet_outputs and
ANDNet_errors to highlight them, and then click the Export button. These two variables
now should be in the command line workspace. To check this, go to the command line
and type who to see all the defined variables. The result should be
who
Your variables are:
ANDNet_errors ANDNet_outputs
You might type ANDNet_outputs and ANDNet_errors to obtain the following
ANDNet_outputs =
0 0 0 1
and
ANDNet_errors =
82
0 0 0 0.
You can export p, t, and ANDNet in a similar way. You might do this and check
with who to make sure that they got to the command line.
Now that ANDNet is exported you can view the network description and examine
the network weight matrix. For instance, the command
ANDNet.iw{1,1}
gives
ans =
2 1
Similarly,
ANDNet.b{1}
yields
ans =
-3.
5.8.7 Clear Network/Data Window
You can clear the Network/Data Manager window by highlighting a variable
such as p and clicking the Delete button until all entries in the list boxes are gone. By
doing this, we start from clean slate.
Alternatively, you can quit MATLAB. A restart with a new MATLAB, followed
by nntool, gives a clean Network/Data Manager window.
83
Recall however, that we exported p, t, etc., to the command line from the
perceptron example. They are still there for your use even after you clear the
Network/Data Manager.
5.8.8 Importing from the Command Line
To make thing simple, quit MATLAB. Start it again, and type nntool to begin a
new session.
Create a new vector.
r= [0; 1; 2; 3]
r =
0
1
2
3
Now click on Import, and set the destination Name to R (to distinguish between
the variable named at the command line and the variable in the GUI). You will have a
window that looks like this
84
.
Figure 5.9 Import / load window
Now click Import and verify by looking at the Network/DAta Manager that the
variable R is there as an input.
5.8.9 Save a Variable to a File and Load It Later
Bring up the Network/Data Manager and click on New Network. Set the name to
mynet. Click on Create. the network name mynet should appear in the Network/Data
Manager. In this same manager window click on Export. Select mynet in the variable
85
list of the Export or Save window and click on Save. This leads to the Save to a MAT
file window. Save to a file mynetfile.
Now lets get rid of mynet in the GUI and retrieve it from the saved file. First go to
the Data/Network Manager, highlight mynet, and click Delete. Next click on Import.
This brings up the Import or Load to Network/Data Manager window. Select the
Load from Disk button and type mynetfile as the MAT-file Name. Now click on
Browse. This brings up the Select MAT file window with mynetfile as an option that
you can select as a variable to be imported. Highlight mynetfile, press Open, and you
return to the Import or Load to Network/Data Manager window. On the Import As
list, select Network. Highlight mynet and lick on Load to bring mynet to the GUI. Now
mynet is back in the GUI Network/Data Manager window.
Chapter No. 6
EXPERIMENTAL WORK
86
Chapter No. 6
EXPERIMENTAL WORK
The data shown in the table 6.1 was used to train the network. Some of the data
was used as unseen data to obtain the result; this is shown in table 6.2.
6.1 DATA SET USED
C= carbon (minimum and maximum) Mn= manganese (minimum and maximum) P=phasphorus S=sulphur T-S= tensile strength
S.No. SAE No. C-min Mn-min P-max S-max T-S 1 1006 0.08 0.25 0.04 0.05 43000 2 1008 0.1 0.25 0.04 0.05 44000 3 1009 0.15 0.6 0.04 0.05 43000 4 1010 0.08 0.3 0.04 0.05 47000 5 1012 0.1 0.3 0.04 0.05 48000 6 1015 0.13 0.3 0.04 0.05 50000 7 1016 0.13 0.6 0.04 0.05 55000 8 1017 0.15 0.3 0.04 0.05 53000 9 1018 0.15 0.6 0.04 0.05 58000 10 1019 0.15 0.7 0.04 0.05 59000 11 1020 0.18 0.3 0.04 0.05 55000 12 1022 0.18 0.7 0.04 0.05 62000 13 1023 0.2 0.3 0.04 0.05 56000 14 1024 0.19 1.35 0.04 0.05 74000 15 1025 0.22 0.3 0.04 0.05 58000 16 1027 0.22 1.2 0.04 0.05 75000 17 1030 0.28 0.6 0.04 0.05 68000 18 1033 0.3 0.7 0.04 0.05 72000 19 1035 0.32 0.6 0.04 0.05 72000 20 1037 0.32 0.7 0.04 0.05 74000 21 1038 0.35 0.6 0.04 0.05 75000 22 1039 0.37 0.7 0.04 0.05 79000 23 1040 0.37 0.6 0.04 0.05 76000 24 1042 0.4 0.6 0.04 0.05 80000 25 1043 0.4 0.7 0.04 0.05 82000 26 1045 0.43 0.6 0.04 0.05 82000 27 1046 0.43 0.7 0.04 0.05 85000 28 1050 0.48 0.6 0.04 0.05 90000 29 1052 0.47 1.2 0.04 0.05 108000
87
30 1055 0.5 0.6 0.04 0.05 94000 31 1060 0.55 0.6 0.04 0.05 98000 32 1065 0.6 0.6 0.04 0.05 100000 33 1070 0.65 0.6 0.04 0.05 102000 34 1074 0.7 0.5 0.04 0.05 105000 35 1078 0.72 0.3 0.04 0.05 100000 36 1084 0.8 0.6 0.04 0.05 119000 37 1085 0.8 0.7 0.04 0.05 121000 38 1086 0.8 0.3 0.04 0.05 112000 39 1090 0.85 0.6 0.04 0.05 122000 40 1095 0.9 0.3 0.04 0.05 120000
Table 6.1. Data set to be used for training the network
S.No. SAE No. C-min Mn-min P-max S-max T-S
1 1021 0.18 0.6 0.04 0.05 61000 2 1026 0.22 0.6 0.04 0.05 64000 3 1036 0.3 1.2 0.04 0.05 83000 4 1041 0.36 1.35 0.04 0.05 92000 5 1049 0.46 0.6 0.04 0.05 87000 6 1064 0.6 0.5 0.04 0.05 97000 7 1080 0.75 0.6 0.04 0.05 112000
Table 6.2. Data set to be used as unseen data for the network
6.2 METHODOLOGY
The data shown in table 6.1 was used to train the network while the performance
of the network was tested on the data shown in table 6.2.
6.2.1 Algorithm
% p1,p2,p3 and p4 are the variables with the values of C, Mn, P, and S respectively%
p1=[0.08 0.1 0.15 0.08 0.1 0.13 0.13 0.15 0.15 0.15 0.18 0.18 0.2 0.19 0.22 0.22 0.28 0.3
0.32 0.32 0.35 0.37 0.37 0.4 0.4 0.43 0.43 0.48 0.47 0.5 0.55 0.6 0.65 0.7 0.72 0.8 0.8 0.8
0.85 0.9];
p2=[0.25 0.025 0.6 0.3 0.3 0.3 0.6 0.3 0.6 0.7 0.3 0.7 0.3 1.35 0.3 1.2 0.6 0.7 0.6 0.7 0.6
0.7 0.6 0.6 0.7 0.6 0.7 0.6 1.2 0.6 0.6 0.6 0.6 0.5 0.3 0.6 0.7 0.3 0.6 0.3];
88
p3=[0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04
0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04 0.04
0.04 0.04 0.04 0.04 0.04];
p4=[0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05
0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05
0.05 0.05 0.05 0.05 0.05];
% t is the target value for the above inputs, they will be used for training%
t=[43000 44000 43000 47000 48000 50000 55000 53000 58000 59000 55000 62000
56000 74000 58000 75000 68000 72000 72000 74000 75000 79000 76000 80000 82000
82000 85000 90000 108000 94000 98000 100000 102000 105000 100000 119000
121000 112000 122000 120000];
p=[p1;p2;p3;p4];
% Initializing “net” with one neuron , one layer, tansig function for input layer, purelin
function for hidden layer and trainlm function for training%
net=newff(minmax(p),{1,1},{'tansig','purelin'},'trainlm');
** Warning in INIT
** Network "input{1}.range" has a row with equal min and max values.
** Constant inputs do not provide useful information.
net.trainparam.show=10;
net.trainparam.goal=0.001;
%Training the network with inputs ‘p’ and target ‘t’%
net=train(net,p,t);
TRAINLM, Epoch 0/100, MSE 6.6312e+009/0.001, Gradient 3.15473e+006/1e-010
TRAINLM, Epoch 4/100, MSE 5.57749e+008/0.001, Gradient 5.05325e-006/1e-010
89
TRAINLM, Maximum MU reached, performance goal was not met.
Figure 6.1 training of the network with TRAINLM function and 1 neuron % Defining input vectors for testing%
in1=[0.18;0.06;0.04;0.05];
in2=[022;0.6;0.04;0.05];
in2=[0.22;0.6;0.04;0.05];
in3=[0.3;1.2;0.04;0.05];
in4=[0.36;1.35;0.04;0.05];
in5=[0.46;0.6;0.04;0.05];
in6=[0.6;0.5;0.04;0.05];
in7=[0.75;0.6;0.04;0.05];
NUMBER OF NEURONS = 1
%Simulating the network with inputs 1, 2, 3, 4, 5, 6,and 7%
y=sim(net,in1)
90
y = 7.7270e+004
y=sim(net,in2)
y = 7.7270e+004
y=sim(net,in3)
y = 8.5667e+004
y=sim(net,in4)
y = 8.5667e+004
y=sim(net,in5)
y = 7.7270e+004
y=sim(net,in6)
y = 7.7270e+004
y=sim(net,in7)
y = 7.8113e+004
91
NUMBER OF NEURONS = 7
Figure 6.2 training of network with 7neurons
%%Simulating the network with inputs 1, 2, 3, 4, 5, 6,and 7%
y=sim(net,in1)
y = 5.2638e+004
y=sim(net,in2)
y = 6.8513e+004
y=sim(net,in3)
y = 6.8513e+004
92
y=sim(net,in4)
y = 6.8513e+004
y=sim(net,in5)
y = 8.8258e+004
y=sim(net,in6)
y = 1.0500e+005
y=sim(net,in7)
y = 8.8258e+004
93
NUMBER OF NEURONS = 9
Figure 6.3 training of network with 9 neurons %Simulating the network with inputs 1, 2, 3, 4, 5, 6,and 7%
y=sim(net,in1)
y = 5.5400e+004
y=sim(net,in2)
y = 8.2920e+004
y=sim(net,in3)
y = 1.0800e+005
94
y=sim(net,in4)
y = 1.0800e+005
y=sim(net,in5)
y = 8.2920e+004
y=sim(net,in6)
y = 8.2920e+004
y=sim(net,in7)
y = 8.2920e+004
Figure 6.4 the network
Chapter No. 7 RESULTS AND CONCLUSION
95
Chapter No. 7
RESULTS AND CONCLUSION
7.1 RESULTS
S.No. SAE No. C-min Mn-min P-max S-max T-S (act) 1neuron 7 neuron 9 neurons1 1021 0.18 0.6 0.04 0.05 61000 77270 52638 55400 2 1026 0.22 0.6 0.04 0.05 64000 77270 68513 82920 3 1036 0.3 1.2 0.04 0.05 83000 85667 68513 108000 4 1041 0.36 1.35 0.04 0.05 92000 85667 68513 108000 5 1049 0.46 0.6 0.04 0.05 87000 77270 88258 82920 6 1064 0.6 0.5 0.04 0.05 97000 77270 105000 82920 7 1080 0.75 0.6 0.04 0.05 112000 78113 88258 82920
Average Error 18.6062 16.26343 18.26556
Weights to layer 1 from input
net.iw{1,1}
ans =
1.0e+004 *
0.8415 -0.2418 0.0978 0.1222
0.5361 -5.3917 -0.1597 -0.1996
-4.1999 2.2293 -0.2352 -0.2940
4.8301 -1.5508 0.0239 0.0299
-0.2510 -0.9045 -0.0411 -0.0513
-1.7587 3.4961 -0.1782 -0.2228
-0.9265 -2.8471 0.1268 0.1585
0.5230 -6.9870 0.0877 0.1096
-2.4701 -6.0088 0.1428 0.1785
Weight to layer
lw{2,1}
96
[9468.7077 27719.2353 -11582.5085 16750 3650.5453 -20946.5199 -12540 16540 -
30300]
Bias To Layer 1
b{1}
[24454.0639; -39904.3358; -58806.4363; 5984.5838; -10268.4537; -44553.1615;
31696.3873; 21915.2677; 35697.9616]
Bias to layer 2
b{2}
[54322.0444]
comparison of actual value and predicted value with 1 neuron
1
10
100
1000
10000
100000
1000000
1 2 3 4 5 6 7
x
Graph 7.1 Comparisons of actual value and predicted value with 1 neuron
97
comparison of actual value and predicted value with 7 neuron
1
10
100
1000
10000
100000
1000000
1 2 3 4 5 6 7
x
Graph 7.2 Comparisons of actual value and predicted value with 7 neuron
comparison of actual value and predicted value with 9 neuron
1
10
100
1000
10000
100000
1000000
1 2 3 4 5 6 7
x
Graph 7.3 Comparisons of actual value and predicted value with 9 neuron
98
0
20000
40000
60000
80000
100000
120000
0 20000 40000 60000 80000 100000 120000
Graph 7.4 Regression line for the values predicted and actual
7.2 CONCLUSION
The mechanical properties of plain carbon steels were predicted using feed forward
back propagation artificial neural network, with an error of 18.6062 with 1 neuron,
16.26343 with 7 neurons and 18.26556 with 9 neurons. The reason of these Errors is the use of
constant data for sulfur and phosphorus. Using more parameters and experimental data can
reduce errors. Overall performance of neural networks was very satisfactory; it is a highly
significant and beneficial tool in design, development and analysis of plain carbon steels that will
result in increasing efficiency and productivity.
7.3 FUTURE WORK
In future, more data and parameters can be used to upgrade the present work. Such as
• Process of manufacturing
• Heat treatment performed
• Type of product
• Mechanical working done
Genetic algorithm may be applied to obtain reverse predictions, such as, obtaining composition by using mechanical properties as input parameters.
APPENDICES
99
APPENDIX A
DEFINITION OF TERMS
Activation / initialisation function
The time varying value that is the output of a neuron.
Backpropagation (generalized delta-rule)
A name given to the process by which the Perceptron neural network is
"trained" to produce good responses to a set of input patterns. In light of this, the
Perceptron network is sometimes called a "back-prop" network.
Bias
The net input (or bias) is proportional to the amount that incoming neural
activations must exceed in order for a neuron to fire.
Connectivity
The amount of interaction in a system, the structure of the weights in a
neural network, or the relative number of edges in a graph.
Pattern recognition
The act of identifying patterns within previously learned data. A neural
network even in the presence of noise or when some data is missing can carry this
out.
Epoch
One complete presentation of the training set to the network during
training.
Input layer
Neurons whose inputs are fed from the outside world.
Learning algorithms (supervised, unsupervised)
100
An adaptation process whereby synapses, weights of neural network's,
classifier strengths, or some other set of adjustable parameters is automatically
modified so that some objective is more readily achieved. The backpropagation
and bucket brigade algorithms are two types of learning procedures.
Learning rule
The algorithm used for modifying the connection strengths, or weights, in
response to training patterns while training is being carried out.
Layer
A group of neurons that have a specific function and are processed as a
whole. The most common example is in a feedforward network that has an input
layer, an output layer and one or more hidden layers.
Monte-Carlo method
The Monte-Carlo method provides approximate solutions to a variety of
mathematical problems by performing statistical sampling experiments on a
computer.
Multilayer-perceptron (MLP)
A type of feedforward neural network that is an extension of the
perceptron in that it has at least one hidden layer of neurons. Layers are updated
by starting at the inputs and ending with the outputs. Each neuron computes a
weighted sum of the incoming signals, to yield a net input, and passes this value
through its sigmoidal activation function to yield the neuron's activation value.
Unlike the perceptron, an MLP can solve linearly inseparable problems.
Neural Network (NN)
A network of neurons that are connected through synapses or weights.
Each neuron performs a simple calculation that is a function of the activations of
the neurons that are connected to it. Through feedback mechanisms and/or the
nonlinear output response of neurons, the network as a whole is capable of
performing extremely complicated tasks, including universal computation and
universal approximation. Three different classes of neural networks are
101
feedforward, feedback, and recurrent neural networks, which differ in the degree
and type of connectivity that they possess.
Neuron
A simple computational unit that performs a weighted sum on incoming
signals, adds a threshold or bias term to this value to yield a net input, and maps
this last value through an activation function to compute its own activation. Some
neurons, such as those found in feedback or Hopfield networks, will retain a
portion of their previous activation.
Output neuron
A neuron within a neural network whose outputs are the result of the
network.
Perceptron
An artificial neural network capable of simple pattern recognition and
classification tasks. It is composed of three layers where signals only pass forward
from nodes in the input layer to nodes in the hidden layer and finally out to the
output layer. There are no connections within a layer.
Sigmoid function
An S-shaped function that is often used as an activation function in a
neural network.
Threshold
A quantity added to (or subtracted from) the weighted sum of inputs into a
neuron, which forms the neuron's net input. Intuitively, the net input (or bias) is
proportional to the amount that the incoming neural activations must exceed in
order for a neuron to fire.
Training set
A neural network is trained using a training set. A training set
comprises information about the problem to be solved as input stimuli. In
some computing systems the training set is called the "facts" file.
102
Weight
In a neural network, the strength of a synapse (or connection) between two
neurons. Weights may be positive (excitatory) or negative (inhibitory). The
thresholds of a neuron are also considered weights, since they undergo adaptation
by a learning algorithm.
103
APPENDIX B
Network Layers
The term `layer' in the neural network sense means different things to different
people. In the NNT, a layer is defined as a layer of neurons, with the exception of the
input layer. So in NNT terminology this would be a one-layer network:
Figure:
And this would be a two-layer network:
Figure:
Each layer has a number of properties, the most important being the
transfer functions of the neurons in that layer, and the function that defines the net
input of each neuron given its weights and the output of the previous layer
104
Activation Functions
When a neuron updates it passes the sum of the incoming signals through an
activation function, or transfer function as Matlab calls it. There are different types of
activation functions, some are saturated and assures that the output value lies within a
specific range, like logsig, tansig, hardlims and satlin. Some transfer functions are
not saturated like purelin. Some of the transfer functions in the neural network toolbox
are plotted in figure 5.3. The transfer function is chosen when you create the network,
and is assigned to each layer. To create a feed forward network with a layer of two input
neurons, three tansig neurons in the hidden layer and one logsig neuron in the output
layer enter: >> net=newff([0 1;0 1],[3 1],{'tansig','logsig'});
Figure: Transfer functions supplied by Matlab plotted in the same scale. Note the
difference between tansig and logsig. tansig ranges between -1 and 1 and logsig
ranges between 0 and 1. The same relationship applies between hardlim and hardlims
and between satlin and satlins.
BIBLIOGRAPHY
105
BIBLIOGRAPHY
1. R. L. Timings “Engineering Materials” vol; 1 longman group United Kingdom 1994.
2. Heikkin, Koivo : “Neural Networks: Basics Using Matlab, Neural Network Tool
Box” USA 2000.
3. Iqbal Shah “Tensile Properties Of Austenitic Stainless Steels” UK 2002.
4. H. K. P. H. Bhadeshia “Neural Networks In Materials Science” isij international 39:10 1999 pp 966-979.
5. Degramo E.Paul “Materials And Process In Manufacturing” 9th edition, John
Willey And Sons United States 2003. 6. Colanglo, Vito J “Analysis of Metallurgical Failure” 2nd edition John Willey And
Sons Singapore 1987.
7. O. P. Khana “Text Book Of Metallurgy And Materials Engineering” India 2002.
8. M. H. Jokhio, M. A. Unar “ Application Of Neural Network In Powder Metallurgy” Engineering Materials Proceeding, 2004.
9. Internet web site www.ide.his.se
10. Demuth, Markbeak “Neural Network Tool Box For Matlab” Mathwork Inc. USA
2000.
11. Internet web site www.astm.org
12. Internet webs site www.mathworks.com
13. Internet webs site www.igi.trgraz.at
14. Internet webs site www.cs.wisc.edu
15. Internet webs site www.statsoftinc.com
16. Internet webs site www.azom.com
17. Internet webs site http://carol.wins.ura.nl
18. Internet webs site http://envistat.esa.cn
106
19. Internet webs site http://www.brain.web-us.com
20. Internet webs site http://njuct.edu.cn
21. Internet webs site www.baldwininternational.com
22. Internet webs site www.cs.man.ac.uk
23. Internet webs site www.tms.org/pubs/jom.htm
24. Internet web site http://www.torch.ch/matos/convolutions.pdf.
25. Carlos Gershenson “Artificial Neural Networks for Beginners” UK 2003
26. Internet web site www.benbest.com
27. Ivan Galkin, U. Mass Lawell “Crash Introduction To Artificial Neural Networks” (Materials for UML 91.531 Data Mining Course).