ARTIFICIAL NEURAL NETWORK ANALYSIS OF MULTIPLE IBA SPECTRA H.F.R.Pinho 1,2, A. Vieira 2, N.R.Nené...

1
ARTIFICIAL NEURAL NETWORK ANALYSIS OF MULTIPLE IBA SPECTRA H.F.R.Pinho 1,2 , A. Vieira 2 , N.R.Nen é 1 , N. P. Barradas 1,3 1 Instituto Tecnológico e Nuclear, E.N. 10, 2685 Sacavém, Portugal , 2 ISEP, R. São Tomé, 4200 Porto, Portugal 3 Centro de Física Nuclear da Universidade de Lisboa, Avenida Prof. Gama Pinto 2, 1699 Lisboa, Portugal 1. INTRODUCTION We have previously used artificial neural networks (ANNs) for automated analysis of Rutherford backscattering (RBS) data [1,2] One of the limitations was that one single spectrum could be analyzed from each sample. When more than one spectrum is collected, each has to be analyzed separately, leading to different results and hence reduced accuracy. Furthermore, often complementary data are collected, either using different experimental conditions, or even using a different technique such as elastic recoil detection (ERDA) or particle-induced x-ray analysis. In that case, simultaneous and self-consistent analysis of all data is essential. In this work we developed a code based on ANN to perform the analysis of multiple RBS and ERDA spectra collected from the same sample. The ANN developed was applied to a very simple case: determination of the stoichiometry of TiNOH samples 3. EXPERIMENTAL CONDITIONS Samples 15 TiNxOyHz samples were deposited by reactive rf magnetron sputtering from a high purity Ti target (99.731 %) onto polished high-speed stainless steel (AISI M2). Spectra 35 MeV 35 Cl 7+ ERDA spectra were collected. Five spectra were obtained simultaneously from each sample: one recoil spectrum for each element, and the backscattering spectrum from the Ti (fig. 2). Conventional data analysis was done with the code NDF [4]. 2. ARTIFICIAL NEURAL NETWORKS Recognize recurring patterns in input data, without Physics knowledge Ideal to do automatically what analysts have long done: relate specific data features to specific sample properties …because spectra can be treated just as pictures. How? By supervised learning from known examples: -training phase: give the ANN many examples where the solution is known beforehand; -let the ANN adjust itself to this training set; -check it against an independent set of data: the test set; -do it until no improvement can be found. Network Architecture (N, I 1 ,….., I n , M) (fig.1) N-number of inputs M-number of outputs I i -number of nodes in hidden layer i Present work Without pre-processing N- 130 yield values (26 channels of each spectrum one after the other, including leading edge) M- 4 (concentration of each element) With pre-processing N- 5 (integrals of the 26 channels of each spectrum) M- 4 (concentration of each element) Network Connectivity Fully linked- full connectivity (standard architecture) from all nodes in one layer to all nodes in the next layer. Cluster linked- input corresponding to each spectrum (either 26 yields or their integral) as being one cluster, connected only to a given set of nodes in the first hidden layer. Other clusters, corresponding to the other spectra, are not connected to that set of nodes. Backpropagation Each connection has a given weight, initially random; • Give known values to the inputs and outputs; • Adjust the weights to minimize the difference between the given outputs and the calculated ones. ANN Training - 1800 examples corresponding to a very broad range of possible elemental concentrations were fed to train the net. These examples were constructed from simulated experimental data (fig.3) with Poisson noise added. The training consisted in 50 iteration steps. ANN Testing - 200 independent examples (not used during training) were used to test the coefficients. A thin carbon layer of varying thickness was considered, in order to emulate real-life conditions, where such layers are often deposited during the experiment due to poor vacuum conditions. 4. TRAIN AND TEST SET 6. SUMMARY AND OUTLOOK We developed artificial neural networks capable of analysing multiple ion beam analysis spectra collected from the same sample. The ANNs were applied to a simple problem, the determination of the stoichiometry of TiNOH samples measured with heavy ion ERDA. This allowed us to make a thorough study of network architecture, connectivity, and effectiveness of pre- processing. Small networks using the spectral yields with no pre-processing achieve the best results in real experimental data. However, the ANNs must be cluster-linked so that each spectrum is only connected to a subset of nodes in the first hidden layer exclusively dedicated to that spectrum. Effective automatic pre-processing (as opposed to a-priori pre-processing) is achieved, leading to very efficient networks that are easy to train. We expect this type of architecture to be easy to scale up to complex multiple IBA spectra problems. The authors gratefully acknowledge the financial support of FCT under grant POCTI/CTM/40059/2001. 5. RESULTS inputs outputs Fig.1 [1] N. P. Barradas and A. Vieira, Phys. Rev. E62 (2000) 5818. [2] V. Matias, G. Öhl, J.C. Soares, N. P. Barradas, A. Vieira, P.P. Freitas, S. Cardoso, Phys. Rev. E 67 (2003) 046705. [3] E. Alves, A. Ramos, N. P. Barradas, F. Vaz, P. Cerqueira, L. Rebouta, U. Kreissig, Surf. Coatings Technology 180-181 (2004) 372 [4] N.P. Barradas, C. Jeynes, and R.P. Webb, Appl. Phys. Lett. 71 (1997) 291. [5] C. M. Bishop, Neural Networks for Pattern Recognition (Oxford: Oxford University Press 0 1000 2000 3000 4000 R B S : T i Y ield (counts) 0 50 100 150 E R D A : T i Y ield (counts) 0 50 100 150 200 E R D A : N Y ield (counts) 50 100 150 200 250 0 50 100 150 200 E R D A : O Y ield (counts) C hannel 50 100 150 200 250 0 20 40 60 E R D A : H Y ield (counts) C hannel Fig.2 Architect ure e rms (train set) e rms (test set) (Ti ) (N) (O) (H) (130,5,4) 0.024 0.025 0.0 29 0.0 43 0.0 49 0.1 47 (130,15,4 ) 0.016 0.017 0.0 32 0.0 46 0.0 45 0.0 81 (130,25,4 ) 0.016 0.017 0.0 33 0.0 46 0.0 43 0.0 74 (130,50,4 ) 0.015 0.016 0.0 29 0.0 40 0.0 46 0.0 83 (130,10,5 ,4) 0.017 0.018 0.0 35 0.0 51 0.0 52 0.0 80 (130,20,1 0,4) 0.015 0.017 0.0 29 0.0 47 0.0 43 0.0 97 (130,20,2 0,4) 0.013 0.015 0.0 33 0.0 55 0.0 51 0.1 11 (130,30,2 0,4) 0.015 0.016 0.0 22 0.0 47 0.0 44 0.0 81 (130,40,2 0,4) 0.015 0.016 0.0 30 0.0 42 0.0 44 0.0 81 (130,40,3 0,4) 0.015 0.015 0.0 25 0.0 55 0.0 46 0.0 76 (130,50,3 0,4) 0.016 0.017 0.0 33 0.0 37 0.0 43 0.1 00 Architec ture e rms (train set) e rms (test set) (Ti ) (N) (O) (H) (5,5,4) 0.030 0.029 0.0 20 0.0 55 0.0 44 0.1 34 (5,15,4) 0.027 0.025 0.0 22 0.0 49 0.0 45 0.0 93 (5,25,4) 0.027 0.025 0.0 19 0.0 52 0.0 33 0.1 02 (5,50,4) 0.029 0.027 0.0 19 0.0 59 0.0 38 0.0 89 (5,10,5, 4) 0.031 0.030 0.0 25 0.0 43 0.0 40 0.1 24 (5,20,10 ,4) 0.028 0.027 0.0 21 0.0 55 0.0 40 0.0 97 (5,20,20 ,4) 0.027 0.026 0.0 25 0.0 49 0.0 45 0.1 05 (5,30,20 ,4) 0.037 0.036 0.0 25 0.0 51 0.0 44 0.1 06 (5,40,20 ,4) 0.037 0.037 0.0 23 0.0 60 0.0 39 0.1 09 (5,40,30 ,4) 0.026 0.026 0.0 21 0.0 46 0.0 35 0.1 13 (5,50,30 ,4) 0.027 0.026 0.0 22 0.0 46 0.0 45 0.0 81 Architec ture e rms (train set) e rms (test set) (Ti ) (N) (O) (H) (130,5,4 ) 0.025 0.024 0.0 19 0.0 53 0.0 37 0.0 79 (130,10, 4) 0.019 0.018 0.0 27 0.0 54 0.0 36 0.0 54 (130,15, 4) 0.019 0.020 0.0 23 0.0 59 0.0 40 0.0 84 (130,25, 4) 0.021 0.021 0.0 25 0.0 47 0.0 42 0.0 92 (130,50, 4) 0.020 0.020 0.0 29 0.0 41 0.0 41 0.1 07 (5,5,4) 0.033 0.032 0.0 17 0.0 57 0.0 38 0.1 07 (5,10,4) 0.031 0.030 0.0 19 0.0 64 0.0 36 0.0 95 (5,15,4) 0.031 0.029 0.0 18 0.0 61 0.0 41 0.0 77 (5,25,4) 0.031 0.029 0.0 19 0.0 62 0.0 40 0.1 02 (5,50,4) 0.029 0.028 0.0 24 0.0 63 0.0 33 0.1 01 e rms - root mean square error - relative error average over all samples measured taking as reference the results given by NDF (fig.3 and tab.1, 2, 3) Fully linked networks without pre-processing Fully linked networks with pre-processing Cluster linked networks with and without pre-processing generate NDF ANN RBS and ERDA measurement s Train Set Test Set Trainin g process Results Fig.3 0 10 20 30 40 50 0.0 0.2 0.4 0.6 0.8 1.0 N o rm alised test set erro r T rain in g iteratio n (130,10,4) C luster F u lly lin ke d Tab.1 Tab.3 Tab.2 Fig.4

Transcript of ARTIFICIAL NEURAL NETWORK ANALYSIS OF MULTIPLE IBA SPECTRA H.F.R.Pinho 1,2, A. Vieira 2, N.R.Nené...

Page 1: ARTIFICIAL NEURAL NETWORK ANALYSIS OF MULTIPLE IBA SPECTRA H.F.R.Pinho 1,2, A. Vieira 2, N.R.Nené 1,N. P. Barradas 1,3 1 Instituto Tecnológico e Nuclear,

ARTIFICIAL NEURAL NETWORK ANALYSIS OF MULTIPLE IBA SPECTRA

H.F.R.Pinho1,2, A. Vieira2, N.R.Nené1,N. P. Barradas1,3

1 Instituto Tecnológico e Nuclear, E.N. 10, 2685 Sacavém, Portugal, 2 ISEP, R. São Tomé, 4200 Porto, Portugal 3 Centro de Física Nuclear da Universidade de Lisboa, Avenida Prof. Gama Pinto 2, 1699 Lisboa, Portugal

1. INTRODUCTION

We have previously used artificial neural networks (ANNs) for automated analysis of Rutherford backscattering (RBS) data [1,2]

One of the limitations was that one single spectrum could be analyzed from each sample.

When more than one spectrum is collected, each has to be analyzed separately, leading to different results and hence reduced accuracy.

Furthermore, often complementary data are collected, either using different experimental conditions, or even using a different technique such as elastic recoil detection (ERDA) or particle-induced x-ray analysis. In that case, simultaneous and self-consistent analysis of all data is essential.

In this work we developed a code based on ANN to perform the analysis of multiple RBS and ERDA spectra collected from the same sample.

The ANN developed was applied to a very simple case:determination of the stoichiometry of TiNOH samples

3. EXPERIMENTAL CONDITIONS

Samples 15 TiNxOyHz samples were deposited by reactive rf magnetron sputtering from a high purity Ti target (99.731 %) onto polished high-speed stainless steel (AISI M2).

Spectra35 MeV 35Cl7+ ERDA spectra were collected.Five spectra were obtained simultaneously from each sample: one recoil spectrum for each element, and the backscattering spectrum from the Ti (fig. 2). Conventional data analysis was done with the code NDF [4].

2. ARTIFICIAL NEURAL NETWORKS

• Recognize recurring patterns in input data, without Physics knowledge• Ideal to do automatically what analysts have long done:

relate specific data features to specific sample properties…because spectra can be treated just as pictures.

• How? By supervised learning from known examples:-training phase: give the ANN many examples where the

solution is known beforehand;-let the ANN adjust itself to this training set;-check it against an independent set of data: the test set;-do it until no improvement can be found.

Network Architecture• (N, I1,….., In, M) (fig.1)

N-number of inputsM-number of outputsIi-number of nodes in hidden layer i

• Present work Without pre-processing

N- 130 yield values (26 channels of each spectrum one after the other, including leading

edge)M- 4 (concentration of each element)

With pre-processing N- 5 (integrals of the 26 channels of each spectrum)M- 4 (concentration of each element)

Network Connectivity • Fully linked- full connectivity (standard architecture) from all nodes in one layer to all nodes in the next layer.• Cluster linked- input corresponding to each spectrum (either 26 yields or their integral) as being one cluster, connected only to a given set of nodes in the first hidden layer. Other clusters, corresponding to the other spectra, are not connected to that set of nodes.

Backpropagation• Each connection has a given weight, initially random;• Give known values to the inputs and outputs;• Adjust the weights to minimize the difference between the given

outputs and the calculated ones.

ANN Training - 1800 examples corresponding to a very broad range of possible elemental concentrations were fed to train the net. These examples were constructed from simulated experimental data (fig.3) with Poisson noise added. The training consisted in 50 iteration steps.

ANN Testing - 200 independent examples (not used during training) were used to test the coefficients.

A thin carbon layer of varying thickness was considered, in order to emulate real-life conditions, where such layers are often deposited during the experiment due to poor vacuum conditions.

4. TRAIN AND TEST SET

6. SUMMARY AND OUTLOOK

• We developed artificial neural networks capable of analysing multiple ion beam analysis spectra collected from the same sample. The ANNs were applied to a simple problem, the determination of the stoichiometry of TiNOH samples measured with heavy ion ERDA. This allowed us to make a thorough study of network architecture, connectivity, and effectiveness of pre-processing.

• Small networks using the spectral yields with no pre-processing achieve the best results in real experimental data. However, the ANNs must be cluster-linked so that each spectrum is only connected to a subset of nodes in the first hidden layer exclusively dedicated to that spectrum.

• Effective automatic pre-processing (as opposed to a-priori pre-processing) is achieved, leading to very efficient networks that are easy to train.

• We expect this type of architecture to be easy to scale up to complex multiple IBA spectra problems.

The authors gratefully acknowledge the financial support of FCT under grant POCTI/CTM/40059/2001.

5. RESULTS

inputs

outputs

Fig.1

[1] N. P. Barradas and A. Vieira, Phys. Rev. E62 (2000) 5818.[2] V. Matias, G. Öhl, J.C. Soares, N. P. Barradas, A. Vieira, P.P. Freitas, S. Cardoso, Phys. Rev. E 67 (2003) 046705.[3] E. Alves, A. Ramos, N. P. Barradas, F. Vaz, P. Cerqueira, L. Rebouta, U. Kreissig, Surf. Coatings Technology 180-181 (2004) 372[4] N.P. Barradas, C. Jeynes, and R.P. Webb, Appl. Phys. Lett. 71 (1997) 291.[5] C. M. Bishop, Neural Networks for Pattern Recognition (Oxford: Oxford University Press 1995)

0

1000

2000

3000

4000

RBS: Ti

Yie

ld (c

ount

s)

0

50

100

150

ERDA: Ti

Yie

ld (c

ount

s)

0

50

100

150

200

ERDA: N

Yie

ld (c

ount

s)

50 100 150 200 2500

50

100

150

200 ERDA: O

Yie

ld (c

ount

s)

Channel

50 100 150 200 2500

20

40

60ERDA: H Y

ield

(cou

nts)

Channel

Fig.2

Architecture erms

(train set)

erms

(test set)

(Ti) (N) (O) (H)

(130,5,4) 0.024 0.025 0.029 0.043 0.049 0.147

(130,15,4) 0.016 0.017 0.032 0.046 0.045 0.081

(130,25,4) 0.016 0.017 0.033 0.046 0.043 0.074

(130,50,4) 0.015 0.016 0.029 0.040 0.046 0.083

(130,10,5,4) 0.017 0.018 0.035 0.051 0.052 0.080

(130,20,10,4) 0.015 0.017 0.029 0.047 0.043 0.097

(130,20,20,4) 0.013 0.015 0.033 0.055 0.051 0.111

(130,30,20,4) 0.015 0.016 0.022 0.047 0.044 0.081

(130,40,20,4) 0.015 0.016 0.030 0.042 0.044 0.081

(130,40,30,4) 0.015 0.015 0.025 0.055 0.046 0.076

(130,50,30,4) 0.016 0.017 0.033 0.037 0.043 0.100

(130,50,40,4) 0.015 0.016 0.022 0.043 0.041 0.058

Architecture erms

(train set)

erms

(test set)

(Ti) (N) (O) (H)

(5,5,4) 0.030 0.029 0.020 0.055 0.044 0.134

(5,15,4) 0.027 0.025 0.022 0.049 0.045 0.093

(5,25,4) 0.027 0.025 0.019 0.052 0.033 0.102

(5,50,4) 0.029 0.027 0.019 0.059 0.038 0.089

(5,10,5,4) 0.031 0.030 0.025 0.043 0.040 0.124

(5,20,10,4) 0.028 0.027 0.021 0.055 0.040 0.097

(5,20,20,4) 0.027 0.026 0.025 0.049 0.045 0.105

(5,30,20,4) 0.037 0.036 0.025 0.051 0.044 0.106

(5,40,20,4) 0.037 0.037 0.023 0.060 0.039 0.109

(5,40,30,4) 0.026 0.026 0.021 0.046 0.035 0.113

(5,50,30,4) 0.027 0.026 0.022 0.046 0.045 0.081

(5,50,40,4) 0.035 0.032 0.021 0.046 0.045 0.097

Architecture erms

(train set)

erms

(test set)

(Ti) (N) (O) (H)

(130,5,4) 0.025 0.024 0.019 0.053 0.037 0.079

(130,10,4) 0.019 0.018 0.027 0.054 0.036 0.054

(130,15,4) 0.019 0.020 0.023 0.059 0.040 0.084

(130,25,4) 0.021 0.021 0.025 0.047 0.042 0.092

(130,50,4) 0.020 0.020 0.029 0.041 0.041 0.107

(5,5,4) 0.033 0.032 0.017 0.057 0.038 0.107

(5,10,4) 0.031 0.030 0.019 0.064 0.036 0.095

(5,15,4) 0.031 0.029 0.018 0.061 0.041 0.077

(5,25,4) 0.031 0.029 0.019 0.062 0.040 0.102

(5,50,4) 0.029 0.028 0.024 0.063 0.033 0.101

erms- root mean square error - relative error average over all samples measured taking as reference the results given by NDF (fig.3 and tab.1, 2, 3)

Fully linked networks without pre-processing Fully linked networks with pre-processing Cluster linked networks with and without

pre-processing

generate

NDF ANN

RBS and ERDA measurements

Train Set

Test Set

Training process

Results

Fig.3

0 10 20 30 40 500.0

0.2

0.4

0.6

0.8

1.0

No

rmal

ised

tes

t se

t er

ror

Training iteration

(130,10,4) Cluster Fully linked

Tab.1Tab.3Tab.2

Fig.4