Prediction of Ac3 and Martensite Start Temperatures by a ...

8
© 2017 ISIJ 2229 ISIJ International, Vol. 57 (2017), No. 12, pp. 2229–2236 * Corresponding author: E-mail: [email protected] DOI: http://dx.doi.org/10.2355/isijinternational.ISIJINT-2017-212 1. Introduction Due to the importance of phase transformations and heat treatments on the mechanical properties of steels, a large number of studies have been conducted to clarify the effect of various alloying elements on phase transformation temperatures, such as the martensite-start temperature (Ms) and austenite transformation temperature (Ac 3 ). Therefore, various empirical models have been proposed and the most commonly used equations among those are listed in Table 1. 1–7) All these equations were derived by the multivariate regression analysis using large experimental datasets com- posed of several tens or hundreds of steels with various chemical compositions. However, the key parameters in these equations, which are considered to have notable effects on the phase trans- formation temperatures, were selected mostly from the insight of the experienced researchers. As a result, there are several disagreements on how alloying elements affect phase transformation temperatures. For instance, the terms corresponding to the effect of carbon on Ac 3 are totally different between equations by Andrews and Hougardy. 1,2) For reliable prediction, therefore, determination of the key parameters is the most important process, which Prediction of Ac 3 and Martensite Start Temperatures by a Data- driven Model Selection Approach Hoheok KIM, 1) * Junya INOUE, 1,2) Masato OKADA 3) and Kenji NAGATA 3) 1) Graduate School of Materials Engineering, The University of Tokyo, 7-3-1 Bunkyo, Hongo, Tokyo, 113-8656 Japan. 2) Research Center for Advanced Science and Technology, The University of Tokyo, 4-6-1 Komaba, Meguro, Tokyo, 153-0041 Japan. 3) Graduate School of Frontier Science, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba, 277-8651 Japan. (Received on April 13, 2017; accepted on August 2, 2017) Four different information criteria, which are widely used for model selection problems, are applied to reveal the explanatory variables for phase transformation temperatures of steels, austenitise temperature (Ac 3 ) and martensite-start temperature (Ms). Using existing datasets for CCT diagram for various steels, the predictive equations for these critical temperatures are derived. A number of empirical equations have been proposed to enable efficient prediction of the the Ac 3 and Ms temperatures of steels. However, the key parameters in those equations are usually chosen based on researchers’ trials and errors. In this study, the performance of the information criteria is evaluated first using a simulated dataset mimicking the characteristics of those for the Ac 3 and the Ms temperatures. Then the criteria are applied to the experi- mental data obtained from two different sources. The key parameters are chosen for the Ac 3 and Ms temperatures and the derived equations are found to be in better agreement with experimental data than the previous empirical equations. Thus, it was clarified that the methods can be applied to automatically discover the hidden mechanism from complex multi-dimensional datasets of steels’ chemical composition. KEY WORDS: steel; martensite start temperature; Ac 3 temperature; modeling; model selection criterion; AIC; ABIC; BIC; cross validation. unfortunately is not automatically done using the conven- tional multivariate regression analysis. Accordingly, several data-driven approaches have been introduced. For instance, a Bayesian neural network has been introduced to establish predictive models for Ac 3 by Vermeulen et al. 8) and Ms by Sourmail and Garcia-Mateo, 9) and it was demonstrated that this new data-driven approach enables both automatic model selection and provides superior estimation of those transformation temperatures. Although the neural network approach does provide a good estimation, it does not allow the explicit separation of different roles of alloying elements. Of course, it is possible to incorporate the effect of each alloying elements from the output of the neural networks to a series of controlled inputs. However, from an engineering point of view, the explicit correlation is usually more important, so the previ- ous empirical regression models are still widely used. 10,11) Recently, another data-driven approach called as data driven model selection has been increasingly introduced to solve the physical problems. The data-driven model selec- tion is a method where the model most suitable for a given dataset is selected based on the information criteria. There exist several information criteria, such as Akaike informa- tion criterion (AIC) by Akaike, 12) Bayesian information cri- terion (BIC) by Schwarz, 13) Akaike’s Bayesian information criterion (ABIC) by Akaike, 14) and cross-validation (CV). 15) Some of application of the methods in materials engineering

Transcript of Prediction of Ac3 and Martensite Start Temperatures by a ...

Page 1: Prediction of Ac3 and Martensite Start Temperatures by a ...

ISIJ International, Vol. 57 (2017), No. 12

© 2017 ISIJ2229

ISIJ International, Vol. 57 (2017), No. 12, pp. 2229–2236

* Corresponding author: E-mail: [email protected]: http://dx.doi.org/10.2355/isijinternational.ISIJINT-2017-212

1. Introduction

Due to the importance of phase transformations and heat treatments on the mechanical properties of steels, a large number of studies have been conducted to clarify the effect of various alloying elements on phase transformation temperatures, such as the martensite-start temperature (Ms) and austenite transformation temperature (Ac3). Therefore, various empirical models have been proposed and the most commonly used equations among those are listed in Table 1.1–7) All these equations were derived by the multivariate regression analysis using large experimental datasets com-posed of several tens or hundreds of steels with various chemical compositions.

However, the key parameters in these equations, which are considered to have notable effects on the phase trans-formation temperatures, were selected mostly from the insight of the experienced researchers. As a result, there are several disagreements on how alloying elements affect phase transformation temperatures. For instance, the terms corresponding to the effect of carbon on Ac3 are totally different between equations by Andrews and Hougardy.1,2)

For reliable prediction, therefore, determination of the key parameters is the most important process, which

Prediction of Ac3 and Martensite Start Temperatures by a Data-driven Model Selection Approach

Hoheok KIM,1)* Junya INOUE,1,2) Masato OKADA3) and Kenji NAGATA3)

1) Graduate School of Materials Engineering, The University of Tokyo, 7-3-1 Bunkyo, Hongo, Tokyo, 113-8656 Japan.2) Research Center for Advanced Science and Technology, The University of Tokyo, 4-6-1 Komaba, Meguro, Tokyo, 153-0041 Japan.3) Graduate School of Frontier Science, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba, 277-8651 Japan.

(Received on April 13, 2017; accepted on August 2, 2017)

Four different information criteria, which are widely used for model selection problems, are applied to reveal the explanatory variables for phase transformation temperatures of steels, austenitise temperature (Ac3) and martensite-start temperature (Ms). Using existing datasets for CCT diagram for various steels, the predictive equations for these critical temperatures are derived. A number of empirical equations have been proposed to enable efficient prediction of the the Ac3 and Ms temperatures of steels. However, the key parameters in those equations are usually chosen based on researchers’ trials and errors. In this study, the performance of the information criteria is evaluated first using a simulated dataset mimicking the characteristics of those for the Ac3 and the Ms temperatures. Then the criteria are applied to the experi-mental data obtained from two different sources. The key parameters are chosen for the Ac3 and Ms temperatures and the derived equations are found to be in better agreement with experimental data than the previous empirical equations. Thus, it was clarified that the methods can be applied to automatically discover the hidden mechanism from complex multi-dimensional datasets of steels’ chemical composition.

KEY WORDS: steel; martensite start temperature; Ac3 temperature; modeling; model selection criterion; AIC; ABIC; BIC; cross validation.

unfortunately is not automatically done using the conven-tional multivariate regression analysis. Accordingly, several data-driven approaches have been introduced. For instance, a Bayesian neural network has been introduced to establish predictive models for Ac3 by Vermeulen et al.8) and Ms by Sourmail and Garcia-Mateo,9) and it was demonstrated that this new data-driven approach enables both automatic model selection and provides superior estimation of those transformation temperatures.

Although the neural network approach does provide a good estimation, it does not allow the explicit separation of different roles of alloying elements. Of course, it is possible to incorporate the effect of each alloying elements from the output of the neural networks to a series of controlled inputs. However, from an engineering point of view, the explicit correlation is usually more important, so the previ-ous empirical regression models are still widely used.10,11)

Recently, another data-driven approach called as data driven model selection has been increasingly introduced to solve the physical problems. The data-driven model selec-tion is a method where the model most suitable for a given dataset is selected based on the information criteria. There exist several information criteria, such as Akaike informa-tion criterion (AIC) by Akaike,12) Bayesian information cri-terion (BIC) by Schwarz,13) Akaike’s Bayesian information criterion (ABIC) by Akaike,14) and cross-validation (CV).15) Some of application of the methods in materials engineering

Page 2: Prediction of Ac3 and Martensite Start Temperatures by a ...

ISIJ International, Vol. 57 (2017), No. 12

© 2017 ISIJ 2230

area are as follows: Al-Rubaie et al.16) used AIC to select the model for fatigue crack growth rate, and Cockayne and van de Walle17) obtained a cluster expansion model for the CaZr1−xTixO3 solid solution by applying CV.

All of these criteria can be used to select the best model suitable for a given dataset, but difference in their assump-tion may lead to different performance under different circumstances, such as the amount of data and the kinds of model to be selected. Therefore, a comprehensive under-standing of the behavior of each model selection criterion is needed when applying it to a specific problem.

In this paper, AIC, BIC, ABIC, and CV are first briefly explained, and then the performance of each model selection criterion evaluated using a simulated dataset considering the characteristics of the databases of Ac3 and Ms temperatures. Finally, the criteria are applied to determine the key param-eters for the Ac3 and Ms temperatures using two kinds of data sources: one is exactly same dataset used by Andrews1) and the other from the CCT database for steels provided by the NIMS Materials Database (MatNavi).18)

2. Likelihood Function and Model Selection Criteria

This section presents a brief review of the basis of typical model selection criteria.

2.1. Likelihood FunctionIn many scientific studies, the objective is to find the

underlying relationship that yields the data. This problem usually leads to the evaluation of a set of candidate models. In such a problem, a likelihood is used to estimate the prob-ability of a model for given observations. Assuming that each observation yn includes Gaussian noise in a model and parameters, the likelihood of yn is given by Eq. (1), where x, θ, and s indicate the input data, the parameters, and the standard deviation of noise, respectively.

p y xsexp

sy xn n| ,θ

πθ( ) = − −( )

1

2

1

22 2

2 ....... (1)

Suppose each observation is generated independently, then the likelihood function of the entire data, p(y|x, θ), can be represented by

p y x p y x

s

exps

y x

n

N

n

Nn

N

n

| |, ,θ θ

πθ

( ) = ( )

=

( )− −( )

=

=

1

2 2

21

21

2

1

2 .

.... (2)

where N represents the number of data. The model param-eters that maximize the likelihood are chosen for the given data. This is to maximize the agreement of the candidate model with the observations, and this method is known as maximum likelihood estimation (MLE). With MLE, the maximum likelihood L̂ is written as

ˆ ˆ ˆ,L p y xs

exps

y xN

n

N

n= ( ) =( )

− −( )

=∑| θ

πθ

1

2

1

22 22

1

2

... (3)

where θ̂ is a parameter determined by the least-squares method. In other words, the probability of a model with the optimized parameters for given data can be calculated by MLE.

2.2. Model Selection CriteriaSimply comparing candidate models with the maximum

likelihood may lead to an overfitting problem. This occurs when a model is too complex such as having too many parameters relative to the observation data. Then the model describes error or noise instead of the underlying relation-ship. This problem can be explained from Fig. 1(a). A simple model cannot describe given data precisely and also gives a poor prediction. When the complexity is moderate, the model fits both the given data and new observations successfully. A more complex model, on the other hand, describes the given data perfectly but fails to predict new observations. As shown in Fig. 1(b), the disagreement of a model with given data, which is usually represented as training error, decreases as a model becomes more complex. However, the prediction error of the model with unseen data

Table 1. A summary of the past equations for Ac3 and Ms temperature based on chemical composition of steels.

Ac3 temperature

Proposed by Equation

Andrews (1965) Ac3(°C) = 910 −203C1/2+ 44.7Si −15.2Ni+31.5Mo+104.4V+13.1W

Hougardy (1984) Ac3(°C) = 902−255C+19Si −11Mn −5Cr+13Mo −20Ni+55V

Kasatkin et al. (1984)Ac3(°C) = 912 −370C −27.4Mn+27.3Si − 6.35Cr −32.7Ni+ 95.2V+190Ti+72Al+ 65.6Nb +5.57W+332S+276P+ 485N

−900B+16.2CMn+32.3CSi+15.4CCr+ 48CNi+ 4.32SiCr −17.3SiMo+18.6SiNi+ 4.8MnNi+ 40.5MoV+174C2

+2.46Mn2− 6.86Si2+ 0.322Cr2+9.9Mo2+1.24Ni2− 60.2V2

Trzaska and Dobrza ski (2007) Ac3(°C) = 973−224.5C1/2−17Mn+34Si −14Ni+21.6Mo+ 41.8V−20Cu

Ms temperature

Proposed by Equation

Payson and Savage (1944) Ms(°C) = 489.9 −316.7C −33.3Mn −27.8C −16.7Ni −11.1(Si+Mo+W)

Grange and Stewart (1946) Ms(°C) =537.8−361.1C −38.9(Mn+Cr) −19.4Ni −27.8Mo

Andrews (Linear, 1965) Ms(°C) =539 − 423C −30.4Mn −17.7Ni −12.1Cr −7.5Mo

Andrews (Non linear, 1965) Ms(°C) =512− 453C −16.9Ni+15Cr − 9.5Mo+217C2−71.5CMn − 67.6CCr

Wang et al. (2000) Ms(°C) =545 − 470.4C −3.96Si −37.7Mn −21.5Cr+38.9Mo

Page 3: Prediction of Ac3 and Martensite Start Temperatures by a ...

ISIJ International, Vol. 57 (2017), No. 12

© 2017 ISIJ2231

is high when the model is too simple or complex and low when the complexity is moderate. Therefore, MLE, which estimates how well a model fits the observations, should not be used as the only indicator when choosing the best model. For this reason, a variety of methods have been suggested which include a penalty term to MLE to consider the model complexity.

Akaike1) proposed the AIC based on information theory. It estimates a relative quality of a given model measuring the Kullback Leibler distance. It also provides an estimate of the information loss when a model is used to express a data-generating process. In this way, it evaluates both the goodness of fit and the complexity of a model. The AIC is defined as Eq. (4), which evaluates the model fitting with the term −2lnL̂ and penalizes the model complexity with the term 2 k where k is the number of parameters.

AIC lnL k= − +2 2ˆ ........................... (4)

The BIC is a criterion for model selection among a finite set of models and was developed by Schwarz.13) The equation for the probability of a model for given data was derived by the Laplace approximation. It balances the increase in the likelihood with the term calculated from the number of parameters (k) and the number of data (N).

BIC lnL kln N= − + ( )2 ˆ ....................... (5)

Next, Akaike’s Bayesian information criterion (ABIC) proposed by Akaike14) is considered. With the assumption that the noise in data follows a Gaussian distribution and the prior follows a uniform distribution, the probability of a model for given data can be expressed as Eq. (6)

ABIC

sy y

n klog s log

T T= − +

+− ( )− ( )

−1

2

1

2

22

1

2

12

2

µ µ

π

Λ

Λ ............ (6)

where

Λ = ( )−xxT1

................................ (7)

µ = ( )−xx xyT 1. .............................. (8)

The first three terms estimate how well a model is fitted to the data and the last term is a penalty for the model com-plexity.

Finally, we consider cross-validation (CV) which is a technique for assessing how well the analysis will general-ize an independent dataset.15) The calculation procedure is illustrated in Fig. 2. In the calculation, a part of data is used for analysis (training dataset) and the remainder for validation (validation dataset). The goal of cross-validation is to avoid overfitting problems by verifying a model with the validation dataset. In particular, in leave-one-out cross validation (LOOCV) only one datum is used as a validation set and the remaining data are used as a training set. Regard-ing these criteria, a model with a lower AIC, BIC, ABIC, or CV is preferred.

3. Validation of Each Criterion Using Toy Model

3.1. Setting of Toy ModelsFirst, to demonstrate the effectiveness of each criterion

in the feature and model selection problem, the following simple linear combination model is considered:

Model y f x w

f x x w N

i

i i

# : ,

, ~ ,

1

0

1

10

2

= ( ) +

( ) = ( )=∑

θ σ

.................... (9)

where y, xi, f (x), w, and θi are the observations, input data, underlying relationship, noise, and parameters, respectively. The input data are distributed uniformly within a range between 0 and 1, and the noise is given by a zero mean normal distribution with standard deviation σ. An overview of the toy model is listed in Table 2. Only five out of 10 inputs are designed to actually have an effect on the output. By varying the standard deviation σ from 1 to 10, 50 to 800 combinations of input data and the output observations are randomly generated to observe the trend of model selection result with increasing number of data. The parameters for two cross terms of input data (a9 for x x1 7* , a10 for x x2 8* ) are also considered. The whole procedure is repeated 100 times to evaluate the performance of each criterion.

Additionally, zero data are inserted into the input data with different ratios to investigate the effect of blank data on the model selection procedure. The zero data are randomly inserted in the input data and their fraction of each param-eter is increased from 33% to 99%. The following simple linear combination model is considered:

Fig. 1. Schematic plot of the input, the observation and the mod-els with different complexity (a) and the training and pre-diction errors as the model complexity increases are repre-sented (b).

Fig. 2. Schematic representation of the cross-validation method.

Table 2. An overview of the the values of the parameters of the Model #1 and their portion of including zero data.

Parameter a1 a2 a3 a4 a5 a6 a7 a8 a9 a10

Value 3 0 7 0 11 0 15 0 19 0

% of zero data 10 10 30 30 50 50 70 70 70–80 70–80

Page 4: Prediction of Ac3 and Martensite Start Temperatures by a ...

ISIJ International, Vol. 57 (2017), No. 12

© 2017 ISIJ 2232

Model y f x w

f x x w N

i

i i

# : ,

, ~ ,

2

0

1

20

2

= ( ) +

( ) = ( )=∑

θ σ

.......... (10)

The overview of the parameter and the fraction of zero data is listed in Table 3.

3.2. Simulation Result for Toy ModelFigure 3 illustrates the frequency of selecting the true

relationship for each criterion with Model #1 under differ-ent numbers of data and noise levels. In this analysis, if θ1, θ3, θ5, θ7, and θ9 are all selected and θ2, θ4, θ6, θ8, and θ10

and are all excluded, the result is counted as successful. For example, Fig. 3(a) plots the model selection result with dif-ferent number of data (N) for the AIC. When the noise level is 1, AIC succeeds 39 times out of 100 trials with 50 data, and 54 times with 800 data. All four criteria have a common trend that the frequency of selecting true model decreases as the noise level increases. Also, when there are more data, the true model is more likely to be found. It is natural that if there is much more noise and less data, it becomes more difficult to find the true underlying relationship.

Among the four criteria, the BIC shows the best perfor-mance in terms of true model selection problems with a probability of 70% of finding the true model with the largest dataset in the entire range of noise levels. With small data-sets, it still chooses the true model in 70% of cases when there is little noise.

The frequency of finding the true model was lowest with the AIC and CV and was less than 60% even with a large dataset. With small dataset, the frequency of finding the true model was less than 50%. In addition to their poor performances, the trend of their performances appears to be similar. It was proved by Stone19) that the calculation processes of the AIC and CV are equivalent in terms of the case of linear model selection. Therefore, both criteria can be treated as the same method in this linear case.

Furthermore, the frequency of features chosen as a work-ing parameter was also evaluated to investigate the model selection performance from the viewpoint of feature selec-tion with Model #1, and the result is plotted in Fig. 4. This

Table 3. An overview of the values of the parameters of the Model #2 and their portion of including zero data.

Parameter a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 a12 a13 a14 a15 a16 a17 a18 a19 a20

Value 0 0 0 0 0 5 5 5 5 5 10 10 10 10 10 20 20 20 20 20

% of zero data 33 66 77 88 99 33 66 77 88 99 33 66 77 88 99 33 66 77 88 99

Fig. 3. The frequency of choosing the true model by each crite-rion. The figure (a), (b), (c) and (d) respectively show the result by AIC, BIC, ABIC and CV.

Fig. 4. The result of parameter selection by each criterion. The y axis indicates the times that each parameter is chosen as a key parameter by each model selection criterion with different noise level (the x axis) and number of data (n) when the calculation is repeated 100 times.

Page 5: Prediction of Ac3 and Martensite Start Temperatures by a ...

ISIJ International, Vol. 57 (2017), No. 12

© 2017 ISIJ2233

figure shows how frequently each parameter is estimated as a necessary feature by the four methods for various numbers of data and noise levels. The result shows that parameters with lower values and including larger percentages of zero data, which are shown in Table 2 tend to be excluded as the noise level increases. This suggests that such parameters are considered unnecessary because they are buried in noise. Among the four criteria, the BIC recorded the highest ratio of finding the true key parameters (a1, a3, a5, a7 and a9) and excluding the unnecessary ones. The ABIC is not an effec-tive method with small datasets but it showed improved performance with a large number of datasets. The AIC and CV produced almost the same results. From the viewpoints of both true model selection and feature selection, the BIC method showed the best performance.

In addition, Fig. 5 shows the effect of zero data on model selection result using the BIC with Model #2. Each chart indicates the fraction of success for the various fractions of zero data, noise in data, and value of parameters. When there is no zero data and the number of data is 100, the parameter of 20 is chosen 100% in all noise range. As the value of the parameter is decreased, the fraction of success

is decreased with increasing noise. The trend of the result changes slightly until the fraction of zero data becomes 88%. However, with little information, the key parameters were barely chosen when the standard deviation of noise is higher than 5. This result suggests that zero data has no con-siderable effect on the model selection as far as the quality of input data is good and the noise level is sufficiently low.

4. Derivation of Working Parameters of Ac3 and Ms Temperatures of Steels Using Existing Datasets

4.1. Datasets Used to Derive Equations for Ac3 and Ms Temperatures

Two sets of data are used to find working parameters for the Ac3 and Ms temperatures. The first dataset is from various sources in Andrews’ paper1), which include high carbon contents and low contents of alloying elements. The second dataset is provided by NIMS Materials Database (MatNavi),18) which covers lower carbon contents and higher contents of alloying elements than those of the sources used by Andrews. An overview of the data and the distribution of important elements in both databases are listed in Table 4

Fig. 5. The model selection result with different zero data ratio. It shows that there is no big difference of the results until when the% of zero data is 88.

Table 4. The chemical composition and the minimum to maximum range of the data used to derive predictive equations for Ac3 and Ms temperatures. Upper and lower values in the table indicate maximum and minimum content of each alloying element.

Datasource

No. ofdata C Si Mn Ni Cr Cu Mo V Ti Nb B P S Al N W As

Ac3

Andrews 155 0.11/ 0.95

0.06/ 1.78

0.04/ 1.98

0/ 5.00

0/ 4.48

0/ 0.91

0/ 1.02

0/ 0.7

0/ 0.05 – – 0/

0.060/

0.060/

0.10 – 0/ 4.1

0/ 0.07

NIMS 198 0.03/ 0.4

0.01/ 1.76

0/ 1.98

0/ 9.33

0/ 9.04

0/ 0.92

0/ 1.32

0/ 0.56

0/ 0.03

0/ 0.18

0/ 0.02

0/ 0.19

0/ 0.14

0/ 0.1

0/ 0.01 – –

MsAndrews 243 0.11/

0.580.11/ 1.89

0.04/ 4.87

0/ 5.04

0/ 4.61

0/ 0.91

0/ 5.4

0/ 0.7 – – – 0/

0.050/

0.04 – – 0/ 8.88

0/ 0.07

NIMS 258 0.02/ 0.94

0/ 1.76

0/ 2.05

0/ 9.11

0/ 9.04

0/ 1.1

0/ 1.66

0/ 0.56

0/ 0.1

0/ 0.18

0/ 0.02

0/ 0.19

0/ 0.04

0/ 0.1

0/ 0.01 – –

Page 6: Prediction of Ac3 and Martensite Start Temperatures by a ...

ISIJ International, Vol. 57 (2017), No. 12

© 2017 ISIJ 2234

and plotted in Fig. 6, respectively.In this study, the data used by Andrews are analyzed

and the results are compared with the equation obtained by Andrews. Then, both sets of data are employed to derive new equations.

4.2. Model Selection Using Andrews’ DatasetFirst, the four criteria are applied to the datasets used by

Andrews. In this way, the determination of the key param-eter by the researcher and the data-driven approach can be compared. The following linear combination models includ-ing lower and higher-order terms of carbon are considered:

Ac a C a C a C a C

a Si a Mn a Ni a Cr a Cu

3 1

13

2

12

3 42

5 6 7 8 9

910°( ) = + + + +

+ + + + ++

C

aa Mo a V a Ti a P a S

a Al a W a As

10 11 12 13 14

15 16 17

+ + + ++ + +

.... (11)

Ms a C a C a C a C

a Si a Mn a Ni a Cr a Cu

°( ) = + + + +

+ + + + + +

C 539 1

13

2

12

3 42

5 6 7 8 9

++ + + + ++ + +a Mo a V a Ti a P a S

a Al a W a As

10 11 12 13 14

15 16 17 .

... (12)

The symbol of each element in Eqs. (11), (12) represents its chemical composition in steels in wt%. In many equa-tions for Ac3, the carbon term is often expressed in the form of a linear term or square-root term. For example, Hougardy2) derived his equation using a linear form. On the other hand, Andrews1) assumed that the Ac3 temperature is proportional to the square root of the carbon content. For this reason, the basic model for the Ac3 and Ms temperatures in the present paper includes all the possible series of carbon found in the literature.

4.3. Model Selection Using Both Andrews’ and NIMS Datasets

Next, we applied each criterion to the combined dataset of Andrews and NIMS. In this analysis, the linear combination models also include cross terms between carbon and strong carbide-forming elements such as chromium, vanadium, titanium, manganese, and molybdenum:

Ac a C a C a C a C

a Si a Mn a Ni a Cr a Cu

3 1

13

2

12

3 42

5 6 7 8 9

910°( ) = + + + +

+ + + + ++

C

aa Mo a V a Ti a P a S

a Al a W a As a CCr

a CMo

10 11 12 13 14

15 16 17 18

19

+ + + ++ + + ++ ++ + +a CTi a CMn a CV20 21 22

... (13)

Ms a C a C a C a C

a Si a Mn a Ni a Cr a Cu

a

°( ) = + + + +

+ + + + ++

C 539 1

13

2

12

3 42

5 6 7 8 9

110 11 12 13 14

15 16 17 18

19

Mo a V a Ti a P a S

a Al a W a As a CCr

a CMo

+ + + ++ + + ++ ++ + +a CTi a CMn a CV20 21 22 .

... (14)

4.4. Model Selection Result with the Andrews’ Dataset4.4.1. Ac3 Temperature

The same procedure as that used for the toy model was

conducted. The model selection results are compared with the equations derived by Andrews and are listed in Table 5. The results for each criterion are similar to the equation defined by Andrews. However, Andrews failed to include terms for titanium, sulfur, aluminum, and arsenic. This is because the amount of these elements were less than 0.1 wt% in the steels used in his study, which made it difficult to discover their effects on the Ac3 and Ms temperatures. The result for the ABIC appears to be different from the others in that it includes many parameters that are determined to be unnec-essary in the other methods. Considering that the number of samples is only 155, the method using the ABIC is likely to overfit the data, as indicated above using the toy model.

4.4.2. Ms TemperatureIt is clear from Table 5 that the terms for manganese,

nickel, and chromium, which are considered important in the Andrews’ equation, are also in every equation obtained from the criteria. The results for Ms temperature, how-ever, are considerably different from those of Andrews in the selection of terms concerning the carbon content. The Andrews’ equation and the equation obtained by CV include only the linear term listed in Table 5. However, the other criteria have all carbon terms indicating that the carbon con-tent does not monotonously affect the Ms temperature. In addition, molybdenum, which is included in the Andrews’ equation does not appear in the equations obtained by the AIC, BIC, and ABIC. It is because molybdenum’s actual effect on Ms temperature is not greater than the noise level; therefore, its effect is buried by noise in data.

These results clearly indicate that the data-driven approach can efficiently find the working parameters without the experience and knowledge of experts, such as Andrews, simply by collecting a large number of existing datasets. Note that Andrews selected working parameters in accor-dance with the knowledge obtained from his experiments, in which the chemical composition of each alloying element was systematically controlled.1)

4.5. Model Selection Result Using Both Andrews and NIMS Datasets

4.5.1. Ac3 TemperatureThe equations for the Ac3 temperature based on the chem-

ical composition are estimated using the same procedure as that for the toy model. Each criterion selects different fea-tures as shown in Table 6. Carbon, silicon, and nickel are included in all the Ac3 equations. From the data-driven meth-

Fig. 6. Chemical composition ranges of major alloying elements in Andrews and NIMS data sets.

Page 7: Prediction of Ac3 and Martensite Start Temperatures by a ...

ISIJ International, Vol. 57 (2017), No. 12

© 2017 ISIJ2235

ods, molybdenum, which is a well-known carbide-forming element, turned out to be a working parameter appearing in a cross term instead of a linear term, as in Andrews’ equa-tion. This can be explained from the austenite area of the steel. Figure 7(a) illustrates the area where only austenite phase exists. Its lower left line and lower right line indicate the Ac3 line and Acm line, respectively. The Ac3 line of the Fe–C–Mo system based on the equation derived by BIC, and the Ac3 and Acm lines of the same system derived using the thermodynamic calculation software, Thermo-Calc, are plotted in Fig. 7(b). The Ac3 line for 0 wt% molybdenum estimated using the equation derived by BIC corresponds well with that derived from the thermodynamic calcula-tion. It is clear from the thermodynamic calculation that, as the content of molybdenum increases, cementite becomes stable and the Acm line goes up. The regression equation actually tracing the behavior of Acm line instead of Ac3 for the increased molybdenum content. In order to verify this, the Ac3 and the Acm temperatures were calculated from the chemical composition of 353 experimental Ac3 data with Thermocalc software. The estimated Acm of 14 out of 353 data were found to be higher than the estimated Ac3. In addition, the differences between the experimental Ac3 and the estimated Acm of above 14 data were smaller than those

Table 6. The equations for Ac3 and Ms temperatures derived from the combined data. The total 353 Ac3 temperature data (155 from Andrews and 198 from NIMS) and 501 Ms temperature (243 from Andrews and 258 from NIMS) are used for the model selection calculation.

ModelSelection Constant C1/3 C1/2 C C2 Si Mn Ni Cr Cu Mo V Ti P S Al W As CCr CMo CTi CMn CV

Ac3

AIC 910 1 282 −2 664 1 985 −1 003 29 −22 −16 0 0 20 0 732 182 0 −149 19 139 0 0 0 0 165

BIC 910 0 −161 0 0 27 −27 −16 0 0 0 0 1 078 0 0 0 25 0 0 108 0 0 0

ABIC 910 1 301 −2 629 1 850 −912 30 −28 −17 1 0 0 0 1 069 208 −127 −158 21 131 0 85 −982 27 110

CV 910 1 393 −2 826 2 020 −964 29 −24 −17 0 0 0 0 724 258 0 0 26 129 0 111 0 0 0

Ms

AIC 539 −985 1 722 −1 525 512 0 −32 −14 −5 17 0 −123 0 202 −446 −206 6 −363 −30 −18 0 0 352

BIC 539 −954 1 663 −1 513 535 0 −32 −14 −5 0 0 0 0 0 0 −211 6 −304 −26 0 0 0 0

ABIC 539 −993 1 738 −1 537 517 0 −33 −14 −5 17 0 −121 271 203 −445 −205 6 −363 −30 −18 −3 865 0 348

CV 539 −985 1 722 −1 525 512 0 −32 −14 −5 17 0 −123 0 202 −446 −206 6 −363 −30 −18 0 0 352

Table 5. The comparison of coefficients of the Andrews’ equations for Ac3 and Ms with the results derived by each cri-terion. The same data which Andrews collected were used for the model selection calculation.

Estimated by Constant C1/3 C1/2 C C2 Si Mn Ni Cr Cu Mo V Ti P S Al W As

Ac3

Andrews(1965) 910 0 −203 0 0 44.7 0 −15.2 0 0 31.5 104.4 0 0 0 0 13.1 0

AIC 910 0 −191 0 0 35 0 −15 0 0 29 98 673 0 −281 231 14 0

BIC 910 0 −198 0 0 33 0 −15 0 0 29 102 772 0 0 0 0 14

ABIC 910 123 −434 203 −112 35 0 −15 0 −12 29 97 700 64 −297 234 14 28

CV 910 0 −191 0 0 35 0 −15 0 0 29 98 673 0 −281 231 14 0

Ms

Andrews(1965) 539 0 0 −423 0 0 −30.4 −17.7 −12.1 0 38.9 0 0 0 0 0 0 0

AIC 539 −4 758 8 491 −6 014 2 252 0 −25 −14 −11 34 0 0 0 336 −438 0 5 −343

BIC 539 −4 925 8 832 −6 301 2 401 0 −26 −15 −10 0 0 0 0 0 0 0 0 −248

ABIC 539 −4 914 8 771 −6 195 2 316 0 −25 −14 −11 33 0 30 0 319 −438 0 0 −342

CV 539 0 0 −394 0 −9 −34 −18 −16 0 −9 19 0 0 0 0 4 −262

between the experimental and the estimated Ac3. This result suggests that some of the Ac3 temperatures in the data might be confused with the Acm temperatures, which is difficult to distinguish simply from the dilatometric analysis commonly used to determine Ac3 temperatures.

The derived equations are found to be in good agreement with experimental data as shown in Fig. 8. The root mean square error (RMSE) of the previously reported equations (Figs. 8(a), 8(b)) are as high as 39, while those of newly derived equations (Figs. 8(c)–8(f)) are about 30.

4.5.2. Ms TemperatureThe equations for the Ms temperature were derived in

the same way as for the Ac3 temperature estimation. The newly derived equations and the past results by Andrews1) and Payson and Savage5) both indicate that carbon, man-ganese, nickel, and chromium are key elements for the Ms temperature. In addition, the cross term between carbon and chromium, which is included in one of Andrews’ equations, is also found in all four data-driven equations. To clarify the reason why these cross terms are chosen, further research is required. It is shown in Fig. 9 that the Ms temperatures estimated by these equations show better agreement with the observation data than the previously reported equations.

Page 8: Prediction of Ac3 and Martensite Start Temperatures by a ...

ISIJ International, Vol. 57 (2017), No. 12

© 2017 ISIJ 2236

5. Conclusion

To demonstrate the effectiveness of the AIC, BIC, ABIC, and CV, these criteria were evaluated using a toy model. The results suggested that they could find meaningful parameters, which are not buried in noise in the given data. Among the four criteria, the BIC method showed the best performance for a linear combination model.

We applied the technique to the actual experimental dataset used by Andrews1) and that provided by NIMS to derive pre-dictive equations for the Ac3 and Ms temperatures. By apply-ing the model selection criteria to the datasets collected from these sources, it was demonstrated that the key parameters, traditionally chosen by skilled researchers on the basis of sys-tematically controlled experiments, could be found efficiently. In addition, simply by integrating the data obtained using various randomly selected conditions, the derived equations not only show improved agreement with the experimental

data, but also clarify the hidden mechanism which is usually difficult to derive because of the high dimensionality of the material’s dataset. These results suggest that the model selec-tion method can successfully be applied to other problems involving feature and working-parameter selection, such as the problem estimating fatigue and creep lifetime.

REFERENCES

1) K. W. Andrews: J. Iron Steel Inst., 203 (1965), 721.2) H. P. Hougardy: Werkstoffkunde Stahl, Band 1: Grundlagen, Springer/

Verlag Stahleisen, Düsseldorf, (1984), 229.3) O. G. Kasatkin, B. B. Vinokur and V. L. Pilyushenko: Met. Sci. Heat

Treat., 26 (1984), 27.4) J. Trzaska and L. A. Dobrazan ́ski: J. Mater. Process. Technol., 192

(2007), 504.5) P. Payson and C. H. Savage: Trans. ASM, 33 (1944), 261.6) R. A. Grange and H. M. Stewart: Trans. AIME, 167 (1946), 467.7) J. Wang, P. J. van der Wolk and S. van der Zwaag: Mater. Trans.

JIM, 41 (2000), 761.8) W. G. Vermeulen, P. F. Morris, A. P. De Weijer and S. Van der

Zwaag: Ironmaking Steelmaking, 23 (1996), 433.9) T. Sourmail and C. Garcia-Mateo: Comput. Mater. Sci., 34 (2005),

323.10) E. J. Seo, L. Cho and B. C. De Cooman: Metall. Mater. Trans. A, 46

(2015), 27.11) C.-N. Li, F.-Q. Ji, G. Yuan, J. Kang, R. D. K. Misra and G.-D. Wang:

Mater. Sci. Eng. A, 662 (2016), 100.12) H. Akaike: IEEE Trans. Autom. Control., 19 (1974), 716.13) G. Schwarz: Ann. Stat., 6 (1978), 461.14) H. Akaike: Trab. Estad. Stica Y Investig. N Oper., 31 (1980), 143.15) C. E. Rasmussen and C. Williams: Gaussian Processes for Machine

Learning, The MIT Press, Cambridge, (2006), 105.16) K. S. Al-Rubaie, E. K. L. Barroso and L. B. Godefroid: Int. J.

Fatigue, 28 (2006), 934.17) E. Cockayne and A. van de Walle: Phys. Rev. B, 81 (2010), 12104.18) NIMS Materials Database (MatNavi), NIMS, http://mits.nims.go.jp/

index_en.html, (accessed 2015-04-10).19) M. Stone: J. R. Stat. Soc. Ser. B, 39 (1974), 44.

Fig. 7. The austenite area of the Fe–C steel by Thermocalc (a) and The Ac3 and Acm lines of the Fe–C–Mo steel drawn with Thermocalc and based on the BIC result for Andrews and NIMS data (b).

Fig. 8. Comparison of the experimental and estimated Ac3 tem-peratures by past researchers (a and b) and model selection criterions (c, d, e and f).

Fig. 9. Comparison of the experimental and estimated Ms tem-peratures derived by past researchers (a and b) and model selection criterions (c, d, e and f).