1Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
Response Surface MethodPrinciple Component Analysis
max( ( ))opt xx S x
Daniel BaurETH Zurich, Institut für Chemie- und Bioingenieurwissenschaften
ETH Hönggerberg / HCI F128 – ZürichE-Mail: [email protected]
http://www.morbidelli-group.ethz.ch/education/index
2Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
Definitions
The response surface method is a tool to Investigate the repsonse of a variable to changes in a set of design
or explanatory variables Fine the optimal conditions for the response
Example: Consider a chemical process where the yield is a (unknown) function of temperature and pressure, and you want to maximize the yield
( , )Y Y T P
3Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
COVT Approach
COVT stands for «Change One Variable per Time» This approach makes a fundamental assupmtion:
Often, experimentation starts in a region far from the optimum
Example: We do not know the response surface for Y(T,P), but we start investigating it by first changing T, then P.
Changing one parameter at a time is independent of the effects of changes in the others.
This is usually not true!
4Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
COVT Approach (Example)T
5060
70
80
Contour curves for the yield (Y)
Starting point
Design of experiments
Optimum ???
Optimum !!!
P
5Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
2k Factorial DesignT
5060
70
80
Contour curves for the yield (Y)
Design of experiments
Optimum-1
-1
+1
+1
P T Y-1 -1 40
-1 +1 78
+1 -1 59
+1 +1 58
Initial investigation starts with a first order approximation of the response surface
P
6Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
Example: Plastic Wrap
The strength of a plastic wrap (Y) is a function of the sealing temperature (T) and the percentage of polyethylene additive (P). A process engineer tries to make the wrap as strong as possible (maximize Y).
The response function (unknown to the engineer!) reads:
Starting conditions: T = 140 C, P = 4.0%
Optimal conditions (analytical): T = 216 C, P = 9.2%
2 220 0.85 1.5 0.0025 0.375 0.025Y T P T P T P
7Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
Ture Response Surface
20
30
30
40
40
50
50
50
50
50
60
60
60
6060
60
70
70
70
70
70
70
75
75
75
75
78
78
PE Additive (%)
Tem
pera
ture
(o C)
0 5 10 15100
120
140
160
180
200
220
240
260
280
300
Starting point
Optimum
8Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
2k Factorial Design
T PCoded
t p120 2 -1 -1120 6 -1 +1160 2 +1 -1160 6 +1 +1
14020
Tt
42Pp
0 1 2Y b b p b t Initial regression model:
-1
-1
+1
+1
9Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
22 Factorial Design
-1
0
1
-1
0
1
45
50
55
60
65
70
75
pt
Y
True Response Surface
Contour Curves of Y
10Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
First Order Regression
-1.5-1
-0.50
0.51
1.5
-1
0
1
40
50
60
70
80
pt
Y
Regressed Response
11Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
2k Factorial Design with Center Point
T PCoded
t p120 2 -1 -1120 6 -1 +1160 2 +1 -1160 6 +1 +1140 4 0 0
14020
Tt
42Pp
0 1 2Y b b p b t Initial regression model:
-1
-1
+1
+1
Central point does not influence the regression of the
slope
12Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
22 Factorial Design with Center Point
-1.5-1
-0.50
0.51
1.5
-1.5-1
-0.50
0.51
1.540
50
60
70
80
pt
Y
True Response Surface
Contour Curves of Y
Experimental
Responses
13Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
First Order Regression
-1.5-1
-0.50
0.51
1.5
-1.5-1
-0.50
0.51
1.540
50
60
70
80
pt
Y
Regressed Response
14Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
Curvature
The center point can give us an indication about the curvature of the surface and its statistical significance
If there is no curvature and the linear model is appropriate in the region of interest, then the average value of the experimental responses in the center point(s) and in all the corners is roughly equal (within the standard deviation)
2
1 1, var2 2curv center center
center
s t n Yn
center corner curvC E Y E Y s C- C+
15Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
Tukey-Anscombe Plot
50 55 60 65 70 75-4
-3
-2
-1
0
1
2
3
Y Regressed
Res
idua
ls
16Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
Steepest Ascent Direction
-1.5 -1 -0.5 0 0.5 1 1.5-1.5
-1
-0.5
0
0.5
1
1.5
p
t
Contour Lines of the
Regressed 1st order Surface
Steepest Ascent Direction
Experimental Points
17Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
Steepest Ascent Direction
-1.5 -1 -0.5 0 0.5 1 1.5
-1
0
1
45
50
55
60
65
70
75
80
pt
Y
18Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
Monodimensional Search
20
20
30
30
40
4050
50
50
50
50
60
60
60
6060
60
70
70
70
70
70
70
75
75
75
75
7878
P
T
0 5 10 15100
120
140
160
180
200
220
240
260
280
300
Steepest Ascent Direction
Monodimensional search
19Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
Monodimensional Search
0 1 2 3 4 5 6 7 8 964
66
68
70
72
74
76
78
80
Step Number
Y
Experimental points
True Response along the
steepest ascent direction
20Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
22 Factorial Design with Center Points
-1.5 -1 -0.5 0 0.5 1 1.5-1.5
-1
-0.5
0
0.5
1
1.5
p
t
Maximum from the
monodimensional search
Maximum of response
surface (unknown)
New 2k Factorial Design
21Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
22 Factorial Design with Center Points
-1.5-1
-0.50
0.51
1.5
-1.5-1
-0.50
0.51
1.570
72
74
76
78
80
pt
Experimental Points
True response surface
22Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
First Order Regression
-1.5-1
-0.50
0.51
1.5
-1.5-1
-0.50
0.51
1.570
72
74
76
78
80
pt
Regressed Response
23Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
Central Composite Design
-1.5 -1 -0.5 0 0.5 1 1.5-1.5
-1
-0.5
0
0.5
1
1.5
p
t
2k Factorial Design
r = 21/2
Central Composite
Design
At least three different levels are needed to estimate a second order function
24Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
Central Composite Design
-1.5 -1 -0.5 0 0.5 1 1.5
-1
0
1
70
72
74
76
78
80
pt
Y
2 20 1 2 3 4 5Y p t p t pt
25Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
Central Composite Design
73 74 75 76 77 78 79-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
Regressed Y
Res
idua
lsTukey-Anscombe Plot
26Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
Response Surface Method Algorithm
1. Use 2k factorial design to generate linearization points around a starting point x(0), where k is the number of variables
2. Fit a linear regression model
3. Check if the curvature is large. If so, jump to point 7. If you think you are far from the maximum, you can try smaller steps.
4. Find the steepest ascent direction
0 0 1 1 1k k kY b x b x b x b
0 12
0
1 , , , Tkk
ii
d b b bb
27Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
Response Surface Method Algorithm (Continued)
5. Conduct experiments at points along the steepest ascent direction
6. When a maximum in the response variable occurs, setx(0) = x(k) and go back to point 1.
7. Perform a central composite design around the current point. Fit a second order linear regression.
8. Find the extremum of the regression curve by setting the Jacobian equal to zero and solving the resulting linear system
9. Check that J is negative definite (all eigenvalues < 0) to ensure a maximum in the function
( ) (0) 1,2,kx x k d k
28Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
Principal Component Analysis (PCA)
Consider a large sets of data (e.g., many spectra (n) of a chemical reaction as a function of the wavelength (p))
Objective: Data reduction: find a smaller set of (k) derived (composite) variables that retain as much information as possible
n
p
An
k
X
29Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
PCA
PCA takes a data matrix of n objects by p variables, which may be correlated, and summarizes it by uncorrelated axes (principal components or principal axes) that are linear combinations of the original p variables
New axes = new coordinate system Construct the Covariance Matrix of the data (which need to
be centered), and find its eigenvalues and eigenvectors
30Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
PCA in Matlab
There are two possibilities to perform PCA with Matlab: 1) Use Singular Value Decomposition: [U,S,V]=svd(data); where U contains the scores, V the eigenvectors of the covariance
matrix, or loading vectors. SVD does not require the statistics toolbox.
2) [COEFF,Scores]=princomp(data); is a specialized command to perform principal value decomposition. It requires the statistics toolbox.
31Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
Exercise 1
A chemical engineer tries to optimize the a reaction by maximizing the yield. There are two variables which influence the yield: The reaction time and the reaction temperature. Currently, the reaction is carried out for 35 minutes at 155 F, resulting in a yield of about 40%.
Three sets of experiments were conducted, given in the data files reactionYield-1 through 3. The datasets are structured identically, with the first two columns being time and temperature, the third and fourth column the same variables in coded units (-1, +1, etc.) and the last column is the yield y.
32Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
Assignment 11. The first data set is near the current operating point.
Fit a first order (planar) surface to the data. What is the direction of the steepest ascent? Plot the operating conditions, experimental design points and the
direction you found in the parameters plane Time vs. Temperature.
2. The second data set contains more experiments in the direction found in part 1. Plot the data (for example as Yield vs. Temperature) and find out
where the yield reaches a maximum along this direction.
33Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
Assignment 1 (Continued)
3. The maximum in 2. is used for another first order design, this data is found in the third data set. Show that the curvature of the response surface is significantly
different from zero.
4. The data from 3. is now extended to a central composite design. Fit a second order (quadratic) response surface to the data and calculate the maximum analytically. If you are using LinearModel, you can specify second order terms in
the modelspec by using the * and ^ operators, for example'y ~ a*b' will incorporate a, b and a*b, and 'y ~ a^2' will use the quadratic term. So for two variables a and b, the modelspec string for a second order linear regression will read'y ~ a^2 + a*b + b^2'
34Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
Assignment 2
The dataset d_react contains data of IR spectra measured during a chemical reaction (122 x 700). The first row contains the wavelength, all other rows the spectra.
1. Create a matrix centeredData, obtained by centering the data, i.e. subtracting the column mean from each column. What can observe when looking at the centered spectra? What distinguishes the different observations (spectra) regarding the different variables (wavelengths)?
2. Perform singular value decomposition on the centered data. The U matrix of this decomposition contains the «scores» in terms of PCA. Use [U,S,V] = svd(centeredData);
35Daniel Baur / Numerical Methods for Chemical Engineers / RSM & PCA
Assignment 2 (Continued)
3. Plot the first 3 scores in a scatterplot matrix using the plotmatrix function.
4. Plot the first three loading vectors (columns of V) versus the wavelength. What can you observe? Compare with what you have seen in point 2.
Top Related