Monitoring Wafer Geometric Quality using Additive Gaussian...
Transcript of Monitoring Wafer Geometric Quality using Additive Gaussian...
-
Monitoring Wafer Geometric Quality usingAdditive Gaussian Process
Linmiao Zhang 1 Kaibo Wang 2 Nan Chen 1
1Department of Industrial and Systems Engineering, National University ofSingapore
2Department of Industrial Engineering, Tsinghua University
May 23, 2013
-
Outline
1 Introduction
2 Statistical Quantification using AGP Model
3 Statistical Monitoring of Geometric Quality
4 Case Studies
5 Conclusion and Future Directions
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
Motivation
Integrated Circuits
3 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
Motivation
Semiconductor Fabrication Process
IngotSlicing Lapping Polishing Cleaning
Wafer
InspectionReject
Disposal
Accept
Front End Back EndChips
4 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
Motivation
Challenges
Transistor size: 32nm → 28nm→ 22nm → 16nm → 14nm →· · ·
Wafer size: 130mm → 150mm→ 200mm → 300mm → 450mm
5 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
Motivation
Wafer Preparation Process
Good
Bad
Wa
fer
Qu
ality
Require
Cause
Diameter
Larger
Higher
Integration
IC Companies
Wafer Fabs
6 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
Motivation
Wafer’s Geometric Quality
Contact method: touching probes;
Non-contact method: wavelength scanning interferometer;
−60 −40 −20 0 20 40 60−60
−40
−20
0
20
40
60
x1
x2
Measurements
Thinner
Thicker
Engineers’ problem: how to check whether the surface is desirable?
7 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
Motivation
Testing Problem
−3 −2 −1 0 1 2 3
−2
−1
01
8 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
Problem formulation
Framework
Surface as the Response Variable
Modeling Monitoring Process Control
• Without covariate• Regression with
covariates
• Design optimization
• Change detection• Design optimization
• Run-to-Run control• Fault diagnostics
9 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
Problem formulation
Difficulties
Complete measurement of the wafer is slow
Geometric profile is too complex to be modeled by parametricfunctions
Measurements on different surfaces might not be aligned well
Deviations (errors) are spatially correlated
10 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
Problem formulation
State of the art
One sample model: Gaussian process (Jin, Chang, and Shi2012), PDE-constrained Gaussian process (Zhao, Jin, Wu,and Shi 2011)
Only applicable for a single surface
Primitive testing: summary indicators of the whole profile
Total Thickness Variation (TTV), Bow, Warp, Site TIR(Doering and Nishi 2007);
Need to fill in the gap
11 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
Review of GP
Gaussian Process
Y (x) = µ+ Z (x) with PD covariance function k(xi , xj)
Suitable for spatially correlated data (Cressie 1993);
Able to approximate complex function (Sacks et al. 1989);
Able to evaluate prediction error (Santner et al. 2003).
0 0.1 0.2 0.3 0.4 0.5−1.5
−1
−0.5
0
0.5
1
1.5
2
2.5
3
3.5
PredictionSampleTrue Function
0 0.1 0.2 0.3 0.4 0.50
0.01
0.02
0.03
0.04
0.05
0.06
0.07
MSE of Prediction
12 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
Review of GP
Gaussian Process with Errors
Errors present in physcial processes or stochastic simulationsY (x) = µ+ Z (x) + �(x)
�(xi ) are i.i.d. normally distributed: Σ + σ2I
�(xi ) are independently and normally distributed, butvar(�(xi )) = σ
2(xi ): Σ + Λ (Ankenman et al. 2010)
�(xi ) are correlated, then?
0 0.1 0.2 0.3 0.4 0.5−2
−1
0
1
2
3
4
Predicted MeanSamplesStandard
Cycle time estimation 187
50th Quantile Regression Curve 85th Quantile Regression Curve
0.5 0.6 0.7 0.8 0.9
510
15
Throughput x
Cyc
le T
ime
Qua
ntile
0.5 0.6 0.7 0.8 0.9
1020
3040
Throughput x
Cyc
le T
ime
Qua
ntile
Fig. 5. G/G/1 quantile regression curve with empirical quantile estimates.
runs). From the numbers provided, we can see that theRMSE obtained using down-sampling is only about 20–30% of that from the original observations.
From both analytical and experimental results, it can beobserved that the correlations among successive cycle timesare much stronger in higher throughput ranges. Therefore,it is possible to devise adaptive down-sampling based onEquation (28) in Section 3.3.3, whose sampling rate is de-termined by the correlations. In this way, the simulationlength needed can be reduced without much sacrifice onthe estimation accuracies.
4.2. G/G/1 system with FCFS queues
Generally, the G/G/1 queuing model provides more flexi-bility in approximating real systems compared to M/M/1queues. However, often the stationary distribution of thecycle time cannot be analytically derived. Therefore, in-stead of computing the relative error and absolute er-ror between our fitted model and analytical results, wecan instead illustrate the prediction accuracy of the re-gression quantile model. In this experiment, the inter-arrival time was assumed to have a lognormal distribu-tion with the log-variance one and log-mean adjusted ac-cording to the throughput requirement. The server pro-cessing time was assumed to follow an Erlang (2) distribu-tion with the rate one-half. Therefore, the mean process-ing time was one in order to be consistent with previousassumptions. As in previous experiments, ten throughputrates equally spaced between 0.5 and 0.95 were selectedfor the simulation. Ten thousand cycle time observationsunder each throughput rate were collected for model fit-ting. A new set of throughputs was chosen ranging from0.5 to 0.95, incremented in steps of 0.01. Under each
throughput, new simulations were conducted and 50 000observations were collected. The empirical sample quan-tile was estimated by using the �Tτ�th order statisticY�Tτ�,where T is the sample size (50 000 in this case). At eachthroughput point, this procedure was repeated five timesand the estimated sample quantiles are plotted along withthe fitted quantile curves in Fig. 5.
From Fig. 5, we can see that the quantile regressioncurve can satisfactorily predict the quantiles under differentthroughput rates. Only in the high throughput range do thepredictions have a large variance and are thus not reliable.However, this issue can be solved by using additional repli-cations in the simulations to collect more data for modelfitting and thus control the accuracy level of predictions.
4.3. Serial production lines
In this section, we consider a serial production system con-sisting of four workstations. The processing times at eachworkstation and the inter-arrival times are random vari-ables following general distributions. Buffers exist betweentwo adjacent workstations. The production line is illus-trated in Fig. 6.
W1 W2
W3 W4
B2
B3 B4
B1
Receiving
Shipping
Fig. 6. Illustration of the serial production line.
Downloaded By: [Chen, Nan] At: 00:44 7 January 2011 13 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
AGP Model
Data Characteristics
Location
Profile Value
f (x)
f (x) + �1(x)
•(x11, y11)
•(x12, y12)
f (x) + �2(x)
•(x21, y22)•
(x21, y22)
14 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
AGP Model
AGP Model
Yi (x) = f (x) + �i (x)
Standard
surface
Deviation
surface
Assumption
f (x) is a realization of GP(µ, s(·))�i (x) is a realization of GP(0, v(·))f (x) and �i (x) are independent
�i (x) and �j(x) are independent for i 6= j
15 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
AGP Model
Distributional view
A Gaussian process can be used as a prior probability distributionover functions in Bayesian inference (Rasmussen and Williams2006).
0 0.2 0.4 0.6 0.8 1
0
0.5
1
1.5
x
Gen
erat
ed V
alue
Realization 1Realization 2
Linear model: Y (x) = f (x) + � i .i .d ∼ F�AGP model: Y (x) = f (x) + �(x) i .i .d ∼ GP(0, v(·))
16 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
AGP Model
Model Estimation
Estimate the model parameters β ≡ [µ, σ21,θ1, σ22,θ2] fromobservations
Location
Profile Value
f (x)
f (x) + �1(x)
•(x11, y11)
•(x12, y12)
f (x) + �2(x)
•(x21, y22)•
(x21, y22)
∗∗∗
∗
17 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
AGP Model
Structure of Σ0
cov(yij , yi ′k) =
{s(xij , xi ′k) + v(xij , xi ′k), ∀i = i ′
s(xij , xi ′k), ∀i 6= i ′i , i ′ = 1, 2, · · · ,N0
+
0
0
M0 ×M0
n1 × n1
n2 × n2
nN0 × nN0
s(xij , xi ′k |θ1) v(xij , xi ′k |θ2)
XIC
XIC
X1 X2 XN0
X1
X2
XN0
18 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
AGP Model
MLE
Given the data from all surface profiles XIC ,YIC , we canestimate β as
β̂ = arg maxβ
{−1
2log[det(σ21S + σ
22V)]
−12
(YIC − µ1M0)T (σ21S + σ
22V)
−1(YIC − µ1M0)}.
Maximizing profile likelihood: given θ1,θ2, the correlationmatrix S,V are fixed. Then µ, σ21, σ
22 can be obtained easily.
µ =1TM0(S + ρV)
−1YIC
1TM0(S + ρV)−11M0
, ρ = σ22/σ21
σ21 =(YIC − µ1M0)T (S + ρV)−1(YIC − µ1M0)
M0
19 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
AGP Model
Prediction
For new unmeasured site (Xl ,Yl):(Yl
YIC
)∼ N
[(µ1nlµ1M0
),
(Σl Σl,0
ΣTl,0 Σ0
)]
Yl |YIC ∼ N(µ̃l , Σ̃l), where
µ̃l = µ1nl + Σl,0Σ−10 (YIC − µ1M0 )
Σ̃l = Σl −Σl,0Σ−10 ΣTl,0
Σl ,0 may have a different form depending on whether Yl aretaken from existing profiles or new ones.
20 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
AGP Model
Prediction Demonstration
0 0.1 0.2 0.3 0.4 0.5−2
−1
0
1
2
3
4
Predicted MeanSamplesStandard
0 0.1 0.2 0.3 0.4 0.50.0581
0.0582
0.0583
0.0584
0.0585
0.0586
0.0587
0.0588
0.0589
0.059
Predicted Variance
Predicted mean Predicted variance
21 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
T 2 Test
Statistical Testing
Location
Profile Value
•• • •
Whether the new profile deviates from f (x) within acceptableregion
Statistical testing based on the samples (where to sample?)
22 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
T 2 Test
T 2 Test
If the new surface conforms with the model, Yl ∼ N(µ̃l , Σ̃l)Reducing surface comparison to multivariate normal datacomparison
H0 : Yl ∼ N(µ̃l , Σ̃l) H1 : Yl 6∼ N(µ̃l , Σ̃l).
Testing statistic:
T 2l = (Yl − µ̃l)T Σ̃−1l (Yl − µ̃l),
Under H0, T2l ∼ χ2nl . Reject H0 when T
2l > HT .
23 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
Generalized likelihood ratio test
GLR Test
Only focus on a certain class of alternative models
Another deviation source is considered as the alternativemodels
Yl(x) = f (x) + �l(x) + ξ(x)ξ(x) is a realization of another GP(δ,w(·)).Suitable to model the global change effects
Testing hypothesis
H0 :Yl(x) = f (x) + �l(x)
H1 :Yl(x) = f (x) + �l(x) + ξ(x)
24 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
Generalized likelihood ratio test
GLR Test
With finite number of observations
Testing hypothesis:
H0 :Yl ∼ N(µ̃l , Σ̃l)H1 :Yl ∼ N(µ̃l + δ1nl , Σ̃l + Σw ) for some nonzero δ, γ2,θl
GLR statistic:
Rl = 2 lnsupδ,γ2,θl
det(Σ̃l + Σw )−1/2 exp
[−(Yl − µ̃l − δ1nl )
T (Σ̃l + Σw )−1(Yl − µ̃l − δ1nl )/2
]det(Σ̃l )
−1/2 exp[−(Yl − µ̃l )T Σ̃
−1l (Yl − µ̃l )/2
]
Rl ∼ equal mixture χ21 - χ22 asymptotically under H0. RejectH0 when: Rl > HR .
25 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
Generalized likelihood ratio test
Summary
N0 IC Units
ni on Unit iAGP Model
(µ̃l , Σ̃l )
New UnitYl
Xl
T 2 TestGLR Test
Accept
Reject
Continue
Disposal
26 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
Approximation and Estimation Performance
Approximation Performance
Standard profile (Shpak 1995):
f (x) = sin(x) + sin(10x/3) + log(x)− 0.84x + 3
Spatially correlated error: �(x) ∼ GP(0, 0.05× v(· |5))
3 4 5 6 7−2
−1
0
1
2
3
x
Pre
dict
ed m
ean
f(x)MeasurementsAGPOGP
3 4 5 6 70
0.5
1
1.5
2
2.5
x
Pred
icte
d va
rian
ce
AGPOGP
OGP Model: Yi (x) = µ+ �i (x)
27 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
Approximation and Estimation Performance
Bias and RMSE of MLE
Accuracy of the MLE with different sample size:
(N0, n0) µ = 1 σ2 = 0.2 θ1 = 3 τ
2 = 0.05 θ2 = 10
(10,10)Bias -0.0043 -0.0189 0.4375 -0.0002 0.7089RMSE 0.1824 0.1001 1.6348 0.0080 4.3834
(10,20)Bias -0.0013 -0.0189 0.1756 0.0001 0.0011RMSE 0.1831 0.0975 0.9608 0.0066 0.9204
(20,10)Bias 0.0106 -0.0103 0.2528 0.0000 0.4140RMSE 0.1903 0.1038 1.1990 0.0056 3.1826
(20,20)Bias 0.0015 -0.0169 0.1317 0.0002 0.0001RMSE 0.1850 0.0920 0.7562 0.0045 0.5976
28 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
Monitoring Performance
Three Change Scenarios
Y (x) = f (x) + �(x)
Mean (µ)
2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5−2
−1
0
1
2
3
4
StandardShifted
Variance (σ22)
2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5−2
−1
0
1
2
3
4
StandardShifted
Correlation (θ2)
2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5−2
−1
0
1
2
3
4
StandardShifted
29 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
Monitoring Performance
Performance of Different Tests
Three tests to compare:Max-Min TestGLR TestT 2 Test
Shift magnitude
Bet
a er
ror
0.0
0.2
0.4
0.6
0.8
1.0
0.0 0.1 0.2 0.3 0.4 0.5
Mean
0.1 0.2 0.3 0.4 0.5
Variance
0 5 10 15
Correlation
MaxMin GLR T2
30 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
Monitoring Performance
Effect of Testing Sample Size (nl)
31 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
Monitoring Performance
Effect of In Control Sample Size (N0, n0)
Shift magnitude
Bet
a er
ror
0.0
0.2
0.4
0.6
0.8
1.0
0.0 0.1 0.2 0.3 0.4 0.5
Mean
GLR
0.1 0.2 0.3 0.4 0.5
Variance
GLR
0 5 10 15
Correlation
GLR
0.0 0.1 0.2 0.3 0.4 0.5
Mean
T2
0.1 0.2 0.3 0.4 0.5
Variance
T2
0 5 10 15
0.0
0.2
0.4
0.6
0.8
1.0
Correlation
T2
(10,10) (20,10) (10,20) (20,20)
32 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
Real Application
Monitoring Wafer Thickness Profile
Data are collected from real production plant;
8 in control wafers to construct AGP model, 30 wafers to betested;
120 measurements from each in control wafer to constructAGP model;
480 measurements from each testing wafer to conduct tests.
33 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
Real Application
Demos of Thickness Profile
In control wafer #2 In control wafer #7Approximatedstandard profile
34 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
Real Application
p-Values of the Tests
0 5 10 15 20 25 300
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
30 Wafer Surfaces
p−
Val
ue
T2
GLRSignificant Level
35 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
Real Application
Rejected Wafers (p-values)
#12 (T 2:0.9250
GLR:1.3051×10−4)#23 (T 2:0.0018
GLR:2.2178×10−11)#24 (T 2:3.7191×10−4
GLR:3.4084×10−14)
#26 (T 2:2.5678×10−4
GLR:9.5180×10−9)#28 (T 2:1.1102×10−16GLR:0)
#30 (T 2:7.2819×10−4
GLR:2.5700×10−11)
36 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
Open issues
Optimal Design for AGP
Nonparametric model, Fisher information matrix is notenough
−60 −40 −20 0 20 40 60−60
−40
−20
0
20
40
60
Ordinary space filling design for deterministic experiments
does not consider geometric featuredoes not consider the error process
37 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
Open issues
Optimality Criteria
Prediction accuracy: minimize (integrated) RMSE
Determine N0, n0 and xijApproximation accuracy of f (x) and error process estimationσ22 ,θ2Sequential allocation strategy (Ankenman et al. 2010)
Detection power: minimize β error
T 2 test: when only µ changes, the Mahalanobis distance
δ′Σ̃−1l δ determines the power, where
Σ̃l = Σl −Σl,0Σ−10 ΣTl,0
Constant mean shift: maxXl 1′Σ̃−1l 1
D-optimal: maxXl det Σ̃−1l =⇒ minXl det Σ̃l
38 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
Open issues
GP with “Covariates”
Surface profile depends on other factors: speed, force, materials,etc.
GP modelGP with inde-pendent errors
• Ankenman et al. (2010)
GP withdependent errors
Multivariateoutput/response
• Co-kriging• Zhou et al. (2011); Qian et al. (2008)• Different distance metrics
Surface response
39 / 42
-
Introduction AGP Model Statistical Testing Case Studies Conclusion References
Open issues
Conclusion
AGP model is suitable to approximate surface profile andquantify dependent deviations;
A simple and flexible framework for process monitoring
Need to further consider design issues and extend the modelto the case with covariate
40 / 42
-
Reference I
Ankenman, B., Nelson, B., and Staum, J. (2010), “Stochastic kriging for simulation metamodeling,” OperationsResearch, 58, 371–382.
Cressie, N. (1993), Statistics for Spatial Data, revised edition, vol. 928, Wiley, New York.
Doering, R. and Nishi, Y. (2007), Handbook of semiconductor manufacturing technology, CRC Press, Boca Raton,FL.
Jin, R., Chang, C., and Shi, J. (2012), “Sequential measurement strategy for wafer geometric profile estimation,”IIE Transactions, 44, 1–12.
Qian, P. Z. G., Wu, H., and Wu, C. F. J. (2008), “Gaussian Process Models for Computer Experiments withQualitative and Quantitative Factors,” Technometrics, 50, 383–396.
Rasmussen, C. E. and Williams, C. K. I. (2006), Gaussian Processes for Machine Learning, MIT Press, Boston.
Sacks, J., Welch, W., Mitchell, T., and Wynn, H. (1989), “Design and analysis of computer experiments,”Statistical science, 4, 409–423.
Santner, T., Williams, B., and Notz, W. (2003), The design and analysis of computer experiments, Springer, NewYork.
Shpak, A. (1995), “Global optimization in one-dimensional case using analytically defined derivatives of objectivefunction,” Computer Science Journal of Moldova, 3, 168–184.
Zhao, H., Jin, R., Wu, S., and Shi, J. (2011), “Pde-constrained gaussian process model on material removal rate ofwire saw slicing process,” Journal of Manufacturing Science and Engineering, 133, 21012.1–21012.9.
Zhou, Q., Qian, P. Z. G., and Zhou, S. (2011), “A Simple Approach to Emulation for Computer Models withQualitative and Quantitative Factors,” Technometrics, 53, 266–273.
-
Thanks and questions
IntroductionStatistical Quantification using AGP ModelStatistical Monitoring of Geometric QualityCase StudiesConclusion and Future Directions