DYNAMIC FEATURE MONITORING TECHNIQUE APPLIED TO …schedules [1]-[2]. Instead, maintenance is almost...

1 Copyright © 2011 by ASME

Proceedings of the ASME 2011 International Manufacturing Science and Engineering Conference MSEC2011

June 13-17, 2011, Corvallis, Oregon, USA

MSEC2011-50041

DYNAMIC FEATURE MONITORING TECHNIQUE APPLIED TO THIN FILM DEPOSITION PROCESSES IN AN INDUSTRIAL PECVD TOOL

Alexander Bleakie University of Texas at Austin

Department of Mechanical Engineering Austin, TX, United States

Dr. Dragan Djurdjanovic University of Texas at Austin

Department of Mechanical Engineering Austin, TX, United States

ABSTRACT

In semiconductor fabrication processes, reliable feature extraction and condition monitoring is critical to understanding equipment degradation and implementing the proper maintenance decisions. This paper presents an integrated feature extraction and equipment monitoring approach based on standard built-in sensors from a modern 300mm-technology industrial Plasma Enhanced Chemical Vapor Deposition (PECVD) tool. Linear Discriminant Analysis was utilized to determine the set of dynamic features that are the most sensitive to different tool conditions brought about by chamber cleaning. Gaussian Mixture Models of the dynamic feature distributions were used to statistically quantify changes of these features as the condition of the tool changed. Data was collected in the facilities of a well-known microelectronics manufacturer from a PECVD tool used for depositing various thin films on silicon wafers, which is one of the key steps in semiconductor manufacturing. Dynamic features coming from the radio frequency (RF) plasma power generator, matching capacitors, pedestal temperature, and chamber temperature sensors were shown to consistently have significant statistical changes as a consequence of repeated cleaning cycles, indicating physical connections to the chamber condition.

INTRODUCTION

Maintenance is one of the key issues in modern

semiconductor manufacturing [1]-[2]. Presently, maintenance scheduling in this industry is predominantly based on the historical reliability information pertaining to individual components, and short-term performance signatures for detecting sudden failures. Due to constantly changing operating conditions, these historical records are hard to interpret and are

often insufficient to make appropriate decisions. Therefore, today’s semiconductor industry does not efficiently use the available built-in sensors to infer or predict the equipment condition in order to implement efficiently the maintenance schedules [1]-[2]. Instead, maintenance is almost exclusively age-based or usage-based, leading to significant waste due to maintenance performed on machines that do not need it or due to missed maintenance actions that result in machine failures, bad product quality, and unscheduled downtimes [2]-[5].

The condition-based maintenance (CBM) paradigm establishes a connection between the equipment condition and its performance expressed in the readings of sensors existing in that equipment [5]-[6]. Such information about the actual condition of the equipment can be used to make maintenance decisions that are optimally synchronized with human and material resources in the manufacturing system, and the least intrusive on the overall manufacturing operations. If implemented successfully, CBM can optimize maintenance decisions for future generations of manufacturing facilities.

Many challenges exist that have prevented the semiconductor manufacturing industry from more pervasively implementing proactive CBM policies based on available equipment sensors, product quality, and equipment reliability. Several of those challenges are [4]-[5]:

1. High complexity in semiconductor manufacturing

systems brings inherent stochastic behavior to a typical facility. Thus, relations between outermost observables in a semiconductor manufacturing system (sensor readings, work in progress levels, metrology and yield information) and the condition of the system and its components are highly complex and stochastic.

2. Equipment health is usually not directly observable with any sensors. Instead, equipment condition must be inferred through in-place tool process sensors.


3. Fragmented and separate maintenance, production, and product quality information brings complex integration problems. These information streams need to be considered simultaneously through a fusion of various data domains [2], [5].

4. Variable operating conditions necessary for the production of a variety of microelectronics place the equipment under highly complex degradation patterns. In a typical semiconductor fabrication facility, different products, operations, and recipes are processed on a single tool, and one same operation could be executed on various tools. This builds complex operating conditions on the tools and makes equipment degradation highly unpredictable. Accurate operation mode-dependent degradation models are needed to truly understand and describe tool behavior in such an environment. The work of finding a set of features (extracted from

equipment sensor readings) that are indicative of the degradation processes and utilizing appropriate condition monitoring techniques is fundamental for enabling proactive CBM [7]. In general, the following techniques have been widely accepted as feature extraction methods for CBM in manufacturing [8]:

Time domain methods: statistical moments, event

counts, energy operators, short-time dynamic features, synchronized averaging…etc.

Frequency domain methods: Fourier analysis, Hilbert transforms, SB ratio, residual analysis, bi-coherence, cyclo-stationarity…etc.

Time-frequency methods: spectrograms, wavelet transforms, Wigner-Ville distribution, Choi-Williams distribution…etc.

Model-based methods: first-principle physics, wideband demodulation, virtual sensors, embedded model matching…etc. Unfortunately, the in-place sensors on even the most

modern semiconductor manufacturing equipment are sampled at very low frequencies (1-10 Hz), compared to some other areas, such as aerospace or rotating machinery applications (sampling frequencies in these areas are often on the order of kHz or even MHz ranges) [7]. This greatly limits the feasibility of frequency and time-frequency methods in this area. Instead, great majority of semiconductor CBM applications utilize raw sensor outputs or their statistical moments (mean, variance, etc.) for monitoring purposes. This is appropriate whenever sensor readings are very closely correlated to the monitored or controlled processes. For example, the thermocouple readings from a chemical vapor deposition (CVD) chamber can be used directly as a monitored feature in statistical process control (SPC) charts. In addition, the in-situ particle count can be used

as an indicator of chamber contamination and is hence directly used for monitoring, if these sensors are available on the equipment. The chemical mechanical planarization (CMP) process is an example where layer thickness, copper residuals, and the wafer temperature are directly measured and used as monitoring parameters in [9]. Spanos et al. [10] depart from this trend of utilization of raw sensor readings for monitoring and present a novel feature extraction method for plasma-etch equipment sensors. Time series models [11] for each of these signals are utilized to analyze residuals of the signals after eliminating trends.

A second fundamental level for enabling proactive CBM policies is condition monitoring of the equipment using the features extracted from the sensor readings. Condition monitoring techniques determine the current system or subsystem conditions based on the outputs from the feature extraction module [7]. SPC and advanced process control (APC), based on statistical or pattern recognition methods, are commonly utilized in semiconductor manufacturing. Montgomery [12] gives a thorough description of SPC, along with applications from various industries. SPC methods can detect statistically significant departures in a time series from normal conditions. They are thus capable of monitoring the process condition and aiding in scheduling maintenance based on current tool conditions. For example, Spanos et al. [10] raised alarms if the univariate T2 statistic [13] (composed of equipment features) falls outside the process control limits established by SPC methods [12]. Bunkofske [14] used SPC for condition monitoring after reducing the number of equipment features using principal component analysis. Mai [15] used SPC in monitoring the contamination inside a lithography tool, which may grow over time to cause errors in the products.

APC methods for process monitoring use elaborate multivariate models that capture process dynamics. These methods are increasingly being employed in recent years to deal with diagnostic challenges associated with larger wafer sizes, higher costs, and smaller critical dimensions of electronic components. Card et al. [16] utilized neural network predictions to monitor and control a plasma etch process. Pompier et al. [17] monitored a multi-chamber oxide deposition process by tracking individual chambers using APC methods. Velichko [18] proposed a nonlinear multiple-input, multiple-output (MIMO) model-based APC framework for semiconductor manufacturing processes, and shows its benefits over tracking raw features. Baek et al. [19] analyzed the electron collisions during plasma processes using APC methods in order to identify small, previously unknown, trends between wet cleans.

This paper presents integrated methods for feature extraction and condition monitoring applied to an industrial Plasma Enhanced Chemical Vapor Deposition (PECVD) tool commonly found in the semiconductor manufacturing industry. The remainder of the paper is organized as follows. In the next section, PECVD tools are described briefly, followed by a description of the layout of the subsystems, sensors, and


operation cycle of the tool used in this study. Afterwards, the following section lists the dynamic features that were extracted from the tool sensors, and introduces the Linear Discriminant Analysis (LDA) as a viable method for determining which of these dynamic features are most sensitive to tool condition changes brought about by maintenance. The overlap of probability density functions, of features yielded by the tool at various stages of its degradation, is introduced as a way to quantify and monitor statistical changes of the dynamic features through time as the condition of the tool changed. Then, the results show one short-term monitoring example (between in-situ clean cycles of the PECVD tool) and one long-term monitoring example (over many in-situ clean cycles) of dynamic features extracted from the tool sensors, as observed over several months of actual production in a major 300mm domestic fab. Finally, conclusions and future work are discussed. DESCRIPTION OF PECVD TOOL

PECVD tools are used for depositing thin films onto silicon wafer substrates, which is one of the crucial steps in manufacturing of microelectronics. It is the most common method for producing conductors and dielectrics with excellent film growth properties necessary for small chip components. Inside a PECVD tool chamber, reactive gases pass over silicon wafers and are absorbed onto the surfaces to form a thin layer. The gases are excited through radio frequency (RF) electrical power that creates energetic plasma used to deposit the film on

the wafers. The plasma state allows the reaction to take place at lower temperatures more suitable for large silicon wafers. Ultimately, many stacked layers of conducting and insulating films with etched patterns between them (forming thousands of microscopic electrical components) form an integrated circuit [20]. PECVD Tool Subsystems and Sensors

A general PECVD tool is composed of a reaction chamber, radio frequency (RF) plasma generation system, gas delivery system, wafer load locks, and a robotic arm to carry wafers to and from the tool. Figure 1 shows a diagram of the main components of a PECVD tool.

The RF matching network for generating plasma is shown on the top-left of the diagram. The high frequency energy is sent through two matching capacitors (load and tune capacitors) that control the power delivered to the chamber. By varying their capacitances, the capacitors try to tune the impedance of the circuit to the load impedance of the chamber and thus deliver maximum RF power to the gases in the chamber [20]. The RF energy excites the flowing gas into the plasma state necessary for lower temperature depositions [20].

The gas delivery system is on the right. It consists of mass flow controllers (MFCs) for each gas used in various depositions. Gas flows at specified durations to ensure processing of specific thin-film recipes. A control valve (bottom of Figure 1) controls the chamber pressure and evacuates deposition gases from the chamber.

Plasma

HF Gen

LF Gen

HF Match

LF Match

Robot Chemistry/

Flow

PECVD Tool Chamber- Sensors

Pump

MFC

MFC

MFC

Pressure

RF Power

Temperature

Showerhead

Pedestal

LL ManoChamber Mano

Pressure

Control Valve

…..

Figure 1: Diagram representing the various components that make up the PECVD tool.


Temperature controlled top and lower chamber plates enclose the chamber and the walls are heated to minimize on-wall deposition and speed up the reaction during the automatic cleaning process. It has been noted that chamber heating may have an adverse effect of degrading the tool faster (e.g., seals) [20]. The PECVD tool used in this study was a standard 300mm wafer tool with numerous in-place sensors measuring the physics of the process (to ensure that the study applies to a real semiconductor fab, only the standard sensors that came with the tool were analyzed and no additional sensors were added). The sensors used were the RF power characteristics (forward, load, and reflected power), voltages of RF matching network capacitors, MFC flow rates, top plate temperature, chamber temperature, pedestal temperatures, chamber pressure, and the pendulum valve angle. Tool Operation and Maintenance Schedule

Silicon Nitride (Si-N), Silicon Dioxide (Si-O2), and Tetraethyl Orthosilicate (TEOS) are some of the most common thin films deposited in PECVD tools, even though other compounds can be used, depending on the conductivity, mechanical and reliability requirements on the film [20].

In addition to the deposition cycles that contribute to the process of chip-making, PECVD tools in semiconductor manufacturing facilities also perform automatic in-situ cleaning programs after a predetermined total film accumulation limit (corresponding to approximately 25-100 wafers, depending on the film thickness). The usual way of performing the in-situ clean is done by flowing plasma-excited Fluorine (F-) into the chamber to eat away deposited films on the tool surfaces. These cleans are performed periodically in order to bring the tool back into a lower state of degradation. Unfortunately, they do not fully bring the tool back to a brand new state. This results in a long-term degradation of the tool, which over time leads to the production of wafers with noticeable defects, unless preventive maintenance (PM) actions are undertaken. Thus, besides the short-term accumulation drifts caused by successive wafer depositions, one can also observe a long-term drift of the tool condition as numerous in-situ clean cycles are executed. Both the long and short-term drifts are indicative of the tool condition.

Figure 2 outlines the scheduling for different levels of PECVD tool cleaning and maintenance. An automatic in-situ clean program is performed after deposition on a predetermined number of wafers (approximately, every 25-100 wafers). The tool loops through these programs using different chemistries and physical parameters until the fixed-time maintenance schedule requires long-term preventive maintenance (approximately, every 25,000-100,000 wafers). This long-term PM action usually consists of a physical wipe down of the chamber and replacements of various critical tool components.

PECVD Tool Degradation

Qual.

Wafer Batch (25-100 wfrs)

In-situ Clean

Preventive Maintenance

Time

Accumulation Drift(ARMA Modeling)

PM Drift(HMM Modeling)

Production

Figure 2: Production cycle of the PECVD tool.

In the remainder of this paper, a set of parameters (chemistry and physical properties) characterizing a given operation will be referred to as an operation recipe. The next section details the methods for feature extraction and condition monitoring for the PECVD tool used in this study.

ARCHITECTURE OF FEATURE EXTRACTION AND CONDITION MONITORING Dynamic Feature Extraction Methodology

Typical dynamic features one obtains from raw time-domain sensor readings include rise time, overshoot, steady state value, and settling times [21]. Additional dynamic features used in condition monitoring include timing between signal actuations, frequencies, and damping coefficient estimates [5]-[6]. Statistical features (mean and variance) are also commonly used for monitoring tool condition. The features extracted from PECVD tool sensor readings in this study are summarized in Table 1. There are 54 features analyzed in total. Analyses are performed on one process at a time (for example, depositions, pre-coats, in-situ clean, etc), by comparing the features of the same process, throughout tool operation and maintenance events.

Table 1: Features extracted from the data analyzed in this study.

Signal Dynamic FeaturesTop Plate Temperature Mean Minimum AmplitudeChamber Temperature Mean Minimum AmplitudePedestal 1 Temperature Mean Minimum AmplitudeLF Forward Power Steady State Error Tune Time LF Load Power Steady State Error Tune TimeLF Reflected Power Steady State Error Tune TimeHF Forward Power Steady State Error Tune TimeHF Load Power Steady State Error Tune TimeHF Reflected Power Steady State Error Tune Time MaximumLoad Capacitor Voltage Steady State Tune Time Overshoot High Overshoot Low Tune Capacitor Voltage Steady State Tune Time Overshoot High Overshoot LowPendulum Valve Angle Steady State MaximumProcess Chamber Pressure Steady State Error Rise Time Overshoot MinimumLiquid Flow Rate 1 Steady State Error Rise Time OvershootLiquid Flow Rate 2 Steady State Error Rise Time Overshoot …….. …….. …….. ……..Liquid Flow Rate N Steady State Error Rise Time OvershootGas Flow Rate 1 Steady State Error Rise Time OvershootGas Flow Rate 2 Steady State Error Rise Time Overshoot …….. …….. …….. ……..Gas Flow Rate N Steady State Error Rise Time Overshoot


In order to perform a multivariate analysis of the numerous features extracted, the features were standardized to eliminate the physical units and thus make them dimensionally homogeneous. Standardization of each feature was accomplished by subtracting its mean and dividing it by its standard deviation, where the mean and standard deviation were calculated from the data set considered representative of normal operation. In addition, an exponential weighted moving average (EWMA) [11] of some raw features was used to smooth out short-term fluctuations, and eliminate noise for long-term trend analyses. Sensitivity Analysis Methodology: Linear Discriminant Analysis

The dynamic features were analyzed for sensitivity by comparing features extracted from the sensor readings emitted during times when the tool operated under different conditions. This comparison has been performed on dynamic features gathered from a given process (for example, deposition) before and after maintenance events.

Changes in the dynamic features that are initiated by the changes in the tool conditions are examined using statistical sensitivity analysis, where the sensitivity of a feature is assessed as the amount of statistical change seen in the feature data as the tool shifts from one condition to another (presence or absence of a fault). Linear Discriminant Analysis (LDA) [22] is used for finding the most sensitive features between two classes of data. This method is one of the most basic classification schemes whenever one wants to find the discrimination boundary of data belonging to known classes. Due to the linear structure, its implementation on feature vectors of large dimensionality is simple, which motivated its use in this work.

LDA is a method used in statistics and machine learning to find a linear combination of features that characterize and separate two or more classes of data the best. The resulting combination may be used as a linear classifier, or, more commonly, for dimensionality reduction before later classification. Other classifiers such as a Quadratic Classifier [22] or a Support Vector Machine [23] may be superior in dealing with multi-class problems, but they require solving of large-scale optimization problems, while LDA [22] requires the solutions to a simple Eigenvalue problem. This is particularly important when one deals with a large feature set, as we do in this study. More details on the LDA are given below:

Suppose the unit-less feature vector extracted at a processing cycle is denoted by ,...4321

TNfffffx (1)

where if represent individual features from sensor readings. From the list of features in Table 1, one can see that this vector will be around dimensionality 54. Let the data set be partitioned into two classes (clean/dirty tool, or pre/post maintenance) denoted 1 and 2 with 1n and 2n samples in each, respectively.

Then, the mean vector for the ith class i , of all the feature vectors belonging to the class is

ixi

i xn

m

1 (2)

The so called “within-class” scatter matrices are calculated as

ix

Tiii mxmxS

))(( (3)

while the overall within class scatter matrix is found as 21 SSSw (4)

LDA finds a vector pointing in the direction of the features having the largest mean differences and variance shifts between the two considered classes. More specifically, the principal discriminant unit vector is found by maximizing the Rayleigh quotient [22], which leads to the solution in the form )( 21

1 mmSw w (5) The unit vector w is used to project the feature vectors in the direction that separates the two classes the best. Projecting the feature vectors onto this line will form a weighted linear combination of the features, where the projected data points, labeled y are found by the inner product between w and the feature vectors x . The scatter of each class (clean and dirty tool) and the difference in the means can also be calculated in the new space along the principal discriminant unit vector ( w ). The scatter of each subset and the difference of the means in the projected space quantify the changes that can be seen in the features belonging to the two classes, as expressed in the direction of the principal discriminant vector w (direction where changes are the most emphasized).

The features with the most weight in the principal discriminant unit vector can be interpreted as being the most sensitive between the two classes analyzed. These features are then tracked using the overlaps of their probability density functions characterizing different tool conditions [6].

Performance Assessment Methodology: Gaussian Mixture Models and Confidence Value Analysis

In order to quantify the changes in the most sensitive features, we can track the probability density functions (PDFs) of the linear combinations of the most sensitive features (probabilistically track changes of feature projections along the w -vector described in the previous section). Alternatively, we can also use a multi-dimensional PDF to simultaneously track several features that were determined by the sensitivity analysis to be the most sensitive to a given tool condition.

The statistical overlap is a highly effective and frequently utilized tool for the tracking of changes in feature PDFs. It combines the changes in PDFs into a single quantifying number and can be defined as the overlapping volume between feature PDFs characterizing normal and current system behavior [6]. As in [6], we will refer to this


overlap as the performance confidence value (CV). The CV between the two probability distributions is calculated as

CV = )()()()(

xfxgxfxg

(6)

where )(xg and )(xf represent the PDFs of the normal and current feature PDFs respectively (random variable of features “x”). ...... denotes the inner product in the space of PDFs

and ... denotes the corresponding norm. From Equation (6), it is easy to see why another possible interpretation of the CV concept is that it is essentially the “cosine” of the angle between any two vectors in the Euclidean vector space of PDFs [24].

In the case of a perfect match (tool operation is identical to that observed during training), the distributions will be directly on top of one another and the CV will be equal to one. As the system degrades, the PDF of the features describing the most recent behavior will drift away from the PDF corresponding to the normal system behavior, thus causing the CV to fall toward zero. Figure 3 shows a graph indicating the CV corresponding to two PDFs.

Figure 3: Definition of a Confidence Value (CV), characterizing the changes in the sensitive feature distributions from [6].

One common way of estimating the PDFs of feature distributions is to describe them using Gaussian Mixture Models (GMMs) [25] because of their universal approximation capabilities and the ability to easily track such PDFs through on-line updating of the corresponding means and covariance matrices with a moving window. Such an approach was successfully used in [26] for monitoring of various dynamic systems.

For single Gaussians, the following formulae express the inner product and norms of the PDFs )(xg and )(xf ,

)]()()(21[

2/

1

)det()2(1)()( gfgf

Tgf mmKKmm

gfn

eKK

xfxg

(7)

)det(2)2(1)(

2/

2

fnn K

xf

(8)

)det(2)2(

1)(2/

2

gnn K

xg

(9)

where n is the dimension of the distribution (number of features), fm and gm are the mean vectors of each class, and

fK and gK are the covariance matrices of each class. For

GMMs, expectation maximization (EM) is used to construct the model, and Equations 7-9 are used along with appropriate weights for each component of the model [25]. IMPLEMENTATION OF FEATURE EXTRACTION AND CONDITION MONITORING METHODS ON A FACTORY DATASET Description of Dataset

The data set used in this study was gathered from a standard 300mm PECVD tool in the facilities of a major domestic manufacturer of integrated circuits. It was obtained from a single tool continuously depositing recipes of TEOS films with various thicknesses. In this study, we focused on the recipe (film thickness) that significantly dominated the operations (about 80% of operation). All the signals in Table 1 are recorded with a sampling rate of 10 Hz. The approximate size of the data set for the recipe considered here is 40,000 wafers, with 1500 in-situ cleans, and two scheduled PM wet cleans (open chamber wipe downs).

Analysis of Short-Term Feature Behavior

The following are the results of the analysis of the features that are the most sensitive to accumulation drifts in between in-situ cleans (i.e. features that probabilistically change the most in between two of the in-situ cleans). The data presented here corresponds to about 80 batches (40 wafers per batch, with in-situ cleans performed about every batch). In this relatively short period (corresponding to a few weeks; due to the proprietary nature of this information, we cannot be more specific), long-term feature drifts or sudden changes were not detected, and the analysis could be focused on short-term accumulation drifts between in-situ cleans.


Intuition about the physics of the tool can give knowledge on what to expect as the most sensitive dynamic features to the in-situ cleans. As described in the Tool Operation and Maintenance Schedule Section, the in-situ cleans are an automatic cleaning cycles performed after a set number of wafers (25-100). Plasma-excited Fluorine (F-) flows into the chamber to eat away deposited films on inside walls. First, since the chamber walls are stripped during in-situ cleaning, the chamber and pedestal temperature features were predicted to be affected by the cleaning. Additionally, the gas flow and chamber pressure features were predicted to be slightly shifted by the cleaning of the chamber due to the cleaner surfaces giving an easier flow. Industry partners have documented that minor changes in the behavior of the deposition chamber (pressure, temperature, gas flow) can bring changes to the chamber load impedance as seen by the RF plasma generator. Therefore, due to the high sensitivity that the RF plasma generator has on slight changes in the chamber load impedance, its tuning capacitors (load and tune) were hypothesized to give the most sensitive features between the in-situ cleans.

The sensitivity analysis was performed on signals obtained just prior to the in-situ cleans (the last 5 depositions just before cleans) and depositions just after in-situ cleans (the first 5 depositions just after cleans). Table 2 lists the top 10 sensitive features found from LDA between the feature sets obtained before and after the in-situ cleans. All other features had significantly smaller weighting in the principal discriminant vector and showed no visible trends in the data between in-situ cleans. Table 2: Results of the Sensitivity Analysis between Pre and Post In-situ Clean Features.

Principal Vector (w) Feature Name 0.60 Load Capacitor Overshoot Low 0.41 Load Capacitor Overshoot High 0.40 Load Capacitor Steady State 0.27 HF Load Power Steady State Error 0.26 Top Plate Temperature Mean 0.25 Top Plate Temperature Minimum 0.22 Pedestal 1 Temperature Minimum 0.21 Pedestal 1 Temperature Mean 0.13 LF Reflected Power Tune Time 0.06 Process Pressure Minimum

One can see that the Load Capacitor features make up the top sensitive features, in agreement with our physical intuition. The load capacitor signal sees a different chamber impedance before and after the in-situ cleaning and its tuning characteristics change in order to deliver the same RF power. Many temperature features appear sensitive; indicating that the chamber and pedestal surfaces shift in temperature before and after the in-situ cleans.

The drifts in these sensitive dynamic features are analyzed by updating the CV every cycle. A “normal operation” GMM was constructed using the depositions just after the in-situ cleans and the CV was updated with each new deposition by re-evaluating the overlap between the normal operation GMM and the GMM of features corresponding to most recent deposition cycles.1

Figure 4, on the next page, presents the results obtained by projecting the features in Table 2 onto the principal discriminant vector. The Figure shows nine processed batches, where the data points are the projected features and the solid line is the corresponding CV for every cycle. The CVs show obvious recoveries after each clean and then gradual drops as the next clean nears. This is indicative of the features having a distribution well aligned with the template distribution immediately after the clean, and then drifting as more and more wafers are deposited.

Due to the consistent behavior of the trending CV, one can use this as an indicator to perform timely in-situ cleans on a PECVD tool. For further implementation, depending on the length of the batch for a particular recipe, the template distribution and the moving window of deposition cycles for CV updating can be adjusted to yield visible CV trends, like those shown in Figure 4.

Analysis of Long-Term Feature Behavior

The following are the results showing the most sensitive dynamic features to a single Preventive Maintenance (PM) event, which occurred during the data collection. This PM consisted of leak checks and a wipe down of the chamber and pedestals. LDA was used to find the features that show the largest statistical change before and after the PM event. The sensitivity analysis was performed between the depositions just prior to the PM (consisting of 3 batches, 120 wafers) and depositions just after the PM (consisting of 3 batches, 120 wafers).

1 The moving window of recent deposition cycles would not cross over the in-situ clean in order to not mix cycles before and after cleans.


2250 2300 2350 2400 2450 2500 2550 2600-3

-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

CVProjected Data

Deposition Cycle

Figure 4: Projected sensitive features and CV displaying statistical drift between in-situ cleans.

The same hypotheses for the most sensitive features are made for the long-term PM as was described in the short-term analysis of the in-situ cleans. It is expected that the RF power capacitors will be most sensitive to the PM due to small chamber impedance shifts propagating into the RF network. However, due to the long period, these features will be less probable of having a consistent trend between PM events. Only the temperature and pressure features were expected to yield any consistent long-term trends.

Table 3 lists the top 10 features showing the largest statistical change, as yielded by the LDA between the feature sets before and after the PM. All other features had significantly less weighting in the principal discriminant vector and showed no visible changes in the data before and after the PM. One can see that many RF power features appear to be sensitive, including features coming from the Load and Tune Capacitors. Temperature and pressure features appear sensitive also, agreeing with our hypotheses.

Table 3: Results of the Sensitivity Analysis between Pre and Post PM Features.

Principal Vector (w) Feature Name 0.49 Process Pressure Steady State Error 0.46 Tune Capacitor Overshoot High 0.39 HF Load Power Steady State Error 0.38 Process Pressure Rise Time 0.36 Load Capacitor Overshoot Low 0.27 LF Load Power Steady State Error 0.18 TEOS Flow Steady State Error 0.08 HF Refl. Power Steady State Error 0.07 Pedestal 1 Temperature Amplitude 0.01 Top Plate Temperature Mean

Clean Boundaries ------


The changes in these sensitive dynamic features indicate that certain subsystems are affected by the wipe down of the chamber and leak checks performed during PM. However, even though these features changed after the PM, there is no indication of long-term trends between two PMs, unlike what was seen between in-situ cleans when the features in Table 2 were analyzed.

Amongst the top 10 sensitive features to the PM event, only the Pedestal 1 Temperature Amplitude displayed a consistent trend that was aligned with the PM. This result was not a surprise because this feature comes from a sensor measuring the temperature of a pedestal that was cleaned during the PM. Build-up of residual films on the pedestal over a long

period gradually affects its thermal resistance and emissivity, which is visible in the temperature behavior.

Figure 5 displays the scaled Pedestal 1 Temperature Amplitude along with the corresponding CVs obtained using a template distribution yielded by temperatures from 75 batches (3000 wafers) immediately after the PM. The CVs use bi-modal GMMs that are fit to the scaled Pedestal 1 Temperature Amplitude data. The data points in Figure 5 are the scaled Pedestal 1 Temperature Amplitudes and the solid line denotes the CVs for every cycle. The strong trends in the scaled temperature feature and the corresponding CVs, as well as the recovery after the PM are obvious. Thus, the pedestal temperature sensor could be used as a trigger for long-term PMs on this tool.

0 2000 4000 6000 8000 10000 12000 14000 16000 18000

-1

0

1

2

3

4

5

6

Scal

ed T

empe

ratu

re A

mpl

itude

& C

V

Scaled FeatureCV

Deposition Cycle

Figure 5: Scaled Pedestal 1 Temperature Amplitude feature and CV displaying statistical drift between PM. CONCLUSIONS AND FUTURE WORK

The work presented in this paper is an integrated feature extraction and equipment monitoring approach based on in-place sensors from a commonly used industrial PECVD tool. The use of Linear Discriminant Analysis (LDA) and Gaussian Mixture Models (GMMs) of dynamic features were used to recognize and track features that show the most prominent statistical change due to tool degradation and maintenance events. The concept of CVs, based on the overlaps of GMMs of features that are the most sensitive to maintenance events is shown to be a reliable condition monitoring indicator that combines both mean and variance changes of features, and

scales the multi-dimensional feature trends into a number between zero and one. These techniques were applied to the identification of sensitive features and their monitoring over short-term and long-term maintenance cycles of PECVD tool operation. Sensitive dynamic features were shown to agree with predictions made based on physical knowledge about the tool process. Using the feature extraction and condition monitoring techniques described in this work, further research can be done to explore proper degradation models and maintenance decisions for PECVD tools (and possibly others) in a fabrication facility.

Our current research efforts are aimed at utilizing the features discussed in this paper to build dynamic models of the

PM ------


evolution of the tool condition in order to predict the degradation states. Hidden Markov Models [27], regression models [11], and similarity-based predictive [28] approaches are all viable options for formulating statistical degradation models in the space of sensitive features. The ultimate goal will be to use these models to optimize the maintenance scheduling in a semiconductor fabrication facility.

ACKNOWLEDGMENTS This work was supported in part by the International

SEMATECH Manufacturing Initiative (ISMI).

REFERENCES

[1] http://www.sematech.org/meetings/archives/mfg/8194/8194.pdf

[2] Fulton, S., and Kim, M., 2007, “ISMI Consensus Preventive and Predictive Maintenance Vision Ver. 1.1,” International SEMATECH Manufacturing Initiative.

[3] Buchner, A., Anand, S., and Hughes, J., 1997, “Data Mining in Manufacturing Environments: Goals, Techniques and Applications,” Studies in Informatics and Control, 6(4), pp. 319-328.

[4] Liu, Y., Kumar, P., Zhang, J., Djurdjanovic, D., and Ni, J., 2005, “Predictive Modeling and Intelligent Maintenance Tools for High Yield Next Generation Fab,” TECHCON.

[5] Djurdjanovic, D., and Liu, Y., 2006, "Survey of Predictive Maintenance Research and Industry Best Practice," University of Michigan, Ann Arbor, MI.

[6] Djurdjanovic, D., Lee, J., and Ni, J., 2003, “Watchdog Agent -An Infotronics Based Prognostics Approach for Product Performance Assessment and Prediction,” International Journal of Advanced Engineering Informatics, Special Issue on Intelligent Maintenance Systems, (17)3-4, pp. 109-125.

[7] Lebold, M., Reichard, K., and Boylan, D., 2003, "Using DCOM in an Open System Architecture Framework for Machinery Monitoring and Diagnostics," IEEE Aerospace Conference, pp. 1227–1235.

[8] Li, J. C., 2006, "Signal processing in manufacturing monitoring," Condition Monitoring and Control for Intelligent Manufacturing: Springers, pp. 245- 265.

[9] Tang, J., Dornfeld, D., Pangrle, S. K., and Dangca, A., 1998, "In-process Detection of Microscratching During CMP using Acoustic Emission Sensing Technology," Journal of Electronic Materials, (27), pp. 1099-1103.

[10] Spanos, C., Guo, H., Miller, A., and Levine-Parrill, J., 1992, “Real Time Statistical Process Control Using Tool Data,” IEEE Transactions on Semiconductor Manufacturing, (5), pp. 308-318.

[11] Pandit, S. M. and Wu, S. M., 1983, Time Series and System Analysis with Applications, John Wiley.

[12] Montgomery, D. C., 2001, Introduction to statistical quality control, New York, John Wiley (4).

[13] Harris, R., 1975, A Primer of Multivariate Statistics, New York: Academic Press.

[14] Bunkofske, R. J., Pascoe, N. T., Colt, J. Z., and Smit, M. W., 1996, "Real-time process monitoring," Proceedings of the 1996 7th Annual IEEE/SEMI Advanced Semiconductor Manufacturing Conference, ASMC 96, Nov 12-14 1996, Cambridge, MA, USA, pp. 382-390.

[15] Mai, K., and Tuckermann, M., 2005, "SPC based in-line reticle monitoring on product wafers," IEEE/SEMI Advanced Semiconductor Manufacturing Conference and Workshop-Advancing Semiconductor Manufacturing Excellence, Munich, Germany, pp. 185-189.

[16] Card, J. P., Naimo, M., and Ziminsky, W., 1998, "Run-to-run Process Control of a Plasma Etch Process with Neural Network Modeling," Quality and Reliability Engineering International, (14), pp. 247-260.

[17] Pompier, D., Giuliani, C., Csejtei, N., Paire, E., and Cholvy, J., 2006, "Oxide Deposition Monitored through APC System," 7th AEC/APC Europe, Aix-en-Provence.

[18] Velichko, S. A., 2004, "Multi-parameter Model Based Advanced Process Control," Advanced Semiconductor Manufacturing, IEEE Conference and Workshop, pp. 443-447.

[19] Baek, K. H., Jung, Y., Min, G. J., Kang, C., Cho, H. K., and Moon, J. T., 2005, "Chamber Maintenance and Fault Detection Technique for a Gate Etch Process via Self-Excited Electron Resonance Spectroscopy," Journal of Vacuum Science; Technology B (Microelectronics and Nanometer Structures), (23), pp. 125-129.

[20] Geng, Hwaiyu, 2005, Semiconductor Manufacturing Handbook, McGraw-Hill Handbooks, NY, (14).

[21] Franklin, G., Emami-Naeini, A., 2006, Feedback Control of Dynamic Systems, Pearson and Prentice Hall, NJ, (5), pp. 115-121.

[22] Duda, R., Hart, P., and Stork, D., 2001, Pattern Classification, Wiley & Sons, NY, (2), pp. 117-124.

[23] Cristianini, N., and Shawe-Taylor, J., 2000, An Introduction to Support Vector Machines, Cambridge University Press, Cambridge.

[24] Lay, D., 2005, Linear Algebra and its Applications, Addison Wesley, (3).

[25] Lindsay, B., 1995, Mixture Models: Theory, Geometry and Applications, Institute of Mathematical Statistics, CA, pp. 1-6.

[26] Liu, J., 2007, “Autonomous Anomaly Detection and Fault Diagnosis,” PhD thesis, University of Michigan.

[27] Liu, Y., 2008, “Predictive Modeling for Intelligent Maintenance in Complex Semiconductor Manufacturing Processes,” PhD thesis, University of Michigan.

[28] Liu, J., Djurdjanovic, D., Ni, J., Casoetto, N., and Lee, J., 2007, “Similarity Based Method for Manufacturing Process Performance Prediction and Diagnosis,” Computers in Industry, (58), pp. 558–566.

DYNAMIC FEATURE MONITORING TECHNIQUE APPLIED TO …schedules [1]-[2]. Instead, maintenance is almost...

Documents

Transcript of DYNAMIC FEATURE MONITORING TECHNIQUE APPLIED TO …schedules [1]-[2]. Instead, maintenance is almost...