0
Research Papers

Prediction of Material Removal Rate for Chemical Mechanical Planarization Using Decision Tree-Based Ensemble Learning

[+] Author and Article Information
Zhixiong Li

Department of Mechanical and Aerospace
Engineering,
University of Central Florida,
Orlando, FL 32816
e-mail: zhixiong.li@Knights.ucf.edu

Dazhong Wu

Department of Mechanical and
Aerospace Engineering,
Department of Industrial Engineering
and Management Systems,
University of Central Florida,
Orlando, FL 32816
e-mail: dazhong.wu@ucf.edu

Tianyu Yu

Department of Mechanical and Aerospace
Engineering,
University of Central Florida,
Orlando, FL 32816
e-mail: tianyu.yu@ucf.edu

1Corresponding author.

Manuscript received March 27, 2018; final manuscript received November 17, 2018; published online January 17, 2019. Assoc. Editor: Qiang Huang.

J. Manuf. Sci. Eng 141(3), 031003 (Jan 17, 2019) (14 pages) Paper No: MANU-18-1189; doi: 10.1115/1.4042051 History: Received March 27, 2018; Revised November 17, 2018

Chemical mechanical planarization (CMP) has been widely used in the semiconductor industry to create planar surfaces with a combination of chemical and mechanical forces. A CMP process is very complex because several chemical and mechanical phenomena (e.g., surface kinetics, electrochemical interfaces, contact mechanics, stress mechanics, hydrodynamics, and tribochemistry) are involved. Predicting the material removal rate (MRR) in a CMP process with sufficient accuracy is essential to achieving uniform surface finish. While physics-based methods have been introduced to predict MRRs, little research has been reported on monitoring and predictive modeling of the MRR in CMP. This paper presents a novel decision tree-based ensemble learning algorithm that can train the predictive model of the MRR. The stacking technique is used to combine three decision tree-based learning algorithms, including the random forests (RF), gradient boosting trees (GBT), and extremely randomized trees (ERT), via a meta-regressor. The proposed method is demonstrated on the data collected from a CMP tool that removes material from the surface of wafers. Experimental results have shown that the decision tree-based ensemble learning algorithm using stacking can predict the MRR in the CMP process with very high accuracy.

Copyright © 2019 by ASME
Your Session has timed out. Please sign back in to continue.

References

Krishnan, M. , Nalaskowski, J. W. , and Cook, L. M. , 2009, “Chemical Mechanical Planarization: Slurry Chemistry, Materials, and Mechanisms,” Chem. Rev., 110(1), pp. 178–204. [CrossRef]
Steigerwald, J. M. , Murarka, S. P. , and Gutmann, R. J. , 2008, Chemical Mechanical Planarization of Microelectronic Materials, Wiley, New York.
Nanz, G. , and Camilletti, L. E. , 1995, “Modeling of Chemical-Mechanical Polishing: A Review,” IEEE Trans. Semicond. Manuf., 8(4), pp. 382–389. [CrossRef]
Evans, C. , Paul, E. , Dornfeld, D. , Lucca, D. , Byrne, G. , Tricard, M. , Klocke, F. , Dambon, O. , and Mullany, B. , 2003, “Material Removal Mechanisms in Lapping and Polishing,” CIRP Ann.-Manuf. Technol., 52(2), pp. 611–633. [CrossRef]
Luo, Q. , Ramarajan, S. , and Babu, S. , 1998, “Modification of the Preston Equation for the Chemical–Mechanical Polishing of Copper,” Thin Solid Films, 335(1–2), pp. 160–167. [CrossRef]
Luo, J. , and Dornfeld, D. A. , 2001, “Material Removal Mechanism in Chemical Mechanical Polishing: Theory and Modeling,” IEEE Trans. Semicond. Manuf., 14(2), pp. 112–133. [CrossRef]
Yu, T. , Asplund, D. T. , Bastawros, A. F. , and Chandra, A. , 2016, “Performance and Modeling of Paired Polishing Process,” Int. J. Mach. Tools Manuf., 109, pp. 49–57. [CrossRef]
Kong, Z. , Oztekin, A. , Beyca, O. F. , Phatak, U. , Bukkapatnam, S. T. , and Komanduri, R. , 2010, “Process Performance Prediction for Chemical Mechanical Planarization (CMP) by Integration of Nonlinear Bayesian Analysis and Statistical Modeling,” IEEE Trans. Semicond. Manuf., 23(2), pp. 316–327. [CrossRef]
Rao, P. K. , Beyca, O. F. , Kong, Z. , Bukkapatnam, S. T. , Case, K. E. , and Komanduri, R. , 2015, “A Graph-Theoretic Approach for Quantification of Surface Morphology Variation and Its Application to Chemical Mechanical Planarization Process,” IIE Trans., 47(10), pp. 1088–1111. [CrossRef]
Wang, J. , Ma, Y. , Zhang, L. , Gao, R. X. , and Wu, D. , 2018, “Deep Learning for Smart Manufacturing: Methods and Applications,” J. Manuf. Syst., 48(C), pp. 144–156.
Wu, D. , Jennings, C. , Terpenny, J. , Gao, R. X. , and Kumara, S. , 2017, “A Comparative Study on Machine Learning Algorithms for Smart Manufacturing: Tool Wear Prediction Using Random Forests,” ASME J. Manuf. Sci. Eng., 139(7), p. 071018. [CrossRef]
Wu, D. , Jennings, C. , Terpenny, J. , Kumara, S. , and Gao, R. X. , 2018, “Cloud-Based Parallel Machine Learning for Tool Wear Prediction,” ASME J. Manuf. Sci. Eng., 140(4), p. 041005. [CrossRef]
Lin, S.-C. , and Wu, M.-L. , 2002, “A Study of the Effects of Polishing Parameters on Material Removal Rate and Non-Uniformity,” Int. J. Mach. Tools Manuf., 42(1), pp. 99–103. [CrossRef]
Lee, H. , and Jeong, H. , 2011, “A Wafer-Scale Material Removal Rate Profile Model for Copper Chemical Mechanical Planarization,” Int. J. Mach. Tools Manuf., 51(5), pp. 395–403. [CrossRef]
Lee, H. , Jeong, H. , and Dornfeld, D. , 2013, “Semi-Empirical Material Removal Rate Distribution Model for SiO2 Chemical Mechanical Polishing (CMP) Processes,” Precis. Eng., 37(2), pp. 483–490. [CrossRef]
Lih, W.-C. , Bukkapatnam, S. T. , Rao, P. , Chandrasekharan, N. , and Komanduri, R. , 2008, “Adaptive Neuro-Fuzzy Inference System Modeling of MRR and WIWNU in CMP Process With Sparse Experimental Data,” IEEE Trans. Autom. Sci. Eng., 5(1), pp. 71–83. [CrossRef]
Wang, P. , Gao, R. X. , and Yan, R. , 2017, “A Deep Learning-Based Approach to Material Removal Rate Prediction in Polishing,” CIRP Ann., 66(1), pp. 429–432. [CrossRef]
Jia, X. , Di, Y. , Feng, J. , Yang, Q. , Dai, H. , and Lee, J. , 2018, “Adaptive Virtual Metrology for Semiconductor Chemical Mechanical Planarization Process Using GMDH-Type Polynomial Neural Networks,” J. Process Control, 62, pp. 44–54. [CrossRef]
Rao, P. K. , Bhushan, M. B. , Bukkapatnam, S. T. , Kong, Z. , Byalal, S. , Beyca, O. F. , Fields, A. , and Komanduri, R. , 2014, “Process-Machine Interaction (PMI) Modeling and Monitoring of Chemical Mechanical Planarization (CMP) Process Using Wireless Vibration Sensors,” IEEE Trans. Semicond. Manuf., 27(1), pp. 1–15. [CrossRef]
Džeroski, S. , and Ženko, B. , 2004, “Is Combining Classifiers With Stacking Better Than Selecting the Best One?,” Mach. Learn., 54(3), pp. 255–273. [CrossRef]
Zhou, Z.-H. , 2012, Ensemble Methods: Foundations and Algorithms, Chapman & Hall, Boca Raton, FL.
Friedman, J. H. , 2001, “Greedy Function Approximation: A Gradient Boosting Machine,” Ann. Stat., 29(5), pp. 1189–1232. [CrossRef]
Li, Z. , Wu, D. , Hu, C. , and Terpenny, J. , 2017, “An Ensemble Learning-Based Prognostic Approach With Degradation-Dependent Weights for Remaining Useful Life Prediction,” Reliab. Eng. Syst. Saf., (in Press).
Geurts, P. , Ernst, D. , and Wehenkel, L. , 2006, “Extremely Randomized Trees,” Mach. Learn., 63(1), pp. 3–42. [CrossRef]
Breiman, L. , 2001, “Random Forests,” Mach. Learn., 45(1), pp. 5–32. [CrossRef]
Liaw, A. , and Wiener, M. , 2002, “Classification and Regression by random Forest,” R News, 2(3), pp. 18–22. https://www.r-project.org/doc/Rnews/Rnews_2002-3.pdf
Ho, T. K. , 1998, “The Random Subspace Method for Constructing Decision Forests,” IEEE Trans. Pattern Anal. Mach. Intell., 20(8), pp. 832–844. [CrossRef]
Friedman, J. H. , 2002, “Stochastic Gradient Boosting,” Comput. Stat. Data Anal., 38(4), pp. 367–378. [CrossRef]
Rosca, N. P. J. , 2016, “PHM Society Data Challenge,” PHM Society, Denver, CO, accessed Nov. 30, 2018, https://www.phmsociety.org/events/conference/phm/16/data-challenge
Ki Bum, L. , and Ouk Kim, C. , 2018, “Recurrent Feature-Incorporated Convolutional Neural Network for Virtual Metrology of the Chemical Mechanical Planarization Process,” J. Intell. Manuf., pp. 1–14. https://link.springer.com/article/10.1007/s10845-018-1437-4
Greenwood, J. , and Williamson, J. P. , 1966, “Contact of Nominally Flat Surfaces,” Proc. R. Soc. London, A, 295(1442), pp. 300–319. [CrossRef]
Johnson, K. L. , 1987, Contact Mechanics, Cambridge University Press, Cambridge, UK.
Seber, G. A. , and Lee, A. J. , 2012, Linear Regression Analysis, Wiley, Hoboken, NJ.
Makalic, E. , and Schmidt, D. F. , 2016, “High-Dimensional Bayesian Regularised Regression With the BayesReg Package,” preprint arXiv:1611.06649. https://arxiv.org/abs/1611.06649
Kang, P. , Kim, D. , and Cho, S. , 2016, “Semi-Supervised Support Vector Regression Based on Self-Training With Label Uncertainty: An Application to Virtual Metrology in Semiconductor Manufacturing,” Expert Syst. Appl., 51, pp. 85–106. [CrossRef]
Solomatine, D. P. , and Shrestha, D. L. , 2004, “AdaBoost.RT: A Boosting Algorithm for Regression Problems,” IEEE International Joint Conference on Neural Networks, Budapest, Hungary, July 23–29, pp. 1163–1168.

Figures

Grahic Jump Location
Fig. 1

A predictive modeling framework based on ensemble learning

Grahic Jump Location
Fig. 2

Two-layer ensemble learning using stacking

Grahic Jump Location
Fig. 3

Schematic diagram of the CMP process

Grahic Jump Location
Fig. 4

Variable importance of the extracted 85 features

Grahic Jump Location
Fig. 5

Prediction performance versus a varying number of features: (a) R2, (b) RE, (c) S-score, (d) RMSE, and (e) training time

Grahic Jump Location
Fig. 6

Variability of RMSE for GBT, RF, and ERT algorithms

Grahic Jump Location
Fig. 7

Prediction performance on the validation dataset using GBT and 35 features

Grahic Jump Location
Fig. 8

Prediction performance on the validation dataset using RF and 35 features

Grahic Jump Location
Fig. 9

Prediction performance on the validation dataset using ERT and 35 features

Grahic Jump Location
Fig. 10

Prediction performance on the validation dataset using GBT and 85 features

Grahic Jump Location
Fig. 11

Prediction performance on the validation dataset using RF and 85 features

Grahic Jump Location
Fig. 12

Prediction performance on the validation dataset using ERT and 85 features

Grahic Jump Location
Fig. 13

Stacked ensemble results using different number of features: (a) R2, (b) RE, (c) S-score, and (d) RMSE

Grahic Jump Location
Fig. 14

Prediction performance for the validation dataset in stage A using the stacked ensemble: ((a) and (b)) stacking-CART and ((c) and (d)) stacking-ELM

Grahic Jump Location
Fig. 15

Prediction results for the validation dataset in stage B using the stacked ensemble: ((a) and (b)) stacking-CART and ((c) and (d)) stacking-ELM

Grahic Jump Location
Fig. 16

Training time of base learners using different number of trees

Tables

Errata

Some tools below are only available to our subscribers or users with an online account.

Related Content

Customize your page view by dragging and repositioning the boxes below.

Related Journal Articles
Related eBook Content
Topic Collections

Sorry! You do not have access to this content. For assistance or to subscribe, please contact us:

  • TELEPHONE: 1-800-843-2763 (Toll-free in the USA)
  • EMAIL: asmedigitalcollection@asme.org
Sign In