

# POLITEHNICA UNIVERSITY OF BUCHAREST



# Doctoral School of Electronics, Telecommunications and Information Technology

**Decision No.** 779 from 03-12-2021

# Ph.D. THESIS SUMMARY

#### Elena-Diana GROSU-ŞANDRU

# DATA-DRIVEN FABRICATION PROCESS VARIATION ASSESSMENT IN CIRCUIT DESIGN ANALYSIS USING MACHINE LEARNING

## EVALUAREA VARIAȚIEI PROCESULUI DE FABRICAȚIE ÎN ANALIZA PROIECTĂRII CIRCUITELOR BAZATĂ PE DATE, UTILIZÂND MACHINE LEARNING

#### THESIS COMMITTEE

**Prof. Dr. Ing. Mihai CIUC**Politehnica Univ. of Bucharest

President

**Prof. Dr. Ing. Corneliu BURILEANU**Politehnica Univ. of Bucharest
PhD Supervisor

Prof. Dr. Ing. Marina TOPA
Technical Univ. of Cluj-Napoca
Referee

Dr. Rer. Mat. Georg PELZ
Infineon Technologies AG & Univ Referee

Infineon Technologies AG & Univ.

Duisburg-Essen

**Prof. Dr. Ing. Cristian RAVARIU**Univ. Politehnica din București
Referee

**BUCHAREST 2021** 

# **Table of contents**

| 1 | Intr | oduction                                                            | 1 |
|---|------|---------------------------------------------------------------------|---|
|   | 1.1  | Process Variation Impact on ICs                                     | 1 |
|   |      | 1.1.1 Machine Learning in the Semiconductor Industry                | 1 |
|   | 1.2  | Problem Description                                                 | 2 |
|   | 1.3  | Motivation                                                          | 2 |
|   | 1.4  | Scope of the Research                                               | 3 |
|   | 1.5  | Thesis Structure                                                    | 3 |
| 2 | Rela | ated Work                                                           | 4 |
|   | 2.1  | IC Verification and PCM Parameters                                  | 4 |
|   |      | 2.1.1 Pre-Silicon Verification                                      | 4 |
|   |      | 2.1.2 Post-Silicon Validation                                       | 4 |
|   |      | 2.1.3 DoE Testing and Production Testing                            | 5 |
|   |      | 2.1.4 Process Control Monitor                                       | 5 |
|   | 2.2  | Sensitivity Analysis                                                | 5 |
|   |      | 2.2.1 Global Sensitivity Analysis                                   | 5 |
|   |      | 2.2.2 Local Sensitivity Analysis                                    | 5 |
|   | 2.3  | Model Checking                                                      | 6 |
|   | 2.4  | Yield Prediction                                                    | 6 |
|   |      | 2.4.1 General Concepts                                              | 6 |
|   |      | 2.4.2 Advanced Approaches                                           | 6 |
| 3 | The  | oretical Fundamentals                                               | 7 |
|   | 3.1  | Regression Algorithms                                               | 7 |
|   |      | 3.1.1 Supervised Learning                                           | 7 |
|   |      | 3.1.2 State-of-the-Art Regression Algorithms                        | 7 |
|   | 3.2  | Model Improvement through Feature Selection                         | 8 |
|   | 3.3  | Bayesian Optimization                                               | 8 |
| 4 | The  | Verification for Manufacturability by Modelling Process Variation - |   |
|   | Circ | euit Performance Dependency Methodology (P2P4M)                     | 9 |
|   | 4.1  | Methodology Flow                                                    | 9 |

|    |        | 4.1.1   | General Framework                                             | 9  |
|----|--------|---------|---------------------------------------------------------------|----|
|    |        | 4.1.2   | Problem Formulation and <b>P2P4M</b> Methodology Flow         | 9  |
|    | 4.2    | Featur  | re Selection Block                                            | 10 |
|    | 4.3    | Model   | Fitting based on preSi Data                                   | 11 |
|    | 4.4    | Bayesi  | ian Optimization of the Fitted Model                          | 11 |
|    | 4.5    | P2P4N   | M Methodology Summary                                         | 11 |
|    | 4.6    | postSi  | Distribution Modelling and Validation of the Methodology      | 12 |
|    | 4.7    | The A   | pplication of <b>P2P4M</b> Methodology                        | 12 |
|    |        | 4.7.1   | Unifying Feature Selection and Hyperparameter Bayesian Opti-  |    |
|    |        |         | mization                                                      | 12 |
|    |        | 4.7.2   | Functional Dependency Modelling and Finding the Influential   |    |
|    |        |         | PCM                                                           | 13 |
|    |        | 4.7.3   | preSi and postSi Distribution Modelling and Validation        | 13 |
|    |        | 4.7.4   | Methodology Reliability                                       | 14 |
| 5  | P2P    | 4M Me   | thodology Use-Cases                                           | 17 |
|    | 5.1    | Use-ca  | ase 1: Sensitivity Analysis with Process Variation            | 17 |
|    |        | 5.1.1   | Global Sensitivity Analysis Methodology                       | 17 |
|    |        | 5.1.2   | Local Sensitivity Analysis Methodology                        | 17 |
|    |        | 5.1.3   | Experimental Results                                          | 19 |
|    | 5.2    | Use-ca  | ase 2: Simulation Model Checking                              | 21 |
|    |        | 5.2.1   | Methodology Flow                                              | 21 |
|    |        | 5.2.2   | Experimental Results                                          | 22 |
|    | 5.3    | Use-ca  | ase 3: Yield Prediction                                       | 22 |
|    |        | 5.3.1   | preSi Yield Prediction for Normal Distributions Methodology . | 22 |
|    |        | 5.3.2   | Parametric Yield Prediction Methodology                       | 23 |
|    |        | 5.3.3   | Experimental Results                                          | 24 |
| 6  | Con    | clusion | S                                                             | 26 |
|    | 6.1    | Genera  | al Objectives and Results                                     | 26 |
|    | 6.2    | Origin  | nal Contributions                                             | 27 |
|    | 6.3    | List of | f Original Publications                                       | 29 |
|    | 6.4    | Future  | e Work                                                        | 31 |
| Re | eferen | ices    |                                                               | 32 |

# Chapter 1

## Introduction

The general goal of this research is to develop strategies, methodologies and tools that assess the process variation impact on circuit design analysis, by linking the reality of an existing fabrication to a new IC design, thus enhancing the manufacturer ability to increase production.

#### 1.1 Process Variation Impact on ICs

The process variation can be regarded as an old issue for the manufacturing industries. As a consequence, it is not a new topic for the semiconductor industry, since it has been addressed for nearly 50 years [23] [21]. Process variation is defined as the deviation of parameters from their nominal specifications [18], while being an entirely random process. Usually, it occurs in the transistors attributes, such as channel lengths or gate widths, which in general are easier to control, or in the substrate doping profile and quantity, tasks that are much harder to be controlled. The process variation can be classified in systematic variation and random variation [13], both of them being considered equally important. Moreover, the variation can be spatial or temporal and may occur lot-to-lot, wafer-to-wafer, inter-die or intra-die. This variation can also be translated in circuit performance discrepancies, almost always identified as decreases. Still, the circuit performances are affected differently, depending on a series of factors, from circuit logic to circuit implementation. Thereupon, in order to reduce the process variation and limit their undesired effects, the process variation management has been introduced [18], agglutinating a series of techniques.

#### 1.1.1 Machine Learning in the Semiconductor Industry

The potential of Machine Learning (ML) in the field of semiconductor industry is great and can tackle all the development and production challenges, starting with the design and ending with the fabs manufacturing, by helping speeding up product development times [28], [27].

#### 1.2 Problem Description

The deviation introduced by the process variation in the nominal process of the parameters translates into limited IC performances, i.e. the IC product is no longer working as it was intended to, and this represents an major concern. The circuit performance parameters, also known as Electrical Parameters (EP) or probe-test, are measures of circuit performances given a specific set of operating conditions. They are simulated during the pre-silicon (preSi) verification and measured during post-silicon (postSi) validation, at different temperatures. The Process Control Monitor Parameters (PCM), also named technology parameters, represent a direct measure of the process variation management during production, performed on a few special narrow structures called Process Control Monitor structures, located among the productive dies.

It is currently possible to correlate the EP to the PCM, which are a closer indication of the process variation, during the postSi validation, by using the Design of Experiments (DoE) lots. However, this analysis is performed late, after the long, costly and time-consuming steps of production and testing. Also, in postSi, the number of each PCM measurements on a wafer is far smaller than the number of measurements available for each EP (a dozen versus thousands), causing the analysis to be inaccurate due to the lack of 1-to-1 correspondence between the two types of parameters. Consequently, in nowadays production processes there is a high demand for an accurate estimation of the relationship between IC performances and the process variation, early in the design and production process, as it would significantly reduce the time-to-market and overall costs of the products.

#### 1.3 Motivation

The first motivation of the thesis is developing a methodology that is able to predict the relationship between the circuit performances and the process variation (by using technology parameters as indicators) at an early stage, by employing the PCM for an alternative use. The second motivation is linked to extracting new and valuable knowledge from the history production data, using ML techniques. The third motivation is correlated to the areas where such relationship could be helpful: earlier diagnostic of circuit responses sensitive to process variation, improvement of yield prediction and diagnostic of inaccuracies in simulation device model. The fourth motivation is linked to other problems that may be addressed based on an early relationship between EP and PCM: Fab-to-fab migration, The approximation effect of the device models, Technology-specific DoE plan, Root cause analysis and IC model checking with process variation.

#### 1.4 Scope of the Research

The main objective of this thesis is to develop a comprehensive methodology for accurate assessing of the process variation impact on circuit design, at an early stage. The methodology relies on data acquired from different verification phases (preSi simulations and postSi measurements) and aims to understand and predict the relationship between the technology parameters (as indicators for the process variations) and the circuit performance parameters. First, this thesis introduces the methodology for modeling the early-stage functional and statistical dependencies between the circuit performances and the process variation, under the form of reliable prediction *metamodels*. Further, this thesis proposes three more methods based on extending the above-mentioned methodology in other areas of interest for the semiconductor industry: sensitivity analysis with process variation (global and local), simulation model checking and multivariate parametric yield prediction.

#### 1.5 Thesis Structure

The rest of this thesis is divided into 5 Chapters. Chapter 2 presents the related work, with a highlight on the state-of-the-art methods involving the use of process variation measures in functional verification. Chapter 3 describes the theoretical fundamentals of the ML regression algorithms and other methods related to them, all employed in this thesis. Chapter 4 introduces the proposed methodology for the accurate assessing of the process variation impact on circuit design, at an early stage, the Verification for Manufacturability by Modelling Process Variation - Circuit Performance Dependency methodology (**P2P4M** - Process to Performance for Manufacturability), capable to offer insights on the process variation impact for manufacturability purposes. Chapter 5 concentrates on highlighting a series of use-cases where the accurate regression *metamodels* can be employed. Lastly, Chapter 6, draws the final conclusions, along with the concise summary of the contributions described in the thesis and the perspectives for future research.

# Chapter 2

### **Related Work**

#### 2.1 IC Verification and PCM Parameters



Fig. 2.1 IC Design and Production verification stages

#### 2.1.1 Pre-Silicon Verification

The preSi verification involves testing the circuit during the design stage, to ensure functional correctness of the design before tape out. This implies the evaluation of the IC design in a virtual environment, under various scenarios, based on sets of operating condition, through simulations. The techniques are more diverse and mature, but very often, they do not provide sufficient coverage, making it impossible to remove all bugs from the design stage [1] – practically, despite all efforts, the process corners are prone to not be tested enough.

#### 2.1.2 Post-Silicon Validation

The postSi validation goal is to certify the correct behavior of the manufactured chip considering a set of predefined operating conditions [17]. It is operated on the first manufactured chips (usually obtained on test wafers) in actual environments.

#### 2.1.3 DoE Testing and Production Testing

During the DoE testing, the first fabricated wafers (DoE lots), are processed and tested to decide if the design will be unaffected by future process variations [7]. The main objective of production testing is to certify the packaged ICs function in perfect compliance to their specifications from data sheet.

#### 2.1.4 Process Control Monitor

In order to keep the process variation under control, it is constantly being supervised during the manufacturing process, using the Process Control Monitor technique. The measurements are performed on narrow electrical structures placed on the wafer among the productive dies the outcome are the PCM, device parameters (usually very numerous) and cover distinct areas: device features, metallization attributes and electrical defects monitoring. They are used to characterize and control the technology in reference to the technology specification [17]. Simulating the Process Control Monitor schematics represents a new paradigm and it is first presented in [26], to help selecting the most important test structures. In [19], the simulated and measured PCM are used to solve the covariance equation for statistical modeling.

#### 2.2 Sensitivity Analysis

#### 2.2.1 Global Sensitivity Analysis

Global SA analyzes the relationship between the uncertainty in system' outputs and the uncertainty in each input factor, evaluated over the entire range of each input factor [33]. Its globability derives from the simultaneous variation of the input factors. The SA methodologies found in the literature can be categorized as follows: variable-based methods (where the factors influence on the output is quantified directly, based on correlation-like metrics) [24] and model-based methods (where the functional dependencies between the input factors and the output are derived firstly and the resulted model is further used to quantify the influence) [8].

#### 2.2.2 Local Sensitivity Analysis

Local SA assesses the local impact of inputs on the systems' response in the proximity of a set of predefined values, and it can be useful in several semiconductor areas, e.g. DoE case, but one of its main drawbacks is the high computational cost [5]. Usually, the local SA is computed using gradients or partial derivatives of the output function at specific input factor values, while the values of other input factors are maintained constant [34].

#### 2.3 Model Checking

Model checking is part of the formal verification and represents the primary technique to inspect the behavior of the circuit model over time, by establishing if it meets a given specification [22]. Statistical model checking represent a simulation-based approach used to verify the statistical properties of complex circuits, where the traditional model checking is not applicable [14]. Statistical model checking can become inefficient and time consuming when the verification requires the simulation of a large number of rare events (low probability events). It can be accelerated by generating frequent rare events. Model verification in terms of process variation is the next step towards an efficient model checking, due to the high importance of determine the effect of process variation on the design specification. The work in [32] addresses the problem of estimating a safe region of the parameters space ensuring the design specifications, based on parameters originating from process variations.

#### 2.4 Yield Prediction

#### 2.4.1 General Concepts

The yield represents the percentage of IC chips that fulfill the specifications. Out-of-Spec (OOS) count involves the actual count the chips falling outside the specification limits. When this count metric is applied on a large number of samples (thousands), it can offer a reliable estimation with low variance.

$$OOS = \frac{No. \ of \ chips \ out - of - specs}{Total \ no. \ of \ chips}$$
(4.1)

Parametric yield prediction mainly involves forecasting by generating samples in preSi stage, using MC simulations (to be statistically relevant) and further applying failure count methods [9] (simulate-and-count approaches), or by applying statistical methods on postSi (measured) data.

#### 2.4.2 Advanced Approaches

Clear improvements in terms of accuracy and computational effectiveness have been highlighted when including diversified measures of the process variation. A new methodology for yield estimation relying entirely on silicon measurements is introduced by [2]. In [31], the authors considers the advantage of the wafer-level spatial correlations between e-tests measurements (scribe line test structures) and the probe-test measurements (circuit performances), by GP-based regression algorithms to predict measurements (probe-test) for the remaining die locations on the wafer.

## Chapter 3

### **Theoretical Fundamentals**

#### 3.1 Regression Algorithms

#### 3.1.1 Supervised Learning

Supervised learning algorithms create a predictive model, i.e. a function, that maps the inputs to an output, based on inputs-output labeled training data [10], [15]. Consequently, it relies on the availability of the target response and aims at minimizing a cost function, by approximating the inferred function. This function will be further used to predict the output of the system, given new inputs, that is not part of the training data.

The flow of a supervised learning algorithm is depicted in 3.1; beside the training dataset, it involves an error signal used to refine the algorithm in order to determine the best prediction model.

#### 3.1.2 State-of-the-Art Regression Algorithms

A typical regression problem, i. e. fitting a model that links an output to a set of input predictors, can be addressed through several regression algorithms: linear, Ridge, Support Vector Machines, Gaussian Process (GP), Multilayer Perceptron (MLP) Neural Network (NN).



Fig. 3.1 Supervised learning flow

#### 3.2 Model Improvement through Feature Selection

Feature selection, also known as variable selection, represent the procedure of selecting a subset of relevant features (input variables) that explain the response, while removing the others from the regression. One important step in most of the regression problems is the selection of the variables [3], [16]. Thus, the redundancy is minimized and the relevant information is maximized, in order to obtain the best regressor, fitted with a reduced variable space [30].

Based on the search strategy, the feature selection methods can be classifies into three categories: filter methods, wrapper methods and embedded methods.

#### 3.3 Bayesian Optimization

Hyperparameter optimization or tuning can be defined as selecting a set of optimal hyperparameters for a specific ML algorithm. Consequently, it represents another important problem that needs to be addressed when training regressors, since their performance depends heavily on the hyperparameter choice. The simplest variants of searching the optimal hyperparameters are Grid Search (GS) and random search. Another approach is by using automated hyperarameter tuning. Such technique is Bayesian Optimization (BO), one of the most powerful methods for finding the extrema of objective functions whose evaluation is expensive [4], as it uses surrogate optimization algorithms to iteratively map the error dependence on the hyperparameters given the dataset [4], [25].

BO employs two elements in solving the optimization problem. Firstly, a probabilistic surrogate function, comprising a prior distribution that captures beliefs about the behavior of the objective function (function evaluations data) and an observation model (posterior distribution), used to describe the data generation manner. Secondly, an acquisition function used to guide the search, by indicating the next interrogated point, planed based upon the posterior distribution. One of the main BO characteristics is that it does not rely on local gradient and Hessian approximations (derivative-free), it uses the entire information available from previous evaluations of the function [25]. Consequently, it can find the minimum even of difficult non-convex functions with less computational expense (relatively few iterations), only by spending more time to determine the next sample point. Therewith, it can easily cope with uncertainties related to black-box stochastic functions, although it can become rather slow in large number of hyperparameters problems.

# **Chapter 4**

# The Verification for Manufacturability by Modelling Process Variation -Circuit Performance Dependency Methodology (P2P4M)

This chapter introduces the proposed methodology assessing the process variation impact on circuit design. The proposed Verification for Manufacturability by Modelling Process Variation - Circuit Performance Dependency methodology (**P2P4M**) is based on modeling the functional and statistical dependencies between the circuit performances(EP) and the manufacturing process variation (PCM), at an early stage, by employing ML algorithms on preSi data.

#### 4.1 Methodology Flow

#### 4.1.1 General Framework

A schematic representation of an unified feature selection and hyperparameter BO methodology for training regression models is presented in Figure 4.1.

#### 4.1.2 Problem Formulation and P2P4M Methodology Flow

The goal of the **P2P4M** methodology is to express an optimal relationship under the form of equation 1.1 through a ML model, by only using preSi MC simulation and no additional prior knowledge regarding the possible dependencies between the considered variables, namely each one of the EP and PCM, as Figure 4.2 shows.

$$EP = f(PCM_i) + \varepsilon \tag{1.1}$$



Fig. 4.1 Schematic representation of the generalized framework for optimization-based regression fitting



Fig. 4.2 Schematic representation of **P2P4M** methodology [42]

#### **4.2** Feature Selection Block

For the proposed **P2P4M** methodology, we employed a correlation metric-based feature selection method, based on two correlation metrics, very recent advances in the field and alleged as being among the best metrics for quantifying the dependence between any two random variables (EP and PCM in our case):

- (Brownian) Distance Correlation *DistCorr*: measures the degree of independence between two variables [29] and it is based on Brownian motion.
- Maximal Information Coefficient *MIC*: a mutual information-based metric that measures the degree of correlation between the two considered variables [20].

#### 4.3 Model Fitting based on preSi Data

The second step of the **P2P4M** methodology consists in fitting a 3-layer feedforward MLP NN (Levenberg-Marquardt backpropagation algorithm and hyperbolic tangent sigmoid *tansig* activation function) regression model - *metamodel* for the studied EP, at each iteration step determined by the BO, based on the subset of likely influential PCM determined by the feature selection block and on the hidden-layer neurons number optimally selected by the BO. The fitting uses the training subset of the available data, pairs of pre-determined PCM and the corresponding values of each one of the studied EP. Next, the estimated values  $(\widetilde{EP})$  are compared to the target ones  $(EP_{train})$  and the NN parameters are adjusted.

#### 4.4 Bayesian Optimization of the Fitted Model

The **P2P4M** methodology employs BO to iteratively develop a global statistical model for the objective function formed of 2 variables - the feature selection metric threshold and the NN's hidden-layer neurons number. The input space is a 2-dimensional space and at each iteration the goal is to find the promising input pair (*FSthreshold* and *neuronsNumber*), fit a new *metamodel* using it, followed by the function evaluation consisting on minimizing the *metamodel*' residual error computed on the testing set  $(Error = \widetilde{EP} - EP)$ . The BO implementation uses GP (based on Matérn 5/2 kernel) as surrogate function and *expected-improvement-plus* acquisition function.

#### 4.5 P2P4M Methodology Summary

The resulting algorithm for **P2P4M** methodology is summarized in Table 4.1.

Table 4.1 **P2P4M** methodology algorithm

```
Required parameters: EP, PCM, maxNN, itNumber

1: Split training-testing dataset (EP_{train}, EP_{test}, PCM_{train}, PCM_{test})

2: FSmetrics = DistCorr(EP_{train}, PCM_{train}) (v1)

FSmetrics = MIC(EP_{train}, PCM_{train}) (v2)

3: neuronsNumber \leftarrow optimizableVariable([2, maxNN])

4: FSthreshold \leftarrow optimizableVariable([(min(FSmetrics), max(FSmetrics)]))

5: OptResults = BayesianOptimization(EP_{train}, EP_{test}, PCM_{train}, PCM_{test}, FSmetrics, neuronsNumber, itNumber)

6: PCM_i = PCM(FSmetrics \geq OptResults.FSthreshold)

7: metamodel = NNtrain((EP_{train}, PCM_{i-train}, OptResults.neuronsNumber)
```

# 4.6 postSi Distribution Modelling and Validation of the Methodology

Figure 4.3 illustrates the framework for predicting postSi EP distributions based on the **P2P4M** *metamodels*. It requires the distribution that reflects the current process window  $(P(PCM_r))$ , the previously obtained individual EP's *metamodel*, under the form of a mathematical equation linking each one of the studied EP and the subset of influential PCMs  $(f(PCM_{r-i}))$ , as well as the prediction error  $(P(\varepsilon))$ . The distribution of the circuit performances  $(P(\widetilde{EP_r}))$  is computed as the superposition between the predicted EP's values returned by the *metamodel* and the non-zero fitting error, that quantifies the impact of other factors, not included in the initial analysis, on the EP.



Fig. 4.3 postSi EP distribution prediction framework

#### 4.7 The Application of P2P4M Methodology

# 4.7.1 Unifying Feature Selection and Hyperparameter Bayesian Optimization

Table 4.2 summarizes the results obtained on a real dataset formed of n = 93 input variables and one output variable, split into  $n_{train} = 862$  and  $n_{test} = 100$  samples. The RMSE computed on the testing dataset has the same order for all four implementation. Yet, the smallest RMSE was obtained when applying feature selection on the input variables. The main advantages of the **P2P4M** methodology is the speed and the computational cost; the same RMSE were obtained with only 7% of the iterations.

Table 4.2 RMSE, number of neurons and *DistCorr* threshold obtained on the real dataset

| Train & Test Settings               | minRMSE | Number of | DistCorr  |
|-------------------------------------|---------|-----------|-----------|
|                                     |         | Neurons   | Threshold |
| GS on neurons number                | 0.0481  | 3         | -         |
| BO on neurons number                | 0.0525  | 7         | -         |
| BO on neurons number &              | 0.0396  | 2         | 0.1       |
| GS on DistCorr threshold            | 0.0390  | 2         | 0.1       |
| BO on neurons number &              | 0.0398  | 2         | 0.1444    |
| DistCorr threshold ( <b>P2P4M</b> ) | 0.0396  | 2         | 0.1444    |

# **4.7.2** Functional Dependency Modelling and Finding the Influential PCM

Table 4.3 shows the **P2P4M** methodology results and accuracy metrics on a dataset of 92 PCM and 5 EP. The feature selection increases the prediction accuracy (*MSPE*) from 2 to 8 times. The *metamodels* are reliable, considering the correlation between the predicted and the target values ( $\rho_{\widetilde{EP}-EP}$ ), as well as the lack of correlation between the fitting error and the predicted values ( $\rho_{\varepsilon-\widetilde{EP}}$ ).

Table 4.3 The training metrics and results of the **P2P4M** methodology for: (1) No feature selection, (2) Feature selection with *DistCorr* metric, (3) Feature selection with *MIC* metric

| Parameter            | EP <sub>1</sub>                   | EP <sub>2</sub> | EP <sub>3</sub> | EP <sub>4</sub> | EP <sub>5</sub> |        |
|----------------------|-----------------------------------|-----------------|-----------------|-----------------|-----------------|--------|
|                      | MSPE                              | 0.088           | 0.181           | 0.187           | 0.145           | 0.166  |
| No Feature Selection | $ ho_{\widetilde{EP}-EP}$         | 0.385           | 0.289           | 0.273           | 0.214           | 0.232  |
|                      | $ ho_{arepsilon-\widetilde{EP}}$  | 0.529           | -0.649          | -0.615          | -0.529          | -0.426 |
|                      | MSPE                              | 0.041           | 0.083           | 0.086           | 0.024           | 0.024  |
| DistCorr             | $\mid ho_{\widetilde{EP}-EP}\mid$ | 0.820           | 0.607           | 0.619           | 0.900           | 0.851  |
|                      | $ ho_{arepsilon-\widetilde{EP}}$  | -0.026          | 0.022           | -0.030          | -0.010          | 0.065  |
|                      | MSPE                              | 0.041           | 0.083           | 0.087           | 0.022           | 0.024  |
| MIC                  | $\mid ho_{\widetilde{EP}-EP}\mid$ | 0.823           | 0.592           | 0.592           | 0.899           | 0.841  |
|                      | $ ho_{arepsilon-\widetilde{EP}}$  | -0.077          | -0.079          | 0.019           | -0.113          | 0.065  |

#### 4.7.3 preSi and postSi Distribution Modelling and Validation

Table 4.4 summarizes the results of two similarity metrics values - Bhattacharyya distance and Wasserstein metric, aimed at quantifying the statistical distance between the real EP distribution and the **P2P4M** modelled EP distribution. The preSi – postSi  $(EP_{preSi} - EP_{postSi})$  is considered a benchmark. As expected, the similarity between the simulation distribution and its predicted counterpart is high for all parameters. A high degree of similarity can also be observed between the postSi modelled distribution; moreover, there is one order of magnitude closeness between the postSi estimated

Table 4.4 Similarity metrics between the EP real and modelled distributions, based on preSi MC simulations or postSi real measurements

| Parameter                                           | Similarity metric | EP <sub>1</sub> | EP <sub>2</sub> | EP <sub>3</sub> | EP <sub>4</sub> | EP <sub>5</sub> |
|-----------------------------------------------------|-------------------|-----------------|-----------------|-----------------|-----------------|-----------------|
| ~                                                   | Bhattacharyya     | 0.0025          | 0.030           | 0.0040          | 0.0011          | 0.0015          |
| $\mathbf{EP}_{preSi} - \mathbf{EP}_{preSi}$         | distance          | 0.0023          |                 |                 |                 |                 |
|                                                     | Wasserstein       | 0.0123          | 0.0139          | 0.0158          | 0.0081          | 0.0097          |
|                                                     | metric            | 0.0123          | 0.0139          | 0.0136          | 0.0081          | 0.0097          |
| ~                                                   | Bhattacharyya     | 0.0017          | 0.0026          | 0.0032          | 0.0033          | 0.0029          |
| $\parallel \text{EP}_{postSi} - \text{EP}_{postSi}$ | distance          |                 |                 |                 |                 |                 |
|                                                     | Wasserstein       | 0.0120          | 0.0144          | 0.0146          | 0.0155          | 0.0145          |
|                                                     | metric            | 0.0120          | 0.0144          | 0.0140          | 0.0133          | 0.0143          |
|                                                     | Bhattacharyya     | 0.0491          | 0.0317          | 0.0265          | 0.0342          | 0.6187          |
| $\parallel \text{EP}_{preSi} - \text{EP}_{postSi}$  | distance          | 0.0491          | 0.0317          | 0.0203          | 0.0342          | 0.0167          |
|                                                     | Wasserstein       | 0.2208          | 0.1496          | 0.1451          | 0.1787          | 0.4583          |
|                                                     | metric            | 0.2208          | 0.1490          | 0.1431          | 0.1767          | 0.7303          |

distribution and the measured postSi distribution (3rd row), compared to the benchmark. The EP marginal *cdf* plots represented in Figure 4.4 better illustrates this. The postSi modelled distribution using the proposed approach, based on measured postSi PCM samples (red line) entirely reassembles the real postSi distribution (green line). Even when there is a significant difference between the preSi and postSi distribution, as it is the case of EP<sub>4</sub> (Figure 4.4(d)), the **P2P4M** methodology is able to adapt to the technology changes and the modelled distribution is much closer to the real one (green line).

#### 4.7.4 Methodology Reliability

As a final step, the methodology's reliability and consistency were assessed during four runs in order to evaluate its functioning capability across two PCM datasets: initialSet93 - 93 PCM that are simulated during preSi and measured in postSi phase and extendedSet198 - 198 PCM, the initialSet93 plus an additional 105 PCM that are not being monitored in postSi. The high influence of the extended dataset on the performance of the **P2P4M** methodology is emphasized in Figure 4.5. Clearly, the *initialSet*93 results on EP<sub>11</sub> represent an explicit example of EP's behavior that is poorly explained by the available set of PCM parameters. Even with a *DistCorr* threshold spanning between 0.19 and 0.31, the *metamodels' MSPE* is set around 0.12. Furthermore, the low accuracy is supported by a less than 0.35 correlation between the predicted and the target values of the EP. The methodology selected two influential PCM - PCM<sub>2</sub> and PCM<sub>30</sub>, displaying important distance correlation metrics of 0.3468 and 0.32549, respectively. Nevertheless, taking into consideration all the results, one can conclude that the initial set of available PCM parameters is not suited to fairly describe EP<sub>11</sub>. The PCM with indexes from 189 to 193 display the highest correlation with EP<sub>11</sub> and should definitely be monitored during the postSi production phase by the technologist.



Fig. 4.4 *cdf* plots for the studied EP' distributions in: (1) simulation (preSi), (2) production (postSi), (3) *metamodel* prediction based on postSi samples (postSi estimated)



Fig. 4.5 **P2P4M**'s reliability and consistency results obtained during the 4 runs for EP<sub>11</sub>

Employing the *extentedSet*198 contributes to significant improvements in the *metamodels*' accuracy. The average *MSPE* decreased to 0.02, while the Pearson's correlation between the target and the predicted reached 0.97. The accurate and optimal *metamodels* are trained based on a PCM set consisting of 14 parameters, as the last plot from 4.5 exhibits. Similarly, Figure 4.6 presents the correlation coefficients between the extended set of PCM and EP<sub>11</sub>. There is a straightforward relationship between the improved *metamodels* and the *DistCorr* metric values displayed by the additional 105 PCM - the maximum distance correlation reaches 0.6983, almost double than the maximum correlation metric between EP<sub>11</sub> and the *initialSet*93's PCM.



Fig. 4.6 *DistCorr* correlation coefficients between the *extendedSet* 198's PCM and EP<sub>11</sub>

# Chapter 5

# **P2P4M Methodology Use-Cases**

This chapter contains the work developed in the context of several use-case applications of the **P2P4M** methodology, aimed to provide efficient instruments in tackling long-known semiconductor issues. The motivation was the opportunity to fully benefit from the *metamodel* assets, as well as to introduce new tailored methodologies for classical IC challenges, within the process variation context.

# 5.1 Use-case 1: Sensitivity Analysis with Process Variation

#### 5.1.1 Global Sensitivity Analysis Methodology

Figure 5.1 describes the steps of the proposed global SA methodology in detail. The first step involves applying the **P2P4M** framework for the target EPand obtain the optimal *metamodel*. Next, the captured relationship is assessed to extract the set of resulted PCM that are the influencing factors ( $PCM_i$ ), as optimally determined by the BO framework, based on the NN's model test set residual error. In the last step, the sensitivity index  $S_j$  is computed, based on the *DistCorr* or MIC metric, as described in equation 1.1, and the PCMs are ranked. Note that higher  $S_j$  means greater impact.

$$S_{j} = DistCorr(EP, PCM_{i-j})$$

$$S_{j} = MIC(EP, PCM_{i-j})$$
(1.1)

#### 5.1.2 Local Sensitivity Analysis Methodology

The steps of the proposed Local SA methodology are detailed in Figure 5.2. The first step is the same as for the global SA methodology - the **P2P4M** framework must be applied for the targeted EP in order to obtain the best EP' *metamodel*. The relationship



Fig. 5.1 Global sensitivity analysis methodology steps

encapsulated by each *metamodel* is formulated as in eq. 1.2, where hyperbolic tangent (tanh) represents the NN activation function,  $IW^T$  is the input-to-hidden layer weights vector,  $LW^T$  is the hidden-to-output layer weights vector and B denotes the input-to-hidden layer bias vector.

$$\widehat{EP} = LW^T \tanh(IW^T \mathbf{PCM}_i + B) \tag{1.2}$$

The second step consists in computing the derivative of each  $\widehat{EP}$  (*metamodel*'s output) with respect to its influential PCM -  $PCM_i$  (*metamodel*'s inputs), to obtain the partial derivatives. Finallym the local sensitivity indices are computed, under the form of equation 1.3, where  $PCM_0$  is the point used to compute the local SA of EP with respect to its  $j^{th}$  influential PCM -  $PCM_{i-j}$ .

$$s_{j|PCM_0} = \frac{\partial \widehat{EP}}{\partial PCM_{i-j}} (PCM_0) \tag{1.3}$$

The local sensitivity of a circuit performance with *n* influential PCM is defined as the Euclidian norm of all the local sensitivities indices:

$$S_{|\mathbf{PCM}_0} = \|[s_{1|PCM_0}, s_{2|PCM_0}, ..., s_{n|PCM_0}]\|_2$$
(1.4)



Fig. 5.2 Local sensitivity analysis methodology steps

#### **5.1.3** Experimental Results

#### **Global Sensitivity Analysis**

In Table 5.1 is listed the ranked set of influential PCM for each one of the EP for the considered dataset (92 PCM) and their corresponding sensitivity index, for both metrics - *DistCorr* and *MIC*. When connecting this data with the results from Table 4.3, it can be easily observed that these metrics' overall magnitude for each one of the EP is strongly correlated with the regression models' accuracy. The scatter plots between EP<sub>3</sub> and PCM<sub>29</sub> and PCM<sub>92</sub> display a visual axis symmetry that proves the statistical independence between the EP and PCM. On the other hand, the scatter plot between EP<sub>4</sub> and PCM<sub>42</sub>, as depicted in Figure 5.3, highlights the presence of statistical dependence between the two considered parameters. Coordinating this outcome with the slightly higher correlation between the predicted and target values of EP<sub>4</sub> ( $\rho_{\widetilde{EP_4}-EP_4}$  - Table 4.3), strengthens the influence PCM<sub>42</sub> exerts on the considered EP.

Withal, the last results proves that the *DistCorr* implementation can be validated as being more suitable as a feature selection method than *MIC* for this particular case.

Table 5.1 DistCorr and MIC sensitivity analysis indexes and the influential PCM' ranking

| Parameter       | DistCorr          | r              | MIC               |           |  |
|-----------------|-------------------|----------------|-------------------|-----------|--|
| 1 at afficter   | Influential PCM   | $S_{DistCorr}$ | Influential PCM   | $S_{MIC}$ |  |
|                 | PCM <sub>2</sub>  | 0.4567         | PCM <sub>2</sub>  | 0.2885    |  |
| $EP_1$          | PCM <sub>27</sub> | 0.4532         | PCM <sub>27</sub> | 0.2710    |  |
|                 | PCM <sub>4</sub>  | 0.4068         | PCM <sub>4</sub>  | 0.2504    |  |
|                 | PCM <sub>29</sub> | 0.2816         | PCM <sub>29</sub> | 0.2075    |  |
|                 | PCM <sub>27</sub> | 0.2252         | PCM <sub>4</sub>  | 0.1721    |  |
| EP <sub>2</sub> | PCM <sub>2</sub>  | 0.2235         | PCM <sub>2</sub>  | 0.1697    |  |
|                 | PCM <sub>4</sub>  | 0.2183         | PCM <sub>27</sub> | 0.1686    |  |
|                 | PCM <sub>4</sub>  | 0.1901         | PCM <sub>2</sub>  | 0.1977    |  |
|                 | PCM <sub>27</sub> | 0.1573         | PCM <sub>27</sub> | 0.1888    |  |
| EP <sub>3</sub> | PCM <sub>2</sub>  | 0.1401         | PCM <sub>92</sub> | 0.1657    |  |
|                 |                   |                | PCM <sub>4</sub>  | 0.1646    |  |
|                 |                   |                | PCM <sub>29</sub> | 0.1619    |  |
| EP <sub>4</sub> | PCM <sub>43</sub> | 0.8031         | PCM <sub>43</sub> | 0.6528    |  |
| 121 4           | PCM <sub>42</sub> | 0.2060         |                   |           |  |
| EP <sub>5</sub> | PCM <sub>42</sub> | 0.8096         | PCM <sub>42</sub> | 0.6074    |  |
| 121 5           | PCM <sub>43</sub> | 0.3798         | PCM <sub>43</sub> | 0.2202    |  |

#### **Local Sensitivity Analysis**

Figure 5.4 sums up the methodology performance when running the 2 proposed experiments - computing the total local sensitivities in all corners and the point of maximum sensitivity within the influential PCM using a local optimizer. There are no points of maximum sensitivity that exceed the maximum value of the local sensitivity computed



Fig. 5.3 Scatter plot between EP<sub>4</sub> and the supplementary influential PCM selected by the *DistCorr* metric (PCM<sub>42</sub>)



Fig. 5.4 Ranked total local SA values calculated for the studied EP

in the corners of the influential PCM for none of the EP. However, the newly obtained critical points indicated within the influential PCM specification limits still introduce

high sensitivity. For example, for EP<sub>2</sub> the total local sensitivity in the point of maximum sensitivity (1.2849) is comparable with the total local sensitivity in the worst corner (obtained for PCM<sub>2</sub> at LSL, PCM<sub>4</sub> at LSL, PCM<sub>27</sub> at USL).

#### 5.2 Use-case 2: Simulation Model Checking

#### 5.2.1 Methodology Flow

Figure 5.5 highlights the steps of the introduced simulation model checking with process variation methodology and it relies on the existence of distributions discrepancies or disparities. The methodology should be employed for one EP at the time, for which the **P2P4M** methodology has been already applied, thus obtaining the optimal *metamodel*, as well as the introduced global SA methodology and the resulting ranking of the influential PCM. Besides this, the methodology requires preSi and postSi distributions for the studied EP and for the entire set of influential PCM -  $PCM_i$ .



Fig. 5.5 Model checking methodology steps

#### **5.2.2** Experimental Results

Figure 5.6 depicts a critical EP, namely EP<sub>6</sub>, showing both distributions discrepancies and specifications' surpassing oh the postSi distribution. One can easily observe that the distributions of wafers 47 and 48 present clear distortions compared to the rest of the wafers and right-shifting. Besides that, wafers 4, 39, 40, 43 and 46 distributions have their mean shifted to the right compared to the preSi distribution, that, in theory, should cover all postSi variations. EP<sub>6</sub> presents eight influential PCM found by the **P2P4M** methodology; the most influential one - PCM<sub>91</sub> does not display distributions' disparities. Yet, PCM<sub>5</sub>, ranked as second ( $S_{DistCorr}$  of 0.2815), display in Figure 5.7 right shift of the postSi and history distributions, both of them violating the upper specification limit, apart from the preSi distribution failing at covering the process variation.



Fig. 5.6 cdf plot for preSi and postSi distributions of EP<sub>6</sub>

#### **5.3** Use-case 3: Yield Prediction

#### 5.3.1 preSi Yield Prediction for Normal Distributions Methodology

The first methodology aims at predicting the parametric yield at an early stage, based on the **P2P4M** methodology's *metamodels*, for EP and PCM displaying normal distributions. Figure 5.8 describes the steps of the proposed approach for one EP and instead of simulating the circuit responses using MC, it uses the dependency between the EP and the dataset of influential PCM. More precisely, it starts by artificially generating a high-sample dataset (millions of samples) for the influential PCM ( $PCM_i$ ), by modeling the PCM distributions as a multivariate normal distribution, followed by estimating the corresponding EP samples ( $\widetilde{EP}$ ), based on the relationship encapsulated by the



Fig. 5.7 *cdf* plot for preSi, postSi and historical distributions of PCM<sub>5</sub>



Fig. 5.8 preSi yield prediction for normal distributions methodology flow

corresponding *metamodel*. Finally, the yield prediction computed using the *OOS* metric, by counting the EP' samples falling outside the specification limits.

#### **5.3.2** Parametric Yield Prediction Methodology

The algorithm is presented in Table 5.2 and it needs the EP's metamodel, together with the influential PCM distributions ( $PCM_i$ )) and the metamodel fitting error distribution for the analyzed performance parameter. In order to compute the OOS metric, the parameter specification limits LSL and USL are required as well. The key point of this new approach is its ability to deal with non-normal distributions, that extends for both PCM and the fitting error, based on employing the distFit method presented in [11], a multivariate distribution fitting framework capable of generating data according to the modeled distribution. Then, the corresponding EP values are estimated based on the

*metamodel*, the synthetic generated PCM values ( $PCM_{i-gen}$ ) and the modeled fitting error ( $\varepsilon_{gen}$ ), thus resulting  $\widetilde{EP}$ . Finally, the OOS is computed for the studied EP based on eq. 3.5.

$$OOS_{EP} = \frac{no. \ of \ samples \ out \ of \ spec}{total \ no. \ of \ samples}$$
(3.5)

Table 5.2 Parametric yield prediction algorithm

Required parameters:  $\widetilde{EP} = f(PCM_i) + \varepsilon$ ,  $PCM_i$ ,  $\varepsilon$ ,  $LSL_{EP}$ ,  $USL_{EP}$ 

- 1: Generate  $\varepsilon_{gen} \sim distFit(\varepsilon)$
- 2: Generate  $PCM_{i-gen} \sim distFit(PCM_i)$
- 3:  $\widetilde{EP} = f(\mathbf{PCM}_{i-gen} + \varepsilon_{gen})$
- 4: Count  $OOS_{\widetilde{FP}}$

#### **5.3.3** Experimental Results

#### preSi Yield Prediction for Normal Distributions

Table 5.3 presents the results obtained for 3 EP and the entire product (considering the dataset on which the proposed methodology was applied). The proposed solution achieves similar, but lower *OOS* values as the considered benchmark (the *OOS*s computed on the preSi initial dataset). On the other hand, it must be taken into consideration the high variance of *OOS* when computed on small datasets, which is exactly the case of the used preSi data. Consequently, the proposed methodology performance is better than the state-of-the-art approach of simply compute *OOS* on a small dataset of MC simulations, obtaining lower variance in prediction.

Table 5.3 Yield loss prediction results for both the three studied EP and the entire product - the preSi benchmark and the proposed methodology results

| Method                      |          | EP <sub>1</sub> | EP <sub>2</sub> | EP <sub>3</sub> | Product |
|-----------------------------|----------|-----------------|-----------------|-----------------|---------|
| OOS EP <sub>preSi</sub> (%) |          | 0.84            | 0.60            | 0.73            | 0.94    |
| (962 samples)               |          |                 |                 |                 |         |
| OOS $\widetilde{\text{EP}}$ | μ<br>(%) | 0.92            | 0.75            | 0.85            | 1.04    |
| (1,200,000 samples)         | σ<br>(%) | 0.15            | 0.13            | 0.14            | 0.17    |

#### **Parametric Yield Prediction**

Figure 5.9 illustrates the visual representation of the obtained results for  $EP_3$  (a) and for the multivariate (product) case (b), concentrating on the 10-iteration distribution of each



Fig. 5.9 Box plots of the OOS metrics computed with the two methods, along with the corresponding benchmark values, for: (a) EP<sub>3</sub> and (b) the entire product

method and the benchmark values (*OOS*s on preSi and postSi samples). *Method* 2 (the algorithm presented in Table 5.3) outperforms *Method* 1, since it displays lower variances. *Method* 1 implied computing the *OOS* metric for each one of the EP on a dataset obtained as follows: firstly, the 220-sample EP was estimated, using the corresponding **P2P4M**' *metamodels* and the available 220 measured samples of **PCM**<sub>postSi</sub>. Next, the fitting error was modelled based on *distFit* to reach the desired number of EP samples (in this case, 164,129 samples). Both methods' yield predictions (disregarding the particular estimation type) have the tendency to be closer to the considered ideal reference (i.e. postSi measurements *OOS*), than to the simulation-based-*OOS*, but the proposed yield prediction algorithm shows a stable behavior and the resulting loss is in the safety margins.

# Chapter 6

### **Conclusions**

Chapter 6 draws the final conclusions and underlines the original contributions of this research, along with future work perspectives.

#### **6.1** General Objectives and Results

This research was dedicated to the study of the fabrication process variation assessment in circuit design analysis, by employing Machine Learning approaches. The main objective was to develop a comprehensive methodology for a precise assessment of the process variation impact on circuit design performances, as early as possible in the development flow, followed by its application in various semiconductor integrated circuits (ICs) research domains, such as the sensitivity analysis of ICs' performance with process variation, diagnostic of inaccuracies in simulation device model or improvement of the yield prediction considering the technological variation.

The **P2P4M** methodology represents an efficient and automated methodology for modelling the relationship between the circuits performance parameters (EPs) and the technology process variations (measured through the PCM), in preSi stage. It connects the preSi verification and the postSi validation with respect to the process variation, regardless of the data distribution and correlation configuration in order to overcome the limitations of the current state-of-the-art methods. Moreover, it represents the only available methodology able to establish the mathematical relation between EPs and PCMs during the preSi phase. The methodology's resulting reliable EPs regression *metamodels* are able to express both the functional and the statistical influence of the technology parameters on the circuit performances. Thus, it allows for an instant and accurate snapshot on the circuit performances behavior when technological changes appear, without the need for additional simulations or postSi measurements, since it is able to adapt to the PCMs' distributions immediately. The data-driven methodologies are applicable to almost any IC implying continuous variables, when co-simulating the

analog circuit model and the PCM structures schematics is possible. They do not require new experiments, but only standard MC simulations and postSi measurements samples.

The outcomes of the methodologies proposed in this research enable viable solutions to the five problems of the semiconductor industry highlighted since the beginning. For the fab-to-fab migration problem, the solution is straightforward; the P2P4M methodology is applied on a historical PCMs dataset able to characterize the process window for the target fabrication plant at a given moment and the circuit electrical performances' distributions of the product to-be-migrated will result upfront, thus enabling the yield prediction. In terms of the approximate effect of the device model, a large difference between the predicted performances' distribution and actual measured ones when employing the **P2P4M** methodology for distribution modelling may provide a strong indication of the device models accuracy problems. As previously stated, the local sensitivity analysis method assists the designer in enabling a fast and computational-wise product-specific DoE test plan, since the fitted *metamodels*, through the global sensitivity analysis method, are able to provide the necessary information regarding the limited set of influential PCMs. Similarly, besides predicting the parametric yield loss due to process variation with the help of the introduced yield prediction methods, the relationship between the circuit performance and the process variation encapsulated in the *metamodels* can be used to easily determine the process variation parameter influencing the decrease of the IC's performance, thus resolving the root cause analysis problem. And last but not least the IC model checking with process variation can be performed by using the introduced model checking method that links the performances' disparities in distribution to their counterparts in the influential PCMs, thus providing a sufficient examination.

#### **6.2** Original Contributions

The author's major original contributions of this research (methodologies and concepts) are summarized as follows, divided by chapter.

#### In Chapter 4:

- The development of an unified feature selection and hyperparameter Bayesian Optimization methodology for training regression models, that stands out by combining the best features selection metrics recommended by the literature (*DistCorr* and *MIC*) with the Bayesian Optimization framework, to train a Neural Network regression model [40];
  - The application on a controlled synthetic dataset generated by a custom function with random (independent) extra variables to enable a comparative study with grid search, in terms of performance;

- The application on a real dataset generated by an integrated circuit that depends on an unknown limited set of parameters, the rest of them having little or no influence, followed by a comparative study with grid search;
- To the best of our knowledge, there are no similar approaches found in the literature;
- The development of the Verification for Manufacturability by Modelling Process Variation Circuit Performance Dependency (**P2P4M**) methodology for modelling the functional and statistical dependencies between the circuit performances (EPs) and the manufacturing process variation (monitored through PCMs), during preSi development stage [6], [38];
  - The application on an experimental IC (LDO) MC simulations to obtain the optimal *metamodels* for each studied EP, that encapsulates the influential PCMs subset;
  - A reliability and consistency study of the P2P4M methodology, illustrated by experimental results obtained on a real IC product;
  - To the best of our knowledge, it represents the only available method that determines the relationship between EPs and PCMs under a mathematical form in preSi stage;
- The development of a framework for predicting postSi EP distributions based on the **P2P4M** *metamodels* (reflecting the modelling of the circuit performance behavior with the technology process variation) and a reference process window information [6];
  - The application on an analog IC, for quickly estimating the postSi EPs' distributions based on PCMs measurements' distribution and validating the approach accuracy using multidimensional similarity distribution metrics;

#### In **Chapter 5**:

- The development of a Global Sensitivity Analysis with process variation methodology for identifying and ranking the most important PCM factors, based on correlation metrics [37];
  - A comparative study of the *DistCorr*, *MIC* and Pearson's correlation metrics, highlighting their limitations;
  - The application on classical MC simulations of an IC product for determining the circuit performances sensitivities with process variations, by obtaining a ranked subset of influential PCMs;

- The development of a Local Sensitivity Analysis with process variation methodology, based on the partial derivative method applied of **P2P4M** *metamodels*, for determining the local sensitivities of the circuit performances in the process corners and the point of maximum sensitivity in the process variation window [42];
  - The application on an IC product preSi dataset to prove the methodology behavior and assist the IC designer in customizing a product-oriented DoE;
- The development of a simulation model checking methodology for adequately examining the process variation impact on circuit performances distributions' disparities, an alternate semi-formal solution to assess the inherent process variation causing coverage problems [12];
  - The application on an analog IC, by employing preSi and postSi distributions to identify the EPs displaying distributions' discrepancies, followed by classifying them as critical and non-critical and analyzing the distributions of the influential PCMs;
- The development of the yield prediction methodology for computing the parametric yield of normal distributions at an early stage, based entirely on preSi MC simulations [41];
  - The application for multivariate yield prediction of an IC product, by employing classical MC simulations and the illustration of the results compared to a predefined benchmark;
- The development of the parametric yield prediction with process variation, an enhanced version of the previously-presented methodology to fit also non-normal distributions of the Process Control Monitor parameters [6];
  - The application for multivariate parametric yield prediction of an IC product influenced by PCMs displaying normal and non-normal distributions, based on a limited number of wafers from two productive lots.

#### **6.3** List of Original Publications

1. **Elena-Diana Şandru**, A. Buzo, H. Cucu, and C. Burileanu. Recent Experiments and Findings in Baby Cry Classification. In *Fratu O., Militaru N., Halunga S. (eds) Future Access Enablers for Ubiquitous and Intelligent Infrastructures. FABULOUS 2017. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering*, 241:253–360, July 2018, ISI WOS: 000481658200037, DOI: 10.1007/978-3-319-92213-3\_37 [39].

- Elena-Diana Şandru and E. David. Unified Feature Selection and Hyperparameter Bayesian Optimization for Machine Learning based Regression. In *Proceedings of the International Symposium on Signals, Circuits and Systems* (ISSCS), pages 1–5, Iași (Romania), July 2019, ISI WOS: 000503459500003, DOI: 10.1109/ISSCS.2019.8801728 [40].
- Elena-Diana Şandru, C. Burileanu, E. David, A. Buzo, and G. Pelz. Modeling the Dependencies between Circuit and Technology Parameters for Sensitivity Analysis using Machine Learning Techniques. In *Proceedings of the 16th Interna*tional Conference on Synthesis, Modeling, Analysis and Simulation Methods and Applications to Circuit Design (SMACD), pages 237–240, Lausanne (Switzerland), July 2019, ISI WOS: 000503265100060, DOI: 10.1109/SMACD.2019.8795266.
- 4. **Elena-Diana Şandru**, E. David, and G. Pelz. Pre-Silicon Yield Estimation using Machine Learning Regression. In *Proceedings of the 26th IEEE International Conference on Electronics Circuits and Systems (ICECS)*, pages 103–104, Genoa (Italy), November 2019, ISI WOS: 000534573400028, DOI: 10.1109/ICECS46596. 2019.8964997 *Best Poster Award* [41].
- Elena-Diana Şandru, E. David, and G. Pelz. Machine Learning-Based Local Sensitivity Analysis of Integrated Circuits to Process Variations. In *Proceedings of the 27th IEEE International Conference on Electronics Circuits and Systems (ICECS)*, pages 1–2, Glasgow (Scotland), November 2020, ISI WOS: 000612696300171, DOI: 10.1109/ICECS49266.2020.9294956 *Best Presentation Award* [42].
- 6. I. Kovacs, M. Topa, C. Pop, **Elena-Diana Şandru**, A. Buzo, and G. Pelz. Correlating Electrical and Process Parameters for Yield Detractors' Detection. In *Proceedings of the 14th International Symposium on Electronics and Telecommunications (ISETC)*, pages 1–4, Timiṣoara (Romania), November 2020, ISI WOS: 000612681000093, DOI: 10.1109/ISETC50328.2020.9301121 [12].
- 7. **Elena-Diana Ṣandru**, C. Burileanu, E. David, and G. Pelz. On the Robustness of the Methodology for Modelling the Dependencies between Circuit and Technology Parameters of Integrated Circuits. In *University POLITEHNICA of Bucharest Scientific Bulletin, Series C: Electrical Engineering and Computer Science*, 83(4):97–110, 2021, **ISSN: 2286-3540** [38].
- Elena-Diana Şandru, E. David, I. Kovacs, A. Buzo, C. Burileanu, and G. Pelz.
   Modeling the Dependency of Analog Circuit Performance Parameters on Manufacturing Process Variations with Applications in Sensitivity Analysis and Yield Prediction. In *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, 41(1):129–142, 2022 Q2, IF: 2.807, ISI WOS: 000732986400014, DOI: 10.1109/TCAD.2021.3054804, ISSN: 0278-0070 [6].

- 9. **Elena-Diana Şandru**. General Concepts of Process Variation on Circuit Design and Semiconductor Products Verification. Technical Report No. 1, University POLITEHNICA of Bucharest, May 2017 [35].
- Elena-Diana Şandru. A Qualitative Analysis of the Process Variation on Circuit Design. State-of-the-Art. Technical Report No. 2, University POLITEHNICA of Bucharest, November 2017 [36].

#### **6.4** Future Work

Future work can focus on addressing some of the following topics, in order to improve the proposed approaches:

- The improvement of the **P2P4M** methodology in the context of computational expense, by adding a stopping criterion to end the Bayesian Optimization iterations when a specific minimum is reached;
- The application of the **P2P4M** methodology in analog circuits calibration, for tuning the design towards technology variations, as the performances in production test can be represented as a function of process parameters and the tuning knobs;
- The extension of the analysis by adding other influencing factors from an IC point of view, such as temperature of input voltages, in addition to the process variation measured through the Process Control Monitor parameters;
- The integration of the global and local sensitivity analysis with process variation methodologies in the IC development phases, to assist the designer in setting the experiments plan for the post-Silicon verification and validation (DoE);
- The enhancement of the parametric yield prediction methodology for high-yield products cases when very low probability distribution tails are involved, by tuning the distribution modelling with importance sampling techniques;
- The integration of the simulation model checking methodology in the IC product development and its advancement to allow yield detractors' detection to contribute to yield optimization.

### References

- [1] Adir, A., Copty, S., Landa, S., Nahir, A., Shurek, G., Ziv, A., Meissner, C., and Schumann, J. (2011). A Unified Methodology for pre-silicon Verification and post-silicon Validation. In *Proceedings of the Design, Automation and Test in Europe Conference and Exhibition (DATE)*, pages 1–6, Grenoble (France).
- [2] Ahmadi, A., Stratigopoulos, H.-G., Huang, K., Nahar, A., Orr, B., Pas, M., Carulli, J.-M., and Makris, Y. (2017). Yield Forecasting Across Semiconductor Fabrication Plants and Design Generations. *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, 36(12):2120–2133.
- [3] Barragan, M. J. and Leger, G. (2019). Feature Selection and Feature Design for Machine Learning Indirect Test: a Tutorial Review. In *Proceedings of the 16th International Conference on Synthesis, Modeling, Analysis and Simulation Methods and Applications to Circuit Design (SMACD)*, pages 237–240, Lausanne (Switzerland).
- [4] Brochu, E., Cora, V. M., and de Freitas, N. (2010). A tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning. https://arxiv.org/abs/1012.2599.
- [5] Cacuci, D. G., Ionescu-Buhor, M., and Navon, I. M. (2005). *Sensitivity and Uncertainty Analysis, Volume II.* Taylor & Francis.
- [6] Sandru, E.-D., David, E., Kovacs, I., Buzo, A., Burileanu, C., and Pelz, G. (2022). Modeling the Dependency of Analog Circuit Performance Parameters on Manufacturing Process Variations with Applications in Sensitivity Analysis and Yield Prediction. *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, 41(1):129–142.
- [7] Director, S., Strojwas, A. J., and Maly, W. (1989). *VLSI Design for Manufacturing: Yield Enhancement*. Kluwer Academic Publishers, Norwell, MA (USA).
- [8] Durrandeab, N., Ginsbourgerc, D., Roustantb, O., and Carrarod, L. (2013). ANOVA kernels and RKHS of zero mean functions for model-based sensitivity analysis. *Journal of Multivariate Analysis*, 115(1):57–67.
- [9] Gong, F., Shi, Y., Yu, H., and Yu, H. (2014). Variability-aware Parametric Yield Estimation for Analog/Mixed-signal Circuits: Concepts, Algorithms and Challenges. *IEEE Design & Test*, 31(4):6–15.
- [10] Haykin, S. (2009). Neural Networks and Learning Machines. Pearson.
- [11] Kovacs, I., Topa, M., Buzo, A., and Pelz, G. (2017). An Accurate Yield Estimation Approach for Multivariate Non-normal Data in Semiconductor Quality Analysis. In *Proceedings of the 14th International Conference on Synthesis, Modeling, Analysis and Simulation Methods and Applications to Circuit Design (SMACD)*, Taormina (Italy).

- [12] Kovacs, I., Ṭopa, M., Pop, C., Ṣandru, E.-D., Buzo, A., and Pelz, G. (2020). Correlating Electrical and Process Parameters for Yield Detractors' Detection. In *Proceedings of the 27th International Symposium on Electronics and Telecommunications* (*ISETC*), pages 1–4, Timisoara (Romania).
- [13] Kuhn, K. J., Giles, M. D., Becher, D., Kolar, P., Kornfeld, A., Kotlyar, R., Ma, S. T., Maheshwari, A., and Mudanai, S. (2011). Process Technology Variation. *Proceedings of the IEEE Transactions on Electron Devices*, 58(8):2197—-2208.
- [14] Kumar, J. A., Ahmadyan, S. N., and Vasudevan, S. (2014). Efficient Statistical Model Checking of Hardware Circuits with Multiple Failure Regions. *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, 33(6):945–958.
- [15] Liu, Q. and Wu, Y. (2012). *Supervised Learning*, chapter Supervised Learning, pages 3349–3351. Encyclopedia of the Sciences of Learning. Springer.
- [16] Miao, J. and Niu, L. (2016). A survey on feature selection. *Procedia Computer Science*, 91:919–926.
- [17] Mitra, S., Seshia, S. A., and Nicolici, N. (2010). Post-silicon Validation Opportunities, Challenges and recent Advances. In *Proceedings of the 47th Design Automation Conference (DAC)*, pages 12–17, Anaheim, CA (USA).
- [18] Mittal, S. (2016). A Survey of Architectural Techniques for Managing Process Variation. *ACM Computing Surveys (CSUR)*, 48(4).
- [19] Pieper, K.-W. and Gontro, E. (2011). An Effective Method for Solving the Covariance Equation for Statistical Modeling. In *Proceedings of 2011 Semiconductor Conference Dresden*, pages 1–4, Dresden (Germany).
- [20] Reshef, D. N. and *et. all* (2011). Detecting Novel Associations in Large Data Sets. *Science*, 334(6062):1218–1224.
- [21] Schemmert, W. and Zimmer, G. (1974). Threshold-voltage Sensitivity of Ionimplanted MOS Transistors due to process Variations. *Electronica Letters*, 10(9):151–1524.
- [22] Seligman, E., Schubert, T., Achutha, M. V., and Kumar, k. (2015). *Basic Formal Verification Algorithms*, chapter 2, pages 23–47. Formal Verification An Essential Toolkit for Modern VLSI Design. Morgan Kaufmann.
- [23] Shockley, W. (1961). Problems related to p-n Junctions in Silicon. *Solid-State Electronics*, 2:35–67.
- [24] Snedecor, G. W. and Cochran, W. G. (1989). *Statistical Methods*. Iowa State University Press, Ames, IA (USA), 8 edition.
- [25] Snoek, J., Larochelle, H., and Prescott Adams, R. (2012). Practical Bayesian Optimization of Machine Learning Algorithms. In *Proceedings of the 25th International Conference on Neural Information Processing Systems (NPIS)*, Lake Tahoe, NV (Spain).
- [26] Sobe, U., Rooch, K.-H., and Mörtl, D. (2007). Simulation and Analysis of Analog Circuit and PCM (Process Control Monitor) Test Structures in Circuit Design. In *Dresden Workshop on Circuit and System Design 2007*, Dresden (Germany).

- [27] Stanisavljevic, D. and Spitzer, M. (2015). A Review of Related Work on Machine Learning in Semiconductor Manufacturing and Assembly Lines. In *Proceedings* of the 15th International Conference on Knowledge Technologies and Data-driven Business (i-KNOW), Graz (Austria).
- [28] Stratigopoulos, H.-G. (2018). Machine Learning Applications in IC Testing. In *Proceedings of the 23rd IEEE European Test Symposiun (ETS)*, Bremen (Germany).
- [29] Szekely, G., Rizzo, M., and Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances. *The Annals of Statistics*, 35(6):2769–2794.
- [30] Tang, J., Alelyani, S., and Liu, H. (2014). Feature Selection for Classification: a Review. CRC Press.
- [31] Xanthopoulos, C., Huang, K., Ahmadi, A., Kupp, N., Carulli, J., Nahar, A., Orr, B., Pass, M., and Makris, Y. (2019). *Gaussian Process-Based Wafer-Level Correlation Modeling and Its Applications*, chapter 5, pages 119–173. Machine Learning in VLSI Computer-Aided Design. Springer.
- [32] Zhang, Y., Sankaranarayanan, S., Somenzi, F., Chen, X., and Abraham, E. (2013). From Statistical Model Checking to Statistical Model Inference: Characterizing the Effect of Process Variations in Analog Circuits. In *Proceedings of the 2013 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)*, San Jose, CA (USA).
- [33] Zhou, X. and Lin, H. (2008a). *Global Sensitivity Analysis*, chapter Global Sensitivity Analysis, page 787. Encyclopedia of GIS. Springer.
- [34] Zhou, X. and Lin, H. (2008b). *Local Sensitivity Analysis*, chapter Local Sensitivity Analysis, pages 1130–1131. Encyclopedia of GIS. Springer.
- [35] Şandru, E.-D. (2017a). A Qualitative Analysis of the Process Variation on Circuit Design. State-of-the-Art. Technical Report No. 2, University POLITEHNICA of Bucharest.
- [36] Şandru, E.-D. (2017b). General Concepts of Process Variation on Circuit Design and Semiconductor Products Verification. Technical Report No. 1, University POLITEHNICA of Bucharest.
- [37] Şandru, E.-D., Burileanu, C., David, E., Buzo, A., and Pelz, G. (2019a). Modeling the Dependencies between Circuit and Technology Parameters for Sensitivity Analysis using Machine Learning Techniques. In *Proceedings of the 16th International Conference on Synthesis, Modeling, Analysis and Simulation Methods and Applications to Circuit Design (SMACD)*, pages 237–240, Lausanne (Switzerland).
- [38] Ṣandru, E.-D., Burileanu, C., David, E., and Pelz, G. (2021). On the Robustness of the Methodology for Modelling the Dependencies between Circuit and Technology Parameters of Integrated Circuits. *University POLITEHNICA of Bucharest Scientific Bulletin, Series C: Electrical Engineering and Computer Science*, 83(4):97–110.
- [39] Şandru, E.-D., Buzo, A., Cucu, H., and Burileanu, C. (2018). Recent Experiments and Findings in Baby Cry Classification. Fratu O., Militaru N., Halunga S. (eds) Future Access Enablers for Ubiquitous and Intelligent Infrastructures. FABULOUS 2017. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, 241:253–360.

- [40] Şandru, E.-D. and David, E. (2019). Unified Feature Selection and Hyperparameter Bayesian Optimization for Machine Learning based Regression. In *Proceedings of the International Symposium on Signals, Circuits and Systems (ISSCS)*, pages 1–5, IaŞi (Romania).
- [41] Ṣandru, E.-D., David, E., and Pelz, G. (2019b). Pre-Silicon Yield Estimation using Machine Learning Regression. In *Proceedings of the 26th IEEE International Conference on Electronics Circuits and Systems (ICECS)*, pages 103–104, Genoa (Italy).
- [42] Ṣandru, E.-D., David, E., and Pelz, G. (2020). Machine Learning-Based Local Sensitivity Analysis of Integrated Circuits to Process Variations. In *Proceedings of the 27th IEEE International Conference on Electronics Circuits and Systems (ICECS)*, pages 1–2, Glasgow (Scotland).