## Abstract

Mode shape information plays the essential role in deciding the spatial pattern of vibratory response of a structure. The uncertainty quantification of mode shape, i.e., predicting mode shape variation when the structure is subjected to uncertainty, can provide guidance for robust design and control. Nevertheless, computational efficiency is a challenging issue. Direct Monte Carlo simulation is unlikely to be feasible especially for a complex structure with a large number of degrees-of-freedom. In this research, we develop a new probabilistic framework built upon the Gaussian process meta-modeling architecture to analyze mode shape variation. To expedite the generation of input data set for meta-model establishment, a multi-level strategy is adopted which can blend a large amount of low-fidelity data acquired from order-reduced analysis with a small amount of high-fidelity data produced by high-dimensional full finite element analysis. To take advantage of the intrinsic relation of spatial distribution of mode shape, a multi-response strategy is incorporated to predict mode shape variation at different locations simultaneously. These yield a multi-level, multi-response Gaussian process that can efficiently and accurately quantify the effect of structural uncertainty to mode shape variation. Comprehensive case studies are carried out for demonstration and validation.

## 1 Introduction

Mode shapes can be acquired from modal testing experiments on an actual structure or from numerical simulation of its finite element model. The mode shape information is one of the most fundamental properties of a structure, as it essentially decides the spatial pattern of structural vibratory response. Real structures, meanwhile, are inevitably subjected to uncertainties caused by material imperfection, manufacturing tolerance and in-service degradation [1]. The deterministic analysis of nominal model without considering uncertainties may render the subsequent design or control ineffective [2]. Incorporating uncertainties into dynamic modeling and analysis has obvious significance. Intuitively, prediction of dynamic response variation of a structure can be conducted through direct Monte Carlo simulation under given uncertainty parameters. However, for a complicated structure, the number of degrees-of-freedom (DOFs) in its finite element model is large, leading to high computational cost in solving the eigenvalue problem. When a single run of finite element simulation is computationally demanding, conducting repeated analyses to facilitate direct Monte Carlo simulation becomes infeasible [3]. Moreover, in some cases, the introduction of many different types of uncertainties leads to a high-dimensional uncertainty parametric space, which further compounds the computational cost issue.

In recent years, there have been continuous efforts in uncertainty quantification of structural dynamic responses. One class of methods aims at reducing the computational time needed for single run through model order reduction. Indeed, along with the advancement of finite element analysis, model order reduction has been one important research subject in computational mechanics/dynamics. A simple and famous approach is referred to as Guyan reduction, where the DOFs in a structure are divided into master DOFs and slave DOFs [4]. The effects of the slave DOFs are transformed onto the master DOFs through static condensation, thereby eliminating the slave DOFs in the original model. Salvini and Vivio [5] applied Guyan reduction into modal analysis at high frequencies. Panayirci et al. [6] utilized it directly to facilitate stochastic structural analysis. To improve the modeling accuracy over Guyan reduction, a variety of component mode synthesis (CMS) approaches have been developed to produce order-reduced models. The fundamental idea of CMS is to retrieve, at least in part, the dynamic effects of truncated DOFs into the order-reduced model. Masson et al. [7] developed a CMS-based model reduction transformation that can be used throughout the entire optimization process to enhance computational efficiency. Shanmugam and Padmanabhan [8] developed a fixed- and free-interface hybrid CMS method to accurately predict the whirl frequencies of rotor dynamic systems. Zhou et al. [9] adopted a NURBS finite element-based free-interface CMS to conduct robust geometry design. While these order-reduction approaches have shown certain effectiveness in mitigating the computational cost of single run, the subsequent sampling-based statistical analysis using direct Monte Carlo simulation still poses significant challenge because it normally requires very large sample size. Besides, it is generally difficult to guarantee the accuracy of results, due to the error introduced by model order reduction.

A different way of realizing uncertainty quantification is through enhancing the efficiency of statistical sampling by means of meta-modeling. The establishment of meta-model involves a significantly reduced sample size (i.e., the size of data set containing concerned responses under sampled uncertainty parameters) for training. Once established or trained, the meta-model can directly predict responses of a process upon given uncertainty parameters without going through the original physics model-based simulation. This can greatly reduce the computational cost of response variation analysis. Amongst various meta-modeling techniques, the Gaussian process architecture exhibits several important advantages [10–12]. The underlying idea of Gaussian processes is to extend the multivariate Gaussian distribution from a finite dimensional space to an infinite dimensional space. It yields a probabilistic framework for nonparametric regression, thereby addressing the issue of prohibitive sample size required in Monte Carlo simulation. In recent years, there have been attempts of extending Gaussian process into structural dynamic analysis. DiazDelao and Adhikari [13] employed Gaussian process as an emulator to approximate the frequency response function (FRF) of simple structures and later [14] applied Gaussian process together with polynomial chaos expansion to reduce the computational cost of stochastic finite element analyses. Xia and Tang [15] computed the frequency response values at a small number of frequency points and then built Gaussian process meta-model, based on these values, to predict frequency responses with high resolution of frequency points. Wan et al. [16] utilized the Gaussian process meta-model for analytical uncertainty quantification of natural frequencies under uniformly- or normally distributed random parameters, and further generalized this method to deal with the uncertainty quantification of complex systems with arbitrary parametric probability distributions [17]. Zhou et al. [18] used Gaussian process to emulate the response surface (i.e., objective function) with respect to design variables in vibration analysis of periodic structure with uncertainty and carried out robust geometry design to mitigate vibration localization. It is worth noting that the applications mentioned in these studies generally focus on scalar-type response of concern. For example, while frequency response function naturally represents a series of responses at different frequency points, only the responses at certain frequency points are investigated and each response of concern will need a separate Gaussian meta-model. This not only leads to large computational effort if a large number of responses are of interest, but also, overlooks the intrinsic relation of these responses.

For a typical frequency response function, the responses at multiple frequency points are correlated. Similarly, for a give mode shape of a structure, the amplitudes at different DOFs (locations) are correlated intrinsically. Moreover, mode shapes characterize the spatial distribution of responses at their corresponding natural frequencies, and in many cases, it is the distribution pattern, rather than amplitude at specific DOF for a mode, that is of interest. Therefore, if one wants to carry out uncertainty quantification of mode shapes, single-response Gaussian process mentioned above is obviously not an ideal approach. In the realm of statistical meta-modeling, multi-response Gaussian process (MRGP), which is capable of providing emulation of multiple and correlated responses concurrently, has seen recent progresses. Arendt et al. [19] pointed out that the MRGP could be used to infer true responses when multiple responses were mutually associated with the same input parameters. Wei et al. [20] employed MRGP to produce meta-models for the failure surfaces of system which are then utilized in reliability-based robust design. Bostanabad et al. [21] utilized MRGP for uncertainty quantification analysis of woven fiber composites across multiple scales. Ariza Ramirez et al. [22] implemented MRGP to conduct the identification of the ship dynamics and further facilitate the system modeling. Pan et al. [23] built a MRGP meta-model to efficiently characterize the frequency responses with inherent correlation under uncertainties. These investigations have illustrated the possibility of formulating MRGP-based meta-modeling to tackle the uncertainty quantification of mode shape information as vectors. In addition to MRGP meta-models, multi-task Gaussian process model (MTGPM) has also been developed for reconstruction of missing monitoring data considering the correlations among the tasks [24].

Both the quality and the quantity of training data sets are important in establishing Gaussian process metal-model with high accuracy. In structural dynamic analysis, high-fidelity data can be acquired from experimental measurement or full-scale finite element analysis, the size of which is usually very limited. If one only uses small amount of high-fidelity data as the training data set, the desired performance of meta-model cannot be ensured. According to O'Hagan and Kennedy [11], blending a small amount of high-fidelity data with a large amount of low-fidelity data is a promising path. It is worth mentioning that in structural dynamic analysis, those aforementioned order-reduced models derived through such as Guyan reduction or CMS techniques can be utilized to generate large amount of response predictions directly. These responses, containing possible order-reduction error, are naturally low-fidelity data. Indeed, in a recent study, a multi-level Gaussian process (MLGP) meta-model was established to investigate the variation of single response of a vibration system such as natural frequency [2]. With the large amount of low-fidelity data from order-reduced model, the Gaussian process may avoid those errors associated with the inference procedure. Meanwhile, with the introduction of a few high-fidelity data from full-scale finite element model, one may correct the error of the low-fidelity data inherited from the order-reduction procedure.

The objective of this research is to develop an efficient tool for the uncertainty quantification of mode shape variation. We specifically investigate the development of Gaussian process based meta-model. As indicated, mode shape information is a distributed quantity, requiring a multi-response Gaussian process (MRGP). Meanwhile, we explore the feasibility of incorporating multi-level Gaussian process (MLGP) that can take advantage of the order-reduced modeling techniques available for the computational dynamic analysis. This will yield a multi-level, multi-response Gaussian process (MLMRGP) that can adequately address the uncertainty quantification of mode shape information. The rest of the paper is organized as follows. Section 2 first explains the high-fidelity and low-fidelity models and the corresponding data sets to be produced and then used in the new framework. Without loss of generality, order-reduced modeling based on Guyan reduction and CMS is outlined. Subsequently, the mathematical formulation of the proposed MLMRGP is presented in detail. In this framework, we train low-level Gaussian process emulator using low-fidelity data set produced from order-reduced model and then train high-level Gaussian process emulator by further employing high-fidelity data set to minimize the residual between high- and low-fidelity outputs. The finally established meta-model with hyper-parameters optimized through low- and high-level emulator trainings is used to predict the output given certain input. In the meantime, the output correlations identified in low- and high-level emulators will be assembled to characterize the actual correlation of unseen/testing outputs. In Sec. 3, comprehensive case studies on a benchmark plate structure are presented, where the effectiveness of the new framework is demonstrated. Concluding remarks are summarized in Sec. 4.

## 2 Multi-Level Multi-Response Gaussian Process for Computational Modal Analysis

### 2.1 Baseline Model and Two-Level Data Sets.

**M**,

**C**, and

**K**are

*N*×

*N*mass, damping, and stiffness matrices where

*N*is the total number of DOFs, and

**Z**and

**f**are the

*N*-dimensional displacement response vector and external force vector, respectively. In this research, we study the effects of structural uncertainty to mode shape information, and let

**M**and

**K**be functions of $\theta $ where $\theta $ represents the set of uncertain parameters. That is, under uncertainty effect, the mass and stiffness matrices are denoted as $M(\theta )$ and $K(\theta )$. We assume small damping or proportional damping. Therefore, the natural frequencies and mode shapes of the system are determined by the following eigenvalue problem

Apparently, the natural frequency *ω* and the mode shape $\psi $ are affected by $\theta $. Equations (1) and (2) are referred to as the baseline model hereafter, corresponding to the full-scale finite element mesh. The goal of this research is to formulate an efficient and accurate framework from which we can predict the variation of mode shape information. Specifically, we plan to incorporate a two-level Gaussian process approach that can balance between computational cost related to training data acquirement and meta-model accuracy. High-fidelity data can be produced from Eq. (2) (i.e., baseline model) directly under sampled uncertainty parameter set $\theta $. In actual practice, the amount of high-fidelity data is usually limited, due to computational cost involved in full-scale finite element analysis.

*n*×

*n*mass, damping, and stiffness matrices where

*n*is the total number of DOFs in the order-reduced model, and

**z**and $fr$ are the

*m*-dimensional displacement vector and external force vector, respectively. Under uncertainty effect, the eigenvalue problem becomes

*ω*

_{r}and $\psi r$ denote the natural frequency and the corresponding mode shape solved from the order-reduced model. After a coordination transform, one can obtain the mode shape with respect to the original coordinate system. As the order of the system is reduced significantly, we can expect to produce a large amount of mode shape information under uncertainty efficiently from repeatedly solving the eigenvalue problems (Eq. (4)).

### 2.2 Multi-Level Multi-Response Gaussian Process Framework.

We employ the Gaussian process architecture to establish meta-model. In Gaussian process formulation, an unknown system is denoted as $f(x)$, where **x** is an input vector. Here for uncertainty quantification of mode shape variation, the input vector is the set of uncertainty parameters $\theta $ shown in Eqs. (2) and (4). The observed value of $f(x)$, i.e., the training data of response, is denoted as **y**. It is worth noting that, as we are interested in mode shape variation, **y** represents the specific mode shape of interest. Therefore, $f(x)$ and **y** are both vectors, which is the basis of multi-response Gaussian process (MRGP). In the context of computational analysis, we neglect the noise effect, which then results in $f(x)=y$. Given a set of *n*_{s} observations described as $\u03d1={(yi,xi),i=1,2,\u2026.ns}$ (*n*_{s} is the number of training data), a single-level Gaussian process regression can be implemented to predict the output over target input. Each input $xi$ and output $yi$ are *r*-dimensional and *q*-dimensional vectors, respectively. In this research, we formulate a two-level MRGP with two types of data sets, i.e., low- and high-fidelity data sets that are introduced as $\u03d1(u)={(yi(u),xi(u)),i=1,2,\u2026.ns(u);u=1,2}$. The superscript *u* indicates the fidelity level of data, and each $(yi(u),xi(u))$ is referred to as a data point at the *u*th level. We predict the output vector at target input given two observed data sets $\u03d1(1)$ and $\u03d1(2)$. Specifically, $\u03d1(1)$ is the low-fidelity data set acquired from order-reduced model (Eq. (4)), and $\u03d1(2)$ is the high-fidelity data set acquired from full-scale baseline model (Eq. (2)). Essentially, we will establish a multi-level multi-response Gaussian process (MLMRGP) to facilitate the uncertainty quantification of modal information.

**Y**in different data sets be expressed, under assumed quasi-linear relations, as

*a,b*)

*ρ*

^{(1)}is a regression parameter. $\delta (1)$ and $\delta (2)$ are modeled as two

*q*-dimensional independent stationary multivariate Gaussian processes [2]. As the summation of independent Gaussians remains in the closed form, we can derive the Gaussian process representation of observed low- and high-fidelity data points as

**X**denotes the samples of both low- and high-fidelity inputs. We use $h(X(u))=[1x1,1(u)\u2026x1,r(u)1x2,1(u)\u2026x2,r(u)........1xns(u),1(u)\u2026xns(u),r(u)]$ to capture the linear characteristic under small uncertainties, where subscript $ns(1)$ and $ns(2)$ are the numbers of low- and high-fidelity data sets, respectively. Therefore, the dimension of $H(X)$ is $(ns(1)+ns(2))\xd7(2r+2)$. In reality, $ns(2)$ is much smaller than $ns(1)$ due to the costly acquisition of high-fidelity data sets through full-scale finite element simulation. $\beta $ is unknown regression coefficient matrix with dimension (2

*r*+ 2) ×

*q*,

**Q**is non-spatial

*q*×

*q*matrix representing the covariance among output variables, and $\Sigma $ is a spatial covariance matrix formed by the spatial inputs with dimension $(ns(1)+ns(2))\xd7(ns(1)+ns(2))$. The above equation can be re-organized into vector representation form as shown below:

*a*)

*b*)

*ρ*

^{(1)}and the reciprocal scale-length $bk(u)$ in the covariance kernel, will be identified through learning from the training data sets. Since

**Q**and $\beta $ depend on these hyper-parameters, only the hyper-parameters denoted as $\varphi =[b1(1),b2(1),\u2026,br(1),b1(2),b2(2),\u2026,br(2),\rho (1)]T$ need to be optimized. The number of parameters

*b*

_{k}does not have to be equal to input dimension

*r*and generally can be adjusted to accommodate computational capacity. The selection of kernel and the related hyper-parameters relies on the essence of the data used, which can be examined through the cross-validation procedure [26]. The key step here is to optimize hyper-parameters following the Bayesian formula:

*a*)

*b*)

*a*)

*b*)

**Y** represents $Y(1)$, if *u* = 1. Otherwise **Y** represents $Y(2)\u2212\rho (1)Y(1),2$. $\beta (u)$ and $Q(u)$ denote, respectively, the unknown regression coefficient matrix of the mean function and the output covariance matrix involved in the *u*-level emulator.

*a*)

*b*)

*a*) and 11(

*b*). Collectively they are used to characterize the posterior GP of outputs over target inputs (Eq. (13)).

### 2.3 Computational Treatment.

Since the objective functions (Eqs. 11(*a*) and 11(*b*)) cannot be expressed in a closed form with respect to the hyper-parameters, sampling-based optimization approaches are preferred. In this study, two algorithms, i.e., simulated annealing [28] and particle swarm [29], are examined. It is found that particle swarm outperforms simulated annealing in prediction accuracy. Hence, the particle swarm algorithm is adopted in the case analysis. The evaluation of objective functions necessitates the computations of matrix inverse and determinant associated with covariance matrices $\Sigma (u)$ and $Q(u)$. In general, this may lead to some numerical issues:

$\Sigma (u)$ theoretically is positive definite as long as the reciprocal of $bk(u)$ is greater than 0 (negative $bk(u)$ is against the physical nature of this kernel). However, it may be nearly singular or ill-conditioned, when a small value of $bk(u)$ is statistically sampled during optimization. Extremely ill-conditioned $\Sigma (u)$ will cause numerical instability in matrix inversion. Numerical computation is generally subjected to resolution (i.e., the smallest non-zero number). Therefore, the determinant of ill-conditioned $\Sigma (u)$ cannot be differentiated.

$Q(u)$ should be positive definite as well. However, it is often close to being singular, especially when a large number of output variables are involved. Earlier studies have noted that very large number of output variables are not recommended since it may induce numerical instability [30].

$Q(u)\u2297\Sigma (u)$ yields a high-dimensional matrix when many response variables and training data sets are taken into account. The inversion of such a large matrix required in each iteration of objective function evaluation is computationally expensive.

Our strategies to address these issues are summarized as follows:

*Matrix*inversion*:*We monitor the condition numbers of $\Sigma (u)$ and $Q(u)$, and set a threshold to decide if current objective evaluation is executed or ignored. Meanwhile, we add a small diagonal perturbation into matrices to be inverted.*Matrix determinant:*We incorporate matrix decomposition, i.e., LU decomposition, or eigenvalue analysis to compute the determinant [23].*Large-size matrix inverse:*We take advantage of a Kronecker product principle, i.e., $(Q(u)\u2297\Sigma (u))\u22121=Q(u)\u22121\u2297\Sigma (u)\u22121$ [31], where $\Sigma (u)$ and $Q(u)$ are both invertible. Recall that the computational complexity of inverting a*P*×*P*matrix is*O*(*P*^{3}). The original complexity $O((ns(u)\xd7q)3)$ can be reduced to $O(ns(u)3)+O(q3)$.

## 3 Case Studies: Meta-Model Establishment and Uncertainty Quantification Illustration

In this section, we demonstrate the effectiveness of the proposed framework. We specifically focus on the mode shape variations and utilize the multi-level multi-response Gaussian process approach. We highlight the influences of low-fidelity and high-fidelity data sets to the uncertainty quantification performance.

### 3.1 Benchmark Structure and Data Preparation

#### 3.1.1 Nominal Structure and Model Order Reduction.

We consider a benchmark structure shown in Fig. 1(a). It consists of essentially three rectangular plates connected together. For the nominal structure without uncertainty, the mass density and Young's modulus are 7850 kg/m^{3} and 206 GPa. From bottom to top, these three plates have, respectively, 2,214, 630, and 858 DOFs. Altogether, the full-scale finite element model of this benchmark structure has 3510 DOFs. We choose this structural configuration so interested readers can readily re-construct the mesh for validation and comparison. This structure can be directly decomposed into substructures to facilitate various order-reduction analyses. The order-reduction approach adopted hereafter can be extended easily to more complicated structures where substructure decomposition is straightforward.

As mentioned in the preceding sections, one important component of this proposed methodology is to incorporate low-fidelity data set into uncertainty quantification, which has the prospect of significantly reducing the computational cost needed for the generation of training data for meta-model establishment. Commonly, Guyan reduction and component mode synthesis (CMS) approaches are used in order-reduction of structural dynamic analysis. The standard Guyan reduction and fixed-interface CMS are outlined in the Appendix. In Guyan reduction, the DOFs are first divided into master DOFs and slave DOFs, and the responses of the slave DOFs are transformed onto the master DOFs through static condensation. The fixed-interface CMS takes into consideration the dynamic effects of the DOFs that are truncated in order reduction, and therefore is generally more accurate at the price of additional computations compared with Guyan reduction. Here, we analyze both order-reduction methods. We consider the first three *z*-direction bending modes. For Guyan reduction, the master DOFs selected are indicated in Fig. 1(b). Apparently, *z*-direction DOFs that are away from the clamped boundaries play a dominant role in these modes, which are selected as the master DOFs. Specifically, 150, 48, and 32 master DOFs are selected for three substructures (from bottom to top), respectively, yielding an order-reduced model with 230 DOFs in total. For fixed-interface CMS, we keep the first 10, 5, and 2 modes of the substructures (from bottom to top), respectively. We also keep all the interface DOFs (between neighboring substructures) in the order-reduced model. The CMS order-reduced model thus has 209 DOFs in total.

The computation is carried out on a two-processor desktop (Intel E5620@2.4 GHz) under matlab environment [32]. In this research, we use self-developed finite element code to carry out the investigations. This will facilitate a streamlined process to generate data sets with multiple fidelity levels. The finite element model of the benchmark structure used in the analysis is fully validated using ansys [33]. When the first 20 natural frequencies and mode shapes are sought, the full finite element analysis takes 1.15 s to complete one run. In comparison, the Guyan reduction and the fixed-interface CMS take 0.13 s and 0.25 s, respectively. The first 5 natural frequencies are listed in Table 1, and the mode shapes are shown in Fig. 2. For illustration purpose, we present only the z-direction DOFs in mode shape comparison since the modes involved are all bending modes. In general, the natural frequencies and mode shapes obtained from order-reduced approaches have good accuracy as compared with full-scale finite element analysis. The performance degrades as the mode order increases. Unsurprisingly, the fixed-interface CMS generally outperforms Guyan reduction in terms of accuracy, while the results are somewhat comparable. Since our goal is to demonstrate that a two-level Gaussian process that integrates together a small amount of high-fidelity data with a large amount of low-fidelity data can yield a satisfying meta-model, in what follows we adopt Guyan reduction as the low-fidelity data generator. The Guyan reduction features faster computation with lower fidelity (i.e., less accuracy) and therefore can better highlight the advantage of two-level meta-modeling.

#### 3.1.2 Model Uncertainties and Data Set Preparation.

We assume model uncertainties come from material properties, i.e., mass density and Young's modulus. Specifically, the benchmark structure is divided into six segments (as shown in Fig. 1), and each segment features its material property uncertainties, leading to 12 uncertainty parameters. We let these 12 uncertainty parameters be subjected to multivariate normal distribution, in which the means take the nominal values and the standard deviations are set as 20% of the means. As we assume uncertainty parameters are independent and identically distributed random variables, the covariance matrix of this multivariate normal distribution is diagonal. Following Latin hypercube sampling [34], we generate 1000 uncertainty input samples. The sampled uncertainty parameters are then employed in Monte Carlo simulations of both full-scale finite element analysis and Guyan reduction. In this case study, since the full-scale finite element mesh of the benchmark structure has relatively low dimension, we can readily produce the Monte Carlo simulation results which are then used for validation.

As mentioned in Sec. 2.3, the number of output variables of the meta-model may not be very large, in order to yield tractable computation and also to avoid numerical instability. As such, for each vibration mode of interest, we focus on 50 DOFs on the top surface of the structure where the mode shape amplitudes of the nominal structure have the largest absolute values. In other words, these DOFs are used to represent/characterize the respective mode shapes. The multiple responses defined in MLMRGP hence consist of these 50 interrelated mode shape amplitudes.

Figure 3 shows the variations of the first two bending modes obtained from the Monte Carlo simulation of full-scale finite element analysis. At each selected DOF, the probabilistic density function (PDF) based on 1000 uncertainty samples is shown as the violin plot. Similarly, Fig. 4 shows the results of the first two bending modes obtained from the Monte Carlo simulation of the Guyan reduction analysis. There are notable discrepancies when we compare Figs. 3 and 4. Apparently, the second-mode shape is more sensitive to uncertainties, as larger amplitude variations can be observed. In addition, the output distribution at each DOF varies. In the subsequent analysis, these mode shape data will be used for meta-model training and validation.

### 3.2 Multi-Level Multi-Response Gaussian Process Meta-Model Establishment and Validation

#### 3.2.1 Meta-Model Establishment.

The multi-level multi-response Gaussian process (MLMRGP) proposed in this research takes advantage of the multi-fidelity data sets generated in Sec. 3.1.2 and takes into consideration the inherent correlation among mode shape amplitudes. We start from employing 30 high-fidelity data and 300 low-fidelity data, both of which are randomly selected from the respective databases generated by the Monte Carlo simulation of full-scale finite element and order-reduced analysis (Sec. 3.1.2). The remaining 700 low-fidelity data and the corresponding 700 high-fidelity data (under the same uncertainty inputs/parameters) are used for validation. To facilitate the optimization of hyper-parameters through Eqs. 11(*a*) and 11(*b*), preprocessing of input/output data is necessary. The input data (i.e., the set of uncertainty parameters) are converted into standard normal distributions, and the output data, i.e., mode shape amplitudes are scaled to [−1 1].

In establishing the meta-model using MLMRGP, we adopt linear mean and anisotropic squared exponential covariance kernels (Eqs. (6) and 8(*b*)). The squared exponential covariance kernel at each level's emulator includes six reciprocals of scale-length values. Each scale-length is used to weigh the spatial correlation of two input samples, i.e., the variations of mass density and Young's modulus of one specific segment in the structure analyzed. For example, $bk(u)$ characterizes the spatial correlation of parameterized inputs, i.e., the mass density and Young's modulus of the *k*th sector in the *u*th level emulator. In addition to 6 $bk(u)$ at each level's emulator, there is one regression coefficient considered. Therefore, a total of 13 hyper-parameters are to be optimized. Using less number of hyper-parameters would render the meta-model incapable of capturing the underlying data features. On the other hand, more hyper-parameters would increase the computational cost and may cause model overfitting. Particle swarm algorithm is used for hyper-parameter optimization. We need to define the design boundaries for all hyper-parameters. Here, 6 $bk(1)$ and 6 $bk(2)$ are specified with bounds [0.01, 50] and [0.01, 300], respectively. Regression coefficient *ρ*^{(1)} is specified with bound [0.001, 1]. The simulation variables, i.e., swarm size and maximum iteration number of particle swarm algorithm are set as 300 and 50,000, respectively. The MLMRGP-based meta-model is then established following the procedure outlined in Sec. 2.2.

#### 3.2.2 Characterization of the First-Mode Shape Variation.

Once the meta-model is trained using the MLMRGP framework, we can use it as an emulator to predict mode shape variation under given input parameters (i.e., various uncertainty parameters). Recall that 700 high-fidelity and low-fidelity data sets, under the same 700 samples of uncertainty parameters, are not used in training. They will be used in validation. Using these 700 samples of uncertainty parameters, we can predict the corresponding mode shape outputs through the meta-model established. We consider the high-fidelity, full-scale finite element results as the accurate results. The prediction errors of the first-mode shape amplitudes by the meta-model are shown in Fig. 5, where the PDF of mode amplitude at given DOF is estimated based on 700 prediction error values. It is worth noting that the peaks of PDFs do not truly represent the worst-case (i.e., maximum) errors. For clear illustration, we include the worst-case errors over entire DOFs in the plot, which are marked as crosses (Fig. 5). The results show that the mean errors vary slightly at different DOFs. Overall, however, the mean errors are all below 2%. The worst-case errors versus DOFs follow the similar trend and are all under 8%.

*overall*error level, we define the average of mean errors (AMR)

*i*-th DOF (over the 700 samples), and

*m*denotes the number of DOFs selected in mode shape characterization. In this case study,

*m*= 50. The AMR of the first-mode shape variation prediction is 1.08%. Recall that the low-fidelity data set is generated by Guyan reduction and bears order-reduction error. For comparison purpose, we calculate the AMR for the corresponding 700 low-fidelity data directly and find an AMR of 1.18% for the first mode. The AMR is slightly reduced by using MLMRGP, as the original AMR for the first mode of the low-fidelity data is already small. As will be shown subsequently, more significant accuracy improvement by MLMRGP is achieved in the prediction of the second-mode shape variation.

We now take further look at some prediction instances. For example, the 26th and the 43rd DOFs show larger prediction errors. Recall that the posterior mean values of MLMRGP are employed as the prediction results here. Meanwhile, the prediction results of MLMRGP are actually statistically characterized by posterior mean and covariance. The posterior covariance essentially indicates the confidence/likelihood of posterior mean. We then analyze the PDFs of prediction results of mode amplitudes at DOFs of interest that are built upon the posterior mean and covariance. The covariance in this case is the variance, as we focus on statistical relation of samples that have the same response variable. For comparison, we also include good prediction instances, i.e., the 42nd DOF and the 48th DOF with smaller prediction errors. We attempt to make the comparison for DOFs with similar nominal mode shape amplitudes. Figure 6 shows the prediction performance comparison from a probabilistic perspective. A shaded area denotes the region between plus and minus one standard deviations. It can be observed that all true values fall within the shaded areas of predicted PDF. While the worst-case errors (i.e., deviation between the posterior mean and the actual value) in the top two sub-plots (the 43rd and the 26th DOFs) are much larger, the corresponding variance increases significantly. This indicates that while relatively larger errors occur at these DOFs, the meta-model is capable of pointing out the low confidence at these locations. This capability of probabilistic prediction illustrates that the Gaussian process is a powerful statistical meta-modeling technique.

*b*)) reflects the most probable correlation among outputs. In order to facilitate the comparison, this covariance matrix is converted to the correlation matrix in the following manner [35]

*a,b*)

**Cov**and

**Corr**represent the original covariance matrix and the resultant correlation matrix, respectively. The output correlation is then used to evaluate the MLMRGP framework by comparing it with the true correlation of testing data sets. In this case, it is interesting to observe that the values in correlation matrix are all close to 1, which indicates the high correlation of all response variables. We arbitrarily choose 4 of 50 response variables for comparison, as shown in Fig. 7. The top-left and bottom-right numbers represent the true correlation of testing data set and the correlation identified from meta-model prediction, respectively. Clearly, they match quite well, illustrating that MLMRGP is capable of accurately identifying statistical correlation among different response variables.

#### 3.2.3 Characterization of the Second-Mode Shape Variation.

Following a similar process, we analyze and interpret the meta-model prediction of the second-mode shape variation. We first analyze the order-reduction error of the entire low-fidelity testing data set as a whole, and the AMR (Eq. (15)) calculated is 6.91%. Utilizing the MLMRGP meta-model, we carry out emulation and the AMR is calculated as 3.39%. This indicates a significant improvement through the MLMRGP process due to the incorporation of a small amount (30) high-fidelity data. The results are shown in Fig. 8. The largest error occurs at the 48th DOF with the mean error at around 9%. The 30th, 37^{th}, and 40th DOFs also exhibit considerable errors. Recall Fig. 3. One may readily notice that the error magnitude is generally associated with the original response distribution. The larger the variance of original output distribution is, the larger the corresponding errors will likely be, simply because the variance reflects the sensitivity of mode shape with respect to input uncertainty parameters. Additionally, the low-fidelity data set of the second mode inherently has greater error than that of the first- mode shape, which can be seen in the deterministic analysis result (Fig. 2). The framework of MLMRGP allows us to tune/optimize the hyper-parameters to ensure prediction accuracy of multiple response variables (i.e., mode shape amplitudes of interest). It reduces large errors at certain DOFs while at the same time it may indeed yield tradeoff at some other DOFs. The final prediction errors reflect how the trained meta-model fits the testing data sets. While the facts mentioned above indeed pose a challenge for the mode shape amplitude prediction, the MLMRGP outperforms the Monte Carlo simulation utilizing Guyan-order reduction analysis with much higher accuracy.

Some example prediction instances are examined probabilistically as shown in Fig. 9. Once again, all true values are within the region between plus and minus one standard deviations. The result illustrates that large prediction error generally occurs with large variance of predicted PDF, which reflects the confidence level of predicted output. The identified correlation (bottom-right number) and the true correlation (upper left number) extracted from testing data sets are put together for comparison in Fig. 10. The correlation among different outputs becomes more complicated due to larger sensitivity of the second-mode shape with respect to uncertainty parameters. The negative correlation here indicates a relationship between two response variables in which one variable increases and the other decreases. It can be observed that the general trend/pattern of correlation is completely captured, which verifies that MLMRGP takes output correlation into consideration during emulation analysis.

#### 3.2.4 Effect of Training Data Set Size.

An important condition in the formulation of the MLMRGP framework (Sec. 2.2) is that the inputs to the high-fidelity training data set are a subset of the inputs to the low-fidelity training data set. In other words, within this case study setup, when the size of the low-fidelity data set remains to be the same (i.e., 300), the size of high-fidelity training data sets can be adjusted from 0 to 300. The high-fidelity training data are reliable evidences used to correct the meta-model error owing to the low-fidelity model truncation. Figure 11 shows the AMR results for the first two modes as we increase the high-fidelity training data size. Unsurprisingly, both show performance improvement with increasing the high-fidelity training data size.

#### 3.2.5 Meta-Model Cross-Validation.

The effect of increasing size of high-fidelity training data size indicated in the preceding sub-section is intuitive. It is also worth noting that the accuracy improvement may not be simply proportional to the training data size. Essentially, the performance has to do with whether the training data set captures the underlying features of the output variables. The training data set, on the other hand, is generated based on random inputs. When the training data set changes, one may expect change of the prediction performance. In order to examine how well a meta-model generalizes to new data set and to avoid model under-fitting or overfitting, we apply the cross-validation analysis. Particularly, here we use the bootstrap sampling-based cross-validation, which allows the random sampling with replacement [36]. We use the same sizes of low-fidelity and high-fidelity data sets, 300 and 30, respectively. Five emulations with different randomly selected training and testing data sets are implemented. Table 2 shows the AMR values of both mode shapes under different emulations. It is found that the results under different emulations are quite consistent, showing the robustness of MLMRGP. Besides, the mean of AMR values using MLMRGP is always smaller than the AMR of low-fidelity data evaluated as a whole, which demonstrates the effectiveness of MLMRGP.

## 4 Conclusion

A new MLMRGP meta-modeling technique is developed in this research, aiming at uncertainty quantification of mode shape variation. This framework allows the usage of a small amount of high-fidelity data produced by full-scale finite element analysis together with a large amount of low-fidelity data produced by order-reduced model such as Guyan reduction as training data sets for meta-model establishment. This reduces significantly the computational cost needed for generating the training data. The new framework also yields the simultaneous prediction of mode shape amplitudes at different DOFs, thereby capturing their intrinsic correlations. Case studies using a benchmark structure indicate that the MLMRGP technique can effectively characterize the mode shape variations. The incorporation of a small amount of high-fidelity data can increase the prediction accuracy compared with using order-reduced data alone. This framework can be extended to general structural dynamic analysis concerning multiple output responses.

## Acknowledgment

This research is supported in part by National Science Foundation under Grant No. CMMI—1825324.

### Appendix: Order-Reduced Models

This section outlines the mathematical formulations of Guyan reduction and fixed-interface component mode synthesis (CMS).

#### Guyan Reduction

*m*and

*s*denote the master and slave DOFs, respectively. The second row in the above matrix equation yields

#### Fixed-Interface Component Mode Synthesis

*s*th substructure, the DOFs are divided into interior DOFs and interface DOFs (between adjacent substructures). Its equation of motion under free vibration condition can be written as

*i*and

*j*indicate the interior and interface DOFs. $fjs$ represents the internal force due to neighboring structure. In fixed-interface CMS, at the substructure level we let $Zjs=0$ and subsequently solve the eigenvalue problem

*s*th substructure. As a basic fixed-interface CMS, we apply a static condensation to take into account the coupling between interface DOFs and the interior DOFs. The transformation between the original DOFs and the order-reduced DOFs for the

*s*th substructure can then be expressed as

## References

^{®}Academic Research Mechanical, Release 19.2, Canonsburg, PA.