# An approach using ensemble empirical mode decomposition to remove noise from prototypical observations on dam safety

- Huaizhi Su
^{1, 2}Email author, - Hao Li
^{3}, - Zhexin Chen
^{3}and - Zhiping Wen
^{4}

**Received: **14 September 2015

**Accepted: **8 May 2016

**Published: **17 May 2016

## Abstract

It is very important for dam safety control to identify reasonably dam behavior according to the prototypical observations on deformation, seepage, stress, etc. However, there are many cases in which the noise corrupts the prototypical observations, and it must be removed from the data. Considering the nonlinear and non-stationary characteristics of data series with signal intermittency, an ensemble empirical mode decomposition (EEMD)-based method is presented to remove noise from prototypical observations on dam safety. Its basic principle and implementation process are discussed. The key parameters and rules, which can adapt the noise removal requirements of prototypical observations on dam safety, are given. The displacement of one actual dam is taken as an example. The noise removal capability of EEMD-based method is assessed. It is indicated that the dam displacement feature can be reflected more clearly by removing noise from prototypical observations on dam displacement. The statistical model, which is built according to noise-removed data series, can provide the more precise forecast for structural behavior.

### Keywords

Dam safety Prototypical observations Noise removal Ensemble empirical mode decomposition## Background

Due to its public and economic impacts and consequences, safety of a dam is of high priority. Based on the prototypical observations of dam body, dam foundation, high slope, surrounding environment, and impact on reservoir dam due to landslides (Pudasaini 2014; Kafle et al. 2016) and seepage (Pudasaini 2016), some mathematical, mechanical and artificial intelligence theories and methods are usually used to analyze and evaluate the dam behavior. It is regarded as an effective approach ensuring service safety of dam engineering (Su et al. 2011). Noise, which can be caused by environmental, man-made and other uncertain factors, is an inevitable part of prototypical observations. The true characteristics of dam behavior sometimes even cannot be reflected from noisy observations. Moreover, the noise has certain effect on further data analysis precision. So some smoothing or filtering methods for noisy data are usually adopted to implement the noise removal of prototypical observations.

At present, wavelet methods are regarded as a powerful alternative tool for removing noise (Shark and Yu 2000; Athanasia and Theofanis 2011; Mohideen 2012). The wavelet coefficients of signal and noise have different characteristics at each wavelet scale. The appropriate wavelet basis function and decomposition layer number are determined according to analyzed signals. The reconstruction of decomposed signals is implemented to fulfill the noise removal. These methods have been widely applied to data pretreatment. However, it is well known that the basis function needs to be fixed in advance for implementing wavelet analysis. It is difficult to approximate accurately the local signal characteristics at different scales with the wavelet function, which is derived from basis function.

Huang et al. (1998) proposed the empirical mode decomposition (EMD) to implement the time–frequency data analysis for nonlinear and non-stationary time series. EMD-based noise removal method has been used recently in many fields such as biology, ocean, medicine, acoustics, fault diagnosis (Huang et al. 1999; Liu et al. 2006; Lee et al. 2011; Park et al. 2011; Ahrabian et al. 2013; Moghtaderi et al. 2013). It does not need to select the basis function in advance and has better adaptive feature. However, when the signal is a superposition of intermittent component and continuous basic component, the unexpected mode mixing will be caused during the mode decomposition. The frequent appearance of mode mixing can make different intrinsic mode function (IMF) components not be effectively separated with EMD. A single IMF component consists of signals of widely disparate scales, or a signal of a similar scale resides in different IMF components. Mode mixing is often a consequence of signal intermittency. The signal intermittency can cause no enough signal extreme points or uneven distribution interval of signal extreme points. Upper and lower envelope generated based on above points is a superposition of intermittent signal envelope and basic signal envelope, which will not only cause serious aliasing in the time–frequency distribution, but also make the physical meaning of individual IMF component unclear.

To overcome the scale separation problem, Wu and Huang (2009) proposed the ensemble empirical mode decomposition (EEMD), which inherits the advantages of EMD. According to the statistical characteristics of Gaussian white noise, namely uniform frequency distribution, a white noise is added to original signal. This method solves the mode mixing problem caused by signal intermittency. The ensemble empirical mode decomposition is introduced to reduce the noise level of prototypical observations on dam safety. This paper is organized as follows. First, the general principle and step of EEMD are reviewed briefly in “Ensemble empirical mode decomposition of nonlinear and non-stationary signal” section. Later, the EEMD-based noise removal process of prototypical observations on dam safety is presented and the algorithm is described in the following section “Noise removal of prototypical observations on dam safety”. In “Actual case analysis” section, the proposed method is applied to noise removal of prototypical observations on one actual dam and statistical model construction. By comparison of fitting and forecasting precision of statistical models before and after noise removal, the validity of proposed method is discussed. Finally, this work briefly concludes in “Conclusions” section.

## Ensemble empirical mode decomposition of nonlinear and non-stationary signal

As an adaptive time–frequency data analysis method, EMD takes a nonlinear and non-stationary signal as integration of some intrinsic mode function (IMF) components. The signal is decomposed layer by layer according to the characteristic scale of signal extrema. A series of IMF components from high frequency to low frequency can be produced, and a residual can be obtained. The handled IMF components are chosen to implement signal reconstruction and fulfill noise removal.

*x*(

*t*), all local extrema of

*x*(

*t*) are identified firstly. Cubic spline curves are adopted to fit local minima or local maxima, respectively. Upper and lower envelopes of

*x*(

*t*) are generated. Secondly, the mean of upper and lower envelopes,

*m*

_{1}(

*t*), is calculated. The mean

*m*

_{1}(

*t*) is subtracted from

*x*(

*t*) and the differential signal,

*h*

_{1}(

*t*) =

*x*(

*t*) −

*m*

_{1}(

*t*), is obtained where

*h*

_{1}(

*t*) is a signal without low frequency. If

*h*

_{1}(

*t*) satisfies the IMF condition, then

*h*

_{1}(

*t*) is regarded as the first IMF component of the signal

*x*(

*t*). If not, the second sifting operation needs to be implemented, namely the above procedure for

*h*

_{1}(

*t*) needs to be repeated, to obtain

*h*

_{11}(

*t*) =

*h*

_{1}(

*t*) −

*m*

_{11}(

*t*). The sifting process is repeated

*j*times, until

*h*

_{1j }(

*t*) =

*h*

_{1(j−1)}(

*t*) −

*m*

_{1j }(

*t*) satisfies the IMF condition.

*h*

_{1j }(

*t*) is regarded as the first IMF component of the signal

*x*(

*t*), namely

*c*

_{1}(

*t*) =

*h*

_{1j }(

*t*). Let

*r*

_{1}(

*t*) =

*x*(

*t*) −

*c*

_{1}(

*t*). The component

*c*

_{1}(

*t*) is extracted from

*x*(

*t*) and a residual signal

*r*

_{1}(

*t*), in which the high frequency component is filtered, is obtained. For

*r*

_{1}(

*t*), the above sifting operation is implemented again. Similarly, the second IMF component

*c*

_{2}(

*t*) of the signal

*x*(

*t*) and the residual signal

*r*

_{2}(

*t*) are extracted. Such sifting procedure is repeated until the stopping criterion of signal decomposition is satisfied. Once this is achieved, the signal

*x*(

*t*) can be decomposed adaptively into

*n*IMF components from high frequency to low frequency, namely

*c*

_{1}(

*t*),

*c*

_{2}(

*t*),…,

*c*

_{ n }(

*t*), and a residual

*r*

_{ n }(

*t*),

According to the characteristic scale of signal extrema, the components of the signal *x*(*t*) are decomposed successively from high frequency to low frequency. The residual *r*
_{
n
}(*t*) is the signal trend component which represents the average trend of the signal *x*(*t*). Thus it can be seen that EMD algorithm has good filtering properties. The decomposition process can be regarded as a filtering process that the characteristic scale of signal extrema is taken as the measure criterion. Furthermore, this algorithm decomposes a signal based on own signal information and the basis function needs to be fixed during signal decomposition. To alleviate the mode mixing problem of EMD, a new noised-assisted data analysis method, namely the ensemble EMD (EEMD), is proposed. The principle of the EEMD is as follows. It defines the true IMF components as the mean of an ensemble of trials, each consisting of the original signal plus a white noise of finite amplitude. The added white noise would populate the whole time–frequency space uniformly with the constituting components of different scales. When the signal is added to this uniformly distributed white background, the signal components with different scales are automatically projected onto proper reference scales established according to the white noise. So the intermittent component of the signal has continuous feature. By adding finite noise, the EEMD eliminates largely the mode mixing problem (Taraphder and Chakraverty 2015).

*x*(

*t*), the effective algorithm of EEMD can be summarized as follows. Firstly, set the total number (

*N*) of added white noise and its amplitude

*ε*. Secondly, add the random Gaussian white noise sequence

*ω*

_{ k }(

*t*) to the original signal

*x*(

*t*). Obtain the noise-added signal

*x*

_{ k }(

*t*), namely

*x*

_{ k }(

*t*). Then, obtain

*n*IMF components,

*c*

_{ ik }(

*t*),

*i*= 1,2,…,

*n*, where

*c*

_{ ik }(

*t*) represents the

*i*th IMF component obtained with EMD of the signal added

*k*th white noise sequence. Lastly, calculate the ensemble mean of each IMF component. The result in the following can be obtained.

## Noise removal of prototypical observations on dam safety

*k*denotes the number of IMF components which are chosen to implement the noise removal,

*c*

_{ i }(

*t*) is the IMF component with noise,

*c*

_{ i }′(

*t*) is the noise-removed IMF component,

*r*

_{ n }(

*t*) is a residual.

### Total number of added white noise and its amplitude

*ε*

_{ n }is the standard deviation representing the difference between the input signal and the final reconstructed result of IMF components,

*ε*denotes the amplitude of added noise, and

*N*is the total number of added noise. If the amplitude of added noise is too small, the added noise cannot affect the expected selection of extreme points. Furthermore, if the amplitude of added noise is proper and the number of added noise is enough, the increasement of amplitude and number of added noise has no more effect on the decomposition results. It is suggested that the amplitude of added white noise is taken as 0.2 times of standard deviation of the signal (Wu and Huang 2009). For high frequency component-oriented signal, small amplitude of added noise should be chosen. In general, when the number of added noise is up to 100 or 200, the satisfactory result can obtained.

### Stopping criterion of sifting process

In fact, the EMD is a process sifting IMF components. The stopping criterion of sifting process is used to control the sifting times of generating one IMF component, namely the fulfillment of two conditions in the IMF definition. The too strict stopping criterion will cause the over-sift of IMF components and the elimination of amplitude changes. The easy stopping criterion will lead to the under-sift of IMF components, the riding waves cannot be eliminated and the condition of local zero mean cannot be satisfied. The conventional stopping criteria of sifting process have the standard deviation criterion and overall local combination rule (Huang et al. 1998, 1999). However, based on these stopping criteria, the decomposition process is very sensitive to local disturbance of the signal. The decomposition results of target signals with different local disturbances are very different and irregular. So these conventional stopping criteria of sifting process are not applicable to the EEMD algorithm that the white noises need to be added repeatedly. To overcome this problem, Wu and Huang (2004) proposed the approach fixing the sifting times and they reveal that the upper and lower envelopes of IMF component are almost symmetrical about the zero axis when the sifting times is up to 10.

### Stopping condition of decomposition process

For the EMD algorithm, the decomposition process can be terminated when any following condition is satisfied, namely, the *n*th IMF component *c*
_{
n
}(*t*) or the residual *r*
_{
n
}(*t*) is less than the preset value, or the residual *r*
_{
n
}(*t*) can be regarded as a monotonic function. It is known that for the white noise populating the whole time or frequency space uniformly with the constituting components of different scales, the role of EMD decomposition is equivalent to a binary filter group. The white noise can be decomposed into a series of IMF components with different average periods, and the average period of any IMF is double average period of previous IMF (Flandrin et al. 2004; Wu and Huang 2004). The average period represents the total number of data, namely signal length, divided by the peak point number, or local maximum point number. Therefore, for the EEMD algorithm that the added white noise populates the whole time–frequency space uniformly, the total number *n* of IMF component decomposed completely approximates log_{2}
*M* − 1, where *M* represents the signal length. In practice, according to the actual requirement, other appropriate conditions can be adopted to terminate the decomposition process. For example, when the extreme point number is less than a certain number, or when the number of IMF component decomposed is up to a certain number, the decomposition process is over.

### Endpoint effect

### EEMD-based noise removal process of prototypical observations on dam safety

#### Implement EEMD

The amplitude of added white noise is taken as 0.2 times of standard deviation of prototypical observation series. The number of added noise is set as 200. The sifting number is set as 10. When the number *n* of IMF component decomposed is up to log_{2}
*M* − 4, the decomposition process is terminated, where *M* is the length of observation series. EEMD of prototypical observation series on dam safety is fulfilled and *n* IMF components are obtained.

#### Select the IMF components to remove noise

*E*

_{ i }represents the energy density of the

*i*th IMF component

*c*

_{ i }of white noise,

*M*is the signal length,

*c*

_{ i },

*M*

_{max}is the number of maximum point of

*c*

_{ i }.

*R*

_{ k }is defined as follows.

*E*

_{ k }and \(\bar{T}_{k}\) represent, respectively, the energy density and the average period of the

*k*th IMF component

*c*

_{ k }, which is obtained by implementing the EEMD of prototypical observation series on dam safety.

When *R*
_{
k
} ≥ *C*, *C* is usually between 2 and 3, most of the noise is contained in the first *k* IMF components. The noise removal for the *k* IMF components need to be implemented.

#### Implement the noise removal with the threshold method

*c*

_{ i }(

*t*).

*c*

_{ i }′(

*t*) represents the noise-removed IMF component.

*λ*

_{ i }denotes the threshold of the IMF component

*c*

_{ i }(

*t*).

*i*≤ 2, the noise energy of corresponding IMF component is larger, and the signal-to-noise ratio is lower. The threshold

*λ*

_{ i }is taken as

*m*is the median of absolute deviation for

*c*

_{1}(

*t*),

*M*represents the sequence length.

*i*≤

*k*, the useful signal energy of corresponding IMF component is close to the noise energy. The threshold should be reduced. So the threshold

*λ*

_{ i }is taken as

#### Reconstruct the signal

Equation (4) is applied to the signal reconstruction. The reconstructed results *x*′(*t*) form a noise-removed observation series of dam safety.

## Actual case analysis

*c*

_{1},

*c*

_{2},…,

*c*

_{7}, and one residual

*r*

_{7}are obtained, as shown in Fig. 6.

*k*= 3,

*R*

_{ k }= 3.9 >

*C*(

*C*= 3). So the first 3 IMF components are selected to implement the noise removal operation respectively with the threshold 0.0468, 0.0468 and 0.0338. The sum of noise-removed components, other IMF components and the residual, namely noise-removed observation series, is shown in Fig. 7.

Comparison between Figs. 5 and 7 shows that after the EEMD-based noise removal is implemented, most of the fluctuations with small amplitude appearing in the original observation series have been filtered. The time-varying feature of horizontal displacement can be reflected more clearly.

*F*is adopted to build the statistical model (Su et al. 2012, 2015).

*H*represents the upstream reservoir water depth,

*t*denotes the cumulative days from the monitoring day to the beginning day,

*θ*=

*t*/100.

*y*′ denote the model calculation,

*a*

_{0},

*a*

_{ i },

*b*

_{1i },

*b*

_{2i },

*d*

_{1},

*d*

_{2}represent the regression coefficients.

*r*

^{2}) and the following mean square error (MSE).

*y*

_{ i }and

*y*

_{ i }′ denote the dam displacement observation and the model calculation respectively,

*l*represents the number of measured values.

For the statistical model built based on the original observation series of horizontal displacement, its fitting MSE is 0.0051 and its forecasting MSE is 0.0073, its fitting *r*
^{2} is 0.9536 and its forecasting *r*
^{2} is 0.9250. For the statistical model built based on the noise-removed observation series of horizontal displacement, its fitting MSE is 0.0050 and its forecasting MSE is 0.0071, its fitting *r*
^{2} is 0.9861 and its forecasting *r*
^{2} is 0.9568. It can be seen that the noise removal improve the performance of built model.

## Conclusions

Considering the nonlinear and non-stationary characteristics of prototypical observations on dam safety, an EEMD-based method is introduced to remove noise from the original observation series with certain intermittency. Its basic principle and implement process are presented. To adapt the noise removal requirements of prototypical observations on dam safety, the key control parameters of EEMD algorithm are given and some improvement strategies are discussed.

The application example illustrates that the proposed method can filter the fluctuations with small amplitude appearing in the prototypical observation series on dam safety. The statistical model, which is built by choosing the noise-removed observations on dam safety, has better performance forecasting the dam behavior. Due to the high ability solving the mode mixing and endpoint effect problems, the EEMD-based method is more suitable for implementing the noise removal of prototypical observations on dam safety, particularly with certain intermittency.

## Declarations

### Authors’ contributions

HS and ZC drafted the manuscript. HL and ZW made some revisions of the manuscript. All authors read and approved the final manuscript.

### Acknowledgements

This research has been partially supported by National Natural Science Foundation of China (SN: 51579083, 41323001, 51139001, 51479054), Jiangsu Natural Science Foundation (SN: BK2012036), the Doctoral Program of Higher Education of China (SN: 20130094110010), Open Foundation of State Key Laboratory of Hydrology-Water Resources and Hydraulic Engineering (SN: 20145027612), the Fundamental Research Funds for the Central Universities (Grant No. 2015B25414), Research Program on Natural Science for Colleges and Universities in Jiangsu Province (SN: 14KJB520016), and Science and Technology Innovation Foundation by Nanjing Institute of Technology (SN: CKJ2010010).

### Competing interests

The authors declare that they have no competing interests.

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

## Authors’ Affiliations

## References

- Ahrabian A, Rehman N, Mandic DP (2013) Bivariate empirical mode decomposition for unbalanced real-world signals. IEEE Signal Proc Lett 20(3):245–248View ArticleGoogle Scholar
- Athanasia P, Theofanis S (2011) On the estimation of the function and its derivatives in nonparametric regression: a bayesian testimation approach. Sankhya A 73(2):231–244View ArticleGoogle Scholar
- Flandrin P, Rilling G, Goncalves P (2004) Empirical mode decomposition as a filter bank. IEEE Signal Proc Lett 11(2):112–114View ArticleGoogle Scholar
- Huang NE, Shen Z, Long SR, Long SR, Wu MC, Shih HH, Zheng Q, Yen NC, Tung CC, Liu H (1998) The empirical mode decomposition and Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc R Soc Lond 454:903–995View ArticleGoogle Scholar
- Huang NE, Shen Z, Long SR (1999) A new view of nonlinear water waves: the Hilbert Spectrum. Annu Rev Fluid Mech 31(1):417–457View ArticleGoogle Scholar
- Kafle J, Pokhrel PR, Khattri KB, Kattel P, Tuladhar BM, Pudasaini SP (2016) Landslide-generated tsunami and particle transport in mountain lakes and reservoirs. Ann Glaciol 57(71):232–244View ArticleGoogle Scholar
- Lee YS, Tsakirtzis S, Vakakis AF, Bergman LA, McFarland DM (2011) A time-domain nonlinear system identification method based on multiscale dynamic partitions. Meccanica 46(4):625–649View ArticleGoogle Scholar
- Liu B, Riemenschneider S, Xu Y (2006) Gearbox fault diagnosis using empirical mode decomposition and Hilbert spectrum. Mech Syst Signal Process 20(3):718–734View ArticleGoogle Scholar
- Moghtaderi A, Flandrin P, Borgnat P (2013) Trend filtering via empirical mode decompositions. Comput Stat Data Anal 58:114–126View ArticleGoogle Scholar
- Mohideen SK (2012) Denosing of images using complex wavelet transform. Int J Adv Sci Tech Res 1(2):176–184Google Scholar
- Park C, Looney D, Kidmose P, Ungstrup M, Mandic DP (2011) Time-frequency analysis of EEG asymmetry using bivariate empirical mode decomposition. IEEE Trans Neural Syst Rehabil 19(4):366–373View ArticleGoogle Scholar
- Pudasaini SP (2014) Dynamics of submarine debris flow and tsunami. Acta Mech 225:2423–2434View ArticleGoogle Scholar
- Pudasaini SP (2016) A novel description of fluid flow in porous and debris materials. Eng Geol 202:62–73View ArticleGoogle Scholar
- Shark LK, Yu C (2000) Denoising by optimal fuzzy thresholding in wavelet domain. Electron Lett 36(6):581–582View ArticleGoogle Scholar
- Su HZ, Wen ZP, Wu ZR (2011) Study on an intelligent inference engine in early-warning system of dam health. Water Resour Manag 25(6):1545–1563View ArticleGoogle Scholar
- Su HZ, Hu J, Wu ZR (2012) A study of safety evaluation and early-warning method for dam global behavior. Struct Health Monit 11(3):269–279View ArticleGoogle Scholar
- Su HZ, Wen ZP, Sun XR, Yang M (2015) Time-varying identification model for dam behavior considering structural reinforcement. Struct Saf 57:1–7View ArticleGoogle Scholar
- Taraphder A, Chakraverty BK (2015) Early damage detection of roller bearings using wavelet packet decomposition, ensemble empirical mode decomposition and support vector machine. Meccanica 50(3):865–874View ArticleGoogle Scholar
- Wu ZH, Huang NE (2004) A study of the characteristics of white noise using the empirical mode decomposition. Proc R Soc Lond 460(2046):1597–1611View ArticleGoogle Scholar
- Wu ZH, Huang NE (2009) Ensemble empirical mode decomposition: a noise assisted data analysis method. Adv Adapt Data Anal 1(1):1–41View ArticleGoogle Scholar