^{1}

^{*}

^{1}

^{2}

In estimating the linear prediction coefficients for an autoregressive spectral model, the concept of using the Yule-Walker equations is often invoked. In case of additive white Gaussian noise (AWGN), a typical parameter compensation method involves using a minimal set of Yule-Walker equation evaluations and removing a noise variance estimate from the principal diagonal of the autocorrelation matrix. Due to a potential over-subtraction of the noise variance, however, this method may not retain the symmetric Toeplitz structure of the autocorrelation matrix and thereby may not guarantee a positive-definite matrix estimate. As a result, a significant decrease in estimation performance may occur. To counteract this problem, a parametric modelling of speech contaminated by AWGN, assuming that the noise variance can be estimated, is herein presented. It is shown that by combining a suitable noise variance estimator with an efficient iterative scheme, a significant improvement in modelling performance can be achieved. The noise variance is estimated from the least squares analysis of an overdetermined set of p lower-order Yule-Walker equations. Simulation results indicate that the proposed method provides better parameter estimates in comparison to the standard Least Mean Squares (LMS) technique which uses a minimal set of evaluations for determining the spectral parameters.

The linear prediction (LP) analysis has been extensively used for estimating vocal tract characteristics of speech. In performing this analysis, a standard p-th order autoregressive (AR) process is typically used to model the spectral envelope of the speech signal. Under noise-free conditions, or at very high SNRs, the conventional methods, such as the Yule-Walker, the Burg, the covariance, and the modified covariance methods [

As the suitable model for an AR (p) plus noise process is an ARMA (p, p) model spectrum [

The second class includes those methods that are designed to mitigate the bias error induced by the noise. Generally, these noise-compensation methods [

The main contributions of this paper are twofold. First, the variance of the noise, which is assumed to be white Gaussian, is evaluated from the least squares analysis of an overdetermined set of p lower-order Yule- Walker equations. Second, the convergence condition involves both the magnitude of the reflection coefficients and the smallest eigenvalue extracted from the autocorrelation matrix. The paper is structured as follows. The next section presents a brief description of the overdetermined modelling approach and the singular value decomposition (SVD) method involved in estimating the noise variance. Section 3 provides a thorough description, and derivation of the noise variance estimator. The noise variance is then used by an efficient iterative scheme to compute the magnitude of noise that should be removed from the zero-order autocorrelation lag prior to the determination of the LP parameters. This iterative scheme uses both the smallest eigenvalue of the autocorrelation matrix and the magnitude of the extracted reflection coefficients as the convergence criteria. Providing that the expected noise variance is lesser than the smallest eigenvalue, or the magnitude of the extracted reflection coefficients are smaller than one, stable and accurate prediction parameters are guaranteed. Such iterative scheme is described in Section 4. Section 5 presents the simulation results supporting the analysis and providing a comparison with the standard LMS technique. Section 6 concludes the paper.

This section briefly discusses the overdetermined modelling and the singular value decomposition (SVD) methods that will be used in the next section to derive the noise variance estimator. The discussion will be conducted at an informal level to make the material as accessible as possible.

An ARMA model of order (p, q) is considered to have a frequency characterization which is the composite of both forward and backward AR models. This model assumes that the time series {x_{n}} can be modelled according to the following recursive relationship

where a_{k} and b_{k} are the k-th parameters of the all-pole and all-zero models, respectively, {u_{n}} is a normalized white noise with distribution _{k} and b_{k} parameters of this rational model. In this method, only 2p correlation coefficients are involved. It was shown by many researchers, such as Cadzow [

Every autocorrelation lag provides property information about the underlying data sequence which is inversely proportional to the lag order. The zero-order lag is proportional to the signal power. The first lag is similar to the correlation between two copies of the signal at adjacent samples in time, and so on. Obviously, as the lag order increases, the provided information becomes less valuable. It is worth noting that estimating a correct model order (i.e., number of equations) is an important issue in signal modelling. Increasing the number of equations is particularly valuable for narrowband processes. In this case, a slow decay of the correlation sequence is expected with relatively large values assigned to higher lag coefficients. For broadband processes, a fast decay of the correlation sequence is expected with little information provided by the higher lag coefficients.

To extract the prerequisite model order values from the overdetermined set of linear equations, the SVD method is often used [

where U and V are m ´ m and n ´ n unitary matrices, respectively, p = min(m, n), and S is a rectangular diagonal matrix of the same size as A with real nonnegative entries, {d_{i}}.These diagonal entries, called the singular values of A, are ordered in descending order of magnitude (i.e.,^{T} or A^{T}A. Thus, the rank of A equals the number of non-zero singular values. The columns of U and V are the eigenvectors of AA^{T} and A^{T}A, respectively.

The pseudo-inverse A^{#} of Equation (2) is related to the SVD of the matrix A by the formula:^{#} is obtained from S by substituting each positive diagonal entry by its reciprocal. For the SVD computation of a rectangular matrix, highly accurate and numerically stable algorithms were developed [^{#}. The introduction of a small amount of noise tends to change the situation. Some or likely all of the formerly zero singular values of matrix S become small nonzero values. These small diagonal values of S become large diagonal values of S^{#}, which leads to large perturbations in the estimated LP parameters. This undesirable effect can be minimized by replacing matrix A in the case of noisy data by a lower rank approximation Â prior to computation of the LP parameters.

In this section, we will first be concerned with the evaluation of the extended order autocorrelation matrix from the higher-order Yule-Walker equations. Then, the SVD method outlined in the previous section is used to compensate for the trivial singular values induced by the AWGN. The resultant AR spectral parameters are subsequently used in the derivation of the noise variance. In the following, the notation a^{l} and a^{h} are used to denote AR spectral estimation from LOYWE and HOYWE, respectively, as they provide more understandable descriptions with less information.

As previously stated, the overdetermined modeling method uses more correlations than the minimal number required by the predictor model to establish the normal equations. In this method, the q non-singular higher-order Yule-Walker equations are solved in a least squares sense as specified by

where the first component _{x} (n) satisfies the relation:

3N/4 provides higher resolution and stable estimates. Accordingly, there exists an inherent trade-off between estimation accuracy and model complexity. Expressing Equation (3) in a compact matrix form gives a set of q linear equations in the p unknowns,

where a^{h} is the p ´ 1 AR parameter vector, r_{x} is a q ´ 1 column vector, and R_{x} denotes the q ´ p ARMA autocorrelation matrix. They are given, respectively, by

Since the autocorrelation lags are unknown, then they may be replaced with estimated autocorrelations using a sample realization of the AR process. However, there is an important issue in AR modelling related to the selection of a suitable autocorrelation lag estimation scheme. The typical unbiased and biased autocorrelation estimates are among the most commonly used techniques in spectral estimation. According to [_{n} is available, the autocorrelation lags can be generated according to

The required ARMA model’s AR parameters a^{h} can be found by solving the q over determined linear equations given in (4). Under noise-free conditions where almost exact autocorrelation lag estimates can be found, the rank of the autocorrelation matrix R_{x} will be equal to p. Due to noise and a finite data samples, however, there will be inherent statistical errors that will affect the autocorrelation lag estimates. It turns out that R_{x} will be of full rank, resulting in inaccurate AR parameters estimates. It has been shown that a suitable approach to deal with this problem consists in evaluating the rank-p approximation of the matrix R_{x}.

As stated previously, SVD is a particularly attractive method for numerically evaluating the rank of a matrix. Let l denotes the rank of the underlying matrix R_{x} and

rank [R_{x}] = l). Having calculated the singular value matrix S from (2), the closest rank-k approximation of the ARMA matrix R_{x} can be evaluated by the Frobenius norm minimization

According to (9), the extent to which _{x} depends on the sum of the (l - k) smallest singular values squared. In order to measure the validity of this approximation which is independent on the size of the matrix R_{x}, the best rank-k approximation can be determined by evaluating the following normalized ratio

The normalized ratio in (10) approaches its maximum value of one as k approaches l. Upon solving Equations (9) and (10), the required ARMA model’s order p can readily be determined, which is set equal to the smallest value of k for which Q_{k} is evaluated close to one. Thus, the q ´ p matrix of lower rank p that best approximates the underlying ARMA matrix R_{x} is generated by setting to zero all but the p largest singular values of the matrix S, that is,

Using this rank approximation approach, the first p singular values of the diagonal matrix S represent the useful data of the matrix R_{x} whereas the last (l - p) singular values are attributed to noise. Combining Equations (11) and (4) and solving for the ARMA model’s AR parameters gives

where the matrix

It is worth noting that the above equation is derived from the least squares analysis of the overdetermined set of p lower-order Yule-Walker equations.

In this section, the estimated noise variance is used in an efficient iterative scheme, which is essentially a gradient method, to determine the AR parameters of a lower-order ARMA model with AWGN. The iterative scheme rests on the well-known assumption that in the correlation domain, only the zero-order lag is affected by white noise, while the remain unaffected. The non-singularity of the ARMA matrix is evaluated by considering both the smallest eigenvalue of the autocorrelation matrix and the magnitude of the estimated reflection coefficients.

In order to gain insight into the convergence behavior of the iterative scheme, we define the modulus of the (j + 1)-th reflection coefficient (0 £ j < p) of the matrix R_{x} and the smallest eigenvalue (in magnitude) as G_{j}_{+1} and l_{min}, respectively. This scheme determines the proportion of noise that should be removed from the zero-order lag so that the matrix R_{x} is non-singular. Note that the positive-definite property prevents R_{x} from becoming singular. Upon comparison of either the value of the estimated noise variance to l_{min} (first criterion) and/or G_{j}_{+1} for all j to one (second criterion), the singularity of the matrix R_{x} can be measured. Providing the value of the estimated noise variance is smaller than the smallest eigenvalue or the estimated reflection coefficients are less than one in magnitude, accurate and stable LP parameters are obtained accordingly. It is worth noting that the second criterion gets only considered when the first one is not satisfied. In this case, the update equation for the entries of the underlying matrix at the ith iteration is given by

where g is an iteration step size that provides a reasonable tradeoff between the rate of convergence and the estimation accuracy of the autocorrelation matrix [

This section presents some computer simulation results which support the thorough analysis presented above. The object of these simulations was to evaluate the effectiveness of the proposed method by comparing its spectral estimation performance with the conventional LMS method (i.e., minimal set of LOYWE evaluations). The first set of experiments involves using LP spectral analysis for estimating the AR model’s parameters. While in the second set of experiments, statistical bias in estimating the AR spectral parameters is achieved.

Designed to be equally intelligible in noise, the sentence “A boy ran down the path” taken from the Hearing in Noise Test (HINT) database [

As this work is directed towards parametric modeling and analysis of speech, three segments corresponding to vowels, /a/, /e/ and /o/, were selected from the above sentence. Sampled values of white Gaussian noise was generated independently by simulation and added acoustically to achieve an input SNR of roughly 0 dB. These segments are subsequently analyzed with N = 256 data points, p = 15 and p + q = N/2. The iteration step size was optimized through simulation and then set to 0.05. No pre-emphasis was applied on the selected segments. In the following, the proposed method is called a noise compensation based SVD (NCSVD).

Figures 3-5 show the estimated power spectra resulting from the NCSVD and the conventional LMS method, and obtained by averaging 50 realizations for each of the three natural vowels considered in this analysis. We observe from these figures that the formant peaks in the NCSVD spectral estimates are more accurately found and shift towards their right positions. In contrast, the conventional LMS method yields poor spectral estimates at lower SNR levels, as expected.

For each of the three vowels, 50 independent realizations were generated and 300 data points were used from steady-state data to estimate the AR spectral parameters for each realization. Statistical bias in estimating AR parameters is computed by ensemble-averaging over the whole realizations. It is performed on the first four parameters which are related to the first two formant peaks of each of the three vowels.

Vowel | Method | a1 | a2 | A3 | a4 |
---|---|---|---|---|---|

/a/ | LOYWE | -1.1266 | 0.0488 | 0.0535 | 0.0106 |

NCSVD | 0.0981 | -0.0049 | 0.0055 | -0.0013 | |

/e/ | LOYWE | -1.3539 | 0.2061 | 0.0598 | 0.0862 |

NCSVD | 0.1327 | -0.0305 | 0.0059 | -0.0009 | |

/o/ | LOYWE | -1.2199 | 0.1742 | -0.1393 | 0.3920 |

NCSVD | 0.1027 | -0.0191 | 0.0199 | -0.0643 |

For the purpose of AR spectral estimation in additive white Gaussian noise, a robust parametric modelling method was presented which combined an appropriate noise variance estimator with an efficient iterative scheme. The noise-variance estimator was derived from the combination of the ODNE method and the truncated SVD of the autocorrelation matrix. The method provides both accurate and stable LP parameters. It was found from computer simulations that its spectral performance was better than that of the conventional LMS approach in terms of formant peaks tracking. This improvement, however, was achieved at the cost of additional computational effort required for calculating the SVD. Fortunately, there exist extremely efficient and computationally tractable algorithms for speeding up the SVD calculation. Moreover, this extra computational effort should not be a problem given the increasing use of parallel computing architectures in signal processing applications.

The authors gratefully acknowledge Teodora Oliveira who participated in a valuable manner in the preparation of this manuscript.