Estimation of sinusoid and exponential signals using eigen-analysis of covariance matrices.
Many functions, used to model physical systems in engineering or physics, arise as the solution to a homogeneous differential equation. Let μ(t) be the concentration or intensity of the physical process at time t. It is often the case that there are physical reasons for supposing that μ(t) satisfies a differential equation like
|(1) ∑||p+1||ξk Dk-1μ(t) = 0 for all t|
where D is the differential operator and the ξk are unknown coefficients. A typical solution to (1) is a sum of exponential functions
|(2) μ(t) = ∑||p||αj exp(βjt)|
where the βj are rate constants (usually negative) and the αj are amplitudes. Such a solution is appropriate for a transient system which dies to zero as time goes on, such as the concentration of an intermediary compound in a chemical reaction. Another typical solution to (1) is a sum of sinusoids
|(3) μ(t) = ∑||p/2||αj sin(βjt + φj)|
Such a solution is persistent and periodic, and is appropriate for electrical or physical systems driven by persistent regular oscillating forces. Signals in speech recognition are often of this type. A third common solution is the damped sinusoid,
|(4) μ(t) = ∑||q||αj exp(βjt) sin(ωjt)|
which combines transient and periodic behaviour. An example would be a pendulum which swings which an exponentially decreasing amplitude.
In practice we don't observe μ(t) exactly, because there is always experimental error. Instead we observe yi = μ(ti) + εi, where the εi are random observation errors. If the errors are approximately Gaussian, then we will try to estimate or identify the unknown parameters in the exponential or sinusoid signal by minimizing the sum of squared errors,
∑ [yi - μ(ti)]2
This is a nonlinear least squares problem in the unknown parameters αj and βj. The basic idea of Prony estimation is to represent the signal μ(t) not in terms of the αj and βj but in terms of the coefficients of the differential equation (1). The coefficients are identified by computing the eigenvectors of a suitably calculated covariance matrix.
Prony's method is a technique for extracting the sinusoid or exponential signals by solving a set of linear equations for the coefficients of the recurrence equation that the signals satisfy. It is closely related to Pisarenko's method, which finds the smallest eigenvalue of an estimated covariance matrix.
The classical method of Count de Prony models a sequence of 2p observations made at equally spaced times by a linear combination of p exponential functions. Prony's ingenious method converted the problem to a system of linear equations. See the MacTutor biography of Baron Gaspard Riche de Prony. The original reference is
de Prony, Baron Gaspard Riche (1795). Essai éxperimental et analytique: sur les lois de la dilatabilité de fluides élastique et sur celles de la force expansive de la vapeur de l'alkool, à différentes températures. Journal de l'école Polytechnique, volume 1, cahier 22, 24-76.
A contemporary treatment of modern Prony methods can be found in
Marple, S. L. (1987). Digital Spectral Analysis with Applications. Prentice-Hall.
Unfortunately, Prony's method is well known to perform poorly when the signal is imbedded in noise; Kahn et al (1992) show that it is actually inconsistent. The Pisarenko form of the method is consistent but inefficient for estimating sinusoid signals and inconsistent for estimating damped sinusoids or exponential signals.
A modified Prony algorithm that is equivalent to maximum likelihood estimation for Gaussian noise was originated by Osborne (1975). It was generalized in Smyth (1985) and Osborne and Smyth (1991) to estimate any function which satisfies a difference equation with coefficients linear and homogeneous in the parameters. Osborne and Smyth (1991) considered in detail the special case of rational function fitting, and proved that the algorithm is asymptotically stable in that case.
The modified Prony algorithm for exponential fitting will estimate, for fixed p, any function μ(t) that solves a constant coefficient differential equation (1). Perturbed observations, yi = μ(ti) + εi , are made at equi-spaced times ti, i = 1, ..., n, where the εi are independent with mean zero and variance σ2. The solutions to (1) include complex exponentials, damped and undamped sinusoids and real exponentials, depending on the roots of the polynomial with the ξk as coefficients. The modified Prony algorithm has the great practical advantage that it will estimate any of these functions according to which best fits the available observations.
The modified Prony algorithm uses the fact that the μ(ti) satisfy an exact difference equation when the ti are equally spaced. The algorithm directly estimates the coefficients, γk say, of this difference equation. The residual sum of squares after estimating the αj can be written in terms of the γk. The derivative with respect to γ = (γ1, ..., γp+1)T can then be written as 2B(γ)γ, where B is a symmetric matrix function of γ. The modified Prony algorithm finds the eigenvector of B(γ)γ = λγ corresponding to λ = 0 by the fixed point iteration in which γk+1 is the eigenvector of B(γk) with eigenvalue nearest zero. The eigenvalue λ is the Lagrange multiplier for the scale of γ in the homogeneous difference equation. Inverse iteration proves very suitable for the actual computations.
Although the algorithm estimates all functions in the same way, the practical considerations and asymptotic arguments differ depending on whether the signals are periodic or transient, real or complex. The algorithm has been applied to real exponential signals by Osborne and Smyth (1991, 1995), to real sinusoidal signals by Mackisack et al (1991) and Kahn et al (1992) and to exponentials with imaginary exponents in complex noise by Kundu (1993).
When the polynomial with coefficients ξk has real distinct roots, the solution to (1) is a sum of real exponential functions (2). If rate coefficients are negative, then this represents a function which is transient, i.e., decays to zero as time goes on. This represents the outcome of a chemical experiment or other process in which the reaction completes in a finite amount of time.
Real exponential fitting is one of the most important, difficult and frequently occurring problems of applied data analysis. Applications include radioactive decay, compartment models and atmospheric transfer functions. Estimation of the αj and βj is well known to be numerically difficult. General purpose algorithms often have great difficulty in converging to a minimum of the sums of squares. This can be caused by difficulty in choosing initial values, ill-conditioning when two or more βj are close, and other less important difficulties associated with the fact that the ordering of the βj is arbitrary. The modified Prony algorithm solves the problem of ordering the βj and is relatively insensitive to starting values. It also solves the ill-conditioning problem as far as convergence of the algorithm is concerned, but may return a pair of damped sinusoids in place of two exponentials which are coalescing.
In some applications the restriction to positive coefficients αj is natural. A convex cone characterization is then possible, and special algorithms have been proposed. We prefer to treat the general problem with freely varying coefficients since this is appropriate for most compartment models. A common attempt to reduce the difficulty of the general problem has been to treat it as a separable regression, i.e., to estimate the coefficients by linear least squares conditional on the rate constants βj. Another approach has been suggested by Ross (Nonlinear Regression, 1990, Section 3.1.4), who suggests that the coefficients of the differential equation (1) comprise a more "stable" parametrization of the problem than do the parameters of (2). Both of these strategies are part of the modified Prony algorithm.
When the polynomial with coefficients ξk has distinct complex roots on the unit circle, the solution to (1) is a sum of sinusoidal signals (3) with p/2 distinct frequencies. This represents a function which is periodic and which can be observed on an ongoing basis.
The frequency estimation problem is mathematically similar to the real exponential problem, but has its own difficulties and features. The least squares frequency estimators are known to be consistent at rate O(n-3/2) while the coefficient estimators are consistent at rate O(n-1/2). The sum of squares objective function is not convex as a function of the frequencies. Rather it has minima which are approximately 2π/n apart. General purpose least squares algorithms have no trouble converging when used to fit sinusoidal models to data, but they typically converge to a local minima rather than to the global minimum of the sum of squares. To achieve efficient estimation of the frequencies it is necessary to resolve the frequencies to the correct global minimum. One algorithm which achieves this is the ORA Prony algorithm from Kahn et al (1992). The Prony methods can be constrained to give periodic solutions when the solution is know apriori to be a sum of sinusoids. See Kannan and Kundu (1994), Kundu and Kannan (1997) and Smyth (1998).
Another method which can be used to handle the non-convexity of the sum of squares is to use elemental sets. An elemental set is the minimum number of observations which are sufficient to identify the parameters in the model, 3p/2 for the frequency estimation problem. The elemental set method consists of using Prony's method to find identify the frequencies and coefficients for a large number of randomly selected elemental sets. This methodology not only allows the global minimum of the sum of squares to be resolved, but also provides a way to handle outliers in the data. Smyth and Hawkins (1997, 1999) show how to obtain frequency estimators with both high efficiency and high breakdown properties.