Best Linear Unbiased Estimation in Linear Models
Simo Puntanen1
University of Tampere, Finland
George P. H. Styan2
McGill University, Montréal, Canada
Keywords and Phrases: Best linear unbiased, BLUE, BLUP, Gauss-Markov Theorem, Generalized inverse, Ordinary least squares, OLSE.
In this article we consider the general linear model (Gauss-Markov model)
or in short
|
|
where is a known model matrix, the vector is an observable -dimensional random vector, is a vector of unknown parameters, and is an unobservable vector of random errors with expectation and covariance matrix where is an unknown constant. The nonnegative definite (possibly singular) matrix is known. In our considerations has no role and hence we may put
As regards the notation, we will use the symbols and to denote, respectively, the transpose, a generalized inverse, the Moore-Penrose inverse, the column space, the orthogonal complement of the column space, and the null space, of the matrix By we denote the partitioned matrix with and as submatrices. By we denote any matrix satisfying Furthermore, we will write to denote the orthogonal projector (with respect to the standard inner product) onto In particular, we denote and One choice for is of course the projector 
Let be a given vector of parametric functions specified by Our object is to find a (homogeneous) linear estimator which would provide an unbiased and in some sense ``best'' estimator for under the model However, not all parametric functions have linear unbiased estimators; those which have are called estimable parametric functions, and then there exists a matrix such that
for all
|
|
Hence is estimable if and only if there exists a matrix such that i.e.,
The ordinary least squares estimator of is defined as where is any solution to the normal equation ; hence minimizes and it can be expressed as while Now the condition guarantees that is unique, even though may not be unique.
The expectation is trivially estimable and is unbiased for whenever An unbiased linear estimator for is defined to be the best linear unbiased estimator, , for under if
for all
|
|
where `` L '' refers to the Löwner partial ordering. In other words, has the smallest covariance matrix (in the Löwner sense) among all linear unbiased estimators. We denote the of as If has full column rank, then is estimable and an unbiased estimator is the for if for all such that The Löwner ordering is a very strong ordering implying for example
for any linear unbiased estimator of ; here refers to the variance and ``det'' denotes the determinant.
The following theorem gives the ``Fundamental equation''; see, e.g., Rao (1967), Zyskind (1967) and Puntanen, Styan and Werner (2000).
Theorem 1 Consider the general linear model Then the estimator is the for if and only if satisfies the equation
 |
(1) |
The corresponding condition for to be the of an estimable parametric function is
It is sometimes convenient to express (1) in the following form, see Rao (1971).
The equation (1) has a unique solution for if and only if Notice that under we assume that the observed value of belongs to the subspace with probability ; this is the consistency condition of the linear model, see, e.g., Baksalary, Rao and Markiewicz (1992). The consistency condition means, for example, that whenever we have some statements which involve the random vector , these statements need hold only for those values of that belong to The general solution for can be expressed, for example, in the following ways:
where and are arbitrary matrices, and is any arbitrary conformable matrix such that Notice that even though may not be unique, the numerical value of is unique because If is positive definite, then Clearly is the under It is also worth noting that the matrix satisfying (1) can be interpreted as a projector: it is a projector onto along see Rao (1974).
Characterizing the equality of the Ordinary Least Squares Estimator  and the has received a lot of attention in the literature, since Anderson (1948), but the major breakthroughs were made by Rao (1967) and Zyskind (1967); for a detailed review, see Puntanen and Styan (1989). For some further references from those years we may mention Kruskal (1968), Watson (1967), and Zyskind and Martin (1969).
We present below six characterizations for the and the to be equal (with probability ).
Theorem 3 shows at once that under the of is trivially the ; this result is often called the Gauss-Markov Theorem.
Consider now two linear models and which differ only in their covariance matrices. For the proof of the following proposition and related discussion, see, e.g., Rao (1971, Th. 5.2, Th. 5.5), and Mitra and Moore (1973, Th. 3.3, Th. 4.1-4.2).
Theorem 4 Consider the linear models and and let the notation   mean that every representation of the for under remains the for under . Then the following statements are equivalent:
| |
 
|
|
| |
 |
|
| |
for some and
|
|
| |
for some and
|
|
Notice that obviously
For the equality between the s of under two partitioned models, see Haslett and Puntanen (2010a).
Consider the model and let denote an unobservable random vector containing new observations. The new observations are assumed to follow the linear model where is a known model matrix associated with new observations, is the same vector of unknown parameters as in , and is an random error vector associated with new observations. Our goal is to predict the random vector on the basis of . The expectation and the covariance matrix are
which we may write as
A linear predictor is said to be unbiased for if for all Then the random vector is said to be unbiasedly predictable. Now an unbiased linear predictor is the best linear unbiased predictor, , for if the Löwner ordering
holds for all such that is an unbiased linear predictor for .
The following theorem characterizes the ; see, e.g., Christensen (2002, p. 283), and Isotalo and Puntanen (2006, p. 1015).
A mixed linear model can be presented as
or shortly
|
|
where and are known matrices, is a vector of unknown fixed effects, is an unobservable vector ( elements) of random effects with and
This leads directly to:
Theorem 6 Consider the mixed model Then the linear estimator is the for if and only if
where . Moreover, is the for if and only if
In terms of Pandora's Box (Theorem 2), if and only if there exists a matrix such that satisfies the equation
For the equality between the s under two mixed models, see Haslett and Puntanen (2010b, 2010c).
Reprinted with permission from Lovric, Miodrag (2011), International Encyclopedia of Statistical Science. Heidelberg: Springer Science+Business Media, LLC.
- Anderson, T. W. (1948). On the theory of testing serial correlation. Skandinavisk Aktuarietidskrift, 31, 88-116.
- Baksalary, Jerzy K.; Rao, C. Radhakrishna and Markiewicz, Augustyn (1992). A study of the influence of the `natural restrictions' on estimation problems in the singular Gauss-Markov model, Journal of Statistical Planning and Inference, 31, 335-351.
- Christensen, Ronald (2002). Plane Answers to Complex Questions: The Theory of Linear Models, 3rd Edition. Springer, New York.
- Haslett, Stephen J. and Puntanen, Simo (2010a). Effect of adding regressors on the equality of the BLUEs under two linear models. Journal of Statistical Planning and Inference, 140, 104-110,
- Haslett, Stephen J. and Puntanen, Simo (2010b). Equality of BLUEs or BLUPs under two linear models using stochastic restrictions. Statistical Papers, 51, 465-475.
- Haslett, Stephen J. and Puntanen, Simo (2010c). On the equality of the BLUPs under two . Metrika, available online, DOI 10.1007/s00184-010-0308-6.
- Isotalo, Jarkko and Puntanen, Simo (2006). Linear prediction sufficiency for new observations in the general Gauss-Markov model. Communications in Statistics: Theory and Methods, 35, 1011-1023.
- Kruskal, William (1967). When are Gauss-Markov and least squares estimators identical? A coordinate-free approach. The Annals of Mathematical Statistics, 39, 70-75.
- Mitra, Sujit Kumar and Moore, Betty Jeanne (1973). Gauss-Markov estimation with an incorrect dispersion matrix. Sankhya, Series A, 35, 139-152.
- Puntanen, Simo and Styan, George P. H. (1989). The equality of the ordinary least squares estimator and the best linear unbiased estimator [with comments by Oscar Kempthorne and by Shayle R. Searle and with ``Reply'' by the authors]. The American Statistician, 43, 153-164.
- Puntanen, Simo; Styan, George P. H. and Werner, Hans Joachim (2000). Two matrix-based proofs that the linear estimator Gy is the best linear unbiased estimator. Journal of Statistical Planning and Inference, 88, 173-179.
- Rao, C. Radhakrishna (1967). Least squares theory using an estimated dispersion matrix and its application to measurement of signals. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability: Berkeley, California, 1965/1966, vol. 1 (Eds. Lucien M. Le Cam & ), University of California Press, Berkeley, pp. 355-372.
- Rao, C. Radhakrishna (1971). Unified theory of linear estimation. Sankhya, Series A, 33, 371-394. [Corrigenda (1972), 34, p. 194 and p. 477.]
- Rao, C. Radhakrishna (1974). Projectors, generalized inverses and the BLUE's. Journal of the Royal Statistical Society, Series B, 36, 442-448.
- Watson, Geoffrey S. (1967). Linear least squares regression. The Annals of Mathematical Statistics, 38, 1679-1699.
- Zyskind, George (1967). On canonical forms, non-negative covariance matrices and best and simple least squares linear estimators in linear models. The Annals of Mathematical Statistics, 38, 1092-1109.
- Zyskind, George and Martin, Frank B. (1969). On best linear estimation and general Gauss-Markov theorem in linear models with arbitrary nonnegative covariance structure. SIAM Journal on Applied Mathematics, 17, 1190-1202.
Footnotes
-
1
- Department of Mathematics and Statistics, FI-33014 University of Tampere, Tampere, Finland. Email: simo.puntanen@uta.fi
-
2
- Department of Mathematics and Statistics, McGill University, 805 ouest rue Sherbrooke Street West, Montréal (Québec), Canada H3A 2K6. Email: styan@math.mcgill.ca
|