GetWiki
generalized least squares
ARTICLE SUBJECTS
being →
database →
ethics →
fiction →
history →
internet →
language →
linux →
logic →
method →
news →
policy →
purpose →
religion →
science →
software →
truth →
unix →
wiki →
ARTICLE TYPES
essay →
feed →
help →
system →
wiki →
ARTICLE ORIGINS
critical →
forked →
imported →
original →
generalized least squares
please note:
- the content below is remote from Wikipedia
- it has been imported raw for GetWiki
{{Short description|Statistical estimation technique}}{{Distinguish|generalized linear model}}{{Copy edit|date=February 2024}}{{Regression bar}}In statistics, generalized least squares (GLS) is a method used to estimate the unknown parameters in a linear regression model. It is used when there is a non-zero amount of correlation between the residuals in the regression model. GLS is employed to improve statistical efficiency and reduce the risk of drawing erroneous inferences, as compared to conventional least squares and weighted least squares methods. It was first described by Alexander Aitken in 1935.JOURNAL, Aitken, A. C., 1935, On Least Squares and Linear Combinations of Observations, Proceedings of the Royal Society of Edinburgh, 55, 42â48, 10.1017/s0370164600014346, It requires knowledge of the covariance matrix for the residuals. If this is unknown, estimating the covariance matrix gives the method of feasible generalized least squares (FGLS). However, FGLS provides fewer guarantees of improvement.- the content below is remote from Wikipedia
- it has been imported raw for GetWiki
Method
In standard linear regression models, one observes data {y_i,x_{ij}}_{i=1, dots, n,j=2, dots, k} on n statistical units with j â 1 predictor values and one response value each. The response values are placed in a vector,mathbf{y} equivbegin{pmatrix}y_1vdotsy_nend{pmatrix},and the predictor values are placed in the design matrix,mathbf{X} equiv begin{pmatrix}1 & x_{12} & x_{13} & cdots & x_{1k}1 & x_{22} & x_{23} & cdots & x_{2k}vdots & vdots & vdots & ddots & vdots1 & x_{n2} & x_{n3} & cdots & x_{nk}end{pmatrix},where each row is a vector of the k predictor variables (including a constant) for the ith data point.The model assumes that the conditional mean of mathbf{y} givenmathbf{X} to be a linear function of mathbf{X} and that the conditional variance of the error term given mathbf{X} is a known non-singular covariance matrix, mathbf{Omega}. That is,
mathbf{y} = mathbf{X} boldsymbol{beta} + boldsymbol{varepsilon}, quad operatorname{E}[boldsymbolvarepsilonmidmathbf{X}]=0, quad operatorname{Cov}[boldsymbolvarepsilonmidmathbf{X}]= boldsymbol{Omega},
where boldsymbolbeta in mathbb{R}^k is a vector of unknown constants, called “regression coefficients”, which are estimated from the data.If mathbf{b} is a candidate estimate for boldsymbol{beta}, then the residual vector for mathbf{b} is mathbf{y}- mathbf{X} mathbf{b}. The generalized least squares method estimates boldsymbol{beta} by minimizing the squared Mahalanobis length of this residual vector:begin{align}{hat{boldsymbol{beta}}} & = underset{mathbf{b}}operatorname{arg min},(mathbf{y}- mathbf{X} mathbf{b})^{ mathrm{T}}mathbf{Omega}^{-1}(mathbf{y}- mathbf{X} mathbf{b})
& = underset{ mathbf{b}}operatorname{arg min},mathbf{y}^{mathrm{T}},mathbf{Omega}^{-1}mathbf{y} + (mathbf{X} mathbf{b})^{mathrm{T}} mathbf{Omega}^{-1} mathbf{X} mathbf{b} - mathbf{y}^{mathrm{T}}mathbf{Omega}^{-1}mathbf{X} mathbf{b}-(mathbf{X} mathbf{b})^{mathrm{T}}mathbf{Omega}^{-1}mathbf{y}, ,
end{align}
which is equivalent to:
{hat{boldsymbolbeta}} = underset{ mathbf{b}}operatorname{arg min},mathbf{y}^{mathrm{T}},mathbf{Omega}^{-1}mathbf{y} + mathbf{b}^{mathrm{T}} mathbf{X}^{mathrm{T}} mathbf{Omega}^{-1} mathbf{X} mathbf{b} -2 mathbf{b}^{mathrm{T}} mathbf{X} ^{mathrm{T}}mathbf{Omega}^{-1}mathbf{y}, which is a quadratic programming problem. The stationary point of the objective function occurs when:
2 mathbf{X}^{mathrm{T}} mathbf{Omega}^{-1} mathbf{X} { mathbf{b}} -2 mathbf{X} ^{mathrm{T}}mathbf{Omega}^{-1}mathbf{y} = 0,so the estimator is:
{hat{boldsymbolbeta}} = underset{ mathbf{b}}operatorname{arg min},mathbf{y}^{mathrm{T}},mathbf{Omega}^{-1}mathbf{y} + mathbf{b}^{mathrm{T}} mathbf{X}^{mathrm{T}} mathbf{Omega}^{-1} mathbf{X} mathbf{b} -2 mathbf{b}^{mathrm{T}} mathbf{X} ^{mathrm{T}}mathbf{Omega}^{-1}mathbf{y}, which is a quadratic programming problem. The stationary point of the objective function occurs when:
{hat{boldsymbol beta}} = left( mathbf{X}^{mathrm{T}} mathbf{Omega}^{-1} mathbf{X} right)^{-1} mathbf{X}^{ mathrm{T}}mathbf{Omega}^{-1}mathbf{y}.
The quantity mathbf{Omega}^{-1} is known as the precision matrix (or dispersion matrix), a generalization of the diagonal weight matrix.
The quantity mathbf{Omega}^{-1} is known as the precision matrix (or dispersion matrix), a generalization of the diagonal weight matrix.
Properties
The GLS estimator is unbiased, consistent, efficient, and asymptotically normal with: operatorname{E}[hatboldsymbolbetamidmathbf{X}] = boldsymbolbeta,quadtext{and}quadoperatorname{Cov}[hat{boldsymbolbeta}midmathbf{X}] = (mathbf{X}^{mathrm{T}}boldsymbolOmega^{-1}mathbf{X})^{-1}.GLS is equivalent to applying ordinary least squares (OLS) to a linearly transformed version of the data. This can be seen by factoring mathbf{Omega} = mathbf{C} mathbf{C}^{ mathrm{T}} using a method such as Cholesky decomposition. Left-multiplying both sides of mathbf{y} = mathbf{X} boldsymbol{beta} + boldsymbol{varepsilon} by mathbf{C}^{-1} yields an equivalent linear model: mathbf{y}^{*} = mathbf{X}^{*} boldsymbol{beta} + boldsymbol{varepsilon}^{*},quadtext{where}quadmathbf{y}^{*} = mathbf{C}^{-1} mathbf{y},quadmathbf{X}^{*} = mathbf{C}^{-1} mathbf{X},quadboldsymbol{varepsilon}^{*} = mathbf{C}^{-1} boldsymbol{varepsilon}.In this model, operatorname{Var}[{boldsymbol{varepsilon}}^{*}midmathbf{X}]= mathbf{C}^{-1} mathbf{Omega} left(mathbf{C}^{-1} right)^{mathrm{T}} = mathbf{I}, where mathbf{I} is the identity matrix. Then, boldsymbol{beta} can be efficiently estimated by applying OLS to the transformed data, which requires minimizing the objective,
left(mathbf{y}^{*} - mathbf{X}^{*} boldsymbol{beta} right)^{mathrm{T}} (mathbf{y}^{*} - mathbf{X}^{*} boldsymbol{beta}) = (mathbf{y}- mathbf{X} mathbf{b})^{mathrm{T}},mathbf{Omega}^{-1}(mathbf{y}- mathbf{X} mathbf{b}).
This transformation effectively standardizes the scale of and de-correlates the errors. When OLS is used on data with homoscedastic errors, the GaussâMarkov theorem applies, so the GLS estimate is the best linear unbiased estimator for β.Weighted least squares
A special case of GLS, called weighted least squares (WLS), occurs when all the off-diagonal entries of Ω are 0. This situation arises when the variances of the observed values are unequal or when heteroscedasticity is present, but no correlations exist among the observed variances. The weight for unit i is proportional to the reciprocal of the variance of the response for unit i.BOOK, Strutz, T., Data Fitting and Uncertainty (A practical introduction to weighted least squares and beyond), Springer Vieweg, 2016, 978-3-658-11455-8, , chapter 3Derivation by maximum likelihood estimation
Ordinary least squares can be interpreted as maximum likelihood estimation with the prior that the errors are independent and normally distributed with zero mean and common variance. In GLS, the prior is generalized to the case where errors may not be independent and may have differing variances. For given fit parameters mathbf b, the conditional probability density function of the errors are assumed to bep(boldsymbolvarepsilon| mathbf b)frac{1}{sqrt{(2pi)^n det boldsymbol Omega }}expleft(-frac{1}{2}boldsymbol varepsilon^{mathrm{T}} boldsymbol Omega^{-1}boldsymbol varepsilonright).By Bayes’ theorem,p(mathbf b | boldsymbol varepsilon)frac{p(boldsymbol varepsilon | mathbf b) p(mathbf b )}{p(boldsymbol varepsilon)}. In GLS, a uniform (improper) prior is taken for p(mathbf b), and as p(boldsymbolvarepsilon) is a marginal distribution, it does not depend on mathbf b. Therefore the log-probability is,log p(mathbf b|boldsymbol varepsilon)log p(boldsymbol varepsilon | mathbf b) +cdots-frac{1}{2}boldsymbol varepsilon^{mathrm{T}} boldsymbol Omega^{-1} boldsymbolvarepsilon +cdots, where the hidden terms are those that do not depend on mathbf b, and log p(boldsymbol varepsilon | mathbf b) is the log-likelihood. The maximum a posteriori (MAP) estimate is then the maximum likelihood estimate (MLE), which is equivalent to the optimization problem from above,{hat{boldsymbol{beta}}} = underset{mathbf{b}}operatorname{argmax} ; p(mathbf b| boldsymbol varepsilon)
underset{mathbf{b}}operatorname{argmax} ; log p(mathbf b | boldsymbol varepsilon)
underset{mathbf{b}}operatorname{argmax} ; log p(boldsymbol varepsilon | mathbf b )
underset{mathbf{b}}operatorname{argmin} ; frac{1}{2} (mathbf y - mathbf X mathbf b)^{mathrm{T}} boldsymbol Omega^{-1}
(mathbf y - mathbf X mathbf b ),where mathbf y - mathbf X mathbf b
has been substituted for boldsymbol varepsilon
, and the optimization problem has been re-written using the fact that the logarithm is a strictly increasing function and the property that the argument solving an optimization problem is independent of terms in the objective function which do not involve said terms.Feasible generalized least squares
If the covariance of the errors Omega is unknown, one can get a consistent estimate of Omega, say widehat Omega ,Baltagi, B. H. (2008). Econometrics (4th ed.). New York: Springer. using an implementable version of GLS known as the feasible generalized least squares (FGLS) estimator. In FGLS, modeling proceeds in two stages:- The model is estimated by OLS or another consistent (but inefficient) estimator, and the residuals are used to build a consistent estimator of the errors covariance matrix (to do so, one often needs to examine the model adding additional constraints; for example, if the errors follow a time series process, a statistician generally needs some theoretical assumptions on this process to ensure that a consistent estimator is available).
- Then, using the consistent estimator of the covariance matrix of the errors, one can implement GLS ideas.
sigma^2*(X^operatorname{T} X)^{-1}
and instead, use an HAC (Heteroskedasticity and Autocorrelation Consistent) estimator. In the context of autocorrelation, the NeweyâWest estimator can be used, and in heteroscedastic contexts, the EickerâWhite estimator can be used instead. This approach is much safer, and it is the appropriate path to take unless the sample is large, where “large” is sometimes a slippery issue (e.g., if the error distribution is asymmetric the required sample will be much larger).The ordinary least squares (OLS) estimator is calculated by:
widehat{u}_{FGLS1} = Y - X widehat beta_{FGLS1}
sqrt{n}(hatbeta_{FGLS} - beta) xrightarrow{d} mathcal{N}!left(0,,Vright),
where n is the sample size, and:
See also
References
{{Reflist}}Further reading
- BOOK, Amemiya, Takeshi, Takeshi Amemiya, 1985, Generalized Least Squares Theory, Advanced Econometrics, Harvard University Press, 0-674-00560-0,books.google.com/books?id=0bzGQE14CwEC&pg=PA181, registration,archive.org/details/advancedeconomet00amem,
- BOOK, Johnston, John, John Johnston (econometrician), Generalized Least-squares, Econometric Methods, New York, McGraw-Hill, Second, 1972, 208â242,books.google.com/books?id=BZtvwZAGyV0C&pg=PA208,
- BOOK, Kmenta, Jan, Jan Kmenta, Generalized Linear Regression Model and Its Applications, Elements of Econometrics, New York, Macmillan, Second, 1986, 0-472-10886-7, 607â650,books.google.com/books?id=Bxq7AAAAIAAJ&pg=PA607,
- JOURNAL, Beck, Nathaniel, Katz, Jonathan N., September 1995, What To Do (and Not to Do) with Time-Series Cross-Section Data,www.cambridge.org/core/journals/american-political-science-review/article/abs/what-to-do-and-not-to-do-with-timeseries-crosssection-data/0E778B85AB008DAF8D13E0AC63505E37, American Political Science Review, en, 89, 3, 634â647, 10.2307/2082979, 2082979, 63222945, 1537-5943,
- content above as imported from Wikipedia
- "generalized least squares" does not exist on GetWiki (yet)
- time: 5:41am EDT - Wed, May 22 2024
- "generalized least squares" does not exist on GetWiki (yet)
- time: 5:41am EDT - Wed, May 22 2024
[ this remote article is provided by Wikipedia ]
LATEST EDITS [ see all ]
GETWIKI 21 MAY 2024
The Illusion of Choice
Culture
Culture
GETWIKI 09 JUL 2019
Eastern Philosophy
History of Philosophy
History of Philosophy
GETWIKI 09 MAY 2016
GetMeta:About
GetWiki
GetWiki
GETWIKI 18 OCT 2015
M.R.M. Parrott
Biographies
Biographies
GETWIKI 20 AUG 2014
GetMeta:News
GetWiki
GetWiki
© 2024 M.R.M. PARROTT | ALL RIGHTS RESERVED