{{Short description|Statistical measure of how far values spread from their average}} {{About|the mathematical concept|other uses|Variance (disambiguation)}}(File:Comparison standard deviations.svg|thumb|400px|right|Example of samples from two populations with the same mean but different variances. The red population has mean 100 and variance 100 (SD=10) while the blue population has mean 100 and variance 2500 (SD=50) where SD stands for Standard Deviation.)In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers is spread out from their average value. It is the second central moment of a distribution, and the covariance of the random variable with itself, and it is often represented by sigma^2, s^2, operatorname{Var}(X), V(X), or mathbb{V}(X).BOOK, Wasserman, Larry, All of Statistics: a concise course in statistical inference, 2005, Springer texts in statistics, 978-1-4419-2322-6, 51, An advantage of variance as a measure of dispersion is that it is more amenable to algebraic manipulation than other measures of dispersion such as the expected absolute deviation; for example, the variance of a sum of uncorrelated random variables is equal to the sum of their variances. A disadvantage of the variance for practical applications is that, unlike the standard deviation, its units differ from the random variable, which is why the standard deviation is more commonly reported as a measure of dispersion once the calculation is finished.There are two distinct concepts that are both called “variance”. One, as discussed above, is part of a theoretical probability distribution and is defined by an equation. The other variance is a characteristic of a set of observations. When variance is calculated from observations, those observations are typically measured from a real-world system. If all possible observations of the system are present, then the calculated variance is called the population variance. Normally, however, only a subset is available, and the variance calculated from this is called the sample variance. The variance calculated from a sample is considered an estimate of the full population variance. There are multiple ways to calculate an estimate of the population variance, as discussed in the section below.The two kinds of variance are closely related. To see how, consider that a theoretical probability distribution can be used as a generator of hypothetical observations. If an infinite number of observations are generated using a distribution, then the sample variance calculated from that infinite set will match the value calculated using the distribution’s equation for variance. Variance has a central role in statistics, where some ideas that use it include descriptive statistics, statistical inference, hypothesis testing, goodness of fit, and Monte Carlo sampling.{{TOC limit}}File:variance_visualisation.svg|thumb|Geometric visualisation of the variance of an arbitrary distribution (2, 4, 4, 4, 5, 5, 7, 9): {{ordered list

|A frequency distribution is constructed.
|The centroid of the distribution gives its mean.
|A square with sides equal to the difference of each value from the mean is formed for each value.
|Arranging the squares into a rectangle with one side equal to the number of values, n, results in the other side being the distribution’s variance, Ïƒ2.

}}

Definition

The variance of a random variable X is the expected value of the squared deviation from the mean of X, mu = operatorname{E}[X]:

operatorname{Var}(X) = operatorname{E}left[(X - mu)^2 right].

This definition encompasses random variables that are generated by processes that are discrete, continuous, neither, or mixed. The variance can also be thought of as the covariance of a random variable with itself:

operatorname{Var}(X) = operatorname{Cov}(X, X).

The variance is also equivalent to the second cumulant of a probability distribution that generates X. The variance is typically designated as operatorname{Var}(X), or sometimes as V(X) or mathbb{V}(X), or symbolically as sigma^2_X or simply sigma^2 (pronounced “sigma squared“). The expression for the variance can be expanded as follows:

begin{align}

operatorname{Var}(X) &= operatorname{E}left[(X - operatorname{E}[X])^2right] [4pt]&= operatorname{E}left[X^2 - 2Xoperatorname{E}[X] + operatorname{E}[X]^2right] [4pt]&= operatorname{E}left[X^2right] - 2operatorname{E}[X]operatorname{E}[X] + operatorname{E}[X]^2 [4pt]&= operatorname{E}left[X^2right] - 2operatorname{E}[X]^2 + operatorname{E}[X]^2 [4pt]&= operatorname{E}left[X^2 right] - operatorname{E}[X]^2end{align}In other words, the variance of {{mvar|X}} is equal to the mean of the square of {{mvar|X}} minus the square of the mean of {{mvar|X}}. This equation should not be used for computations using floating point arithmetic, because it suffers from catastrophic cancellation if the two components of the equation are similar in magnitude. For other numerically stable alternatives, see Algorithms for calculating variance.

Discrete random variable

If the generator of random variable X is discrete with probability mass function x_1 mapsto p_1, x_2 mapsto p_2, ldots, x_n mapsto p_n, then

operatorname{Var}(X) = sum_{i=1}^n p_icdot(x_i - mu)^2,

where mu is the expected value. That is,

mu = sum_{i=1}^n p_i x_i .

(When such a discrete weighted variance is specified by weights whose sum is not 1, then one divides by the sum of the weights.)The variance of a collection of n equally likely values can be written as

operatorname{Var}(X) = frac{1}{n} sum_{i=1}^n (x_i - mu)^2

where mu is the average value. That is,

mu = frac{1}{n}sum_{i=1}^n x_i .

The variance of a set of n equally likely values can be equivalently expressed, without directly referring to the mean, in terms of squared deviations of all pairwise squared distances of points from each other:CONFERENCE, Yuli Zhang, Huaiyu Wu, Lei Cheng, June 2012, Some new deformation formulas about variance and covariance, Proceedings of 4th International Conference on Modelling, Identification and Control(ICMIC2012), 987â€“992,

operatorname{Var}(X) = frac{1}{n^2} sum_{i=1}^n sum_{j=1}^n frac{1}{2}(x_i - x_j)^2 = frac{1}{n^2}sum_i sum_{j>i} (x_i-x_j)^2.

Absolutely continuous random variable

If the random variable X has a probability density function f(x), and F(x) is the corresponding cumulative distribution function, then

begin{align}

operatorname{Var}(X) = sigma^2 &= int_{R} (x-mu)^2 f(x) , dx [4pt]
&= int_{R} x^2f(x),dx -2muint_{R} xf(x),dx + mu^2int_{R} f(x),dx [4pt]
&= int_{R} x^2 ,dF(x) - 2 mu int_{R} x ,dF(x) + mu^2 int_{R} ,dF(x) [4pt]
&= int_{R} x^2 ,dF(x) - 2 mu cdot mu + mu^2 cdot 1 [4pt]
&= int_{R} x^2 ,dF(x) - mu^2,
end{align}

or equivalently,

operatorname{Var}(X) = int_{R} x^2 f(x) ,dx - mu^2 ,

where mu is the expected value of X given by

mu = int_{R} x f(x) , dx = int_{R} x , d F(x).

In these formulas, the integrals with respect to dx and dF(x)are Lebesgue and Lebesgueâ€“Stieltjes integrals, respectively.If the function x^2f(x) is Riemann-integrable on every finite interval [a,b]subsetR, then

operatorname{Var}(X) = int^{+infty}_{-infty} x^2 f(x) , dx - mu^2,

where the integral is an improper Riemann integral.

Examples

Exponential distribution

The exponential distribution with parameter {{mvar|λ}} is a continuous distribution whose probability density function is given by

f(x) = lambda e^{-lambda x}

on the interval {{math|[0, ∞)}}. Its mean can be shown to be

operatorname{E}[X] = int_0^infty x lambda e^{-lambda x} , dx = frac{1}{lambda}.

Using integration by parts and making use of the expected value already calculated, we have:

begin{align}

operatorname{E}left[X^2right] &= int_0^infty x^2 lambda e^{-lambda x} , dx
&= left[ -x^2 e^{-lambda x} right]_0^infty + int_0^infty 2xe^{-lambda x} ,dx
&= 0 + frac{2}{lambda}operatorname{E}[X]
&= frac{2}{lambda^2}.

end{align}Thus, the variance of {{mvar|X}} is given by

operatorname{Var}(X) = operatorname{E}left[X^2right] - operatorname{E}[X]^2 = frac{2}{lambda^2} - left(frac{1}{lambda}right)^2 = frac{1}{lambda^2}.

Fair die

A fair six-sided die can be modeled as a discrete random variable, {{mvar|X}}, with outcomes 1 through 6, each with equal probability 1/6. The expected value of {{mvar|X}} is (1 + 2 + 3 + 4 + 5 + 6)/6 = 7/2. Therefore, the variance of {{mvar|X}} is

begin{align}

operatorname{Var}(X) &= sum_{i=1}^6 frac{1}{6}left(i - frac{7}{2}right)^2 [5pt]
&= frac{1}{6}left((-5/2)^2 + (-3/2)^2 + (-1/2)^2 + (1/2)^2 + (3/2)^2 + (5/2)^2right) [5pt]
&= frac{35}{12} approx 2.92.

end{align}The general formula for the variance of the outcome, {{mvar|X}}, of an {{nowrap|{{mvar|n}}-sided}} die is

begin{align}

operatorname{Var}(X) &= operatorname{E}left(X^2right) - (operatorname{E}(X))^2 [5pt]
&= frac{1}{n}sum_{i=1}^n i^2 - left(frac{1}{n}sum_{i=1}^n iright)^2 [5pt]
&= frac{(n + 1)(2n + 1)}{6} - left(frac{n + 1}{2}right)^2 [4pt]
&= frac{n^2 - 1}{12}.

end{align}

Commonly used probability distributions

The following table lists the variance for some commonly used probability distributions.{| class=“wikitable”! Name of the probability distribution! Probability distribution function! Mean! Variance

| Binomial distribution| Pr,(X=k) = binom{n}{k}p^k(1 - p)^{n-k}| np! np(1 - p)

| Geometric distribution| Pr,(X=k) = (1 - p)^{k-1}p| frac{1}{p}! frac{(1 - p)}{p^2}

| Normal distribution| fleft(x mid mu, sigma^2right) = frac{1}{sqrt{2pisigma^2}} e^{-frac{(x - mu)^2}{2sigma^2}}| mu! sigma^2

| Uniform distribution (continuous)| f(x mid a, b) = begin{cases}

frac{1}{b - a} & text{for } a le x le b, [3pt]
0 & text{for } x < a text{ or } x > b
end{cases}| frac{a + b}{2}

! frac{(b - a)^2}{12}

| Exponential distribution| f(x mid lambda) = lambda e^{-lambda x}| frac{1}{lambda}! frac{1}{lambda^2}

| Poisson distribution| f(k mid lambda) = frac{e^{-lambda}lambda^{k}}{k!}| lambda ! lambda

Properties

Basic properties

Variance is non-negative because the squares are positive or zero:

operatorname{Var}(X)ge 0.

The variance of a constant is zero.

operatorname{Var}(a) = 0.

Conversely, if the variance of a random variable is 0, then it is almost surely a constant. That is, it always has the same value:

operatorname{Var}(X)= 0 iff exists a : P(X=a) = 1.

Issues of finiteness

If a distribution does not have a finite expected value, as is the case for the Cauchy distribution, then the variance cannot be finite either. However, some distributions may not have a finite variance, despite their expected value being finite. An example is a Pareto distribution whose index k satisfies 1 < k leq 2.

Decomposition

The general formula for variance decomposition or the law of total variance is: If X and Y are two random variables, and the variance of X exists, then

operatorname{Var}[X]=operatorname{E}(operatorname{Var}[Xmid Y])+operatorname{Var}(operatorname{E}[Xmid Y]).

The conditional expectation operatorname E(Xmid Y) of X given Y, and the conditional variance operatorname{Var}(Xmid Y) may be understood as follows. Given any particular value y of the random variable Y, there is a conditional expectation operatorname E(Xmid Y=y) given the event Y = y. This quantity depends on the particular value y; it is a function g(y) = operatorname E(Xmid Y=y). That same function evaluated at the random variable Y is the conditional expectation operatorname E(Xmid Y) = g(Y).In particular, if Y is a discrete random variable assuming possible values y_1, y_2, y_3 ldots with corresponding probabilities p_1, p_2, p_3 ldots, , then in the formula for total variance, the first term on the right-hand side becomes

operatorname{E}(operatorname{Var}[X mid Y]) = sum_i p_i sigma^2_i,

where sigma^2_i = operatorname{Var}[X mid Y = y_i]. Similarly, the second term on the right-hand side becomes

operatorname{Var}(operatorname{E}[X mid Y]) = sum_i p_i mu_i^2 - left(sum_i p_i mu_iright)^2 = sum_i p_i mu_i^2 - mu^2,

where mu_i = operatorname{E}[X mid Y = y_i] and mu = sum_i p_i mu_i. Thus the total variance is given by

operatorname{Var}[X] = sum_i p_i sigma^2_i + left( sum_i p_i mu_i^2 - mu^2 right).

A similar formula is applied in analysis of variance, where the corresponding formula is

mathit{MS}_text{total} = mathit{MS}_text{between} + mathit{MS}_text{within};

here mathit{MS} refers to the Mean of the Squares. In linear regression analysis the corresponding formula is

mathit{MS}_text{total} = mathit{MS}_text{regression} + mathit{MS}_text{residual}.

This can also be derived from the additivity of variances, since the total (observed) score is the sum of the predicted score and the error score, where the latter two are uncorrelated.Similar decompositions are possible for the sum of squared deviations (sum of squares, mathit{SS}):

mathit{SS}_text{total} = mathit{SS}_text{between} + mathit{SS}_text{within}, mathit{SS}_text{total} = mathit{SS}_text{regression} + mathit{SS}_text{residual}.

Calculation from the CDF

The population variance for a non-negative random variable can be expressed in terms of the cumulative distribution function F using

2int_0^infty u(1 - F(u)),du - left(int_0^infty (1 - F(u)),duright)^2.

This expression can be used to calculate the variance in situations where the CDF, but not the density, can be conveniently expressed.

Characteristic property

The second moment of a random variable attains the minimum value when taken around the first moment (i.e., mean) of the random variable, i.e. mathrm{argmin}_m,mathrm{E}left(left(X - mright)^2right) = mathrm{E}(X). Conversely, if a continuous function varphi satisfies mathrm{argmin}_m,mathrm{E}(varphi(X - m)) = mathrm{E}(X) for all random variables X, then it is necessarily of the form varphi(x) = a x^2 + b, where {{nowrap|a > 0}}. This also holds in the multidimensional case.JOURNAL, Kagan, A., Shepp, L. A., 10.1016/S0167-7152(98)00041-8, Why the variance?, Statistics & Probability Letters, 38, 4, 329â€“333, 1998,

Units of measurement

Unlike the expected absolute deviation, the variance of a variable has units that are the square of the units of the variable itself. For example, a variable measured in meters will have a variance measured in meters squared. For this reason, describing data sets via their standard deviation or root mean square deviation is often preferred over using the variance. In the dice example the standard deviation is {{math|{{sqrt|2.9}} â‰ˆ 1.7}}, slightly larger than the expected absolute deviation of 1.5.The standard deviation and the expected absolute deviation can both be used as an indicator of the “spread” of a distribution. The standard deviation is more amenable to algebraic manipulation than the expected absolute deviation, and, together with variance and its generalization covariance, is used frequently in theoretical statistics; however the expected absolute deviation tends to be more robust as it is less sensitive to outliers arising from measurement anomalies or an unduly heavy-tailed distribution.

Propagation

Addition and multiplication by a constant

Variance is invariant with respect to changes in a location parameter. That is, if a constant is added to all values of the variable, the variance is unchanged:

operatorname{Var}(X+a)=operatorname{Var}(X).

If all values are scaled by a constant, the variance is scaled by the square of that constant:

operatorname{Var}(aX)=a^2operatorname{Var}(X).

The variance of a sum of two random variables is given by

operatorname{Var}(aX + bY)=a^2operatorname{Var}(X)+b^2operatorname{Var}(Y)+2ab, operatorname{Cov}(X,Y)

operatorname{Var}(aX - bY)=a^2operatorname{Var}(X)+b^2operatorname{Var}(Y)-2ab, operatorname{Cov}(X,Y)

where operatorname{Cov}(X,Y) is the covariance.

Linear combinations

In general, for the sum of N random variables {X_1,dots,X_N}, the variance becomes:

operatorname{Var}left(sum_{i=1}^N X_iright)=sum_{i,j=1}^Noperatorname{Cov}(X_i,X_j)=sum_{i=1}^Noperatorname{Var}(X_i)+ 2 sum_{ine j}operatorname{Cov}(X_i,X_j),

begin{align}operatorname{Var}left( sum_{i=1}^N a_iX_iright) &=sum_{i,j=1}^{N} a_ia_joperatorname{Cov}(X_i,X_j) &=sum_{i=1}^N a_i^2operatorname{Var}(X_i)+sum_{inot=j}a_ia_joperatorname{Cov}(X_i,X_j)& =sum_{i=1}^N a_i^2operatorname{Var}(X_i)+2sum_{1le i| align = right | direction = vertical | width = 250 | image1 = Scaled chi squared.svg | width1 =| alt1 =| caption1 =| image2 = Scaled chi squared cdf.svg | width2 =| alt2 =| caption2 = Distribution and cumulative distribution of S

- content above as imported from Wikipedia
- "variance" does not exist on GetWiki (yet)
- time: 10:20am EDT - Wed, May 22 2024

[ this remote article is provided by Wikipedia ]

CONNECT