It’s well-known that the average of n i.i.d. r.v.’s with variance σ2 has variance 1nσ2.

Suppose that the r.v.’s Xi’s are only identically distributed with variance σ2 and pairwise correlation ρ, then we have:

V(1ni=1nXi)=1n2(nσ2+n(n1)ρσ2)=σ2n+(n1)ρσ2n=ρσ2+1ρnσ2.

This equation suggests that as the size n increases, though the second term disappears, the first term remains constant. So the averaging over more and more correlated r.v.’s will not keep improving the variance.

Wait a second! Our derivation of the above equation doesn’t depend on sign of ρ. But the above equation seems to fail when ρ<0 since the variance would then be negative for a sufficiently large n. What’s wrong with it?

Let’s consider the simple case when n=3 and ρ=1. First of all, is it even possible to construct such 3 r.v.’s? We see that since Xi’s are identically distributed, if ρX1,X2=1 and ρX1,X3=1, we have X2=X1 and X3=X1. But this contradicts the fact that ρX2,X3=1.

Actually, we cannot have arbitrary number of r.v.’s that are identically distributed with a negative pairwise correlation. We will show it by using the fact that the determinant of a correlation matrix must be positive semi-definite.

Let ρ>0 and the correlation matrices for the case of positive and negative pairwise correlations are denoted by:

Σ(ρ)=(1ρρρ1ρρρ1)andΣ(ρ)=(1ρρρ1ρρρ1),

respectively.

Then we have the following:

detΣ(ρ)=det((1ρ1ρ)+(ρρρ)(ρρρ))=(1+(ρρρ)((1ρ)1(1ρ)1)(ρρρ))(1ρ)n=(1+nρ1ρ)(1ρ)n>0,ρ(0,1),

where we used the matrix determinant lemma: det(A+uv)=(1+vA1u)det(A). It shows that detΣ(ρ) is always positive definite for ρ(0,1). So there’s no problem with such identically distribution r.v.’s with a positive pairwise correlation ρ.

Now, let’s compute determinant of Σ(ρ):

detΣ(ρ)=det((1+ρ1+ρ)+(ρρρ)(ρρρ))=(1+(ρρρ)((1+ρ)1(1+ρ)1)(ρρρ))(1+ρ)n=(1+nρ1+ρ)(1+ρ)n.

To ensure detΣ(ρ)0, we need the condition that

1+nρ1+ρ01+ρnρρ1n1ρ11n.

Thus, we see that, for n r.v.’s that are identically distributed with a negative pairwise correlation ρ, ρ actually cannot be smaller than 11n. For example, 3 identically distributed r.v.’s cannot have a pairwise correlation smaller than 12 and this bound becomes 13 for 4 identically distributed r.v.’s.

Lastly, we make two more observations based on the above findings.

  • For identically distributed r.v.’s Xi’s with pairwise correlation ρ, what are the maximum and minimum values of ρ?
11nρ1.
  • How do we construct such Xi’s with pairwise correlation ρ?

Well, for a positive pairwise correlation, we’ve showed that Σ(ρ) is always positive definite. Thus, there exists a decomposition Σ(ρ)=CC for some matrix C. Then, for any i.i.d. r.v.’s Zi’s with unit variance, the linear transformation X=CZ should have the desired variance-covariance matrix since

V(X)=V(CZ)=CV(Z)C=CC=Σ(ρ).

For a negative pairwise correlation ρ, as long as ρ>11n, we know Σ(ρ) is also positive definite, thus the above construction also works in this case.