Transcript for:
Minimum Possible Value of Correlation (Rho)

x y and z are three random variables with mutual pairwise correlation rho what is the minimum possible value that the row can take before moving on with the actual question let's establish some base concepts and terminology if you are comfortable with the notions of correlation covariance and positive semi-definite matrices you can skip right ahead as a refresher let us define the correlation between two random variables divide the covariance of x and y by the product of x and y's standard deviations the formula for the covariance is the expectation of the multiplied center versions of x and y expand the product apply the linearity of the expectation in the numerator write the standard deviations as the square root of the formula for variance and you will get an alternative equation for the correlation coefficient here is one concise example to quiz your newly acquired knowledge if we look at the definition of covariance and correlation we can see that the correlation is not bounded since it scales with the expectation of each variable this implies that if the question were about the minimum covariance the answer was undefined since it could take any real value on the other side the correlation is always in the interval -1 1 with one proof of it up on the problems web page we define the correlation matrix of three variables with the possibility to generalize to more as the matrix with elements the pairwise correlations of the variables one of the basic properties of the correlation matrix is that it is positive semi-definite a proof of this that uses the definition of such matrix can be as well found on our website next we define the terms minor and leading principle minor which are the determinants of the matrix obtained by deleting the last and minus k rows and columns from the original one in the end we need one less property of a positive semi-definite matrix all its leading principle miners are non-negative when asked to give a minimum volume two things must be done find the lower bound and then prove that this bound is attainable by providing an example the loosest bounds for rho are minus one and one we can easily see that rho can be equal to minus one if the pairs x y and y z have a correlation of minus one the correlation between x and z must be one we could also choose x y and z independent in which case rho would be zero so our minimal value is in the interval open at minus one between minus one and zero to get a tighter inequality we use the necessary properties of the correlation matrix outlined in the first part of the video write the correlation matrix of x y and z and set the conditions for it to be positive semi-definite recursively eliminate trailing rows and columns to get the principal leading minors the first is one clearly non-negative the second is 1 minus rho squared which is non-negative for values of rho between -1 and 1. this condition on the correlation is already satisfied due to the previously mentioned constraints the third and last one is the determinant of the matrix itself we use saros's rule to compute a determinant we expand the polynomial and get that 1 minus rho squared times one plus two row is greater than or equal to zero this implies that either row is one or one plus two row is greater than or equal to zero so rho is greater than or equal to minus one divided by 2. now we have reduced the interval for rho and the minimal value that it can take is at least -1 divided by 2. if we find the triplet of random variables with pairwise correlations of minus 0.5 we have proved that this value is attainable hence the minimum unfortunately this is the tricky part one way to construct such variables is let a1 a2 and a3 be independent identically distributed standard uniform random variables and consider x i to be a i minus the mean of the variables a for each value of i between 1 and 3. if we expand the average and compute the coefficients we get a simplified formula for the x i's we compute the variance of the x i's by using the formula for the linear combination of independent random variables in our case the a i's the variance of a i is 1 and the covariance of a i n a j is 0. so the variance of x i is 2 3 for any value of i thinking back to the correlation formula we are missing the covariance between x i and x k we again linearly expand the covariance keeping in mind that the covariance of independent random variables is zero and the covariance of a random variable with itself is its variance the result for the covariance of x i and xk is minus one divided by three putting the two formulas together we get the correlation of x i and x k is minus 1 divided by 2. this example implies that the value of -0.5 is attainable hence it is the minimum value that rho can take we have the correct result for the case of three random variables can we generalize it how about the minimum value of row when we have n random variables with pairwise correlations equal just as before we consider the correlation matrix and its properties the determinant of it must be at least zero computing it is not as trivial since there is no generalized formula for the determinant of an n by n matrix we can use the decomposition along a column an induction to prove that this value of d is 1 minus rho to the power of n minus 1 times 1 plus and minus 1 times rho for this to be greater than or equal to 0 we must have that row is at least -1 divided by n minus 1. we construct again the random variables x i as the difference between a i and the mean of the a's where a i are iid standard uniform with a similar rationing as in the previous part we compute the variance of x i it is n minus 1 divided by n the covariance of x i and x k turns out to be -1 divided by n from the correlation formula we obtain the correlation between any distinct x i and x k to be minus 1 divided by n minus 1 just the lower bound we observed before this generalization is consistent with our results for n equals three at the same time we can see that the value of the minimal correlation converges to zero when n goes to infinity this aligns with the natural idea that adding more random variables to a set will increase the minimum correlation thanks for watching if you enjoyed this and would love to see more like this video subscribe to the channel and hit the alarm bell to be notified when new videos are released leave any comments about this problem below or on the problems dedicated web page for more info please check the description box below see you next time