Concerning linear regression, i.e. the situation where we know that E(Y |X) = X^T β, for some β ∈ R^d

, we

proved in class that the value of β can be found by minimizing the EPE for the problem, namely E((Y−X^T β)^2

), and this minimization gives rise to

β = (E(XX^T))^−1 E(XY )… (1)

Prove this formula directly from the fact that

E(Y |X) = X^T β… (2)

In particular, (a) multiply both sides of (2) by X (from the left), then (b) take the expectation w.r.t. X on

both sides (i.e. you will get E(XE(Y |X)) = E(XX^T β), and (c) show that this implies formula (1).