2.1 What is Statistical Learning?


2.1.1 Why Estimate $f$?

There are two main reasons that we may wish to estimate $f$ : prediction and inference.

Prediction

In many situations, a set of inputs X are readily available, but the output $Y$ cannot be easily obtained. In this setting, since the error term averages to zero, we can predict $Y$ using

$$ \hat{Y}=\hat{f}(X) $$

In this setting,$\hat{f}$ is often treated as a black box, in the sense that one is not typically concerned with the exact form of $\hat{f}$ , provided that it yields accurate predictions for $Y$.

The accuracy of $\hat{Y}$ as a prediction for $Y$ depends on two quantities, which we will call the reducible error and the irreducible error.

Inference

In this situation, we wish to estimate $f$, but our goal is not necessarily to make predictions for $Y$. We want to answer the following questions:

2.1.2 How Do We Estimate $f$?

There are broadly 2 types of statistical learning methods: parametric or non-parametric.