Linear Regression Refresher
Part 1 - The Framework
January 01, 2020
Feel like you know what linear regression is but didn’t learn the math, or just forgot and need a refresher? Join the club! We’ve all been there, whether we admit it or not :) A little math knowledge is assumed but my hope is that the parts that may be unfamiliar aren’t necessary to get the big picture.
Gist
Linear regression is a method for modelling a process with a quantitative output observed data and simple assumptions.
Input -> System -> Output
Input -> System + Noise -> Output
Learning Regression
If you are just looking for a refresher , you can skip this section.
I think one big stumbling block to learning about linear regression, especially for the first time, is that it’s usually the introduction of several different concepts at once, all of which actually are pretty important! Whether or not these exact terms are used, to understand linear regression end-to-end you have to understand:
- loss functions
- function minimizations
The Regression Problem
Let’s consider a minimal version of the regression problem: given our data in the form of a dependent variable and an independent variable 1, we want to find a function so that we can model the unknown system that generated our data. Our model looks like this,
where represents the error that comes from imperfect data2. This formulation is at the heart of everything we are doing3, so let’s linger here for a minute.
What is ?
Typically there is no actual hope of finding the “real” - we can’t possibly account for all the factors that go into real-world phenomena! There are unmeasured variables, changing conditions, and interns who knock over test tubes in the lab… there is always noise. So what we do instead is estimate , to make inferences and predictions about . The equation we actually deal with is then
I will keep using when talking about the function of we are trying to find though.
We will assume as usual that is Normally distributed and is independent from . Why is this important? Well one thing we often are interested is the expected value of Y given an input. Formulating this will help us figure out where to go next:
so what we see is that all we have to do to estimate the expected value of is calculate - pretty cool!.
When is a linear function of , we call this task linear regression.
Assumption: Our Data Is Imperfect
How We Solving The Regression Problem
We don’t usually want just any function though - we want the best function! As mathematicians, that word “best” should raise a lot of questions, and this is where we need the concept of a loss function. Without going too deep, a loss function provides the framework for evaulating the performance of and defines what “best” means by answering the question how off is our guess of ?. We then have two tasks to solve in our simplified regression problem:
- Choose an appropriate loss function
- Find the function such that the value of is the lowest
In introductory stats, the loss function we use is almost always the squared error loss. Often we are not even taught that this part of the :sparkles: art :sparkles: of regression! It is common enough in today’s ML world that I think it’s useful to not take it as a given here.
Squared Errors
Minimizing Squared Errors, Conditional Mean
Why not another loss function?
kjhsokfhsagld