Skip to content

Conversation

@Shubham-Agarwall
Copy link
Collaborator

No description provided.

Copy link
Collaborator

@EricThomson EricThomson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is coming along great, thanks a lot! Just a few things:
1. More on error metrics
If you could say more about error metrics when you introduce them that would be helpful. It's just one line now about MSE and R^2. Since you have RMSE (which is a better measure) later, I think introducing MSE and then RMSE would be good when you do that. RMSE is better becuase it's in the same units as the original dataset. So, you have a nice image for MSE, and then RMSE we could handle with something like:

MSE is the average of squared errors. Squaring makes big misses matter more, but it also makes the units weird (dollars^2, mpg^2). So most people look at RMSE = sqrt(MSE). RMSE is in the same units as the target: “our price predictions are off by about $48K on average.” If you want a number with real-world meaning, use RMSE.

Then someone may wonder why we need R^2 if we have RMSE. They are related, but we could say something like:

R2 compares your model to a super-simple model that always predicts the average of y. Think of that baseline as your “just guess the average” strategy. Each of these models has an error measured by MSE. R2 tells you how much smaller the model's error is compared to the baseline error!

R2 = 1 - (your model’s MSE) / (baseline’s MSE)

If R2 = 0.72, your model got rid of 72% of the error you would make by always guessing the average. If R2 = 0, your model is no better than guessing the average. In more plain language, R2 measures the fraction of the original error (measured as MSE) that the model removes.

One thing you might point out that R2 is literally the square of the person correlation coefficient that you discussed back in week 1! This makes intuitive sense: if the data all fall along a line, then the regression model is a great fit! We could say something here about this connecting the two lessons:

We learned in Week 1 that correlation measures how strongly two variables move together. In simple one-feature regression, R2 has a direct connection to correlation: it is literally the square of the Pearson correlation between the feature and the target. So if the correlation is strong, R2 will be high. If the correlation is weak, R2 will be low.

Also, we can add something to the lesson that explains how regression is actually performed: it is done using least squares, literally finding the "best" line, the one that minimizes MSE, so we could say something like this:

Linear regression finds the straight line that keeps the prediction errors (in the MSE sense) as small as possible, using a method called least squares.

It would be useful to mention this, so the method that scikit-learn uses to find the line isn't a mystery.

2. The code example
The code example is pretty good, I just recommend that instead of having an explanation first then a bunch of code, to have more explanation interspersed with the code. Also, it says things like "The model is doing an okay job; Predictions are reasonable, not perfect". This is pretty generic, let's add some numbers in there and discuss what they mean more concretely.

Just more generally, if you can fill out this example with more walkthrough and explanation that would be great.

  1. Minor: images are not rendering. Not sure why I didn't look at the paths.

Thanks this is looking good!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants