Metrics you can use to evaluate your Regression Model:

1.       R Square (Coefficient of Determination) - This metric explains the percentage of variance explained by covariates in the model. It ranges between 0 and 1. But it's a good practice to consider adjusted R² than R² to determine model fit.

2.       Adjusted R²- The problem with R² is that it keeps on increasing as you increase the number of variables, regardless of the fact that the new variable is actually adding new information to the model. To overcome that, we use adjusted R² which doesn't increase (stays same or decrease) unless the newly added variable is truly useful.

3.       F Statistics - It evaluates the overall significance of the model, It is the ratio of explained variance by the model by unexplained variance. It compares the full model with an intercept only (no predictors) model. Its value can range between zero and any arbitrary large number. Naturally, higher the F statistics, better the model.

4.       RMSE / MSE / MAE : Error metric , All these are errors, lower the number, better the model.

·         MSE - This is mean squared error. For example, suppose the actual y is 10 and predictive y is 30, the resultant MSE would be (30-10)² = 400.

·         MAE - This is mean absolute error. Using the previous example, the resultant MAE would be (30-10) = 20

·         RMSE - This is root mean square error. It is interpreted as how far on an average, the residuals are from zero. It nullifies squared effect of MSE by square root and provides the result in original units as data. Here, the resultant RMSE would be (30-10)² = 20. Usually, we calculate these numbers after summing overall values (actual - predicted) from the data.

·         P-Values – p-values are numbers between 0 and 1, it is a threshold about the confidence about the decision, commonly used threshold is 0.05, it means only 5% of the result would be wrong, getting a small p-value where there is no difference is called a False Positive.

Eg: the idea trying to determine if these drugs are same or not is called Hypothesis testing, The Null Hypothesis is the drugs are the same, p-value helps us to decide to reject null hypothesis or not

The p-value for R^2 comes from something called “F”

F= The variation of Y explained by X / The variation of Y not explained by X

F=SS(Mean)-SS(fit)/(P fit – P mean)/SS(Fit)/(n- p fit) this equation will tell us is R^2 is significant.

(P fit – P mean)/ (n- p fit) – is called degrees of freedom

Pfit is no of parameters in the fit line, eg: y= y-intercept+slope (x) then it involves 2 parameter , so pfit = 2

Pmean is no of parameters in the mean line =y =y intercept , then the parameter is 1

Comments

  1. I need some clarification can u call me back on this number 9940088006

    ReplyDelete

Post a Comment

Popular posts from this blog

Hypothesis Testing