Metrics you can use to evaluate
your Regression Model:
1.
R
Square (Coefficient of Determination) - This metric
explains the percentage of variance explained by covariates in the model. It
ranges between 0 and 1. But it's a good practice to consider adjusted R² than
R² to determine model fit.
2.
Adjusted R²- The problem with R² is that it keeps on increasing as you
increase the number of variables, regardless of the fact that the new variable
is actually adding new information to the model. To overcome that, we use
adjusted R² which doesn't increase (stays same or decrease) unless the newly
added variable is truly useful.
3.
F
Statistics - It evaluates the overall significance of the
model, It is the ratio of explained variance by the
model by unexplained variance. It compares the full model with an intercept
only (no predictors) model. Its value can range between zero and any arbitrary large number. Naturally, higher
the F statistics, better the model.
4.
RMSE
/ MSE / MAE : Error metric ,
All these are errors, lower the number, better the model.
·
MSE - This is mean squared error.
For example, suppose the actual y is 10 and predictive y is 30, the resultant
MSE would be (30-10)² = 400.
·
MAE - This is mean absolute error.
Using the previous example, the resultant MAE would be (30-10) = 20
·
RMSE - This is root mean square
error. It is interpreted as how far on an
average, the residuals are from zero. It nullifies squared effect of
MSE by square root and provides the result in original units as data. Here,
the resultant RMSE would be √(30-10)² = 20. Usually, we calculate these numbers after summing overall
values (actual - predicted) from the data.
·
P-Values – p-values are
numbers between 0 and 1, it is a threshold about the confidence about the
decision, commonly used threshold is 0.05, it means only 5% of the result would
be wrong, getting a small p-value where there is no difference is called a
False Positive.
Eg: the idea trying to determine if
these drugs are same or not is called Hypothesis testing, The Null Hypothesis
is the drugs are the same, p-value helps us to decide to reject null hypothesis
or not
The p-value for R^2 comes
from something called “F”
F= The variation of Y explained by X / The variation
of Y not explained by X
F=SS(Mean)-SS(fit)/(P
fit – P mean)/SS(Fit)/(n- p fit) this equation will tell us is R^2 is significant.
(P fit – P
mean)/ (n- p fit) – is called degrees of freedom
Pfit is no of parameters in the fit line, eg: y=
y-intercept+slope (x) then it involves 2 parameter , so pfit = 2
Pmean is no of parameters in the mean line =y =y
intercept , then the parameter is 1

I need some clarification can u call me back on this number 9940088006
ReplyDelete