Contents

## The classification problem and the logistic regression

**The problem – We want to classify whether or not a person will default on his/her credit card payment based on his/her credit card balance. We are using the data from a simulated dataset default, and for understanding, we assume that the credit card default (YES/NO) depends on the credit card balance only.**

## From the problem to a math problem

**P\left(\frac{default=yes}{balance}\right)**– (1)

## Conditional probability as a logistic model

If we rearrange the logistic function, we get

\frac{p\left(X\right)}{1-p\left(X\right)}={e}^{{\beta}_{0}+{\beta}_{1}X}

The quantity \frac{p\left(X\right)}{1-p\left(X\right)} is called the odds and can take any value from 0 to infinity. Values close to 0 and infinity shows that there are very low and very high chances of default, respectively. If we take log on both the sides of the above equation, we find that

\mathrm{log}\left[\frac{p\left(X\right)}{1-P\left(X\right)}\right]=\beta o+{\beta}_{1}X

The quantity on the left-hand side is called the log-odds or logit and is linearly dependent on X.

## Estimation of the logistic regression coefficients and maximum likelihood

As the p-value for both the coefficients is very low, hence they both are significant. Using this table, we can conclude that there is a relation between the credit card default and the credit card balance and is given below

\hat{p}\left(x\right)=\frac{{e}^{-10.6513+0.0055x}}{1+{e}^{-10.6513+0.0055x}}

## Making predictions of the class

## Conclusion

*Reference- James, Gareth, et al. An introduction to statistical learning. Vol. 112. New York: springer, 2013.*
## No Responses Yet