Logistic Regression in python

Published 06-14-2017 00:00:00

Explained

Logistic regression is a supervised classification algorithm and therefore is useful for estimating discrete values. It is typically used for predicting the probability of an event using the logistic function in order to get an output between 0 and 1.

When first learning this logistic regression, I was under the impression that it was a sort of a niche thing and therefore I didn’t give it my full attention. In hindsight, I couldn’t have been more wrong. Some of the underlying aspects of logistic regression come up in many other important machine learning algorithms like neural networks. With this in mind, go ahead and check out the video below for more.

Youtube - Basic ideas of logitic regression

Now that you’ve got a grasp on the concepts behind logistic regression, let’s implement it in Python.

Getting Started

from sklearn.linear_model import LogisticRegression
df = pd.read_csv(‘logistic_regression_df.csv’)
df.columns = [‘X’, ‘Y’]
df.head()

Visualization

sns.set_context(“notebook”, font_scale=1.1)
sns.set_style(“ticks”)
sns.regplot(‘X’,’Y’, data=df, logistic=True)
plt.ylabel(‘Probability’)
plt.xlabel(‘Explanatory’)

Implementation

logistic = LogisticRegression()
X = (np.asarray(df.X)).reshape(-1, 1)
Y = (np.asarray(df.Y)).ravel()
logistic.fit(X, Y)
logistic.score(X, Y)
print(‘Coefficient: \n’, logistic.coef_)
print(‘Intercept: \n’, logistic.intercept_)
print(‘R² Value: \n’, logistic.score(X, Y))