Logistic Regression implementation in python

Too much of talks, show me the code:

The complete code for all algorithms can be found here: https://github.com/geekRishabhjain/MLpersonalLibrary/tree/master/RlearnPy

Sigmoid function:

def sigmoid(self, z):
g = 1 / (1 + np.exp(-z))
return g
Simple, you input the matrix z and the function returns the matrix after taking sigmoid of each term.

The Cost function:

def costFunction(self, theta, X, y):
m = len(y)
y.shape = (m, 1)
h = self.sigmoid(X @ theta)
temp = (-y.T @ np.log(h)) - ((1 - y).T @ np.log(1 - h))
J = (1 / m) * temp
grad = (1 / m) * ((((h - y).T) @ X).T)
return J, grad
The h variable first calculates the predictions for the given theta, the next two lines evaluate the cost function, as have been defined earlier. The final term evaluates the grad to be used in the gradient descent.

The gradient descent:

def gradientDescent(self, X, y, theta, alpha, num_iters):
m = y.shape[0]
J_history = []
for i in range(num_iters):
cost, grad = self.costFunction(theta, X, y)
theta = theta - (alpha) * grad # 1*97 into 97*3
return theta, np.array(J_history)
The gradient descent works the same as was discussed in Linear Regression, we return theta after performing the gradient descent on it num_iters time and each time we keep track of the cost for that particular theta, this is stored in J_history. J_history is also returned from the function, this can be used to plot the J_history vs num_iters graph.
'alpha' is the same learning rate as was discussed in Linear Regression, and the same rule applies for choosing it now as did then.
X features matrix, with one extra column of ones. 'y' is the matrix of labels.

The predict function:

def predict(self, theta, X):
p = X.dot(theta) > 0
return p
As was said we predict one of $H>0.5$, that is when $X@theta >0$ else we predict zero. The same is done in the code given above.

