While there have been many great tutorials online that I've used, this one is mostly from the "Machine Learning Full Course - Learn Machine Learning 10 Hours | Machine Learning Tutorial | Edureka" on YouTube. Some of the other sites I've used are also within the references:

This post I'm rebuilding the Linear Regression algorithm and in the next post we use Sickit learn's Linear Regression

#!/usr/bin/env python3 ''' This code is based on me learning more about Linear Regression This is part of me expanding my knowledge on machine learning In this version I'm rebuilding the algorithm Author: Nik Alleyne blog: www.securitynik.com filename: linearRegresAlgo_v2.py ''' import numpy as np import pandas as pd from matplotlib import pyplot as plt plt.rcParams['figure.figsize'] = (20.0, 10.0) def main(): print('*[*] Beginning Linear regresion ...') # Reading Data - This file was downloaded fro GitHub. # See the reference section for the URL df = pd.read_csv('./headbrain.csv',sep=',', dtype='int64', verbose=True) #Gather information on the shape of the datset print('[*] {} rows, columns in the training dataset'.format(df.shape)) print('[*] First 10 records of the training dataset') print(df.head(10)) #Let's now create the X and Y axis using X = df['Head Size(cm^3)'].values Y = df['Brain Weight(grams)'].values #Find the mean of X and Y mean_x = np.mean(X) mean_y = np.mean(Y) print('[*] The mean of X is {} || The mean of Y is {} '.format(mean_x, mean_y)) # Calculating the coefficients # See formula here https://support.minitab.com/en-us/minitab-express/1/help-and-how-to/modeling-statistics/regression/how-to/multiple-regression/methods-and-formulas/methods-and-formulas/#coefficient-coef` numerator = 0 denominator = 0 for i in range(len(X)): numerator += ((X[i] - mean_x) * (Y[i] - mean_y)) denominator += (X[i] - mean_x) ** 2 b1 = numerator / denominator b0 = mean_y - (b1 * mean_x) print('[*] Coefficients:-> Brain Weight (b1): {} || Head size (b0): {}'.format(b1, b0)) # When compared to the equation y = mx+c, we can say m = b1 & c = b0 # create the graph max_x = np.max(X) + 100 min_x = np.min(X) - 100 # Calculating line values x and y x = np.linspace(min_x, max_x, 1000) y = b0 + b1 * x #plotting the line plt.plot(x,y, color='r', label='Regression Line') plt.scatter(X, Y, c='b', label='Scatter Plot') plt.xlabel('Head Size(cm^3)') plt.ylabel('Brain Weight(grams)') plt.legend() plt.show() # Let's now use the R2 method to determine how good the model is # Formula can be found here # https://support.minitab.com/en-us/minitab-express/1/help-and-how-to/modeling-statistics/regression/how-to/multiple-regression/methods-and-formulas/methods-and-formulas/#coefficient-coef ss_total = 0 ss_error = 0 for i in range(len(X)): y_pred = b0 + b1 * X[i] ss_total += (Y[i] - mean_y) ** 2 ss_error += (Y[i] - y_pred) ** 2 r_sq = 1 - (ss_error/ss_total) print('[*] Your R2 squared value is: {}'.format(r_sq)) if __name__ == '__main__': main()

When we run the code, we get:

root@securitynik:~/ML# ./linearRegresAlgo_v2.py | more *[*] Beginning Linear regresion ... Tokenization took: 0.06 ms Type conversion took: 0.23 ms Parser memory cleanup took: 0.00 ms [*] (237, 4) rows, columns in the training dataset [*] First 10 records of the training dataset Gender Age Range Head Size(cm^3) Brain Weight(grams) 0 1 1 4512 1530 1 1 1 3738 1297 2 1 1 4261 1335 3 1 1 3777 1282 4 1 1 4177 1590 5 1 1 3585 1300 6 1 1 3785 1400 7 1 1 3559 1255 8 1 1 3613 1355 9 1 1 3982 1375 [*] The mean of X is 3633.9915611814345 || The mean of Y is 1282.873417721519 [*] Coefficients:-> Brain Weight (b1): 0.26342933948939945 || Head size (b0): 325.57342104944223 [*] Your R2 squared value is: 0.6393117199570003

That's it, my first shot at machine learning. Next post we use Sickit Learn rather than build the algorithm ourselves.

References:

https://www.youtube.com/watch?v=GwIo3gDZCVQ&list=PL9ooVrP1hQOHUfd-g8GUpKI3hHOwM_9Dn&index=1

https://matplotlib.org/3.1.1/tutorials/introductory/customizing.html#sphx-glr-tutorials-introductory-customizing-py

Headbrain.csv

read_csv

Calculating Coefficient

R2

## No comments:

## Post a Comment