Overview
There are good libraries to make machine learning model and usually, it’s enough to use those to attain the goal you set on the model.
It’s not necessary to write algorithm by yourself. To say precisely, to write and use your full-scratch written model makes more bugs than prevalent library’s one. So you should use prevalent libraries except for the time that those don’t fulfill what you want to get.
But to deepen your understandings and knowledge to machine leaning, writing existing algorithm by yourself is very good trial.
Here, I show how to write perceptron algorithm.
Why perceptron
Although there are many machine learning algorithms, I chose perceptron as first full-scratch trial. Perceptron is relatively easy to write and the system of that is very fundamental to other algorithms.
Many people think to write machine learning algorithm is mathematically difficult, takes much time, needs long codes. But peceptron is not. This needs basic linear-algebra knowledge. Just this.
So, this is very good theme as choice of first full-scratch trial.
What is perceptron
Perceptron is the algorithm which takes input data and outputs the predicted class the data to belong to. To say precisely, the procedure is as following.
- get input and make linear combination value with weights
- pass the linear combination to the acrivation function
The drawing below shows how data flows.
Only parameters we need to care about are those weights. On training phase, the model updates the weights and tries to find weights to separate data well.
means activation function. This takes linear combination of weights and data as argument.
Code
import numpy as np
class Perceptron:
def __init__(self, eta=0.1, iter_num=100):
self.eta = eta
self.iter_num = iter_num
@staticmethod
def activate(linear_combination):
return np.where(linear_combination >= 0, 1, -1)
def predict(self, x):
linear_combination = np.dot(x, self.weights[1:]) + self.weights[0]
y_pred = Perceptron.activate(linear_combination)
return y_pred
def fit(self, X, Y):
self.weights = np.zeros(1 + X.shape[1])
for _ in range(self.iter_num):
self.error = 0
for x, y in zip(X, Y):
y_pred = self.predict(x)
update = self.eta * (y - y_pred)
self.weights[1:] += update * x
self.weights[0] += update
self.error += int(update != 0.0)
print(self.error/len(Y))
return self
from sklearn import datasets
# prepare for data
iris = datasets.load_iris()
features = iris.data
iris.target = np.where(iris.target == 0, -1, 1)
perceptron = Perceptron()
perceptron.fit(features, iris.target)
Let’s check one by one.
def __init__(self, eta=0.1, iter_num=100):
self.eta = eta
self.iter_num = iter_num
In init(), eta and iter_num are set. eta defines the update width of weights. iter_num is the iteration number of the time in train.
@staticmethod
def activate(linear_combination):
return np.where(linear_combination >= 0, 1, -1)
This is activation function. It takes linear combination and outputs 1 or -1. The threshold is 0.
def predict(self, x):
linear_combination = np.dot(x, self.weights[1:]) + self.weights[0]
y_pred = Perceptron.activate(linear_combination)
return y_pred
This function is to predict.
linear_combination is the dot product of input data and weights. self.weights[0] is the bias item. y_pred is the outcome of prediction.
def fit(self, X, Y):
self.weights = np.zeros(1 + X.shape[1])
for _ in range(self.iter_num):
self.error = 0
for x, y in zip(X, Y):
y_pred = self.predict(x)
update = self.eta * (y - y_pred)
self.weights[1:] += update * x
self.weights[0] += update
self.error += int(update != 0.0)
print(self.error/len(Y))
return self
This function is to train model by data. On the first line, weights are initialized by 0. The number of weights is the data’s variable number + 1.
This +1 is for bias item.
By the system above, the model’s weights are updated.