Welcome to the 27th part of our machine learning tutorial series and the next part in our Support Vector Machine section. In this tutorial, we're going to continue working on the SVM optimization problem in python code.
Where we left off, our code was:
import matplotlib.pyplot as plt from matplotlib import style import numpy as np style.use('ggplot') class Support_Vector_Machine: def __init__(self, visualization=True): self.visualization = visualization self.colors = {1:'r',-1:'b'} if self.visualization: self.fig = plt.figure() self.ax = self.fig.add_subplot(1,1,1) # train def fit(self, data): self.data = data # { ||w||: [w,b] } opt_dict = {} transforms = [[1,1], [-1,1], [-1,-1], [1,-1]] all_data = [] for yi in self.data: for featureset in self.data[yi]: for feature in featureset: all_data.append(feature) self.max_feature_value = max(all_data) self.min_feature_value = min(all_data) all_data = None step_sizes = [self.max_feature_value * 0.1, self.max_feature_value * 0.01, # point of expense: self.max_feature_value * 0.001,] # extremely expensive b_range_multiple = 5 # b_multiple = 5 latest_optimum = self.max_feature_value*10 for step in step_sizes: w = np.array([latest_optimum,latest_optimum]) # we can do this because convex optimized = False while not optimized: pass def predict(self,features): # sign( x.w+b ) classification = np.sign(np.dot(np.array(features),self.w)+self.b) return classification data_dict = {-1:np.array([[1,7], [2,8], [3,8],]), 1:np.array([[5,1], [6,-1], [7,3],])}
Picking up with the while not optimized
part:
optimized = False while not optimized: for b in np.arange(-1*(self.max_feature_value*b_range_multiple), self.max_feature_value*b_range_multiple, step*b_multiple):
Here, we begin also iterating through possible b values, and now you can see our b values we set earlier in action. I will note here that we're straight iterating through b with a constant step-size. We could also break down b steps just like we did with w. To make things more accurate and precise, you probably would want to implement that. That said, I am going to skip doing that for brevity, since we'll achieve similar results either way and we're not trying to win any awards here.
Adding furhter:
optimized = False while not optimized: for b in np.arange(-1*(self.max_feature_value*b_range_multiple), self.max_feature_value*b_range_multiple, step*b_multiple): for transformation in transforms: w_t = w*transformation found_option = True # weakest link in the SVM fundamentally # SMO attempts to fix this a bit # yi(xi.w+b) >= 1 # # #### add a break here later.. for i in self.data: for xi in self.data[i]: yi=i if not yi*(np.dot(w_t,xi)+b) >= 1: found_option = False if found_option: opt_dict[np.linalg.norm(w_t)] = [w_t,b]
Now we iterate through each of the transformations, testing each of them against our constraint requirements. If any of the featuresets within our data don't meet our constraints, then we toss the variables as they don't fit and we move on. I commented in a suggestion for a break here. If just one variable doesn't work you might as well give up on the rest since just 1 that doesn't fit is enough to toss the values for w and b. You could break there, as well as in the preceeding for loop. For now, I will leave the code as I originally had it, but I thought of the change whenever I was filming the video version.
Now we finish off the fit
method, which I will post in full and explain the additions:
def fit(self, data): self.data = data # { ||w||: [w,b] } opt_dict = {} transforms = [[1,1], [-1,1], [-1,-1], [1,-1]] all_data = [] for yi in self.data: for featureset in self.data[yi]: for feature in featureset: all_data.append(feature) self.max_feature_value = max(all_data) self.min_feature_value = min(all_data) all_data = None # support vectors yi(xi.w+b) = 1 step_sizes = [self.max_feature_value * 0.1, self.max_feature_value * 0.01, # point of expense: self.max_feature_value * 0.001,] # extremely expensive b_range_multiple = 5 # we dont need to take as small of steps # with b as we do w b_multiple = 5 latest_optimum = self.max_feature_value*10 for step in step_sizes: w = np.array([latest_optimum,latest_optimum]) # we can do this because convex optimized = False while not optimized: for b in np.arange(-1*(self.max_feature_value*b_range_multiple), self.max_feature_value*b_range_multiple, step*b_multiple): for transformation in transforms: w_t = w*transformation found_option = True # weakest link in the SVM fundamentally # SMO attempts to fix this a bit # yi(xi.w+b) >= 1 # # #### add a break here later.. for i in self.data: for xi in self.data[i]: yi=i if not yi*(np.dot(w_t,xi)+b) >= 1: found_option = False if found_option: opt_dict[np.linalg.norm(w_t)] = [w_t,b] if w[0] < 0: optimized = True print('Optimized a step.') else: w = w - step norms = sorted([n for n in opt_dict]) #||w|| : [w,b] opt_choice = opt_dict[norms[0]] self.w = opt_choice[0] self.b = opt_choice[1] latest_optimum = opt_choice[0][0]+step*2
Once we've passed zero with our stepping of the w vector, there's no reason to continue since we've tested the negatives via the transformation, thus we'll be done with that step size and either continue to the next, or be done entirely. If we've not passed 0, then we take another step. Once we've taken all of the steps that we want to take, then we're going to sort a list of all dictionary keys from our opt_dict
(which contains ||w|| : [w,b]). We want the smallest magnitude of vector w, so we go with the first item in that list. We set self.w
and self.b
from here, we set latest optimums, and we either may take another step or be totally done with the entire process (if we have no more steps to take).
At this point, our full code:
import matplotlib.pyplot as plt from matplotlib import style import numpy as np style.use('ggplot') class Support_Vector_Machine: def __init__(self, visualization=True): self.visualization = visualization self.colors = {1:'r',-1:'b'} if self.visualization: self.fig = plt.figure() self.ax = self.fig.add_subplot(1,1,1) # train def fit(self, data): self.data = data # { ||w||: [w,b] } opt_dict = {} transforms = [[1,1], [-1,1], [-1,-1], [1,-1]] all_data = [] for yi in self.data: for featureset in self.data[yi]: for feature in featureset: all_data.append(feature) self.max_feature_value = max(all_data) self.min_feature_value = min(all_data) all_data = None # support vectors yi(xi.w+b) = 1 step_sizes = [self.max_feature_value * 0.1, self.max_feature_value * 0.01, # point of expense: self.max_feature_value * 0.001,] # extremely expensive b_range_multiple = 5 # we dont need to take as small of steps # with b as we do w b_multiple = 5 latest_optimum = self.max_feature_value*10 for step in step_sizes: w = np.array([latest_optimum,latest_optimum]) # we can do this because convex optimized = False while not optimized: for b in np.arange(-1*(self.max_feature_value*b_range_multiple), self.max_feature_value*b_range_multiple, step*b_multiple): for transformation in transforms: w_t = w*transformation found_option = True # weakest link in the SVM fundamentally # SMO attempts to fix this a bit # yi(xi.w+b) >= 1 # # #### add a break here later.. for i in self.data: for xi in self.data[i]: yi=i if not yi*(np.dot(w_t,xi)+b) >= 1: found_option = False if found_option: opt_dict[np.linalg.norm(w_t)] = [w_t,b] if w[0] < 0: optimized = True print('Optimized a step.') else: w = w - step norms = sorted([n for n in opt_dict]) #||w|| : [w,b] opt_choice = opt_dict[norms[0]] self.w = opt_choice[0] self.b = opt_choice[1] latest_optimum = opt_choice[0][0]+step*2 def predict(self,features): # sign( x.w+b ) classification = np.sign(np.dot(np.array(features),self.w)+self.b) return classification data_dict = {-1:np.array([[1,7], [2,8], [3,8],]), 1:np.array([[5,1], [6,-1], [7,3],])}
Now we're ready to visualize and test predictions with our support vector machine. We'll pick that up in the next tutorial.