Machine learning SVM - the usefulness of kernels

If you’ve read through how Support Vector Machines work, you probably know the linear simple SVM might not work in all cases… but how does it fail? Let’s take a look at an example I tried like to my simple example… but change it to be a larger space than just 4, and separated with a region in the middle, and the region around it (positive, negative labelled areas to learn):

#See also https://github.com/rogue-hack-lab/PythonMachineLearningSVMExamples/blob/master/coffee.py
from __future__ import division
# [row,col] coordinates:
data = [[ 0,0 ], [ 0,1 ], [ 0,2 ], [ 0,3 ],
        [ 1,0 ], [ 1,1 ], [ 1,2 ], [ 1,3 ],
        [ 2,0 ], [ 2,1 ], [ 2,2 ], [ 2,3 ],
        [ 3,0 ], [ 3,1 ], [ 3,2 ], [ 3,3 ],
 ]

#Separating x,y coordinates in nonlinear (circle center, only center positive):
category = [ -1,  -1,  -1, -1,
             -1,   1,   1, -1,
             -1,   1,   1, -1,
             -1,  -1,  -1, -1, ]

import numpy
from sklearn.svm import SVC

clf = SVC(kernel='linear', C=1)
clf.fit(data, category)

#Get m coefficients:
coef = clf.coef_[0]
b = clf.intercept_[0]

print('This is the M*X+b=0 equation...')
print('M=%s' % (coef))
print('b=%s' % (b))
print('So the equation of the separating line in this 2d svm is:')
print('%f*x + %f*y + %f = 0' % (coef[0],coef[1],b))
print('The support vector limit lines are:')
print('%f*x + %f*y + %f = -1' % (coef[0],coef[1],b))
print('%f*x + %f*y + %f = 1' % (coef[0],coef[1],b))

vertmatrix = [[x] for x in coef]

good = 0
bad = 0
for i, d in enumerate(data):
	#i-th element, d in data:
	calculatedValue = numpy.dot(d, vertmatrix)[0] + b
	print( 'Mx+b for x=%s calculates to %s' % (d, calculatedValue) )
	if calculatedValue > 0 and category[i] > 0:
		good += 1
	elif calculatedValue < 0 and category[i] < 0:
		good += 1
	else:
		bad +=1 #they should have matched category.
		
print('accuracy=%f' % (good/(good+bad)) )
#The same as the builtin "score" accuracy:
print('accuracy=%f' % clf.score(data, category) )

while True:
	#Try other points
	d = input('vector=')
	calculatedValue = numpy.dot(d, vertmatrix)[0] + b
	print(calculatedValue)

Now the result is not great, the linear separator can’t do anything in this case and the best found is something that makes all given points the negative:

This is the MX+b=0 equation… M=[0. 0.] b=-1.0 So the equation of the separating line in this 2d svm is: 0.000000x + 0.000000y + -1.000000 = 0 The support vector limit lines are: 0.000000x + 0.000000y + -1.000000 = -1 0.000000x + 0.000000*y + -1.000000 = 1
 Mx+b for x=[0, 0] calculates to -1.0
 Mx+b for x=[0, 1] calculates to -1.0
 Mx+b for x=[0, 2] calculates to -1.0
 Mx+b for x=[0, 3] calculates to -1.0
 Mx+b for x=[1, 0] calculates to -1.0
 Mx+b for x=[1, 1] calculates to -1.0
 Mx+b for x=[1, 2] calculates to -1.0
 Mx+b for x=[1, 3] calculates to -1.0
 Mx+b for x=[2, 0] calculates to -1.0
 Mx+b for x=[2, 1] calculates to -1.0
 Mx+b for x=[2, 2] calculates to -1.0
 Mx+b for x=[2, 3] calculates to -1.0
 Mx+b for x=[3, 0] calculates to -1.0
 Mx+b for x=[3, 1] calculates to -1.0
 Mx+b for x=[3, 2] calculates to -1.0
 Mx+b for x=[3, 3] calculates to -1.0
 accuracy=0.750000
 accuracy=0.750000

So what if we use a Kernel function as is often used with SVMs? I set this line to find out:

clf = SVC(kernel='rbf', C=1)

and removed the _coef and logic that of course can’t work with a non linear one:

from __future__ import division
# [row,col] coordinates:
data = [[ 0,0 ], [ 0,1 ], [ 0,2 ], [ 0,3 ],
        [ 1,0 ], [ 1,1 ], [ 1,2 ], [ 1,3 ],
        [ 2,0 ], [ 2,1 ], [ 2,2 ], [ 2,3 ],
        [ 3,0 ], [ 3,1 ], [ 3,2 ], [ 3,3 ],
 ]

#Separating x,y coordinates in nonlinear (circle center, only center positive):
category = [ -1,  -1,  -1, -1,
             -1,   1,   1, -1,
             -1,   1,   1, -1,
             -1,  -1,  -1, -1, ]

import numpy
from sklearn.svm import SVC

clf = SVC(kernel='rbf', C=1)
clf.fit(data, category)

print('accuracy=%f' % clf.score(data, category) )

while True:
	#Try other points
	d = input('vector=')
	calculatedValue = clf.predict(eval(d))
	print(calculatedValue)

With this revised code I can see that the output is 100% accuracy and separates inputs by that area centered on [1.5,1.5]:

accuracy=1.000000
 vector=[[1,1]]
 [1]
 vector=[[-1,-1]]
 [-1]
 vector=[[1.5, 1.5]]
 [1]
 vector=[[3.1, 3.1]]
 [-1]

For more information on how SVM can work in Scikit learn, check out their full documentation!

Machine learning SVM – the usefulness of kernels

Leave a Reply Cancel reply