Support Vector Machine - Python Implementation Using CVXOPT - Data Blog
Support Vector Machine - Python Implementation Using CVXOPT - Data Blog
Background
This notebook assumes previous knowledge and understanding of the mathematics behind SVMs and the formulation of the
primal / dual optimization problem. For a summary of this topic please have a look at the following post on stats.stackexchange:
https://stats.stackexchange.com/questions/23391/how-does-a-support-vector-machine-svm-work/353605#353605
(https://stats.stackexchange.com/questions/23391/how-does-a-support-vector-machine-svm-work/353605#353605)
Libraries
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import seaborn as sns
sns.set()
/
#Data set
x_neg = np.array([[3,4],[1,4],[2,3]])
y_neg = np.array([-1,-1,-1])
x_pos = np.array([[6,-1],[7,-1],[5,-3]])
y_pos = np.array([1,1,1])
x1 = np.linspace(-10,10)
x = np.vstack((np.linspace(-10,10),np.linspace(-10,10)))
#Plot
fig = plt.figure(figsize = (10,10))
plt.scatter(x_neg[:,0], x_neg[:,1], marker = 'x', color = 'r', label = 'Negative -1')
plt.scatter(x_pos[:,0], x_pos[:,1], marker = 'o', color = 'b',label = 'Positive +1')
plt.plot(x1, x1 - 3, color = 'darkblue')
plt.plot(x1, x1 - 7, linestyle = '--', alpha = .3, color = 'b')
plt.plot(x1, x1 + 1, linestyle = '--', alpha = .3, color = 'r')
plt.xlim(0,10)
plt.ylim(-5,5)
plt.xticks(np.arange(0, 10, step=1))
plt.yticks(np.arange(-5, 5, step=1))
#Lines
plt.axvline(0, color = 'black', alpha = .5)
plt.axhline(0,color = 'black', alpha = .5)
plt.plot([2,6],[3,-1], linestyle = '-', color = 'darkblue', alpha = .5 )
plt.plot([4,6],[1,1],[6,6],[1,-1], linestyle = ':', color = 'darkblue', alpha = .5 )
plt.plot([0,1.5],[0,-1.5],[6,6],[1,-1], linestyle = ':', color = 'darkblue', alpha = .5 )
#Annotations
plt.annotate(s = '$A \ (6,-1)$', xy = (5,-1), xytext = (6,-1.5))
plt.annotate(s = '$B \ (2,3)$', xy = (2,3), xytext = (2,3.5))#, arrowprops = {'width':.2, 'headwidth':8})
plt.annotate(s = '$2$', xy = (5,1.2), xytext = (5,1.2) )
plt.annotate(s = '$2$', xy = (6.2,.5), xytext = (6.2,.5))
plt.annotate(s = '$2\sqrt{2}$', xy = (4.5,-.5), xytext = (4.5,-.5))
plt.annotate(s = '$2\sqrt{2}$', xy = (2.5,1.5), xytext = (2.5,1.5))
plt.annotate(s = '$w^Tx + b = 0$', xy = (8,4.5), xytext = (8,4.5))
plt.annotate(s = '$(\\frac{1}{4},-\\frac{1}{4}) \\binom{x_1}{x_2}- \\frac{3}{4} = 0$', xy = (7.5,4), xytext =
(7.5,4))
plt.annotate(s = '$\\frac{3}{\sqrt{2}}$', xy = (.5,-1), xytext = (.5,-1))
/
Implementing the SVM algorithm (Hard margin)
Case 1) Linearly separable, binary classification
Determine the set of support vectors S by finding the indices such that αi > 0
1 T T
min x Px + q x
2
s. t. Gx ≤ h
Ax = b
With API
/
cvxopt.solvers.qp(P, q[, G, h[, A, b[, solver[, initvals]]]])
m m
1 (i) (j) (i) (j)
max ∑ α i − ∑y y αi αj < x x >
α 2
i i,j
s. t. α i ≥ 0
m
(i)
∑ αi y = 0
We convert the sums into vector form and multiply both the objective and the constraint by −1 which turns this into a
minimization problem and reverses the inequality
1 T T
min α Hα − 1 α
α 2
s. t. − α i ≤ 0
T
s. t. y α = 0
We are now ready to convert our numpy arrays into the cvxopt format, using the same notation as in the documentation this
gives
P := H a matrix of size m × m
⃗
q := −1 a vector of size m × 1
G := −diag[1] a diagonal matrix of -1s of size m × m
⃗
h := 0 a vector of zeros of size m × 1
A := y the label vector of size m × 1
b := 0 a scalar
Note that in the simple example of m = 2 the matrix G and vector h which define the constraint are
−1 0 0
G = [ ] and h = [ ]
0 −1 0
(1) (1)
(1)
x x y
1 2
X = [ ] y = [ ]
(2) (2) (2)
x x y
1 2
We now proceed to creating a new matrix X ′ where each input sample x is multiplied by the corresponding output label y. This
can be done easily in Numpy using vectorization and padding.
(1) (1)
(1) (1)
x y x y
′ 1 2
X = [ ]
(2) (2)
(2) (2)
x y x y
1 2
/
(1) (1) (1) (2)
(1) (1) (1) (2)
x y x y x y x y
′ ′T 1 2 1 1
H = X @X = [ ][ ]
(2) (2) (1) (2)
(2) (2) (1) (2)
x y x y x y x y
1 2 2 2
Implementation in Python
CVXOPT solver and resulting α
#Importing with custom names to avoid issues with numpy / sympy matrix
from cvxopt import matrix as cvxopt_matrix
from cvxopt import solvers as cvxopt_solvers
#Run solver
sol = cvxopt_solvers.qp(P, q, G, h, A, b)
alphas = np.array(sol['x'])
/
#w parameter in vectorized form
w = ((y * alphas).T @ X).reshape(-1,1)
#Computing b
b = y[S] - np.dot(X[S], w)
#Display results
print('Alphas = ',alphas[alphas > 1e-4])
print('w = ', w.flatten())
print('b = ', b[0])
/
x_neg = np.array([[3,4],[1,4],[2,3]])
y_neg = np.array([-1,-1,-1])
x_pos = np.array([[6,-1],[7,-1],[5,-3],[2,4]])
y_pos = np.array([1,1,1,1])
x1 = np.linspace(-10,10)
x = np.vstack((np.linspace(-10,10),np.linspace(-10,10)))
#Lines
plt.axvline(0, color = 'black', alpha = .5)
plt.axhline(0,color = 'black', alpha = .5)
plt.xlabel('$x_1$')
plt.ylabel('$x_2$')
/
Case 2) Non fully linearly separable, binary classification
For the softmax margin SVM, recall that the optimization problem can be expressed as
m
1 T
max ∑ α i − α Hα
α 2
i
s. t. 0 ≤ α i ≤ C
m
(i)
∑ αi y = 0
1 T T
min α Hα − 1 α
α 2
s. t. − α i ≤ 0
αi ≤ C
T
y α = 0
This is almost the same problem as previously, except for the additional inequality constraint on α . We translate this new
constraint into standard form by concatenating below matrix G a diagonal matrix of 1s of size m × m . Similarly for the vector h
to which the value of C is added m times.
Note that in the simple example of m = 2 the matrix G and vector h which define the constraint are
/
−1 0 0
⎡ ⎤ ⎡ ⎤
⎢ 0 −1 ⎥ ⎢ 0 ⎥
G = ⎢ ⎥ and h = ⎢ ⎥
⎢ 1 0 ⎥ ⎢ C ⎥
⎣ ⎦ ⎣ ⎦
0 1 C
#Run solver
sol = cvxopt_solvers.qp(P, q, G, h, A, b)
alphas = np.array(sol['x'])
#Display results
print('Alphas = ',alphas[alphas > 1e-4])
print('w = ', w.flatten())
print('b = ', b[0])
/
clf = SVC(C = 10, kernel = 'linear')
clf.fit(X, y.ravel())
print('w = ',clf.coef_)
print('b = ',clf.intercept_)
print('Indices of support vectors = ', clf.support_)
print('Support vectors = ', clf.support_vectors_)
print('Number of support vectors for each class = ', clf.n_support_)
print('Coefficients of the support vector in the decision function = ', np.abs(clf.dual_coef_))
w = [[ 0.25 -0.25]]
b = [-0.75]
Indices of support vectors = [0 2 3 6]
Support vectors = [[ 3. 4.]
[ 2. 3.]
[ 6. -1.]
[ 2. 4.]]
Number of support vectors for each class = [2 2]
Coefficients of the support vector in the decision function = [[ 5. 6.3125 1.3125 1
0. ]]
http://goelhardik.github.io/2016/11/28/svm-cvxopt/ (http://goelhardik.github.io/2016/11/28/svm-cvxopt/)
https://cvxopt.org/userguide/coneprog.html#cvxopt.solvers.coneqp
(https://cvxopt.org/userguide/coneprog.html#cvxopt.solvers.coneqp)
Comments
/
xavierbourretsicotte.github.io Comment Policy
All comments are welcome, feel free to contact me to discuss data, learning and maths
LOG IN WITH
OR SIGN UP WITH DISQUS ?
Name
SVM requires an optimization algorithm, but not necessarily a QP or any particular type
of solver. As long as your solver finds the optimal solutions given the constraints then
you are OK
△ ▽ • Reply • Share ›
Navigation
Data Blog ()
About me (/pages/about-me.html)
Categories (/pages/categories.html)
atom (/feeds/all.atom.xml)
Author
github (https://github.com/xavierbourretsicotte)
linkedin (https://www.linkedin.com/in/xavier-bourret-sicotte/)
stackexchange (https://stats.stackexchange.com/users/192854/xavier-bourret-sicotte)
Categories
Machine Learning (19) (/category/machine-learning.html)
Links
Centre National des Arts et Metiers (http://www.cnam.fr/portail/conservatoire-national-des-arts-et-metiers-821166.kjsp)
/
© Xavier Bourret Sicotte 2016
Powered by Pelican