Support Vector Machine - Part 2 - Computing the Margin
Published:
Introduction
In the previous article, we’ve seen that the goal of SVM is to find the hyperplane that maximizes the margin of the data train. We’ve also known that the margin is a doubled value of the distance between the hyperplane and the closest data point to the hyperplane. In this part, we will see how to compute the margin by applying some vector’s methods.
The Equation of the Hyperplane
Before we compute the margin, we need to know the equation of the hyperplane as it’ll be used to determine the position of the point in the hyperplane.
If we implement the SVM on a two dimensional space, we’ll get a line as the special representation of the hyperplane in which the equation is y = ax + b. Now, suppose we have two vectors, namely w = (-b, -a, 1) and x = (1, x, y). We will prove that these two vectors have a correlation with the line equation in which if we can achieve that point, then we’ll have two representations of the equation. Afterwards, we’ll choose the best equation representing the hyperplane based on several considerations.
Here is an illustration of the computation.
Based on the above computation, we can see that both equations represent the same thing, or in other words we can say that we find another way to express the line equation. The new equation has two vectors as the variable representation and it performs the dot product. For the equation of the hyperplane, we will use this new equation because of these considerations:
- It is easier to implement this equation on the higher dimensional space as it is represented by vector which is more flexible.
- The _dot product_ gives 0 as the result which means that the vector **w** will always perpendicular to the hyperplane (normal vector). It is very helpful when we want to compute the distance between a data point and the hyperplane.
Compute the Margin
Let’s take a look at this illustration.
We can see that the vector w is perpendicular to the hyperplane and based on the previous explanation, we can get the vector’s component just by looking at the coefficient of the line equation (standard equation), namely w = (-b, -a, 1) if the line equation is y - ax - b = 0. From the above illustration we know that the vector w can be represented as (0, -3, 1), whereas the vector x is (1, x, y). In this case we can neglect the first component (0) as it only determines the position of the hyperplane relative to the original point (0, 0).
Our task is to compute the distance between the data point A and the hyperplane or in other words we will find the norm (magnitude of a vector) of the vector d. Since the vector d is the projection of the vector a onto the vector w, we can apply this formula to find the projection vector: d = (u.a)u.
After we have the distance, simply double the value to get the margin.
That’s all for the second part of this SVM’s tutorial. Next, we’ll see how to find the optimal hyperplane when we already had the margin.
References
- Images - Personal Documents