# Example

Suppose the dataset contains two positive samples

x

(

1

)

=

[

1

,

1

]

T

x^{(1)}=[1,1]^T

x(1)=[1,1]T and

x

(

2

)

=

[

2

,

2

]

T

x^{(2)}=[2,2]^T

x(2)=[2,2]T, and two negative samples

x

(

3

)

=

[

0

,

0

]

T

x^{(3)}=[0,0]^T

x(3)=[0,0]T and

x

(

4

)

=

[

1

,

0

]

T

x^{(4)}=[-1,0]^T

x(4)=[1,0]T. Please calculate the SVM decision hyperplane.

# Calculate

min

λ

J

(

λ

)

=

1

2

i

=

1

N

j

=

1

N

λ

i

λ

j

y

(

i

)

y

(

j

)

(

x

(

i

)

)

T

x

(

j

)

i

=

1

N

λ

i

min_lambda {mathcal{J}(lambda)} = frac{1}{2}sum_{i=1}^Nsum_{j=1}^N lambda_ilambda_jy^{(i)}y^{(j)}(x^{(i)})^Tx^{(j)} - sum_{i=1}^Nlambda_i

λmin J(λ)=21i=1Nj=1Nλiλjy(i)y(j)(x(i))Tx(j)i=1Nλi

s

.

t

.

λ

i

0

,

i

=

1

N

λ

i

y

(

i

)

=

0

s.t. lambda_i geqslant 0, sum_{i=1}^Nlambda_iy^{(i)}=0

s.t.        λi0,      i=1Nλiy(i)=0

D

a

t

a

s

e

t

D

:

{

x

:

{

[

1

,

1

]

,

[

2

,

2

]

,

[

0

,

0

]

,

[

1

,

0

]

}

,

y

:

{

1

,

1

,

1

,

1

}

}

Dataset D:{x:{[1,1],[2,2],[0,0],[-1,0]},y:{1,1,-1,-1}}

Dataset D:{x:{[1,1],[2,2],[0,0],[1,0]},y:{1,1,1,1}}可得下式：

min

λ

J

(

λ

)

=

1

2

(

2

λ

1

2

+

8

λ

2

2

+

λ

4

2

+

8

λ

1

λ

2

+

2

λ

1

λ

4

+

4

λ

2

λ

4

)

λ

1

λ

2

λ

3

λ

4

s

.

t

λ

1

0

,

λ

2

0

,

λ

3

0

,

λ

4

0

λ

1

+

λ

2

λ

3

λ

4

=

0

min_lambda {mathcal{J}(lambda)} = frac{1}{2}(2lambda_1^2+8lambda_2^2+lambda_4^2+8lambda_1lambda_2+2lambda_1lambda_4+4lambda_2lambda_4) \- lambda_1-lambda_2-lambda_3-lambda_4\ s.t lambda_1 geqslant 0,lambda_2geqslant 0,lambda_3geqslant 0,lambda_4geqslant 0\ lambda_1+lambda_2-lambda_3-lambda_4 = 0

λmin J(λ)=21(2λ12+8λ22+λ42+8λ1λ2+2λ1λ4+4λ2λ4)λ1λ2λ3λ4s.t       λ10,λ20,λ30,λ40λ1+λ2λ3λ4=0
since

λ

1

+

λ

2

=

λ

3

+

λ

4

λ

3

=

λ

1

+

λ

2

λ

4

lambda_1+lambda_2 = lambda_3+lambda_4 to lambda_3 = lambda_1+lambda_2 - lambda_4

λ1+λ2=λ3+λ4λ3=λ1+λ2λ4:

min

λ

J

(

λ

)

=

λ

1

2

+

4

λ

2

2

+

1

2

λ

4

2

+

4

λ

1

λ

2

+

λ

1

λ

4

+

2

λ

2

λ

4

2

λ

1

2

λ

2

s

.

t

λ

1

0

,

λ

2

0

{

J

λ

1

=

2

λ

1

+

4

λ

2

+

λ

4

2

=

0

J

λ

2

=

4

λ

1

+

8

λ

2

+

2

λ

4

2

=

0

J

λ

4

=

λ

1

+

2

λ

2

+

λ

4

=

0

min_lambda {mathcal{J}(lambda)} = lambda_1^2+4lambda_2^2+frac{1}{2}lambda_4^2+4lambda_1lambda_2+lambda_1lambda_4+2lambda_2lambda_4 - 2lambda_1-2lambda_2\ s.t lambda_1 geqslant 0,lambda_2geqslant 0 \ \ Longrightarrow ^{求偏导}\ left{begin{matrix} frac{partial mathcal{J}}{partial lambda_1} = 2lambda_1 +4lambda_2+lambda_4-2=0 \ frac{partial mathcal{J}}{partial lambda_2} = 4lambda_1 +8lambda_2+2lambda_4-2=0 \ frac{partial mathcal{J}}{partial lambda_4} = lambda_1 +2lambda_2+lambda_4=0 end{matrix}right.

λmin J(λ)=λ12+4λ22+21λ42+4λ1λ2+λ1λ4+2λ2λ42λ12λ2s.t       λ10,λ20λ1J=2λ1+4λ2+λ42=0λ2J=4λ1+8λ2+2λ42=0λ4J=λ1+2λ2+λ4=0
Lagrange无解，所以极小值在边界上：

• λ

1

=

0

λ

3

=

λ

1

+

λ

2

λ

4

lambda_1 = 0， lambda_3 = lambda_1+lambda_2 - lambda_4

带入

J

(

λ

)

mathcal{J}(lambda)

中，得：

J

(

λ

)

=

4

λ

2

2

+

1

2

λ

4

2

+

+

2

λ

2

λ

4

2

λ

2

{

J

λ

2

=

8

λ

2

+

2

λ

4

2

=

0

J

λ

4

=

2

λ

2

+

λ

4

=

0

{

λ

2

=

1

2

λ

4

=

1

(

0

s

.

t

.

)

λ

2

=

0

,

λ

4

=

0

J

(

λ

)

=

0

λ

4

=

0

,

λ

2

=

1

4

J

(

λ

)

=

1

4

mathcal{J}(lambda) = 4lambda_2^2+frac{1}{2}lambda_4^2++2lambda_2lambda_4 -2lambda_2 \ \ Longrightarrow ^{求偏导}\ left{begin{matrix} frac{partial mathcal{J}}{partial lambda_2} = 8lambda_2+2lambda_4-2=0 \ frac{partial mathcal{J}}{partial lambda_4} = 2lambda_2+lambda_4=0 end{matrix}right. Longrightarrow left{begin{matrix} lambda_2=frac{1}{2} \ lambda_4=-1(le0 不满足s.t.) end{matrix}right.\ 再令：\ lambda_2 = 0,则lambda_4=0， mathcal{J}(lambda) = 0；\ 或lambda_4 = 0,则lambda_2=frac{1}{4}， mathcal{J}(lambda) = -frac{1}{4}；

• λ

2

=

0

lambda_2 = 0

λ

1

=

0

,

λ

4

=

0

J

(

λ

)

=

0

λ

4

=

0

,

λ

1

=

1

J

(

λ

)

=

1

lambda_1 = 0,则lambda_4=0， mathcal{J}(lambda) = 0；\ 或lambda_4 = 0,则lambda_1=1， mathcal{J}(lambda) =-1；

• λ

3

=

0

lambda_3 = 0

λ

1

=

0

,

λ

2

=

2

13

J

(

λ

)

=

2

13

λ

2

=

0

,

λ

1

=

2

5

J

(

λ

)

=

2

5

lambda_1 = 0,则lambda_2=frac{2}{13}， mathcal{J}(lambda) = -frac{2}{13}；\ 或lambda_2 = 0,则lambda_1=frac{2}{5}， mathcal{J}(lambda) =-frac{2}{5}；

• λ

4

=

0

lambda_4 = 0

λ

1

=

0

,

λ

2

=

1

4

J

(

λ

)

=

1

4

λ

2

=

0

,

λ

1

=

1

J

(

λ

)

=

1

lambda_1 = 0,则lambda_2=frac{1}{4}， mathcal{J}(lambda) = -frac{1}{4}；\ 或lambda_2 = 0,则lambda_1=1， mathcal{J}(lambda) =-1；

综上：

λ

1

,

2

,

3

,

4

=

{

1

,

0

,

1

,

0

}

lambda_{1,2,3,4} ={1,0,1,0}

{

W

=

i

=

1

N

λ

i

y

(

i

)

x

(

i

)

b

=

y

(

j

)

i

=

1

N

λ

i

y

(

i

)

(

x

(

i

)

)

T

x

(

j

)

{

W

=

[

1

,

1

]

T

b

=

1

x

(

1

)

+

x

(

2

)

1

=

0

left{begin{matrix} W=sum_{i=1}^{N} lambda_{i} y^{(i)} boldsymbol{x}^{(i)}\ b=y^{(j)}-sum_{i=1}^{N} lambda_{i} y^{(i)}left(x^{(i)}right)^{T} x^{(j)} end{matrix}right. Longrightarrow left{begin{matrix} W = [1,1]^T\ b=-1 end{matrix}right. \Longrightarrow x^{(1)}+x^{(2)} -1 =0

THE END