几个算法比较-草稿版

算法名称 模型 策略 求解算法
线性回归

f

(

x

)

=

W

T

x

+

b

f(x)=W^T cdot x + b

f(x)=WTx+b

最小二乘

L

(

W

,

b

)

=

(

f

(

x

)

y

)

2

L(W,b) = (f(x)-y)^2

L(W,b)=(f(x)y)2

梯度下降、牛顿法
LOSSO回归

f

(

x

)

=

W

T

x

+

b

f(x)=W^T cdot x + b

f(x)=WTx+b

最小二乘

L

(

W

,

b

)

=

(

f

(

x

)

y

)

2

+

λ

W

1

L(W,b) = (f(x)-y)^2+lambda ||W||_1

L(W,b)=(f(x)y)2+λ∣∣W1

坐标下降法
岭回归

f

(

x

)

=

W

T

x

+

b

f(x)=W^T cdot x + b

f(x)=WTx+b

最小二乘

L

(

W

,

b

)

=

(

f

(

x

)

y

)

2

+

1

2

λ

W

2

L(W,b) = (f(x)-y)^2+frac{1}{2}lambda W^2

L(W,b)=(f(x)y)2+21λW2

梯度下降
逻辑回归

f

(

x

)

=

1

1

+

e

(

W

T

x

+

b

)

f(x)=frac{1}{1+e^{-(W^Tcdot x + b)}}

f(x)=1+e(WTx+b)1

交叉熵损失

l

n

p

(

y

x

)

=

1

m

i

=

1

m

(

y

l

n

y

^

+

(

1

y

)

l

n

(

1

y

^

)

)

-lnp(y|x)=-dfrac{1}{m}sum_{i=1}^m(ylnhat y+(1-y)ln(1-hat y))

lnp(yx)=m1i=1m(ylny^+(1y)ln(1y^)),

y

^

=

1

1

+

e

(

W

T

X

+

b

)

hat y = dfrac{1}{1+e^{-(W^T cdot X + b)}}

y^=1+e(WTX+b)1

梯度下降、牛顿法
感知机

f

(

x

)

=

s

i

g

n

(

W

T

x

+

b

)

f(x)=sign(W^T cdot x+b)

f(x)=sign(WTx+b),sign是一个符号函数

让分错的点距离当前分离超平面距离尽可能的小

L

(

w

,

b

)

=

x

i

M

y

i

(

w

x

i

+

b

)

L(w,b)=-sum_{x_i in M} y_i(wx_i+b)

L(w,b)=xiMyi(wxi+b) M是所有分类错误点的集合

随机梯度下降:每找到一个错误的样本更新一次参数
K近邻

y

=

a

r

g

m

a

x

x

i

N

K

(

x

)

I

(

y

i

=

c

j

)

y=argmax sum_{x_iin N_K(x)} I(y_i=c_j)

y=argmaxxiNK(x)I(yi=cj)

- -
朴素贝叶斯

P

(

Y

=

c

k

X

=

x

)

=

P

(

Y

=

c

k

)

j

P

(

X

(

j

)

=

x

(

j

)

Y

=

c

k

)

k

P

(

Y

=

c

k

)

j

P

(

X

(

j

)

=

x

(

j

)

Y

=

c

k

)

P(Y=c_k|X=x)=frac{P(Y=c_k)prod_j P(X^{(j)}=x^{(j)}|Y=c_k)}{sum_kP(Y=c_k)prod_j P(X^{(j)}=x^{(j)}|Y=c_k)}

P(Y=ckX=x)=kP(Y=ck)jP(X(j)=x(j)Y=ck)P(Y=ck)jP(X(j)=x(j)Y=ck),条件独立性假设

经验风险期望最小化:

y

=

f

(

x

)

=

a

r

g

m

a

x

c

k

P

(

Y

=

c

k

)

j

P

(

X

(

j

)

=

x

(

j

)

Y

=

c

k

)

y=f(x)=argmax_{c_k}P(Y=c_k)prod_j P(X^{(j)}=x^{(j)}|Y=c_k)

y=f(x)=argmaxckP(Y=ck)jP(X(j)=x(j)Y=ck)

极大似然估计:

P

(

Y

=

c

k

)

=

i

=

1

N

I

(

y

i

=

c

k

)

N

P(Y=c_k)=frac{sum_{i=1}^N I(y_i=c_k)}{N}

P(Y=ck)=Ni=1NI(yi=ck),

P

(

X

(

j

)

=

a

j

l

Y

=

c

k

)

=

i

=

1

N

I

(

x

i

(

j

)

=

a

j

l

,

y

i

=

c

k

)

i

=

1

N

I

(

y

i

=

c

k

)

P(X^{(j)}=a_{jl}|Y=c_k)=frac{sum_{i=1}^N I(x_i^{(j)}=a_{jl},y_i=c_k)}{sum_{i=1}^N I(y_i=c_k)}

P(X(j)=ajlY=ck)=i=1NI(yi=ck)i=1NI(xi(j)=ajl,yi=ck)

SVM-线性可分

f

(

x

)

=

s

i

g

n

(

W

T

x

+

b

)

f(x)=sign(W^T cdot x+b)

f(x)=sign(WTx+b)

间隔最大化/合页损失:

m

i

n

1

2

w

2

mindfrac{1}{2}||w||^2

min21∣∣w2同时包含约束:

y

i

(

W

T

x

i

+

b

)

1

>

=

0

y_i(W^Tcdot x_i+b)-1>=0

yi(WTxi+b)1>=0

拉格朗日、SMO
SVM-线性近似可分

f

(

x

)

=

s

i

g

n

(

W

T

x

+

b

)

f(x)=sign(W^T cdot x+b)

f(x)=sign(WTx+b)

间隔最大化/合页损失:

m

i

n

1

2

w

2

+

C

i

=

1

N

ξ

i

minfrac{1}{2}||w||^2+Csum_{i=1}^Nxi_i

min21∣∣w2+Ci=1Nξi,同时包含约束:

y

i

(

w

x

i

+

b

)

>

=

1

ξ

i

y_i(wcdot x_i+b)>=1-xi_i

yi(wxi+b)>=1ξi

拉格朗日、SMO
SVM-支持向量机

f

(

x

)

=

s

i

g

n

(

W

T

x

+

b

)

f(x)=sign(W^T cdot x+b)

f(x)=sign(WTx+b)

间隔最大化/合页损失:

m

i

n

α

1

2

i

=

1

N

j

=

1

N

α

i

α

j

y

i

y

j

K

(

x

i

x

j

)

i

=

1

N

α

i

min_{alpha}dfrac{1}{2}sum_{i=1}^Nsum_{j=1}^Nalpha_ialpha_jy_iy_jK(x_icdot x_j)-sum_{i=1}^Nalpha_i

minα21i=1Nj=1NαiαjyiyjK(xixj)i=1Nαi同时包含约束:

i

=

1

N

α

i

y

i

=

0

sum_{i=1}^Nalpha_iy_i=0

i=1Nαiyi=0

0

<

=

α

i

<

=

C

0<=alpha_i<=C

0<=αi<=C

拉格朗日、SMO
本图文内容来源于网友网络收集整理提供,作为学习参考使用,版权属于原作者。
THE END
分享
二维码
< <上一篇
下一篇>>