博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
《机器学习实战》支持向量机(手稿+代码)
阅读量:6786 次
发布时间:2019-06-26

本文共 25257 字,大约阅读时间需要 84 分钟。

 


 

注释:已经看过有半年了,而且当时只是看了理论和opencv的函数调用,现在忘的九霄云外了,Ng的视频也没看SVM,现在准备系统的再学一遍吧。

 之前记录的博客:

 

一.SVM理论重点和难点剖析

  注释:这里主要讲解公式推导和证明的重难点,不是按部就班的讲解SVM的求解过程,算是对推导过程的补充吧!

     一点未接触过SVM的请看大神的博客:

                    

                    

                    

                    

                    

  1.1点到直线距离的由来

    我们先讨论点到平面的距离,由此推广到点到直线和点到超平面的距离公式。    

点到平面公式推导

SVM公式推导一

SVM公式推导二

  1.2拉格朗日对偶问题

    用于求解带条件的最优化问题,其实到最后你就明白了SVM从头到尾最主要做的就是如何高效的求解目标值。而其它的学习算法做的都是对数据的求解优化问题,这点是SVM和其它算法根本的区别。

 原始问题

对偶问题一

对偶问题2

原始问题和对偶问题的关系

KKT条件

SVM公式推导三

SVM公式推导四

  1.3核函数的推导

    目的:1.处理线性不可分的情况。2.求解方便。

    过程:二维情况的不可分割,就映射到三维、四维....等高维空间去分割。

    通俗解释:知乎大神开始装逼的时刻了。

    理论部分:如下公式推导.......

核函数引出一

  1.4松弛变量的引入

    目的:防止部分异常点的干扰。

    原理:和其它算法引入惩罚系数一样的,允许有异常点但是要接受惩罚。比如:异常的点肯定都是偏离群体的点,既然偏离群体,那么它的值就为负数且绝对值愈大惩罚程度越大。

    具体推导:见下文......

松弛变量的引入

  1.5.SMO算法

SMO算法一

SMO算法二

    注释:后面还有参数如何最优选择,有点看不懂而且也有点不想看了,干脆从下面的代码去分析SMO的具体过程吧!

二.程序实现

    代码实现强烈推荐:

  给了程序伪代码很详细,程序读起来很方便。

  2.1.SMO实现

    2.1.1简化版SMO

    简化版:针对理论中“SMO”的最后一句话,最优选择的问题!简化版是随机选择,选择上不做优化。

1 import numpy as np 2 import matplotlib.pyplot as plt 3  4 #预处理数据 5 def loadDataSet(fileName): 6     dataMat  = [] 7     labelMat = [] 8     fr = open(fileName,'r') 9     for line in fr.readlines():10         lineArr = line.strip().split('\t')11         dataMat.append([float(lineArr[0]),float(lineArr[1])])12         labelMat.append(float(lineArr[2]))13     a = np.mat(dataMat)14     b = np.mat(labelMat).transpose()15     DataLabel = np.array(np.hstack((a,b)))16     return dataMat, labelMat, DataLabel17 #随机18 def selectJrand(i,m):19     j = i20     while(j==i):21         j = int(np.random.uniform(0,m))22     return j23 #约束之后的aj24 def clipAlpha(aj,H,L):25     if aj>H:26         aj = H27     elif aj
toler) and (alphas[i]>0)):49 j = selectJrand(i,m)#随机选择一个不同于i的[0,m]50 fXj = float(np.multiply(alphas,labelMatraix).transpose()\51 *(dataMatraix*dataMatraix[j,:].T))+b52 Ej = fXj - float(labelMatraix[j])53 alphaIold = alphas[i].copy()54 alphaJold = alphas[j].copy()55 if (labelMatraix[i] != labelMatraix[j]):56 L = max(0,alphas[j]-alphas[i])57 H = min(C,C+alphas[j]-alphas[i])58 else:59 L = max(0,alphas[i]+alphas[j]-C)60 H = min(C,alphas[i]+alphas[j])61 if (L==H):62 print('L==H')63 continue64 #计算65 eta = 2.0*dataMatraix[i,:]*dataMatraix[j,:].T\66 - dataMatraix[i,:]*dataMatraix[i,:].T\67 - dataMatraix[j,:]*dataMatraix[j,:].T68 if (eta>0):69 print("eta>0")70 continue71 #更新a的新值j72 alphas[j] -= labelMatraix[j]*(Ei - Ej)/eta73 #修剪aj74 alphas[j] = clipAlpha(alphas[j],H,L)75 if (abs(alphas[j] - alphaJold) < 0.00001):76 print("aj not moving")77 #更新ai78 alphas[i] += labelMatraix[j]*labelMatraix[i]*(alphaJold - alphas[j])79 #更新b1,b280 b1 = b - Ei - labelMatraix[i]*(alphas[i]-alphaIold)*dataMatraix[i,:]\81 *dataMatraix[i,:].T - labelMatraix[j]*(alphas[j]-alphaJold)*dataMatraix[i,:]*dataMatraix[j,:].T82 b2 = b - Ej - labelMatraix[i]*(alphas[i] - alphaIold)*dataMatraix[i,:]\83 * dataMatraix[j, :].T - labelMatraix[j]*(alphas[j] - alphaJold) * dataMatraix[j,:] * dataMatraix[j,:].T84 #通过b1和b2计算b85 if (0< alphas[i]

    2.1.2效果图

main.py文件

1 import svm 2 import matplotlib.pyplot as plt 3 import numpy as np 4  5 if __name__ == '__main__': 6     fig = plt.figure() 7     axis = fig.add_subplot(111) 8     dataMat, labelMat,DataLabel= svm.loadDataSet("testSet.txt") 9     #b, alphas = svm.smoSimple(dataMat,labelMat,0.6,0.001,40)10     #ws = svm.calsWs(alphas,dataMat,labelMat)11     pData0 = [0,0,0]12     pData1 = [0,0,0]13     for hLData in DataLabel:14         if (hLData[-1]==1):pData0 = np.vstack((pData0,hLData))15         elif(hLData[-1]==-1):pData1 = np.vstack((pData1,hLData))16         else:continue17     vmax = np.max(pData0[:,0:1])18     vmin = np.min(pData0[:,0:1])19     axis.scatter(pData0[:,0:1],pData0[:,1:2],marker = 'v')20     axis.scatter(pData1[:,0:1],pData1[:,1:2],marker = 's')21     xdata = np.random.uniform(2.0,8.0,[1,20])22     ydata = xdata*(0.81445/0.27279) - (3.837/0.27279)23     axis.plot(xdata.tolist()[0],ydata.tolist()[0],'r')24     25     fig.show()26     27     #print("alphas = ",alphas[alphas>0])
View Code

svm.py文件

1 import numpy as np  2 import matplotlib.pyplot as plt  3   4 #预处理数据  5 def loadDataSet(fileName):  6     dataMat  = []  7     labelMat = []  8     fr = open(fileName,'r')  9     for line in fr.readlines(): 10         lineArr = line.strip().split('\t') 11         dataMat.append([float(lineArr[0]),float(lineArr[1])]) 12         labelMat.append(float(lineArr[2])) 13     a = np.mat(dataMat) 14     b = np.mat(labelMat).transpose() 15     DataLabel = np.array(np.hstack((a,b))) 16     return dataMat, labelMat, DataLabel 17 #随机 18 def selectJrand(i,m): 19     j = i 20     while(j==i): 21         j = int(np.random.uniform(0,m)) 22     return j 23 #约束之后的aj 24 def clipAlpha(aj,H,L): 25     if aj>H: 26         aj = H 27     elif aj
toler) and (alphas[i]>0)): 49 j = selectJrand(i,m)#随机选择一个不同于i的[0,m] 50 fXj = float(np.multiply(alphas,labelMatraix).transpose()\ 51 *(dataMatraix*dataMatraix[j,:].T))+b 52 Ej = fXj - float(labelMatraix[j]) 53 alphaIold = alphas[i].copy() 54 alphaJold = alphas[j].copy() 55 if (labelMatraix[i] != labelMatraix[j]): 56 L = max(0,alphas[j]-alphas[i]) 57 H = min(C,C+alphas[j]-alphas[i]) 58 else: 59 L = max(0,alphas[i]+alphas[j]-C) 60 H = min(C,alphas[i]+alphas[j]) 61 if (L==H): 62 print('L==H') 63 continue 64 #计算 65 eta = 2.0*dataMatraix[i,:]*dataMatraix[j,:].T\ 66 - dataMatraix[i,:]*dataMatraix[i,:].T\ 67 - dataMatraix[j,:]*dataMatraix[j,:].T 68 if (eta>0): 69 print("eta>0") 70 continue 71 #更新a的新值j 72 alphas[j] -= labelMatraix[j]*(Ei - Ej)/eta 73 #修剪aj 74 alphas[j] = clipAlpha(alphas[j],H,L) 75 if (abs(alphas[j] - alphaJold) < 0.00001): 76 print("aj not moving") 77 #更新ai 78 alphas[i] += labelMatraix[j]*labelMatraix[i]*(alphaJold - alphas[j]) 79 #更新b1,b2 80 b1 = b - Ei - labelMatraix[i]*(alphas[i]-alphaIold)*dataMatraix[i,:]\ 81 *dataMatraix[i,:].T - labelMatraix[j]*(alphas[j]-alphaJold)*dataMatraix[i,:]*dataMatraix[j,:].T 82 b2 = b - Ej - labelMatraix[i]*(alphas[i] - alphaIold)*dataMatraix[i,:]\ 83 * dataMatraix[j, :].T - labelMatraix[j]*(alphas[j] - alphaJold) * dataMatraix[j,:] * dataMatraix[j,:].T 84 #通过b1和b2计算b 85 if (0< alphas[i]
View Code

    

    2.1.3非线性分类(核向量)

 

 

   程序代码:

1 import numpy as np  2 import matplotlib.pyplot as plt  3   4 #预处理数据  5 def loadDataSet(fileName):  6     dataMat  = []  7     labelMat = []  8     fr = open(fileName,'r')  9     for line in fr.readlines(): 10         lineArr = line.strip().split('\t') 11         dataMat.append([float(lineArr[0]),float(lineArr[1])]) 12         labelMat.append(float(lineArr[2])) 13     a = np.mat(dataMat) 14     b = np.mat(labelMat).T 15     DataLabel = np.array(np.hstack((a,b))) 16     return dataMat, labelMat, DataLabel 17 #随机 18 def selectJrand(i,m): 19     j = i 20     while(j==i): 21         j = int(np.random.uniform(0,m)) 22     return j 23 #约束之后的aj 24 def clipAlpha(aj,H,L): 25     if aj>H: 26         aj = H 27     elif aj
toler) and (alphas[i]>0)): 49 j = selectJrand(i,m)#随机选择一个不同于i的[0,m] 50 fXj = float(np.multiply(alphas,labelMatraix).transpselfe()\ 51 *(dataMatraix*dataMatraix[j,:].T))+b 52 Ej = fXj - float(labelMatraix[j]) 53 alphaIold = alphas[i].copy() 54 alphaJold = alphas[j].copy() 55 if (labelMatraix[i] != labelMatraix[j]): 56 L = max(0,alphas[j]-alphas[i]) 57 H = min(C,C+alphas[j]-alphas[i]) 58 else: 59 L = max(0,alphas[i]+alphas[j]-C) 60 H = min(C,alphas[i]+alphas[j]) 61 if (L==H): 62 print('L==H') 63 continue 64 #计算 65 eta = 2.0*dataMatraix[i,:]*dataMatraix[j,:].T\ 66 - dataMatraix[i,:]*dataMatraix[i,:].T\ 67 - dataMatraix[j,:]*dataMatraix[j,:].T 68 if (eta>0): 69 print("eta>0") 70 continue 71 #更新a的新值j 72 alphas[j] -= labelMatraix[j]*(Ei - Ej)/eta 73 #修剪aj 74 alphas[j] = clipAlpha(alphas[j],H,L) 75 if (abs(alphas[j] - alphaJold) < 0.00001): 76 print("aj not moving") 77 #更新ai 78 alphas[i] += labelMatraix[j]*labelMatraix[i]*(alphaJold - alphas[j]) 79 #更新b1,b2 80 b1 = b - Ei - labelMatraix[i]*(alphas[i]-alphaIold)*dataMatraix[i,:]\ 81 *dataMatraix[i,:].T - labelMatraix[j]*(alphas[j]-alphaJold)*dataMatraix[i,:]*dataMatraix[j,:].T 82 b2 = b - Ej - labelMatraix[i]*(alphas[i] - alphaIold)*dataMatraix[i,:]\ 83 * dataMatraix[j, :].T - labelMatraix[j]*(alphas[j] - alphaJold) * dataMatraix[j,:] * dataMatraix[j,:].T 84 #通过b1和b2计算b 85 if (0< alphas[i]
1):#启发式选择132 for k in validEcacheList:133 if (k==i):continue#k不能等于i134 Ek = self.clacEk(k)#计算绝对偏差135 deltaE = abs(Ei - Ek)#相对偏差136 if (deltaE >maxDeltaE):137 maxK = k138 maxDeltaE = deltaE139 Ej = Ek140 return maxK, Ej141 142 else:143 j = selectJrand(i,self.m)#随机选择144 Ej = self.clacEk(j)#随机绝对偏差145 return j,Ej146 def updataEk(self,k):147 Ek = self.clacEk(k)148 self.eCache[k] = [k,Ek]149 def innerL(self,i):150 Ei = self.clacEk(i)151 if ((self.labelMat[i]*Ei<-self.tol and self.alphas[i]
self.tol and self.alphas[i]>0)):153 j,Ej = self.selecJ(i,Ei)#选择J154 alphaIold = self.alphas[i].copy()155 alphaJold = self.alphas[j].copy()156 #计算L和H的值157 if (self.labelMat[i] != self.labelMat[j]):158 L = max(0,self.alphas[j]-self.alphas[i])159 H = min(self.C,self.C+self.alphas[j]-self.alphas[i])160 else:161 L = max(0,self.alphas[j]+self.alphas[i]-self.C)162 H = min(self.C,self.alphas[i] +self.alphas[j])163 if (L==H): return 0164 #eta = 2.0* self.X[i,:]*self.X[j,:].T - self.X[i,:]*self.X[i,:].T - \165 # self.X[j,:]*self.X[j,:].T166 eta = 2.0*self.K[i,j] - self.K[i,i] - self.K[j,j]#在此应用核函数167 if (eta>=0): return 0168 self.alphas[j] -= self.labelMat[j] * (Ei - Ej)/eta169 self.alphas[j] = clipAlpha(self.alphas[j],H,L)170 #更新新出现的aj171 self.updataEk(j)172 if (abs(self.alphas[j] - alphaJold)<0.00001):173 print('J not move')174 self.alphas[i] += self.labelMat[j]*self.labelMat[i]*(alphaJold-self.alphas[j])175 self.updataEk(i)#更新新出现的ai176 b1 = self.b - Ei - self.labelMat[i] *(self.alphas[i] - alphaIold)*\177 self.K[i,i] - self.labelMat[j]*(self.alphas[j]-alphaJold)*self.K[i,j]178 b2 = self.b - Ej - self.labelMat[i] *(self.alphas[i] - alphaIold)*\179 self.K[i,j] - self.labelMat[j]*(self.alphas[j]-alphaJold)*self.K[j,j]180 if (self.alphas[i]>0 and self.alphas[i]
0 and self.alphas[j]
0) or (entireSet)):193 alphaPairsChanged = 0194 if entireSet: # go over all195 for i in range(self.m):196 alphaPairsChanged += self.innerL(i)197 print('fullSet iter:%d, i=%d, pair change:%d' % (iter, i, alphaPairsChanged))198 iter += 1199 else: # go over non-bound (railed) alphas200 nonBoundIs = np.nonzero((self.alphas.A > 0) * (self.alphas.A < C))[0]201 for i in nonBoundIs:202 alphaPairsChanged += self.innerL(i)203 print('nonBound iter:%d, i=%d, pair change:%d' % (iter, i, alphaPairsChanged))204 iter += 1205 if entireSet:206 entireSet = False # toggle entire set loop207 elif (alphaPairsChanged == 0):208 entireSet = True209 print("iteration number: %d" % iter)210 return self.b, self.alphas211 #生成核函数,注意这里核函数的计算公式,见博文对其进行说明!212 def kernelTrans(X,A,kTup):213 m,n = X.shape214 K = np.mat(np.zeros([m,1]))215 if (kTup[0]=='lin'):K = X*A.T216 elif(kTup[0]=='rbf'):217 for j in range(m):218 deltaRow = X[j,:] - A219 K[j] = deltaRow * deltaRow.T220 K = np.exp(K/(-1*kTup[1]**2))221 else:raise NameError('Houston We have a problem')222 return K223 def testRbf(k1 = 1.3):224 dataArr, labelArr, dataLbel = loadDataSet('testSetRBF.txt')225 b, alphas = smoP(dataArr,labelArr,200,0.0001,10000,('rbf',k1))226 dataMat = np.mat(dataArr)227 labelMat = np.mat(labelArr).T228 svInd = np.nonzero(alphas.A>0)[0]#支持向量a229 sVs = dataMat[svInd]#支持向量X230 labelSV = labelMat[svInd]231 print("There are %d Support Vector"%(sVs.shape[0]))232 m, n = dataMat.shape233 errorCount = 0234 for i in range(m):235 kernelEval = kernelTrans(sVs,dataMat[i,:],('rbf',k1))236 predict = kernelEval.T * np.multiply(labelSV,alphas[svInd])+b237 if(np.sign(predict)!=np.sign(labelArr[i])):errorCount+=1238 print("The training error is: %f"%(float(errorCount/m)))239 dataArr, labelArr, datalabel = loadDataSet('testSetRBF2.txt')240 errorCount = 0241 dataMat = np.mat(dataArr)242 labelMat = np.mat(labelArr).T243 m, n = dataMat.shape244 for i in range(m):245 kernelEval = kernelTrans(sVs, dataMat[i,:], ('rbf', k1))246 predict = kernelEval.T * np.multiply(labelSV, alphas[svInd]) + b247 if (np.sign(predict) != np.sign(labelArr[i])): errorCount += 1248 print("The test error is: %f" %(float(errorCount / m)))249 250 if __name__ == '__main__':251 252 testRbf(1.3)

 

    2.1.4手写数字识别测试

1 import numpy as np  2 import matplotlib.pyplot as plt  3   4 #预处理数据  5 def loadDataSet(fileName):  6     dataMat  = []  7     labelMat = []  8     fr = open(fileName,'r')  9     for line in fr.readlines(): 10         lineArr = line.strip().split('\t') 11         dataMat.append([float(lineArr[0]),float(lineArr[1])]) 12         labelMat.append(float(lineArr[2])) 13     a = np.mat(dataMat) 14     b = np.mat(labelMat).T 15     DataLabel = np.array(np.hstack((a,b))) 16     return dataMat, labelMat, DataLabel 17 #随机 18 def selectJrand(i,m): 19     j = i 20     while(j==i): 21         j = int(np.random.uniform(0,m)) 22     return j 23 #约束之后的aj 24 def clipAlpha(aj,H,L): 25     if aj>H: 26         aj = H 27     elif aj
toler) and (alphas[i]>0)): 49 j = selectJrand(i,m)#随机选择一个不同于i的[0,m] 50 fXj = float(np.multiply(alphas,labelMatraix).transpselfe()\ 51 *(dataMatraix*dataMatraix[j,:].T))+b 52 Ej = fXj - float(labelMatraix[j]) 53 alphaIold = alphas[i].copy() 54 alphaJold = alphas[j].copy() 55 if (labelMatraix[i] != labelMatraix[j]): 56 L = max(0,alphas[j]-alphas[i]) 57 H = min(C,C+alphas[j]-alphas[i]) 58 else: 59 L = max(0,alphas[i]+alphas[j]-C) 60 H = min(C,alphas[i]+alphas[j]) 61 if (L==H): 62 print('L==H') 63 continue 64 #计算 65 eta = 2.0*dataMatraix[i,:]*dataMatraix[j,:].T\ 66 - dataMatraix[i,:]*dataMatraix[i,:].T\ 67 - dataMatraix[j,:]*dataMatraix[j,:].T 68 if (eta>0): 69 print("eta>0") 70 continue 71 #更新a的新值j 72 alphas[j] -= labelMatraix[j]*(Ei - Ej)/eta 73 #修剪aj 74 alphas[j] = clipAlpha(alphas[j],H,L) 75 if (abs(alphas[j] - alphaJold) < 0.00001): 76 print("aj not moving") 77 #更新ai 78 alphas[i] += labelMatraix[j]*labelMatraix[i]*(alphaJold - alphas[j]) 79 #更新b1,b2 80 b1 = b - Ei - labelMatraix[i]*(alphas[i]-alphaIold)*dataMatraix[i,:]\ 81 *dataMatraix[i,:].T - labelMatraix[j]*(alphas[j]-alphaJold)*dataMatraix[i,:]*dataMatraix[j,:].T 82 b2 = b - Ej - labelMatraix[i]*(alphas[i] - alphaIold)*dataMatraix[i,:]\ 83 * dataMatraix[j, :].T - labelMatraix[j]*(alphas[j] - alphaJold) * dataMatraix[j,:] * dataMatraix[j,:].T 84 #通过b1和b2计算b 85 if (0< alphas[i]
1):#启发式选择140 for k in validEcacheList:141 if (k==i):continue#k不能等于i142 Ek = self.clacEk(k)#计算绝对偏差143 deltaE = abs(Ei - Ek)#相对偏差144 if (deltaE >maxDeltaE):145 maxK = k146 maxDeltaE = deltaE147 Ej = Ek148 return maxK, Ej149 150 else:151 j = selectJrand(i,self.m)#随机选择152 Ej = self.clacEk(j)#随机绝对偏差153 return j,Ej154 def updataEk(self,k):155 Ek = self.clacEk(k)156 self.eCache[k] = [k,Ek]157 def innerL(self,i):158 Ei = self.clacEk(i)159 if ((self.labelMat[i]*Ei<-self.tol and self.alphas[i]
self.tol and self.alphas[i]>0)):161 j,Ej = self.selecJ(i,Ei)#选择J162 alphaIold = self.alphas[i].copy()163 alphaJold = self.alphas[j].copy()164 #计算L和H的值165 if (self.labelMat[i] != self.labelMat[j]):166 L = max(0,self.alphas[j]-self.alphas[i])167 H = min(self.C,self.C+self.alphas[j]-self.alphas[i])168 else:169 L = max(0,self.alphas[j]+self.alphas[i]-self.C)170 H = min(self.C,self.alphas[i] +self.alphas[j])171 if (L==H): return 0172 #eta = 2.0* self.X[i,:]*self.X[j,:].T - self.X[i,:]*self.X[i,:].T - \173 # self.X[j,:]*self.X[j,:].T174 eta = 2.0*self.K[i,j] - self.K[i,i] - self.K[j,j]#在此应用核函数175 if (eta>=0): return 0176 self.alphas[j] -= self.labelMat[j] * (Ei - Ej)/eta177 self.alphas[j] = clipAlpha(self.alphas[j],H,L)178 #更新新出现的aj179 self.updataEk(j)180 if (abs(self.alphas[j] - alphaJold)<0.00001):181 print('J not move')182 self.alphas[i] += self.labelMat[j]*self.labelMat[i]*(alphaJold-self.alphas[j])183 self.updataEk(i)#更新新出现的ai184 b1 = self.b - Ei - self.labelMat[i] *(self.alphas[i] - alphaIold)*\185 self.K[i,i] - self.labelMat[j]*(self.alphas[j]-alphaJold)*self.K[i,j]186 b2 = self.b - Ej - self.labelMat[i] *(self.alphas[i] - alphaIold)*\187 self.K[i,j] - self.labelMat[j]*(self.alphas[j]-alphaJold)*self.K[j,j]188 if (self.alphas[i]>0 and self.alphas[i]
0 and self.alphas[j]
0) or (entireSet)):201 alphaPairsChanged = 0202 if entireSet: # go over all203 for i in range(self.m):204 alphaPairsChanged += self.innerL(i)205 print('fullSet iter:%d, i=%d, pair change:%d' % (iter, i, alphaPairsChanged))206 iter += 1207 else: # go over non-bound (railed) alphas208 nonBoundIs = np.nonzero((self.alphas.A > 0) * (self.alphas.A < C))[0]209 for i in nonBoundIs:210 alphaPairsChanged += self.innerL(i)211 print('nonBound iter:%d, i=%d, pair change:%d' % (iter, i, alphaPairsChanged))212 iter += 1213 if entireSet:214 entireSet = False # toggle entire set loop215 elif (alphaPairsChanged == 0):216 entireSet = True217 print("iteration number: %d" % iter)218 return self.b, self.alphas219 #生成核函数,注意这里核函数的计算公式,见博文对其进行说明!220 def kernelTrans(X,A,kTup):221 m,n = X.shape222 K = np.mat(np.zeros([m,1]))223 if (kTup[0]=='lin'):K = X*A.T224 elif(kTup[0]=='rbf'):225 for j in range(m):226 deltaRow = X[j,:] - A227 K[j] = deltaRow * deltaRow.T228 K = np.exp(K/(-1*kTup[1]**2))229 else:raise NameError('Houston We have a problem')230 return K231 232 def testRbf(k1 = 1.3):233 dataArr, labelArr, dataLbel = loadDataSet('testSetRBF.txt')234 b, alphas = smoP(dataArr,labelArr,200,0.0001,10000,('rbf',k1))235 dataMat = np.mat(dataArr)236 labelMat = np.mat(labelArr).T237 svInd = np.nonzero(alphas.A>0)[0]#支持向量a238 sVs = dataMat[svInd]#支持向量X239 labelSV = labelMat[svInd]240 print("There are %d Support Vector"%(sVs.shape[0]))241 m, n = dataMat.shape242 errorCount = 0243 for i in range(m):244 kernelEval = kernelTrans(sVs,dataMat[i,:],('rbf',k1))245 predict = kernelEval.T * np.multiply(labelSV,alphas[svInd])+b246 if(np.sign(predict)!=np.sign(labelArr[i])):errorCount+=1247 print("The training error is: %f"%(float(errorCount/m)))248 dataArr, labelArr, datalabel = loadDataSet('testSetRBF2.txt')249 errorCount = 0250 dataMat = np.mat(dataArr)251 labelMat = np.mat(labelArr).T252 m, n = dataMat.shape253 for i in range(m):254 kernelEval = kernelTrans(sVs, dataMat[i,:], ('rbf', k1))255 predict = kernelEval.T * np.multiply(labelSV, alphas[svInd]) + b256 if (np.sign(predict) != np.sign(labelArr[i])): errorCount += 1257 print("The test error is: %f" %(float(errorCount / m)))258 259 def loadImage(dirName):260 from os import listdir261 hwLabel = []262 trainingFileList = listdir(dirName)263 m = len(trainingFileList)264 trainingMat = np.zeros((m,1024))265 for i in range(m):266 fileNameStr = trainingFileList[i]267 fileStr = fileNameStr.split('.')[0]268 classNumStr = int(fileStr.split('_')[0])269 if (classNumStr==9):270 hwLabel.append(-1)271 else:hwLabel.append(1)272 trainingMat[i,:] = img2vector('%s/%s'%(dirName,fileNameStr))273 return trainingMat, np.array(hwLabel)274 275 def testDigits(kTup = ('rbf',10)):276 dataArr, labelArr = loadImage('trainingDigits')277 b, alphas = smoP(dataArr,labelArr,200,0.0001,10000,kTup)278 dataMat = np.mat(dataArr);labelMat = np.mat(labelArr).T279 svInd = np.nonzero(alphas.A>0)[0]280 sVs = dataMat[svInd]281 labelSV = labelMat[svInd]282 print("there are %d Support Vectors"%(int(sVs.shape[0])))283 m,n = dataMat.shape284 errorCount = 0285 for i in range(m):286 kernelEval = kernelTrans(sVs,dataMat[i,:],kTup)287 predict = kernelEval.T * np.multiply(labelSV,alphas[svInd]) + b288 if (np.sign(predict)!=np.sign(labelArr[i])):errorCount+=1289 print("The training error is: %f"%(float((errorCount)/m)))290 dataArr, labelArr = loadImage('testDigits')291 errorCount = 0292 dataMat = np.mat(dataArr);labelMat = np.mat(labelArr).T293 m, n = dataMat.shape294 for i in range(m):295 kernelEval = kernelTrans(sVs, dataMat[i, :], kTup)296 predict = kernelEval.T * np.multiply(labelSV, alphas[svInd]) + b297 if (np.sign(predict) != np.sign(labelArr[i])): errorCount += 1298 print("The test error is: %f" % (float((errorCount) / m)))

 

三.参考文献

   参考:

    注释:以下是参考链接里面的内容,按以下中文描述排列!都是大神的结晶,没有主观次序。

    点到平面距离、拉格朗日二次优化、二次曲线公式推导、SMO公式推导

    

    

    

    

 

转载于:https://www.cnblogs.com/wjy-lulu/p/7977952.html

你可能感兴趣的文章
配置IEEE802.3X流控制
查看>>
从濒临解散到浴火重生,OceanBase 这十年经历了什么?
查看>>
DHCP详解
查看>>
Mysql 在java 中的乱码
查看>>
linux下mysql命令
查看>>
Gitlab的使用
查看>>
Fartlek跑-间歇跑
查看>>
怎样在window phone8 中通过webBrowser调用第三方验证登陆接口
查看>>
Kalman原理(很详细)本文转载自《学习OpenCV》清华大学出版社 于诗琪 刘瑞祯 译...
查看>>
linux/centos6 系统时间同步 同步系统时间 ntpdate
查看>>
第一次开启51CTO博客
查看>>
升职还需犹豫?
查看>>
我的友情链接
查看>>
CMD框变小字体显示乱码
查看>>
正则总结:JavaScript中的正则表达式
查看>>
HAProxy 详解
查看>>
7.1文件查找之find命令详解
查看>>
Linux系统管理-(11)-网络配置ifcfg家族
查看>>
memset()
查看>>
Jquery Ajax二次封装(部分转载)
查看>>