如hinge lossf(w)=max(1−y⟨w,x⟩,0)及神经网络中使用的ReLU激活函数不可微。 第二个区别是,SD不是一种下降方法,也就是说,可能出现f(xt+1)>f(xt)。不管使用什么步长,目标函数都可以保持不变或甚至在迭代中增加。事实上,一个常见的误解是,在SD方法中,次梯度告诉我们应该往哪个方向去降低函数值。
propose a novel learning algorithm called Atom Decomposition Based Subgradient Descent (ADBSD), which solves the optimization problem for the matrix classifier whose objective function is the combination of the Frobenius matrix norm and nuclear norm of the weight matrix along with the hinge loss ...
This approach was developed by replacing the hinge loss function in the conventional support vector machine (SVM) with a generalized pinball loss function. We show that SG-GPSVM is convergent and that it approximates the conventional generalized pinball support vector machine (GPSVM). Further, the...