machine learningconvex Q-learningsemi-definite programmingdata-driven batch optimizationdynamic process controlComputer Aided Chemical Engineeringdoi:10.1016/B978-0-323-85159-6.50056-7Sophie SitterDamien van de BergMax MowbrayAntonio del Rio Chanona
Convex Q-learning is a recent approach to reinforcement learning, motivated by the possibility of a firmer theory for convergence, and the possibility of making use of greater a priori knowledge regarding policy or value function structure. This paper explores algorithm design in the continuous time...
Active-Learning a Convex Body in Low Dimensions 来自 Semantic Scholar 喜欢 0 阅读量: 9 作者:S Har-Peled,M Jones,S Rahul 摘要: Consider a set \(P\subseteq \mathbb {R}^d\) of n points, and a convex body \(C\) provided via a separation oracle. The task at hand is to decide for...
To figure out the issue of flight control design for morphing quadrotors, this paper resorts to a combination of model-free control techniques (e.g., deep reinforcement learning, DRL) and convex combination (CC) technique, and proposes a convex-combined-DRL (cc-DRL) flight control algorithm ...
Nonconvex-nonconcave minimax optimization has received intense attention over the last decade due to its broad applications in machine learning. Most existing algorithms rely on one-sided information, such as the convexity (resp. concavity) of the primal (resp. dual) functions, or other specific ...
F. Facchinei, G. Scutari, and S. Sagratella, “Parallel selective algorithms for nonconvex big...
J. Sun, Q.g Qu, and J. Wright. When are nonconvex problems not scary? arXiv preprint arXiv:1510.06096, 2015.Ju Sun, Qing Qu, and John Wright. When are nonconvex problems not scary? arXiv preprint arXiv:1510.06096, 2015.J. Sun, Q. Qu, and J. Wright, "When are nonconvex ...
The developed convergence guarantee covers a variety of nonconvex functions such as piecewise linear functions, \ell_q \ell_q quasi-norm, Schatten- q q quasi-norm ( 0<q<1 0<q<1 ), minimax concave penalty (MCP), and smoothly clipped absolute deviation (SCAD) penalty. It also allows non...
We study the problem of two-machine no-wait flowshop scheduling with learning effect and convex resource-dependent processing times. Under the condition of the due-date assignment with common flow allowance (i.e. slack (SLK) due-date assignment), we provide a bi-criteria analysis where the ...
mined by {B} Given is a matrix B composed by its set of column vectors {B} = {b1, …, bQ}, the convex cone deter- C {B} = {∑Q αqbq|αq ≥ 0} q=1 (3) Definition 2. A be expressed as non-zero vector z is a lateral edge a trivial combination of {B}...