2.2.5 Reinforcement Learning: 这里用的就是 DQN 算法,具体可以参考其他博客。 2.3 Cross-lingual policy transfer: 这里进行跨语言策略迁移的目的是:为了处理数据量比较少的语言中的 active learning 问题。作者采用 Transfer learning 的方法,在数据量丰富的数据集上学习一个比较好的 policy,然后将这种策略应用到 数...
In this paper we design and evaluate a Deep-Reinforcement Learning agent that optimizes routing. Our agent adapts automatically to current traffic conditions and proposes tailored configurations that attempt to minimize the network delay. Experiments show very promising performance. Moreover, this approac...
This paper aims to examine the potential of using the emerging deep reinforcement learning techniques in missile guidance applications. To this end, a Markovian decision process that enables the application of reinforcement learning theory to solve the guidance problem is formulated. A heuristic way is...
预测自动扩展(Predictive Autoscaling)是一种重要的机制,支持根据云中波动的工作负载需求自主调整计算资源。在最近的研究中,强化学习(Reinforcement Learning, RL)作为一种有前途的方法,用于学习资源管理策略,以指导动态和不确定云环境下的扩展操作。然而,RL方法在预测自动扩展时面临以下挑战:决策缺乏准确性,采样效率低,工...
By formulating the selection process as a policy-based decision-making approach, the search for dominant failure modes becomes a systematic decision process for each failure stage (Guan et al., 2023). Deep reinforcement learning (DRL) is well-suited for this application due to its excellent ...
2. Deep RL in 行为决策 和 运动规划 典型的pipeline是,输入传感器数据流,辅以全局路径规划信息,处理后最终得到控制输出(转角、加速度),这种处理的流程一般是分层的,因为驾驶动作天然是分级的,先是一个高级的离散状态的决策(行为决策,换道、跟车、左转),接着一个连续状态空间的动作(运动规划,提供能满足behavior的...
The main contribution of this letter is to develop a deep reinforcement learning-based control-aware scheduling (DEEPCAS) algorithm to tackle these issues. We use the following (optimal) design strategy: first, we synthesize an optimal controller for each subsystem; next, we design a learning ...
A deep-learning approach for reconstructing 3D turbulent flows from 2D observation dataMustafa Z. Yousif, Linqi Yu, Sergio Hoyas, Ricardo Vinuesa & HeeChang Lim Scientific Reports volume 13, Article number: 2529 (2023) Cite this article
1. A Survey of Deep Learning Applications to Autonomous Vehicle Control 3. 强化学习 (reinforcement Learning) 强化学习(RL) 是于 监督学习(Sueprvised Learning) 和 非监督学习(Unsupervised Learning) 之外的第三种机器学习(Machine Learning) 方式。RL 通过一个代理来完成行动策略。代理的目标是最大化在其生命...
Double Q-Network:思路并不新鲜,仿照Double Q-learning,一个Q网络用于选择动作,另一个Q网络用于评估动作,交替工作,解决upward-bias问题,效果不错。三个臭皮匠顶个诸葛亮么,就像工作中如果有double-check,犯错的概率就能平方级别下降。Silver15年论文Deep Reinforcement Learning with Double Q-learning ...