After that, a new method is forwarded based on correlation coefficient and variance to calculate the weights of attributes. In light of the conventional Mahalanobis distance, q-rung orthopair fuzzy Mahalanobis distance is developed. Then, regret-based dominated relation and re...
Regret-based dynamics have been introduced and studied in the context of discrete-time repeated play. Here we carry out the corresponding analysis in continuous time. We observe that, in contrast to (smooth) fictitious play or to evolutionary models, the appropriate state space for this analysis ...
论文链接:Evolving Curricula with Regret-Based Environment Design Introduction 该工作研究的问题应该属于open-ended reinforcement learning即开放式强化学习,设定上有点像multi-task reinforcement learning,同样是训练一个通用智能体来解决一系列任务,但是任务集合不是确定的几个任务,而是开放的一个环境空间,空间中的不同...
Regret-based pruning in extensive-form games 来自 掌桥科研 喜欢 0 阅读量: 122 作者:N Brown,T Sandholm 摘要: Counterfactual Regret Minimization (CFR) is a leading algorithm for finding a Nash equilibrium in large zero-sum imperfect-information games. CFR is an iterative algorithm that repeatedly ...
Previous work has studied SSGs with uncertain payoffs modeled by interval uncertainty and provided maximin-based robust solutions. In contrast, in this work we propose the use of the less conservative minimax regret decision criterion for such payoff-uncertain SSGs and present the first algorithms ...
Many multi-agent coordination problems can be represented as DCOPs. Motivated by task allocation in disaster response, we extend standard DCOP models to consider uncertain task rewards where the outcome of completing a task depends on its current state,
来源会议 Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, June 28 - July 1, 2009 2009/01/01 研究点推荐 Growing Digital Library Regret-based Online Ranking NoRegret KLRank 引用走势 2012 被引量:1 0...
In this work, we test both the effectiveness of regret-based elicitation, and user comprehension and acceptance of minimax regret in user studies. We report on a study involving 40 users interacting with the UTPref Recommendation System, which helps students navigate and find rental accommodation. ...
? Both utility- and regret-based choice models are estimated. ? Differences in model fit between regret- and utility-based models are small. ? Parameter interpretation and policy-implications differ more substantially. 展开 关键词: politicians' preferences road pricing stated choice random regret ...
Autonomous Driving using Safe Reinforcement Learning by Incorporating a Regret-based Human Lane-Changing Decision Model 来自 arXiv.org 喜欢 0 阅读量: 142 作者:D Chen,L Jiang,Y Wang,Z Li 摘要: It is expected that many human drivers will still prefer to drive themselves even if the self-...