NeurIPS 今年共收录1900篇论文,我该怎么阅读?
- 2020-10-13 12:16:00
- 刘大牛 转自文章
- 230
近日, 知乎 上有个小热的问题:
在这个问题下,已经有众多大佬对如何阅读论文进行献言献策。
确实,今年 NeurIPS 2020 有接近两千篇论文被接收,这是一个什么概念?
据说,AI 圈子的一位大神—— 旷视科技 张祥雨博士,3 年看完了 1800 篇论文。
这已经是相当恐怖的速度了,按照这个速度,对大神而言,读完 NeurIPS 2020 的论文尚且需要花费三年的时间,这让别人该何去何从?
关于如何读论文,AI 科技评论之前也有一篇“ 吴恩达 教你读论文:持续而缓慢的学习,才是正道”的文章 ,大家可以再次阅读学习。
按照 吴恩达 的观点,读论文不能贪快,要高质量、持续地阅读学习才是正道。
今日,AI 科技评论以 NeurIPS 2020 接近两千篇的论文为例,给大家提供两个论文阅读的便利。
1、阅读大牛的论文:
见“ NeurIPS 2020 论文接收大排行!谷歌 169 篇第一、斯坦福第二、清华国内第一”一文。
在这篇文章中,AI 科技评论列举了 AI 学术大牛如 深度学习 三巨头、 周志华 、 李飞飞 等人的论文,大牛的团队出品的论文,质量平均而言肯定有很大保证的。
2、按主题分门别类的阅读:
这是显而易见的选择,也是大家正在做的事情,AI 科技评论今天这篇文章正是把 NeurIPS 2020 的论文做了一个简单分类统计供大家参考阅读。
说明:
1、统计主题根据日常经常接触到的i进行,不保证全面。
2、统计会有交叉和重复:如论文《Semi-Supervised Neural Architecture Search》会被 半监督学习 和 NAS 统计两次。
3、统计基于“人工”(的)智能,若有疏漏和错误请怪在 AI 身上。
4、本文统计后续补充会持续更新在 AI 科技评论 知乎 专栏上,欢迎大家关注。
前奏
1、论文题目最短的论文:
《Choice Bandits》
2、合作人数最多(31人)的论文:谷歌大脑+OpenAI 29人+约翰霍普斯金天团
《Language Models are Few-Shot Learners 》
3、模仿 Attenton is all you need?
4、五篇和新冠肺炎有关的论文:
《何时以及如何解除风险?基于区域高斯过程的全球 COVID-19(新冠肺炎)情景分析与政策评估》
《新冠肺炎在德国传播的原因分析》
《CogMol:新冠肺炎靶向性和选择性药物设计》
《非药物干预对新冠肺炎传播有效性估计的鲁棒性研究》
《新冠肺炎预测的可解释序列学习》
另附:COVID-19 Open Data:新冠疫情开放时序数据集 https://github.com/GoogleCloudPlatform/covid-19-open-data
5、五篇 Rethinking 的文章:
《重新思考标签对改善类不平衡学习的价值》
《重新思考预训练和自训练》
这篇由谷歌大脑出品的论文 6 月 11 日就挂在 arXiv 上面了,
论文链接:https://arxiv.org/pdf/2006.06882
《重新思考图神经网络的池化层》
《重新思考通用特征转换中可学习树 Filter》
《重新思考分布转移/转换下 深度学习 的重要性权重 》
6、题目带有 Beyond 的论文:
今年 ACL 2020 最佳论文题目正是带有 Beyond 一词,以下论文中的某一篇说不定会沾沾 ACL 2020 最佳论文的喜气在 NeurIPS 2020 上面获个大奖。(如未获奖,概不负责)
其中第一篇论文以 Beyond accuracy 开头,这和 ACL 2020 最佳论文题目开头一模一样了。
7、题目比较有意思的论文:
《Teaching a GAN What Not to Learn》
Siddarth Asokan (Indian Institute of Science) · Chandra Seelamantula (IISc Bangalore)
《Self-supervised learning through the eyes of a child》
Emin Orhan (New York University) · Vaibhav Gupta (New York University) · Brenden Lake (New York University)
《How hard is to distinguish graphs with graph neural networks?》
Andreas Loukas (EPFL)
《Tree! I am no Tree! I am a low dimensional Hyperbolic Embedding》
Rishi S Sonthalia (University of Michigan) · Anna Gilbert (University of Michigan)
8、Relu:7 篇
2
NLP相关
1、BERT:7 篇
2、Attention:24 篇,这里 Attention 不止有用在 NLP 领域,这里暂且归到NLP 分类下,下同。
1、Auto Learning Attention
Benteng Ma (Northwestern Polytechnical University) · Jing Zhang (The University of Sydney) · Yong Xia (Northwestern Polytechnical University, Research & Development Institute of Northwestern Polytechnical University in Shenzhen) · Dacheng Tao (University of Sydney)
2、Bayesian Attention Modules
Xinjie Fan (UT Austin) · Shujian Zhang (UT Austin) · Bo Chen (Xidian University) · Mingyuan Zhou (University of Texas at Austin)
3、Improving Natural Language Processing Tasks with Human Gaze-Guided Neural Attention
Ekta Sood (University of Stuttgart, Simtech ) · Simon Tannert (Institute for Natural Language Processing, University of Stuttgart) · Philipp Mueller (VIS, University of Stuttgart) · Andreas Bulling (University of Stuttgart)
4、Prophet Attention: Predicting Attention with Future Attention for Improved Image Captioning
Fenglin Liu (Peking University) · Xuancheng Ren (Peking University) · Xian Wu (Tencent Medical AI Lab) · Shen Ge (Tencent Medical AI Lab) · Wei Fan (Tencent) · Yuexian Zou (Peking University) · Xu Sun (Peking University)
5、Kalman Filtering Attention for User Behavior Modeling in CTR Prediction
Hu Liu (JD.com) · Jing LU (Business Growth BU JD.com) · Xiwei Zhao (JD.com) · Sulong Xu (JD.com) · Hao Peng (JD.com) · Yutong Liu (JD.com) · Zehua Zhang (JD.com) · Jian Li (JD.com) · Junsheng Jin (JD.com) · Yongjun Bao (JD.com) · Weipeng Yan (JD.com)
6、RANet: Region Attention Network for Semantic Segmentation
Dingguo Shen (Shenzhen University) · Yuanfeng Ji (City University of Hong Kong) · Ping Li (The Hong Kong Polytechnic University) · Yi Wang (Shenzhen University) · Di Lin (Tianjin University)
7、SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks
Fabian Fuchs (University of Oxford) · Daniel Worrall (University of Amsterdam) · Volker Fischer (Robert Bosch GmbH, Bosch Center for Artificial Intelligence) · Max Welling (University of Amsterdam / Qualcomm AI Research)
8、Complementary Attention Self-Distillation for Weakly-Supervised Object Detection
Zeyi Huang (carnegie mellon university) · Yang Zou (Carnegie Mellon University) · B. V. K. Vijaya Kumar (CMU, USA) · Dong Huang (Carnegie Mellon University)
9、Modern Hopfield Networks and Attention for Immune Repertoire Classification
Michael Widrich (LIT AI Lab / University Linz) · Bernhard Schäfl (JKU Linz) · Milena Pavlović (Department of Informatics, University of Oslo) · Hubert Ramsauer (LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria) · Lukas Gruber (Johannes Kepler University) · Markus Holzleitner (LIT AI Lab / University Linz) · Johannes Brandstetter (LIT AI Lab / University Linz) · Geir Kjetil Sandve (Department of Informatics, University of Oslo) · Victor Greiff (Department of Immunology, University of Oslo) · Sepp Hochreiter (LIT AI Lab / University Linz / IARAI) · Günter Klambauer (LIT AI Lab / University Linz)
10、Untangling tradeoffs between recurrence and self-attention in artificial neural networks
Giancarlo Kerg (MILA) · Bhargav Kanuparthi (Montreal Institute for Learning Algorithms) · Anirudh Goyal ALIAS PARTH GOYAL (Université de Montréal) · Kyle Goyette (University of Montreal) · Yoshua Bengio (Mila / U. Montreal) · Guillaume Lajoie (Mila, Université de Montréal)
11、RATT: Recurrent Attention to Transient Tasks for Continual Image Captioning
Riccardo Del Chiaro (University of Florence) · Bartłomiej Twardowski (Computer Vision Center, UAB) · Andrew D Bagdanov (University of Florence) · Joost van de Weijer (Computer Vision Center Barcelona)
12、Multi-Task Temporal Shift Attention Networks for On-Device Contactless Vitals Measurement
Xin Liu (University of Washington ) · Josh Fromm (OctoML) · Shwetak Patel (University of Washington) · Daniel McDuff (Microsoft Research)
13、SAC: Accelerating and Structuring Self-Attention via Sparse Adaptive Connection
Xiaoya Li (Shannon.AI) · Yuxian Meng (Shannon.AI) · Mingxin Zhou (Shannon.AI) · Qinghong Han (Shannon.AI) · Fei Wu (Zhejiang University) · Jiwei Li (Shannon.AI)
14、Fast Transformers with Clustered Attention
Apoorv Vyas (Idiap Research Institute) · Angelos Katharopoulos (Idiap) · François Fleuret (University of Geneva)
15、Sparse and Continuous Attention Mechanisms
André Martins () · Marcos Treviso (Instituto de Telecomunicacoes) · António Farinhas (Instituto Superior Técnico) · Vlad Niculae (Instituto de Telecomunicações) · Mario Figueiredo (University of Lisbon) · Pedro Aguiar (Instituto Superior Técnico)
16、Learning to Execute Programs with Instruction Pointer Attention Graph Neural Networks
David Bieber (Google Brain) · Charles Sutton (Google) · Hugo Larochelle (Google Brain) · Daniel Tarlow (Google Brain)
17、Neural encoding with visual attention
Meenakshi Khosla (Cornell University) · Gia Ngo (Cornell University) · Keith Jamison (Cornell University) · Amy Kuceyeski (Cornell University) · Mert Sabuncu (Cornell)
18、Deep Reinforcement Learning with Stacked Hierarchical Attention for Text-based Games
Yunqiu Xu (University of Technology Sydney) · Meng Fang (Tencent) · Ling Chen (" University of Technology, Sydney, Australia") · Yali Du (University College London) · Joey Tianyi Zhou (IHPC, A*STAR) · Chengqi Zhang (University of Technology Sydney)
19、Object-Centric Learning with Slot Attention
Francesco Locatello (ETH Zürich - MPI Tübingen) · Dirk Weissenborn (Google) · Thomas Unterthiner (Google Research, Brain Team) · Aravindh Mahendran (Google) · Georg Heigold (Google) · Jakob Uszkoreit (Google, Inc.) · Alexey Dosovitskiy (Google Research) · Thomas Kipf (Google Research)
20、SMYRF - Efficient attention using asymmetric clustering
Giannis Daras (National Technical University of Athens) · Nikita Kitaev (University of California, Berkeley) · Augustus Odena (Google Brain) · Alexandros Dimakis (University of Texas, Austin)
21、Focus of Attention Improves Information Transfer in Visual Features
Matteo Tiezzi (University of Siena) · Stefano Melacci (University of Siena) · Alessandro Betti (University of Siena) · Marco Maggini (University of Siena) · Marco Gori (University of Siena)
22、AttendLight: Universal Attention -Based Reinforcement Learning Model for Traffic Signal Control
Afshin Oroojlooy (SAS Institute, Inc) · Mohammadreza Nazari (SAS Institute Inc.) · Davood Hajinezhad (SAS Institute Inc.) · Jorge Silva (SAS)
23、Multi-agent Trajectory Prediction with Fuzzy Query Attention
Nitin Kamra (University of Southern California) · Hao Zhu (Peking University) · Dweep Kumarbhai Trivedi (University of Southern California) · Ming Zhang (Peking University) · Yan Liu (University of Southern California)
24、Limits to Depth Efficiencies of Self-Attention
Yoav Levine (HUJI) · Noam Wies (Hebrew University of Jerusalem) · Or Sharir (Hebrew University of Jerusalem) · Hofit Bata (Hebrew University of Jerusalem) · Amnon Shashua (Hebrew University of Jerusalem)
25、Improving Natural Language Processing Tasks with Human Gaze-Guided Neural Attention
Ekta Sood (University of Stuttgart, Simtech ) · Simon Tannert (Institute for Natural Language Processing, University of Stuttgart) · Philipp Mueller (VIS, University of Stuttgart) · Andreas Bulling (University of Stuttgart
3、Transformer:14 篇
1、Fast Transformers with Clustered Attention
Apoorv Vyas (Idiap Research Institute) · Angelos Katharopoulos (Idiap)· François Fleuret (University of Geneva)
2、Deep Transformers with Latent Depth
Xian Li (Facebook) · Asa Cooper Stickland (University of Edinburgh) · Yuqing Tang (Facebook AI) · Xiang Kong (Carnegie Mellon University)
3、Cross Transformers : spatially-aware few-shot transfer
Carl Doersch (DeepMind) · Ankush Gupta (DeepMind) · Andrew Zisserman (DeepMind & University of Oxford)
4、SE(3)- Transformers : 3D Roto-Translation Equivariant Attention Networks
Fabian Fuchs (University of Oxford) · Daniel Worrall (University of Amsterdam) · Volker Fischer (Robert Bosch GmbH, Bosch Center for Artificial Intelligence) · Max Welling (University of Amsterdam / Qualcomm AI Research)
5、Funnel- Transformer : Filtering out Sequential Redundancy for Efficient Language Processing
Zihang Dai (Carnegie Mellon University) · Guokun Lai (Carnegie Mellon University) · Yiming Yang (CMU) · Quoc V Le (Google)
6、Adversarial Sparse Transformer for Time Series Forecasting
Sifan Wu (Tsinghua University) · Xi Xiao (Tsinghua University) · Qianggang Ding (Tsinghua University) · Peilin Zhao (Tencent AI Lab) · Ying Wei (Tencent AI Lab) · Junzhou Huang (University of Texas at Arlington / Tencent AI Lab)
7、Accelerating Training of Transformer -Based Language Models with Progressive Layer Dropping
Minjia Zhang (Microsoft) · Yuxiong He (Microsoft)
8、COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning
Mohammadreza Zolfaghari (University of Freiburg) · Simon Ging (Uni Freiburg) · Hamed Pirsiavash (University of Maryland, Baltimore County) · Thomas Brox (University of Freiburg)
9、Cascaded Text Generation with Markov Transformers
Yuntian Deng (Harvard University) · Alexander Rush (Cornell University)
10、GROVER: Self-Supervised Message Passing Transformer on Large-scale Molecular Graphs
Yu Rong (Tencent AI Lab) · Yatao Bian (Tencent AI Lab) · Tingyang Xu (Tencent AI Lab) · Weiyang Xie (Tencent AI Lab) · Ying WEI (Tencent AI Lab) · Wenbing Huang (Tsinghua University) · Junzhou Huang (University of Texas at Arlington / Tencent AI Lab)
11、Learning to Communicate in Multi-Agent Systems via Transformer -Guided Program Synthesis
Jeevana Priya Inala (MIT) · Yichen Yang (MIT) · James Paulos (University of Pennsylvania) · Yewen Pu (MIT) · Osbert Bastani (University of Pennysylvania) · Vijay Kumar (University of Pennsylvania) · Martin Rinard (MIT) · Armando Solar-Lezama (MIT)
12、Measuring Systematic Generalization in Neural Proof Generation with Transformers
Nicolas Gontier (Mila, Polytechnique Montréal) · Koustuv Sinha (McGill University / Mila / FAIR) · Siva Reddy (McGill University) · Chris Pal (Montreal Institute for Learning Algorithms, École Polytechnique, Université de Montréal)
13、O(n) Connections are Expressive Enough: Universal Approximability of Sparse Transformers
Chulhee Yun (MIT) · Yin-Wen Chang (Google Inc.) · Srinadh Bhojanapalli (Google AI) · Ankit Singh Rawat (Google Research) · Sashank Reddi (Google) · Sanjiv Kumar (Google Research)
14、MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
Wenhui Wang (MSRA) · Furu Wei (Microsoft Research Asia) · Li Dong (Microsoft Research) · Hangbo Bao (Harbin Institute of Technology) · Nan Yang (Microsoft Research Asia) · Ming Zhou (Microsoft Research)
4、预训练:5 篇
1、 Pre-training via Paraphrasing
Mike Lewis (Facebook AI Research) · Marjan Ghazvininejad (Facebook AI Research) · Gargi Ghosh (Facebook) · Armen Aghajanyan (Facebook) · Sida Wang (Facebook AI Research) · Luke Zettlemoyer (University of Washington and Allen Institute for Artificial Intelligence)
2、 Pre-Training Graph Neural Networks: A Contrastive Learning Framework with Augmentations
Yuning You (Texas A&M University) · Tianlong Chen (Unversity of Texas at Austin) · Yongduo Sui (University of Science and Technology of China) · Ting Chen (Google) · Zhangyang Wang (University of Texas at Austin) · Yang Shen (Texas A&M University)
3、Rethinking Pre-training and Self-training
Barret Zoph (Google Brain) · Golnaz Ghiasi (Google) · Tsung-Yi Lin (Google Brain) · Yin Cui (Google) · Hanxiao Liu (Google Brain) · Ekin Dogus Cubuk (Google Brain) · Quoc V Le (Google)
4、MPNet: Masked and Permuted Pre-training for Language Understanding
Kaitao Song (Nanjing University of Science and technology) · Xu Tan (Microsoft Research) · Tao Qin (Microsoft Research) · Jianfeng Lu (Nanjing University of Science and Technology) · Tie-Yan Liu (Microsoft Research Asia)
5、Adversarial Contrastive Learning: Harvesting More Robustness from Unsupervised Pre-Training
Ziyu Jiang (Texas A&M University) · Tianlong Chen (Unversity of Texas at Austin) · Ting Chen (Google) · Zhangyang Wang (University of Texas at Austin)
1、A Ranking-based, Balanced Loss Function for Both Classification and Localisation in Object Detection
Kemal Oksuz (Middle East Technical University) · Baris Can Cam (Roketsan) · Emre Akbas (Middle East Technical University) · Sinan Kalkan (Middle East Technical University)
2、UWSOD: Toward Fully-Supervised-Level Performance Weakly Supervised Object Detection
Yunhang Shen (Xiamen University) · Rongrong Ji (Xiamen University, China) · Zhiwei Chen (Xiamen University) · Yongjian Wu (Tencent Technology (Shanghai) Co.,Ltd) · Feiyue Huang (Tencent)
3、Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection
Xiang Li (NJUST) · Wenhai Wang (Nanjing University) · Lijun Wu (Sun Yat-sen University) · Shuo Chen (Nanjing University of Science and Technology) · Xiaolin Hu (Tsinghua University) · Jun Li (Nanjing University of Science and Technology) · Jinhui Tang (Nanjing University of Science and Technology) · Jian Yang (Nanjing University of Science and Technology)
4、Every View Counts: Cross-View Consistency in 3D Object Detection with Hybrid-Cylindrical-Spherical Voxelization
Qi Chen (Johns Hopkins University) · Lin Sun (Samsung, Stanford, HKUST) · Ernest Cheung (Samsung) · Alan Yuille (Johns Hopkins University)
5、Complementary Attention Self-Distillation for Weakly-Supervised Object Detection
Zeyi Huang (carnegie mellon university) · Yang Zou (Carnegie Mellon University) · B. V. K. Vijaya Kumar (CMU, USA) · Dong Huang (Carnegie Mellon University)
6、Few-Cost Salient Object Detection with Adversarial-Paced Learning
Dingwen Zhang (Xidian University) · HaiBin Tian (Xidian University) · Jungong Han (University of Warwick)
7、Bridging Visual Representations for Object Detection
Cheng Chi (University of Chinese Academy of Sciences) · Fangyun Wei (Microsoft Research Asia) · Han Hu (Microsoft Research Asia)
8、Fine-Grained Dynamic Head for Object Detection
Lin Song (Xian Jiaotong University) · Yanwei Li (The Chinese University of Hong Kong) · Zhengkai Jiang (Institute of Automation,Chinese Academy of Sciences) · Zeming Li (Megvii(Face++) Inc) · Hongbin Sun (Xian Jiaotong University) · Jian Sun (Megvii, Face++) · Nanning Zheng (Xian Jiaotong University)
9、Detection as Regression: Certified Object Detection with Median Smoothing
Ping-yeh Chiang (University of Maryland, College Park) · Michael Curry (University of Maryland) · Ahmed Abdelkader (University of Maryland, College Park) · Aounon Kumar (University of Maryland, College Park) · John Dickerson (University of Maryland) · Tom Goldstein (University of Maryland)
10、RepPoints v2: Verification Meets Regression for Object Detection
Yihong Chen (Peking University) · Zheng Zhang (MSRA) · Yue Cao (Microsoft Research) · Liwei Wang (Peking University) · Stephen Lin (Microsoft Research) · Han Hu (Microsoft Research Asia)
11、CoADNet: Collaborative Aggregation-and-Distribution Networks for Co-Salient Object Detection
Qijian Zhang (City University of Hong Kong) · Runmin Cong (Beijing Jiaotong University) · Junhui Hou (City University of Hong Kong, Hong Kong) · Chongyi Li ( Nanyang Technological University) · Yao Zhao (Beijing Jiaotong University)
12、Restoring Negative Information in Few-Shot Object Detection
Yukuan Yang (Tsinghua University) · Fangyun Wei (Microsoft Research Asia) · Miaojing Shi (Kings College London) · Guoqi Li (Tsinghua University)
1、Video Object Segmentation with Adaptive Feature Bank and Uncertain-Region Refinement
Yongqing Liang (Louisiana State University) · Xin Li (Louisiana State University) · Navid Jafari (Louisiana State University) · Jim Chen (Northeastern University)
2、Make One-Shot Video Object Segmentation Efficient Again
Tim Meinhardt (TUM) · Laura Leal-Taixé (TUM)
3、Delving into the Cyclic Mechanism in Semi-supervised Video Object Segmentation
Yuxi Li (Shanghai Jiao Tong University) · Jinlong Peng (Tencent Youtu Lab) · Ning Xu (Adobe Research) · John See (Multimedia University) · Weiyao Lin (Shanghai Jiao Tong university)
实例分割:2 篇
1、Deep Variational Instance Segmentation
Jialin Yuan (Oregon State University) · Chao Chen (Stony Brook University) · Fuxin Li (Oregon State University)
2、DFIS: Dynamic and Fast Instance Segmentation
Xinlong Wang (University of Adelaide) · Rufeng Zhang (Tongji University) · Tao Kong (Bytedance) · Lei Li (ByteDance AI Lab) · Chunhua Shen (University of Adelaide)
行人重识别:
4
各种Learning
1、强化学习:94 篇
1、 Reinforcement Learning for Control with Multiple Frequencies
Jongmin Lee (KAIST) · ByungJun Lee (KAIST) · Kee-Eung Kim (KAIST)
2、 Reinforcement Learning with General Value Function Approximation: Provably Efficient Approach via Bounded Eluder Dimension
Ruosong Wang (Carnegie Mellon University) · Russ Salakhutdinov (Carnegie Mellon University) · Lin Yang (UCLA)
3、 Reinforcement Learning in Factored MDPs: Oracle-Efficient Algorithms and Tighter Regret Bounds for the Non-Episodic Setting
Ziping Xu (University of Michigan) · Ambuj Tewari (University of Michigan)
4、 Reinforcement Learning with Feedback Graphs
Christoph Dann (Carnegie Mellon University) · Yishay Mansour (Google) · Mehryar Mohri (Courant Inst. of Math. Sciences & Google Research) · Ayush Sekhari (Cornell University) · Karthik Sridharan (Cornell University)
5、 Reinforcement Learning with Augmented Data
Misha Laskin (UC Berkeley) · Kimin Lee (UC Berkeley) · Adam Stooke (UC Berkeley) · Lerrel Pinto (New York University) · Pieter Abbeel (UC Berkeley & covariant.ai) · Aravind Srinivas (UC Berkeley)
6、 Reinforcement Learning with Combinatorial Actions: An Application to Vehicle Routing
Arthur Delarue (MIT) · Ross Anderson (Google Research) · Christian Tjandraatmadja (Google)
7、Breaking the Sample Size Barrier in Model-Based Reinforcement Learning with a Generative Model
Gen Li (Tsinghua University) · Yuting Wei (Carnegie Mellon University) · Yuejie Chi (CMU) · Yuantao Gu (Tsinghua University) · Yuxin Chen (Princeton University)
8、Almost Optimal Model-Free Reinforcement Learning via Reference-Advantage Decomposition
Zihan Zhang (Tsinghua University) · Yuan Zhou (UIUC) · Xiangyang Ji (Tsinghua University)
9、Effective Diversity in Population Based Reinforcement Learning
Jack Parker-Holder (University of Oxford) · Aldo Pacchiano (UC Berkeley) · Krzysztof M Choromanski (Google Brain Robotics) · Stephen J Roberts (University of Oxford)
10、A Boolean Task Algebra for Reinforcement Learning
Geraud Nangue Tasse (University of the Witwatersrand) · Steven James (University of the Witwatersrand) · Benjamin Rosman (University of the Witwatersrand / CSIR)
11、Knowledge Transfer in Multi-Task Deep Reinforcement Learning for Continuous Control
Zhiyuan Xu (Syracuse University) · Kun Wu (Syracuse University) · Zhengping Che (DiDi AI Labs, Didi Chuxing) · Jian Tang (DiDi AI Labs, DiDi Chuxing) · Jieping Ye (Didi Chuxing)
12、Multi-task Batch Reinforcement Learning with Metric Learning
Jiachen Li (University of California, San Diego) · Quan Vuong (University of California San Diego) · Shuang Liu (University of California, San Diego) · Minghua Liu (UCSD) · Kamil Ciosek (Microsoft) · Henrik Christensen (UC San Diego) · Hao Su (UCSD)
13、On the Stability and Convergence of Robust Adversarial Reinforcement Learning : A Case Study on Linear Quadratic Systems
Kaiqing Zhang (University of Illinois at Urbana-Champaign (UIUC)) · Bin Hu (University of Illinois at Urbana-Champaign) · Tamer Basar (University of Illinois at Urbana-Champaign)
14、Towards Playing Full MOBA Games with Deep Reinforcement Learning
Deheng Ye (Tencent) · Guibin Chen (Tencent) · Wen Zhang (Tencent) · chen sheng (qq) · Bo Yuan (Tencent) · Bo Liu (Tencent) · Jia Chen (Tencent) · Hongsheng Yu (Tencent) · Zhao Liu (Tencent) · Fuhao Qiu (Tencent AI Lab) · Liang Wang (Tencent) · Tengfei Shi (Tencent) · Yinyuting Yin (Tencent) · Bei Shi (Tencent AI Lab) · Lanxiao Huang (Tencent) · qiang fu (Tencent AI Lab) · Wei Yang (Tencent AI Lab) · Wei Liu (Tencent AI Lab)
15、Promoting Coordination through Policy Regularization in Multi-Agent Deep Reinforcement Learning
Julien Roy (Mila) · Paul Barde (Quebec AI institute - Ubisoft La Forge) · Félix G Harvey (Polytechnique Montréal) · Derek Nowrouzezahrai (McGill University) · Chris Pal (MILA, Polytechnique Montréal, Element AI)
16、Confounding-Robust Policy Evaluation in Infinite-Horizon Reinforcement Learning
Nathan Kallus (Cornell University) · Angela Zhou (Cornell University)
17、Learning Retrospective Knowledge with Reverse Reinforcement Learning
Shangtong Zhang (University of Oxford) · Vivek Veeriah (University of Michigan) · Shimon Whiteson (University of Oxford)
18、Combining Deep Reinforcement Learning and Search for Imperfect-Information Games
Noam Brown (Facebook AI Research) · Anton Bakhtin (Facebook AI Research) · Adam Lerer (Facebook AI Research) · Qucheng Gong (Facebook AI Research)
19、POMO: Policy Optimization with Multiple Optima for Reinforcement Learning
Yeong-Dae Kwon (Samsung SDS) · Jinho Choo (Samsung SDS) · Byoungjip Kim (Samsung SDS) · Iljoo Yoon (Samsung SDS) · Youngjune Gwon (Samsung SDS) · Seungjai Min (Samsung SDS)
20、Self-Paced Deep Reinforcement Learning
Pascal Klink (TU Darmstadt) · Carlo DEramo (TU Darmstadt) · Jan Peters (TU Darmstadt & MPI Intelligent Systems) · Joni Pajarinen (TU Darmstadt)
21、Efficient Model-Based Reinforcement Learning through Optimistic Policy Search and Planning
Sebastian Curi (ETHz) · Felix Berkenkamp (Bosch Center for Artificial Intelligence) · Andreas Krause (ETH Zurich)
22、Weakly-Supervised Reinforcement Learning for Controllable Behavior
Lisa Lee (CMU / Google Brain / Stanford) · Ben Eysenbach (Carnegie Mellon University) · Russ Salakhutdinov (Carnegie Mellon University) · Shixiang (Shane) Gu (Google Brain) · Chelsea Finn (Stanford)
23、MOReL: Model-Based Offline Reinforcement Learning
Rahul Kidambi (Cornell University) · Aravind Rajeswaran (University of Washington) · Praneeth Netrapalli (Microsoft Research) · Thorsten Joachims (Cornell)
24、Security Analysis of Safe and Seldonian Reinforcement Learning Algorithms
Pinar Ozisik (UMass Amherst) · Philip Thomas (University of Massachusetts Amherst)
25、Model-based Adversarial Meta-Reinforcement Learning
Zichuan Lin (Tsinghua University) · Garrett W. Thomas (Stanford University) · Guangwen Yang (Tsinghua University) · Tengyu Ma (Stanford University)
26、Safe Reinforcement Learning via Curriculum Induction
Matteo Turchetta (ETH Zurich) · Andrey Kolobov (Microsoft Research) · Shital Shah (Microsoft) · Andreas Krause (ETH Zurich) · Alekh Agarwal (Microsoft Research)
27、Conservative Q-Learning for Offline Reinforcement Learning
Aviral Kumar (UC Berkeley) · Aurick Zhou (University of California, Berkeley) · George Tucker (Google Brain) · Sergey Levine (UC Berkeley)
28、Munchausen Reinforcement Learning
Nino Vieillard (Google Brain) · Olivier Pietquin (Google Research Brain Team) · Matthieu Geist (Google Brain)
29、Non-Crossing Quantile Regression for Distributional Reinforcement Learning
Fan Zhou (Shanghai University of Finance and Economics) · Jianing Wang (Shanghai University of Finance and Economics) · Xingdong Feng (Shanghai University of Finance and Economics)
30、Online Decision Based Visual Tracking via Reinforcement Learning
ke Song (Shandong university) · Wei Zhang (Shandong University) · Ran Song (School of Control Science and Engineering, Shandong University) · Yibin Li (Shandong University)
31、Discovering Reinforcement Learning Algorithms
Junhyuk Oh (DeepMind) · Matteo Hessel (Google DeepMind) · Wojciech Czarnecki (DeepMind) · Zhongwen Xu (DeepMind) · Hado van Hasselt (DeepMind) · Satinder Singh (DeepMind) · David Silver (DeepMind)
32、Shared Experience Actor-Critic for Multi-Agent Reinforcement Learning
Filippos Christianos (University of Edinburgh) · Lukas Schäfer (University of Edinburgh) · Stefano Albrecht (University of Edinburgh)
33、The LoCA Regret: A Consistent Metric to Evaluate Model-Based Behavior in Reinforcement Learning
Harm Van Seijen (Microsoft Research) · Hadi Nekoei (MILA) · Evan Racah (Mila, Université de Montréal) · Sarath Chandar (Mila / École Polytechnique de Montréal)
34、Leverage the Average: an Analysis of KL Regularization in Reinforcement Learning
Nino Vieillard (Google Brain) · Tadashi Kozuno (Okinawa Institute of Science and Technology) · Bruno Scherrer (INRIA) · Olivier Pietquin (Google Research Brain Team) · Remi Munos (DeepMind) · Matthieu Geist (Google Brain)
35、Task-agnostic Exploration in Reinforcement Learning
Xuezhou Zhang (UW-Madison) · Yuzhe Ma (University of Wisconsin-Madison) · Adish Singla (MPI-SWS)
36、Generating Adjacency-Constrained Subgoals in Hierarchical Reinforcement Learning
Tianren Zhang (Tsinghua University) · Shangqi Guo (Tsinghua University) · Tian Tan (Stanford University) · Xiaolin Hu (Tsinghua University) · Feng Chen (Tsinghua University)
37、Storage Efficient and Dynamic Flexible Runtime Channel Pruning via Deep Reinforcement Learning
Jianda Chen (Nanyang Technological University) · Shangyu Chen (Nanyang Technological University, Singapore) · Sinno Jialin Pan (Nanyang Technological University, Singapore)
38、Multi-Task Reinforcement Learning with Soft Modularization
Ruihan Yang (UC San Diego) · Huazhe Xu (UC Berkeley) · YI WU (UC Berkeley) · Xiaolong Wang (UCSD/UC Berkeley)
39、Weighted QMIX: Improving Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
Tabish Rashid (University of Oxford) · Gregory Farquhar (University of Oxford) · Bei Peng (University of Oxford) · Shimon Whiteson (University of Oxford)
40、MDP Homomorphic Networks: Group Symmetries in Reinforcement Learning
Elise van der Pol (University of Amsterdam) · Daniel Worrall (University of Amsterdam) · Herke van Hoof (University of Amsterdam) · Frans Oliehoek (TU Delft) · Max Welling (University of Amsterdam / Qualcomm AI Research)
41、On Efficiency in Hierarchical Reinforcement Learning
Zheng Wen (DeepMind) · Doina Precup (DeepMind) · Morteza Ibrahimi (DeepMind) · Andre Barreto (DeepMind) · Benjamin Van Roy (Stanford University) · Satinder Singh (DeepMind)
42、Variational Policy Gradient Method for Reinforcement Learning with General Utilities
Junyu Zhang (Princeton University) · Alec Koppel (U.S. Army Research Laboratory) · Amrit Singh Bedi (US Army Research Laboratory) · Csaba Szepesvari (DeepMind / University of Alberta) · Mengdi Wang (Princeton University)
43、Model-based Reinforcement Learning for Semi-Markov Decision Processes with Neural ODEs
Jianzhun Du (Harvard University) · Joseph Futoma (Harvard University) · Finale Doshi-Velez (Harvard)
44、DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction
Aviral Kumar (UC Berkeley) · Abhishek Gupta (University of California, Berkeley) · Sergey Levine (UC Berkeley)
45、Neurosymbolic Reinforcement Learning with Formally Verified Exploration
Greg Anderson (University of Texas at Austin) · Abhinav Verma (Rice University) · Isil Dillig (UT Austin) · Swarat Chaudhuri (The University of Texas at Austin)
46、Generalized Hindsight for Reinforcement Learning
Alexander Li (UC Berkeley) · Lerrel Pinto (New York University) · Pieter Abbeel (UC Berkeley & covariant.ai)
47、Meta-Gradient Reinforcement Learning with an Objective Discovered Online
Zhongwen Xu (DeepMind) · Hado van Hasselt (DeepMind) · Matteo Hessel (Google DeepMind) · Junhyuk Oh (DeepMind) · Satinder Singh (DeepMind) · David Silver (DeepMind)
48、TorsionNet: A Reinforcement Learning Approach to Sequential Conformer Search
Tarun Gogineni (University of Michigan) · Ziping Xu (University of Michigan) · Exequiel Punzalan (University of Michigan) · Runxuan Jiang (University of Michigan) · Joshua Kammeraad (University of Michigan) · Ambuj Tewari (University of Michigan) · Paul Zimmerman (University of Michigan)
49、Learning to Dispatch for Job Shop Scheduling via Deep Reinforcement Learning
Cong Zhang (Nanyang Technological University) · Wen Song (Institute of Marine Scinece and Technology, Shandong University) · Zhiguang Cao (National University of Singapore) · Jie Zhang (Nanyang Technological University) · Puay Siew Tan (SIMTECH) · Xu Chi (Singapore Institute of Manufacturing Technology, A-Star)
50、Is Plug-in Solver Sample-Efficient for Feature-based Reinforcement Learning ?
Qiwen Cui (Peking University) · Lin Yang (UCLA)
51、Instance-based Generalization in Reinforcement Learning
Martin Bertran (Duke University) · Natalia L Martinez (Duke University) · Mariano Phielipp (Intel AI Labs) · Guillermo Sapiro (Duke University)
52、Preference-based Reinforcement Learning with Finite-Time Guarantees
Yichong Xu (Carnegie Mellon University) · Ruosong Wang (Carnegie Mellon University) · Lin Yang (UCLA) · Aarti Singh (CMU) · Artur Dubrawski (Carnegie Mellon University)
53、Learning to Decode: Reinforcement Learning for Decoding of Sparse Graph-Based Channel Codes
Salman Habib (New Jersey Institute of Tech) · Allison Beemer (New Jersey Institute of Technology) · Joerg Kliewer (New Jersey Institute of Technology)
54、BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement Learning
Xinyue Chen (NYU Shanghai) · Zijian Zhou (NYU Shanghai) · Zheng Wang (NYU Shanghai) · Che Wang (New York University) · Yanqiu Wu (New York University) · Keith Ross (NYU Shanghai)
55、Task-Agnostic Online Reinforcement Learning with an Infinite Mixture of Gaussian Processes
Mengdi Xu (Carnegie Mellon University) · Wenhao Ding (Carnegie Mellon University) · Jiacheng Zhu (Carnegie Mellon University) · ZUXIN LIU (Carnegie Mellon University) · Baiming Chen (Tsinghua University) · Ding Zhao (Carnegie Mellon University)
56、On Reward-Free Reinforcement Learning with Linear Function Approximation
Ruosong Wang (Carnegie Mellon University) · Simon Du (Institute for Advanced Study) · Lin Yang (UCLA) · Russ Salakhutdinov (Carnegie Mellon University)
57、Near-Optimal Reinforcement Learning with Self-Play
Yu Bai (Salesforce Research) · Chi Jin (Princeton University) · Tiancheng Yu (MIT )
58、Robust Multi-Agent Reinforcement Learning with Model Uncertainty
Kaiqing Zhang (University of Illinois at Urbana-Champaign (UIUC)) · TAO SUN (Amazon.com) · Yunzhe Tao (Amazon Artificial Intelligence) · Sahika Genc (Amazon Artificial Intelligence) · Sunil Mallya (Amazon AWS) · Tamer Basar (University of Illinois at Urbana-Champaign)
59、Towards Minimax Optimal Reinforcement Learning in Factored Markov Decision Processes
Yi Tian (MIT) · Jian Qian (MIT) · Suvrit Sra (MIT)
60、Scalable Multi-Agent Reinforcement Learning for Networked Systems with Average Reward
Guannan Qu (California Institute of Technology) · Yiheng Lin (California Institute of Technology) · Adam Wierman (California Institute of Technology) · Na Li (Harvard University)
61、Constrained episodic reinforcement learning in concave-convex and knapsack settings
Kianté Brantley (The University of Maryland College Park) · Miro Dudik (Microsoft Research) · Thodoris Lykouris (Microsoft Research NYC) · Sobhan Miryoosefi (Princeton University) · Max Simchowitz (Berkeley) · Aleksandrs Slivkins (Microsoft Research) · Wen Sun (Microsoft Research NYC)
62、Sample Efficient Reinforcement Learning via Low-Rank Matrix Estimation
Devavrat Shah (Massachusetts Institute of Technology) · Dogyoon Song (Massachusetts Institute of Technology) · Zhi Xu (MIT) · Yuzhe Yang (MIT)
63、Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning
Younggyo Seo (KAIST) · Kimin Lee (UC Berkeley) · Ignasi Clavera Gilaberte (UC Berkeley) · Thanard Kurutach (University of California Berkeley) · Jinwoo Shin (KAIST) · Pieter Abbeel (UC Berkeley & covariant.ai)
64、Cooperative Heterogeneous Deep Reinforcement Learning
Han Zheng (UTS) · Pengfei Wei (National University of Singapore) · Jing Jiang (University of Technology Sydney) · Guodong Long (University of Technology Sydney (UTS)) · Qinghua Lu (Data61, CSIRO) · Chengqi Zhang (University of Technology Sydney)
65、Implicit Distributional Reinforcement Learning
Yuguang Yue (University of Texas at Austin) · Zhendong Wang (University of Texas, Austin) · Mingyuan Zhou (University of Texas at Austin)
66、Efficient Exploration of Reward Functions in Inverse Reinforcement Learning via Bayesian Optimization
Sreejith Balakrishnan (National University of Singapore) · Quoc Phong Nguyen (National University of Singapore) · Bryan Kian Hsiang Low (National University of Singapore) · Harold Soh (National University Singapore)
67、EPOC: A Provably Correct Policy Gradient Approach to Reinforcement Learning
Alekh Agarwal (Microsoft Research) · Mikael Henaff (Microsoft) · Sham Kakade (University of Washington) · Wen Sun (Microsoft Research NYC)
68、Provably Efficient Reinforcement Learning with Kernel and Neural Function Approximations
Zhuoran Yang (Princeton) · Chi Jin (Princeton
联系人: | 透明七彩巨人 |
---|---|
Email: | weok168@gmail.com |