NeurIPS 今年共收录1900篇论文,我该怎么阅读?

2020-10-13 12:16:00
刘大牛
转自文章
230

近日, 知乎 上有个小热的问题:

在这个问题下,已经有众多大佬对如何阅读论文进行献言献策。

确实,今年 NeurIPS 2020 有接近两千篇论文被接收,这是一个什么概念?

据说,AI 圈子的一位大神—— 旷视科技 张祥雨博士,3 年看完了 1800 篇论文。

这已经是相当恐怖的速度了,按照这个速度,对大神而言,读完 NeurIPS 2020 的论文尚且需要花费三年的时间,这让别人该何去何从?

关于如何读论文,AI 科技评论之前也有一篇“ 吴恩达 教你读论文:持续而缓慢的学习,才是正道”的文章 ,大家可以再次阅读学习。

按照 吴恩达 的观点,读论文不能贪快,要高质量、持续地阅读学习才是正道。

今日,AI 科技评论以 NeurIPS 2020 接近两千篇的论文为例,给大家提供两个论文阅读的便利。

1、阅读大牛的论文:

见“ NeurIPS 2020 论文接收大排行!谷歌 169 篇第一、斯坦福第二、清华国内第一”一文。

在这篇文章中,AI 科技评论列举了 AI 学术大牛如 深度学习 三巨头、 周志华 李飞飞 等人的论文,大牛的团队出品的论文,质量平均而言肯定有很大保证的。

2、按主题分门别类的阅读:

这是显而易见的选择,也是大家正在做的事情,AI 科技评论今天这篇文章正是把 NeurIPS 2020 的论文做了一个简单分类统计供大家参考阅读。

说明:

1、统计主题根据日常经常接触到的i进行,不保证全面。

2、统计会有交叉和重复:如论文《Semi-Supervised Neural Architecture Search》会被 半监督学习 和 NAS 统计两次。

3、统计基于“人工”(的)智能,若有疏漏和错误请怪在 AI 身上。

4、本文统计后续补充会持续更新在 AI 科技评论 知乎 专栏上,欢迎大家关注。



前奏

1、论文题目最短的论文:

《Choice Bandits》

2、合作人数最多(31人)的论文:谷歌大脑+OpenAI 29人+约翰霍普斯金天团

《Language Models are Few-Shot Learners 》 

3、模仿 Attenton is all you need?

4、五篇和新冠肺炎有关的论文:

《何时以及如何解除风险?基于区域高斯过程的全球 COVID-19(新冠肺炎)情景分析与政策评估》

《新冠肺炎在德国传播的原因分析》 

《CogMol:新冠肺炎靶向性和选择性药物设计》

《非药物干预对新冠肺炎传播有效性估计的鲁棒性研究》

《新冠肺炎预测的可解释序列学习》

另附:COVID-19 Open Data:新冠疫情开放时序数据集 https://github.com/GoogleCloudPlatform/covid-19-open-data

5、五篇 Rethinking 的文章:

《重新思考标签对改善类不平衡学习的价值》

《重新思考预训练和自训练》

这篇由谷歌大脑出品的论文 6 月 11 日就挂在 arXiv 上面了,

论文链接:https://arxiv.org/pdf/2006.06882

《重新思考图神经网络的池化层》

《重新思考通用特征转换中可学习树 Filter》

《重新思考分布转移/转换下 深度学习 的重要性权重 》



6、题目带有 Beyond 的论文:

今年 ACL 2020 最佳论文题目正是带有 Beyond 一词,以下论文中的某一篇说不定会沾沾 ACL 2020 最佳论文的喜气在 NeurIPS 2020 上面获个大奖。(如未获奖,概不负责)



其中第一篇论文以 Beyond accuracy 开头,这和 ACL 2020 最佳论文题目开头一模一样了。

7、题目比较有意思的论文:

《Teaching a GAN What Not to Learn》

Siddarth Asokan (Indian Institute of Science) · Chandra Seelamantula (IISc Bangalore)

《Self-supervised learning through the eyes of a child》

Emin Orhan (New York University) · Vaibhav Gupta (New York University) · Brenden Lake (New York University)

《How hard is to distinguish graphs with graph neural networks?》

Andreas Loukas (EPFL)

《Tree! I am no Tree! I am a low dimensional Hyperbolic Embedding》

Rishi S Sonthalia (University of Michigan) · Anna Gilbert (University of Michigan)

8、Relu:7 篇



2

NLP相关



1、BERT:7 篇





2、Attention:24 篇,这里 Attention 不止有用在 NLP 领域,这里暂且归到NLP 分类下,下同。

1、Auto Learning  Attention

Benteng Ma (Northwestern Polytechnical University) · Jing Zhang (The University of Sydney) · Yong Xia (Northwestern Polytechnical University, Research & Development Institute of Northwestern Polytechnical University in Shenzhen) · Dacheng Tao (University of Sydney)

2、Bayesian  Attention  Modules

Xinjie Fan (UT Austin) · Shujian Zhang (UT Austin) · Bo Chen (Xidian University) · Mingyuan Zhou (University of Texas at Austin)

3、Improving Natural Language Processing Tasks with Human Gaze-Guided Neural  Attention

Ekta Sood (University of Stuttgart, Simtech ) · Simon Tannert (Institute for Natural Language Processing, University of Stuttgart) · Philipp Mueller (VIS, University of Stuttgart) · Andreas Bulling (University of Stuttgart)

4、Prophet Attention: Predicting Attention with Future  Attention  for Improved Image Captioning

Fenglin Liu (Peking University) · Xuancheng Ren (Peking University) · Xian Wu (Tencent Medical AI Lab) · Shen Ge (Tencent Medical AI Lab) · Wei Fan (Tencent) · Yuexian Zou (Peking University) · Xu Sun (Peking University)

5、Kalman Filtering  Attention  for User Behavior Modeling in CTR Prediction

Hu Liu (JD.com) · Jing LU (Business Growth BU JD.com) · Xiwei Zhao (JD.com) · Sulong Xu (JD.com) · Hao Peng (JD.com) · Yutong Liu (JD.com) · Zehua Zhang (JD.com) · Jian Li (JD.com) · Junsheng Jin (JD.com) · Yongjun Bao (JD.com) · Weipeng Yan (JD.com)

6、RANet: Region  Attention  Network for Semantic Segmentation

Dingguo Shen (Shenzhen University) · Yuanfeng Ji (City University of Hong Kong) · Ping Li (The Hong Kong Polytechnic University) · Yi Wang (Shenzhen University) · Di Lin (Tianjin University)

7、SE(3)-Transformers: 3D Roto-Translation Equivariant  Attention  Networks

Fabian Fuchs (University of Oxford) · Daniel Worrall (University of Amsterdam) · Volker Fischer (Robert Bosch GmbH, Bosch Center for Artificial Intelligence) · Max Welling (University of Amsterdam / Qualcomm AI Research)

8、Complementary  Attention  Self-Distillation for Weakly-Supervised Object Detection

Zeyi Huang (carnegie mellon university) · Yang Zou (Carnegie Mellon University) · B. V. K. Vijaya Kumar (CMU, USA) · Dong Huang (Carnegie Mellon University)

9、Modern Hopfield Networks and  Attention  for Immune Repertoire Classification

Michael Widrich (LIT AI Lab / University Linz) · Bernhard Schäfl (JKU Linz) · Milena Pavlović (Department of Informatics, University of Oslo) · Hubert Ramsauer (LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria) · Lukas Gruber (Johannes Kepler University) · Markus Holzleitner (LIT AI Lab / University Linz) · Johannes Brandstetter (LIT AI Lab / University Linz) · Geir Kjetil Sandve (Department of Informatics, University of Oslo) · Victor Greiff (Department of Immunology, University of Oslo) · Sepp Hochreiter (LIT AI Lab / University Linz / IARAI) · Günter Klambauer (LIT AI Lab / University Linz)

10、Untangling tradeoffs between recurrence and  self-attention  in artificial neural networks

Giancarlo Kerg (MILA) · Bhargav Kanuparthi (Montreal Institute for Learning Algorithms) · Anirudh Goyal ALIAS PARTH GOYAL (Université de Montréal) · Kyle Goyette (University of Montreal) · Yoshua Bengio (Mila / U. Montreal) · Guillaume Lajoie (Mila, Université de Montréal)

11、RATT: Recurrent  Attention  to Transient Tasks for Continual Image Captioning

Riccardo Del Chiaro (University of Florence) · Bartłomiej Twardowski (Computer Vision Center, UAB) · Andrew D Bagdanov (University of Florence) · Joost van de Weijer (Computer Vision Center Barcelona)

12、Multi-Task Temporal Shift  Attention  Networks for On-Device Contactless Vitals Measurement

Xin Liu (University of Washington ) · Josh Fromm (OctoML) · Shwetak Patel (University of Washington) · Daniel McDuff (Microsoft Research)

13、SAC: Accelerating and Structuring  Self-Attention  via Sparse Adaptive Connection

Xiaoya Li (Shannon.AI) · Yuxian Meng (Shannon.AI) · Mingxin Zhou (Shannon.AI) · Qinghong Han (Shannon.AI) · Fei Wu (Zhejiang University) · Jiwei Li (Shannon.AI)

14、Fast Transformers with Clustered  Attention

Apoorv Vyas (Idiap Research Institute) · Angelos Katharopoulos (Idiap) · François Fleuret (University of Geneva)

15、Sparse and Continuous  Attention  Mechanisms

André Martins () · Marcos Treviso (Instituto de Telecomunicacoes) · António Farinhas (Instituto Superior Técnico) · Vlad Niculae (Instituto de Telecomunicações) · Mario Figueiredo (University of Lisbon) · Pedro Aguiar (Instituto Superior Técnico)

16、Learning to Execute Programs with Instruction Pointer  Attention  Graph Neural Networks

David Bieber (Google Brain) · Charles Sutton (Google) · Hugo Larochelle (Google Brain) · Daniel Tarlow (Google Brain)

17、Neural encoding with visual  attention

Meenakshi Khosla (Cornell University) · Gia Ngo (Cornell University) · Keith Jamison (Cornell University) · Amy Kuceyeski (Cornell University) · Mert Sabuncu (Cornell)

18、Deep Reinforcement Learning with Stacked Hierarchical  Attention  for Text-based Games

Yunqiu Xu (University of Technology Sydney) · Meng Fang (Tencent) · Ling Chen (" University of Technology, Sydney, Australia") · Yali Du (University College London) · Joey Tianyi Zhou (IHPC, A*STAR) · Chengqi Zhang (University of Technology Sydney)

19、Object-Centric Learning with Slot  Attention

Francesco Locatello (ETH Zürich - MPI Tübingen) · Dirk Weissenborn (Google) · Thomas Unterthiner (Google Research, Brain Team) · Aravindh Mahendran (Google) · Georg Heigold (Google) · Jakob Uszkoreit (Google, Inc.) · Alexey Dosovitskiy (Google Research) · Thomas Kipf (Google Research)

20、SMYRF - Efficient  attention  using asymmetric clustering

Giannis Daras (National Technical University of Athens) · Nikita Kitaev (University of California, Berkeley) · Augustus Odena (Google Brain) · Alexandros Dimakis (University of Texas, Austin)

21、Focus of  Attention  Improves Information Transfer in Visual Features

Matteo Tiezzi (University of Siena) · Stefano Melacci (University of Siena) · Alessandro Betti (University of Siena) · Marco Maggini (University of Siena) · Marco Gori (University of Siena)

22、AttendLight: Universal  Attention -Based Reinforcement Learning Model for Traffic Signal Control

Afshin Oroojlooy (SAS Institute, Inc) · Mohammadreza Nazari (SAS Institute Inc.) · Davood Hajinezhad (SAS Institute Inc.) · Jorge Silva (SAS)

23、Multi-agent Trajectory Prediction with Fuzzy Query  Attention

Nitin Kamra (University of Southern California) · Hao Zhu (Peking University) · Dweep Kumarbhai Trivedi (University of Southern California) · Ming Zhang (Peking University) · Yan Liu (University of Southern California)

24、Limits to Depth Efficiencies of  Self-Attention

Yoav Levine (HUJI) · Noam Wies (Hebrew University of Jerusalem) · Or Sharir (Hebrew University of Jerusalem) · Hofit Bata (Hebrew University of Jerusalem) · Amnon Shashua (Hebrew University of Jerusalem)

25、Improving Natural Language Processing Tasks with Human Gaze-Guided Neural  Attention

Ekta Sood (University of Stuttgart, Simtech ) · Simon Tannert (Institute for Natural Language Processing, University of Stuttgart) · Philipp Mueller (VIS, University of Stuttgart) · Andreas Bulling (University of Stuttgart

3、Transformer:14 篇

1、Fast  Transformers  with Clustered Attention

Apoorv Vyas (Idiap Research Institute) · Angelos Katharopoulos (Idiap)· François Fleuret (University of Geneva)

2、Deep  Transformers  with Latent Depth

Xian Li (Facebook) · Asa Cooper Stickland (University of Edinburgh) · Yuqing Tang (Facebook AI) · Xiang Kong (Carnegie Mellon University)

3、Cross Transformers : spatially-aware few-shot transfer

Carl Doersch (DeepMind) · Ankush Gupta (DeepMind) · Andrew Zisserman (DeepMind & University of Oxford)

4、SE(3)- Transformers : 3D Roto-Translation Equivariant Attention Networks

Fabian Fuchs (University of Oxford) · Daniel Worrall (University of Amsterdam) · Volker Fischer (Robert Bosch GmbH, Bosch Center for Artificial Intelligence) · Max Welling (University of Amsterdam / Qualcomm AI Research)

5、Funnel- Transformer : Filtering out Sequential Redundancy for Efficient Language Processing

Zihang Dai (Carnegie Mellon University) · Guokun Lai (Carnegie Mellon University) · Yiming Yang (CMU) · Quoc V Le (Google)

6、Adversarial Sparse  Transformer  for Time Series Forecasting

Sifan Wu (Tsinghua University) · Xi Xiao (Tsinghua University) · Qianggang Ding (Tsinghua University) · Peilin Zhao (Tencent AI Lab) · Ying Wei (Tencent AI Lab) · Junzhou Huang (University of Texas at Arlington / Tencent AI Lab)

7、Accelerating Training of  Transformer -Based Language Models with Progressive Layer Dropping

Minjia Zhang (Microsoft) · Yuxiong He (Microsoft)

8、COOT: Cooperative Hierarchical  Transformer  for Video-Text Representation Learning

Mohammadreza Zolfaghari (University of Freiburg) · Simon Ging (Uni Freiburg) · Hamed Pirsiavash (University of Maryland, Baltimore County) · Thomas Brox (University of Freiburg)

9、Cascaded Text Generation with Markov  Transformers

Yuntian Deng (Harvard University) · Alexander Rush (Cornell University)

10、GROVER: Self-Supervised Message Passing  Transformer  on Large-scale Molecular Graphs

Yu Rong (Tencent AI Lab) · Yatao Bian (Tencent AI Lab) · Tingyang Xu (Tencent AI Lab) · Weiyang Xie (Tencent AI Lab) · Ying WEI (Tencent AI Lab) · Wenbing Huang (Tsinghua University) · Junzhou Huang (University of Texas at Arlington / Tencent AI Lab)

11、Learning to Communicate in Multi-Agent Systems via  Transformer -Guided Program Synthesis

Jeevana Priya Inala (MIT) · Yichen Yang (MIT) · James Paulos (University of Pennsylvania) · Yewen Pu (MIT) · Osbert Bastani (University of Pennysylvania) · Vijay Kumar (University of Pennsylvania) · Martin Rinard (MIT) · Armando Solar-Lezama (MIT)

12、Measuring Systematic Generalization in Neural Proof Generation with  Transformers

Nicolas Gontier (Mila, Polytechnique Montréal) · Koustuv Sinha (McGill University / Mila / FAIR) · Siva Reddy (McGill University) · Chris Pal (Montreal Institute for Learning Algorithms, École Polytechnique, Université de Montréal)

13、O(n)   Connections are Expressive Enough: Universal Approximability of Sparse  Transformers

Chulhee Yun (MIT) · Yin-Wen Chang (Google Inc.) · Srinadh Bhojanapalli (Google AI) · Ankit Singh Rawat (Google Research) · Sashank Reddi (Google) · Sanjiv Kumar (Google Research)

14、MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained  Transformers

Wenhui Wang (MSRA) · Furu Wei (Microsoft Research Asia) · Li Dong (Microsoft Research) · Hangbo Bao (Harbin Institute of Technology) · Nan Yang (Microsoft Research Asia) · Ming Zhou (Microsoft Research)

4、预训练:5 篇

1、 Pre-training  via Paraphrasing

Mike Lewis (Facebook AI Research) · Marjan Ghazvininejad (Facebook AI Research) · Gargi Ghosh (Facebook) · Armen Aghajanyan (Facebook) · Sida Wang (Facebook AI Research) · Luke Zettlemoyer (University of Washington and Allen Institute for Artificial Intelligence)

2、 Pre-Training  Graph Neural Networks: A Contrastive Learning Framework with Augmentations

Yuning You (Texas A&M University) · Tianlong Chen (Unversity of Texas at Austin) · Yongduo Sui (University of Science and Technology of China) · Ting Chen (Google) · Zhangyang Wang (University of Texas at Austin) · Yang Shen (Texas A&M University)

3、Rethinking  Pre-training  and Self-training

Barret Zoph (Google Brain) · Golnaz Ghiasi (Google) · Tsung-Yi Lin (Google Brain) · Yin Cui (Google) · Hanxiao Liu (Google Brain) · Ekin Dogus Cubuk (Google Brain) · Quoc V Le (Google)

4、MPNet: Masked and Permuted  Pre-training  for Language Understanding

Kaitao Song (Nanjing University of Science and technology) · Xu Tan (Microsoft Research) · Tao Qin (Microsoft Research) · Jianfeng Lu (Nanjing University of Science and Technology) · Tie-Yan Liu (Microsoft Research Asia)

5、Adversarial Contrastive Learning: Harvesting More Robustness from Unsupervised  Pre-Training

Ziyu Jiang (Texas A&M University) · Tianlong Chen (Unversity of Texas at Austin) · Ting Chen (Google) · Zhangyang Wang (University of Texas at Austin)

3



CV相关 目标检测:12 篇

1、A Ranking-based, Balanced Loss Function for Both Classification and Localisation in  Object Detection

Kemal Oksuz (Middle East Technical University) · Baris Can Cam (Roketsan) · Emre Akbas (Middle East Technical University) · Sinan Kalkan (Middle East Technical University)

2、UWSOD: Toward Fully-Supervised-Level Performance Weakly Supervised Object Detection

Yunhang Shen (Xiamen University) · Rongrong Ji (Xiamen University, China) · Zhiwei Chen (Xiamen University) · Yongjian Wu (Tencent Technology (Shanghai) Co.,Ltd) · Feiyue Huang (Tencent)

3、Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense  Object Detection

Xiang Li (NJUST) · Wenhai Wang (Nanjing University) · Lijun Wu (Sun Yat-sen University) · Shuo Chen (Nanjing University of Science and Technology) · Xiaolin Hu (Tsinghua University) · Jun Li (Nanjing University of Science and Technology) · Jinhui Tang (Nanjing University of Science and Technology) · Jian Yang (Nanjing University of Science and Technology)

4、Every View Counts: Cross-View Consistency in 3D  Object Detection  with Hybrid-Cylindrical-Spherical Voxelization

Qi Chen (Johns Hopkins University) · Lin Sun (Samsung, Stanford, HKUST) · Ernest Cheung (Samsung) · Alan Yuille (Johns Hopkins University)

5、Complementary Attention Self-Distillation for Weakly-Supervised  Object Detection

Zeyi Huang (carnegie mellon university) · Yang Zou (Carnegie Mellon University) · B. V. K. Vijaya Kumar (CMU, USA) · Dong Huang (Carnegie Mellon University)

6、Few-Cost Salient  Object Detection  with Adversarial-Paced Learning

Dingwen Zhang (Xidian University) · HaiBin Tian (Xidian University) · Jungong Han (University of Warwick)

7、Bridging Visual Representations for  Object Detection

Cheng Chi (University of Chinese Academy of Sciences) · Fangyun Wei (Microsoft Research Asia) · Han Hu (Microsoft Research Asia)

8、Fine-Grained Dynamic Head for  Object Detection

Lin Song (Xian Jiaotong University) · Yanwei Li (The Chinese University of Hong Kong) · Zhengkai Jiang (Institute of Automation,Chinese Academy of Sciences) · Zeming Li (Megvii(Face++) Inc) · Hongbin Sun (Xian Jiaotong University) · Jian Sun (Megvii, Face++) · Nanning Zheng (Xian Jiaotong University)

9、Detection as Regression: Certified  Object Detection  with Median Smoothing

Ping-yeh Chiang (University of Maryland, College Park) · Michael Curry (University of Maryland) · Ahmed Abdelkader (University of Maryland, College Park) · Aounon Kumar (University of Maryland, College Park) · John Dickerson (University of Maryland) · Tom Goldstein (University of Maryland)

10、RepPoints v2: Verification Meets Regression for  Object Detection

Yihong Chen (Peking University) · Zheng Zhang (MSRA) · Yue Cao (Microsoft Research) · Liwei Wang (Peking University) · Stephen Lin (Microsoft Research) · Han Hu (Microsoft Research Asia)

11、CoADNet: Collaborative Aggregation-and-Distribution Networks for Co-Salient  Object Detection

Qijian Zhang (City University of Hong Kong) · Runmin Cong (Beijing Jiaotong University) · Junhui Hou (City University of Hong Kong, Hong Kong) · Chongyi Li ( Nanyang Technological University) · Yao Zhao (Beijing Jiaotong University)

12、Restoring Negative Information in Few-Shot  Object Detection

Yukuan Yang (Tsinghua University) · Fangyun Wei (Microsoft Research Asia) · Miaojing Shi (Kings College London) · Guoqi Li (Tsinghua University)

目标分割:3篇

1、Video Object Segmentation with Adaptive Feature Bank and Uncertain-Region Refinement

Yongqing Liang (Louisiana State University) · Xin Li (Louisiana State University) · Navid Jafari (Louisiana State University) · Jim Chen (Northeastern University)



2、Make One-Shot Video  Object Segmentation  Efficient Again



Tim Meinhardt (TUM) · Laura Leal-Taixé (TUM)



3、Delving into the Cyclic Mechanism in Semi-supervised Video  Object Segmentation



Yuxi Li (Shanghai Jiao Tong University) · Jinlong Peng (Tencent Youtu Lab) · Ning Xu (Adobe Research) · John See (Multimedia University) · Weiyao Lin (Shanghai Jiao Tong university)



实例分割:2 篇

1、Deep Variational  Instance Segmentation

Jialin Yuan (Oregon State University) · Chao Chen (Stony Brook University) · Fuxin Li (Oregon State University)

2、DFIS: Dynamic and Fast  Instance Segmentation

Xinlong Wang (University of Adelaide) · Rufeng Zhang (Tongji University) · Tao Kong (Bytedance) · Lei Li (ByteDance AI Lab) · Chunhua Shen (University of Adelaide)

行人重识别:

4

各种Learning

1、强化学习:94 篇

1、 Reinforcement Learning  for Control with Multiple Frequencies

Jongmin Lee (KAIST) · ByungJun Lee (KAIST) · Kee-Eung Kim (KAIST)

2、 Reinforcement Learning  with General Value Function Approximation: Provably Efficient Approach via Bounded Eluder Dimension

Ruosong Wang (Carnegie Mellon University) · Russ Salakhutdinov (Carnegie Mellon University) · Lin Yang (UCLA)

3、 Reinforcement Learning  in Factored MDPs: Oracle-Efficient Algorithms and Tighter Regret Bounds for the Non-Episodic Setting

Ziping Xu (University of Michigan) · Ambuj Tewari (University of Michigan)

4、 Reinforcement Learning  with Feedback Graphs

Christoph Dann (Carnegie Mellon University) · Yishay Mansour (Google) · Mehryar Mohri (Courant Inst. of Math. Sciences & Google Research) · Ayush Sekhari (Cornell University) · Karthik Sridharan (Cornell University)

5、 Reinforcement Learning  with Augmented Data

Misha Laskin (UC Berkeley) · Kimin Lee (UC Berkeley) · Adam Stooke (UC Berkeley) · Lerrel Pinto (New York University) · Pieter Abbeel (UC Berkeley & covariant.ai) · Aravind Srinivas (UC Berkeley)

6、 Reinforcement Learning  with Combinatorial Actions: An Application to Vehicle Routing

Arthur Delarue (MIT) · Ross Anderson (Google Research) · Christian Tjandraatmadja (Google)

7、Breaking the Sample Size Barrier in Model-Based  Reinforcement Learning with a Generative Model

Gen Li (Tsinghua University) · Yuting Wei (Carnegie Mellon University) · Yuejie Chi (CMU) · Yuantao Gu (Tsinghua University) · Yuxin Chen (Princeton University)

8、Almost Optimal Model-Free  Reinforcement Learning  via Reference-Advantage Decomposition

Zihan Zhang (Tsinghua University) · Yuan Zhou (UIUC) · Xiangyang Ji (Tsinghua University)

9、Effective Diversity in Population Based  Reinforcement Learning

Jack Parker-Holder (University of Oxford) · Aldo Pacchiano (UC Berkeley) · Krzysztof M Choromanski (Google Brain Robotics) · Stephen J Roberts (University of Oxford)

10、A Boolean Task Algebra for  Reinforcement Learning

Geraud Nangue Tasse (University of the Witwatersrand) · Steven James (University of the Witwatersrand) · Benjamin Rosman (University of the Witwatersrand / CSIR)

11、Knowledge Transfer in Multi-Task  Deep Reinforcement Learning  for Continuous Control

Zhiyuan Xu (Syracuse University) · Kun Wu (Syracuse University) · Zhengping Che (DiDi AI Labs, Didi Chuxing) · Jian Tang (DiDi AI Labs, DiDi Chuxing) · Jieping Ye (Didi Chuxing)

12、Multi-task Batch  Reinforcement Learning  with Metric Learning

Jiachen Li (University of California, San Diego) · Quan Vuong (University of California San Diego) · Shuang Liu (University of California, San Diego) · Minghua Liu (UCSD) · Kamil Ciosek (Microsoft) · Henrik Christensen (UC San Diego) · Hao Su (UCSD)

13、On the Stability and Convergence of Robust Adversarial  Reinforcement Learning : A Case Study on Linear Quadratic Systems

Kaiqing Zhang (University of Illinois at Urbana-Champaign (UIUC)) · Bin Hu (University of Illinois at Urbana-Champaign) · Tamer Basar (University of Illinois at Urbana-Champaign)

14、Towards Playing Full MOBA Games with  Deep Reinforcement Learning

Deheng Ye (Tencent) · Guibin Chen (Tencent) · Wen Zhang (Tencent) · chen sheng (qq) · Bo Yuan (Tencent) · Bo Liu (Tencent) · Jia Chen (Tencent) · Hongsheng Yu (Tencent) · Zhao Liu (Tencent) · Fuhao Qiu (Tencent AI Lab) · Liang Wang (Tencent) · Tengfei Shi (Tencent) · Yinyuting Yin (Tencent) · Bei Shi (Tencent AI Lab) · Lanxiao Huang (Tencent) · qiang fu (Tencent AI Lab) · Wei Yang (Tencent AI Lab) · Wei Liu (Tencent AI Lab)

15、Promoting Coordination through Policy Regularization in Multi-Agent Deep Reinforcement Learning

Julien Roy (Mila) · Paul Barde (Quebec AI institute - Ubisoft La Forge) · Félix G Harvey (Polytechnique Montréal) · Derek Nowrouzezahrai (McGill University) · Chris Pal (MILA, Polytechnique Montréal, Element AI)

16、Confounding-Robust Policy Evaluation in Infinite-Horizon  Reinforcement Learning

Nathan Kallus (Cornell University) · Angela Zhou (Cornell University)

17、Learning Retrospective Knowledge with Reverse  Reinforcement Learning

Shangtong Zhang (University of Oxford) · Vivek Veeriah (University of Michigan) · Shimon Whiteson (University of Oxford)

18、Combining  Deep Reinforcement Learning  and Search for Imperfect-Information Games

Noam Brown (Facebook AI Research) · Anton Bakhtin (Facebook AI Research) · Adam Lerer (Facebook AI Research) · Qucheng Gong (Facebook AI Research)

19、POMO: Policy Optimization with Multiple Optima for  Reinforcement Learning

Yeong-Dae Kwon (Samsung SDS) · Jinho Choo (Samsung SDS) · Byoungjip Kim (Samsung SDS) · Iljoo Yoon (Samsung SDS) · Youngjune Gwon (Samsung SDS) · Seungjai Min (Samsung SDS)

20、Self-Paced  Deep Reinforcement Learning

Pascal Klink (TU Darmstadt) · Carlo DEramo (TU Darmstadt) · Jan Peters (TU Darmstadt & MPI Intelligent Systems) · Joni Pajarinen (TU Darmstadt)

21、Efficient Model-Based  Reinforcement Learning  through Optimistic Policy Search and Planning

Sebastian Curi (ETHz) · Felix Berkenkamp (Bosch Center for Artificial Intelligence) · Andreas Krause (ETH Zurich)

22、Weakly-Supervised  Reinforcement Learning  for Controllable Behavior

Lisa Lee (CMU / Google Brain / Stanford) · Ben Eysenbach (Carnegie Mellon University) · Russ Salakhutdinov (Carnegie Mellon University) · Shixiang (Shane) Gu (Google Brain) · Chelsea Finn (Stanford)

23、MOReL: Model-Based Offline  Reinforcement Learning

Rahul Kidambi (Cornell University) · Aravind Rajeswaran (University of Washington) · Praneeth Netrapalli (Microsoft Research) · Thorsten Joachims (Cornell)

24、Security Analysis of Safe and Seldonian  Reinforcement Learning  Algorithms

Pinar Ozisik (UMass Amherst) · Philip Thomas (University of Massachusetts Amherst)

25、Model-based Adversarial  Meta-Reinforcement Learning

Zichuan Lin (Tsinghua University) · Garrett W. Thomas (Stanford University) · Guangwen Yang (Tsinghua University) · Tengyu Ma (Stanford University)

26、Safe  Reinforcement Learning  via Curriculum Induction

Matteo Turchetta (ETH Zurich) · Andrey Kolobov (Microsoft Research) · Shital Shah (Microsoft) · Andreas Krause (ETH Zurich) · Alekh Agarwal (Microsoft Research)

27、Conservative Q-Learning for Offline  Reinforcement Learning

Aviral Kumar (UC Berkeley) · Aurick Zhou (University of California, Berkeley) · George Tucker (Google Brain) · Sergey Levine (UC Berkeley)

28、Munchausen  Reinforcement Learning

Nino Vieillard (Google Brain) · Olivier Pietquin (Google Research Brain Team) · Matthieu Geist (Google Brain)

29、Non-Crossing Quantile Regression for Distributional  Reinforcement Learning

Fan Zhou (Shanghai University of Finance and Economics) · Jianing Wang (Shanghai University of Finance and Economics) · Xingdong Feng (Shanghai University of Finance and Economics)

30、Online Decision Based Visual Tracking via  Reinforcement Learning

ke Song (Shandong university) · Wei Zhang (Shandong University) · Ran Song (School of Control Science and Engineering, Shandong University) · Yibin Li (Shandong University)

31、Discovering  Reinforcement Learning  Algorithms

Junhyuk Oh (DeepMind) · Matteo Hessel (Google DeepMind) · Wojciech Czarnecki (DeepMind) · Zhongwen Xu (DeepMind) · Hado van Hasselt (DeepMind) · Satinder Singh (DeepMind) · David Silver (DeepMind)

32、Shared Experience Actor-Critic for Multi-Agent  Reinforcement Learning

Filippos Christianos (University of Edinburgh) · Lukas Schäfer (University of Edinburgh) · Stefano Albrecht (University of Edinburgh)

33、The LoCA Regret: A Consistent Metric to Evaluate Model-Based Behavior in Reinforcement Learning

Harm Van Seijen (Microsoft Research) · Hadi Nekoei (MILA) · Evan Racah (Mila, Université de Montréal) · Sarath Chandar (Mila / École Polytechnique de Montréal)

34、Leverage the Average: an Analysis of KL Regularization in  Reinforcement Learning

Nino Vieillard (Google Brain) · Tadashi Kozuno (Okinawa Institute of Science and Technology) · Bruno Scherrer (INRIA) · Olivier Pietquin (Google Research Brain Team) · Remi Munos (DeepMind) · Matthieu Geist (Google Brain)

35、Task-agnostic Exploration in  Reinforcement Learning

Xuezhou Zhang (UW-Madison) · Yuzhe Ma (University of Wisconsin-Madison) · Adish Singla (MPI-SWS)

36、Generating Adjacency-Constrained Subgoals in Hierarchical  Reinforcement Learning

Tianren Zhang (Tsinghua University) · Shangqi Guo (Tsinghua University) · Tian Tan (Stanford University) · Xiaolin Hu (Tsinghua University) · Feng Chen (Tsinghua University)

37、Storage Efficient and Dynamic Flexible Runtime Channel Pruning via Deep Reinforcement Learning

Jianda Chen (Nanyang Technological University) · Shangyu Chen (Nanyang Technological University, Singapore) · Sinno Jialin Pan (Nanyang Technological University, Singapore)

38、Multi-Task  Reinforcement Learning  with Soft Modularization

Ruihan Yang (UC San Diego) · Huazhe Xu (UC Berkeley) · YI WU (UC Berkeley) · Xiaolong Wang (UCSD/UC Berkeley)

39、Weighted QMIX: Improving Monotonic Value Function Factorisation for Deep Multi-Agent  Reinforcement Learning

Tabish Rashid (University of Oxford) · Gregory Farquhar (University of Oxford) · Bei Peng (University of Oxford) · Shimon Whiteson (University of Oxford)

40、MDP Homomorphic Networks: Group Symmetries in  Reinforcement Learning

Elise van der Pol (University of Amsterdam) · Daniel Worrall (University of Amsterdam) · Herke van Hoof (University of Amsterdam) · Frans Oliehoek (TU Delft) · Max Welling (University of Amsterdam / Qualcomm AI Research)

41、On Efficiency in Hierarchical  Reinforcement Learning

Zheng Wen (DeepMind) · Doina Precup (DeepMind) · Morteza Ibrahimi (DeepMind) · Andre Barreto (DeepMind) · Benjamin Van Roy (Stanford University) · Satinder Singh (DeepMind)

42、Variational Policy Gradient Method for  Reinforcement Learning  with General Utilities

Junyu Zhang (Princeton University) · Alec Koppel (U.S. Army Research Laboratory) · Amrit Singh Bedi (US Army Research Laboratory) · Csaba Szepesvari (DeepMind / University of Alberta) · Mengdi Wang (Princeton University)

43、Model-based  Reinforcement Learning  for Semi-Markov Decision Processes with Neural ODEs

Jianzhun Du (Harvard University) · Joseph Futoma (Harvard University) · Finale Doshi-Velez (Harvard)

44、DisCor: Corrective Feedback in  Reinforcement Learning  via Distribution Correction

Aviral Kumar (UC Berkeley) · Abhishek Gupta (University of California, Berkeley) · Sergey Levine (UC Berkeley)

45、Neurosymbolic  Reinforcement Learning  with Formally Verified Exploration

Greg Anderson (University of Texas at Austin) · Abhinav Verma (Rice University) · Isil Dillig (UT Austin) · Swarat Chaudhuri (The University of Texas at Austin)

46、Generalized Hindsight for  Reinforcement Learning

Alexander Li (UC Berkeley) · Lerrel Pinto (New York University) · Pieter Abbeel (UC Berkeley & covariant.ai)

47、Meta-Gradient  Reinforcement Learning  with an Objective Discovered Online

Zhongwen Xu (DeepMind) · Hado van Hasselt (DeepMind) · Matteo Hessel (Google DeepMind) · Junhyuk Oh (DeepMind) · Satinder Singh (DeepMind) · David Silver (DeepMind)

48、TorsionNet: A  Reinforcement Learning  Approach to Sequential Conformer Search

Tarun Gogineni (University of Michigan) · Ziping Xu (University of Michigan) · Exequiel Punzalan (University of Michigan) · Runxuan Jiang (University of Michigan) · Joshua Kammeraad (University of Michigan) · Ambuj Tewari (University of Michigan) · Paul Zimmerman (University of Michigan)

49、Learning to Dispatch for Job Shop Scheduling via Deep  Reinforcement Learning

Cong Zhang (Nanyang Technological University) · Wen Song (Institute of Marine Scinece and Technology, Shandong University) · Zhiguang Cao (National University of Singapore) · Jie Zhang (Nanyang Technological University) · Puay Siew Tan (SIMTECH) · Xu Chi (Singapore Institute of Manufacturing Technology, A-Star)

50、Is Plug-in Solver Sample-Efficient for Feature-based  Reinforcement Learning ?

Qiwen Cui (Peking University) · Lin Yang (UCLA)

51、Instance-based Generalization in  Reinforcement Learning

Martin Bertran (Duke University) · Natalia L Martinez (Duke University) · Mariano Phielipp (Intel AI Labs) · Guillermo Sapiro (Duke University)

52、Preference-based  Reinforcement Learning  with Finite-Time Guarantees

Yichong Xu (Carnegie Mellon University) · Ruosong Wang (Carnegie Mellon University) · Lin Yang (UCLA) · Aarti Singh (CMU) · Artur Dubrawski (Carnegie Mellon University)

53、Learning to Decode:  Reinforcement Learning  for Decoding of Sparse Graph-Based Channel Codes

Salman Habib (New Jersey Institute of Tech) · Allison Beemer (New Jersey Institute of Technology) · Joerg Kliewer (New Jersey Institute of Technology)

54、BAIL: Best-Action  Imitation Learning  for Batch Deep  Reinforcement Learning

Xinyue Chen (NYU Shanghai) · Zijian Zhou (NYU Shanghai) · Zheng Wang (NYU Shanghai) · Che Wang (New York University) · Yanqiu Wu (New York University) · Keith Ross (NYU Shanghai)

55、Task-Agnostic Online  Reinforcement Learning  with an Infinite Mixture of Gaussian Processes

Mengdi Xu (Carnegie Mellon University) · Wenhao Ding (Carnegie Mellon University) · Jiacheng Zhu (Carnegie Mellon University) · ZUXIN LIU (Carnegie Mellon University) · Baiming Chen (Tsinghua University) · Ding Zhao (Carnegie Mellon University)

56、On Reward-Free  Reinforcement Learning  with Linear Function Approximation

Ruosong Wang (Carnegie Mellon University) · Simon Du (Institute for Advanced Study) · Lin Yang (UCLA) · Russ Salakhutdinov (Carnegie Mellon University)

57、Near-Optimal  Reinforcement Learning  with Self-Play

Yu Bai (Salesforce Research) · Chi Jin (Princeton University) · Tiancheng Yu (MIT )

58、Robust Multi-Agent  Reinforcement Learning  with Model Uncertainty

Kaiqing Zhang (University of Illinois at Urbana-Champaign (UIUC)) · TAO SUN (Amazon.com) · Yunzhe Tao (Amazon Artificial Intelligence) · Sahika Genc (Amazon Artificial Intelligence) · Sunil Mallya (Amazon AWS) · Tamer Basar (University of Illinois at Urbana-Champaign)

59、Towards Minimax Optimal  Reinforcement Learning  in Factored Markov Decision Processes

Yi Tian (MIT) · Jian Qian (MIT) · Suvrit Sra (MIT)

60、Scalable Multi-Agent  Reinforcement Learning  for Networked Systems with Average Reward

Guannan Qu (California Institute of Technology) · Yiheng Lin (California Institute of Technology) · Adam Wierman (California Institute of Technology) · Na Li (Harvard University)

61、Constrained episodic  reinforcement learning  in concave-convex and knapsack settings

Kianté Brantley (The University of Maryland College Park) · Miro Dudik (Microsoft Research) · Thodoris Lykouris (Microsoft Research NYC) · Sobhan Miryoosefi (Princeton University) · Max Simchowitz (Berkeley) · Aleksandrs Slivkins (Microsoft Research) · Wen Sun (Microsoft Research NYC)

62、Sample Efficient  Reinforcement Learning  via Low-Rank Matrix Estimation

Devavrat Shah (Massachusetts Institute of Technology) · Dogyoon Song (Massachusetts Institute of Technology) · Zhi Xu (MIT) · Yuzhe Yang (MIT)

63、Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning

Younggyo Seo (KAIST) · Kimin Lee (UC Berkeley) · Ignasi Clavera Gilaberte (UC Berkeley) · Thanard Kurutach (University of California Berkeley) · Jinwoo Shin (KAIST) · Pieter Abbeel (UC Berkeley & covariant.ai)

64、Cooperative Heterogeneous  Deep Reinforcement Learning

Han Zheng (UTS) · Pengfei Wei (National University of Singapore) · Jing Jiang (University of Technology Sydney) · Guodong Long (University of Technology Sydney (UTS)) · Qinghua Lu (Data61, CSIRO) · Chengqi Zhang (University of Technology Sydney)

65、Implicit Distributional  Reinforcement Learning

Yuguang Yue (University of Texas at Austin) · Zhendong Wang (University of Texas, Austin) · Mingyuan Zhou (University of Texas at Austin)

66、Efficient Exploration of Reward Functions in Inverse  Reinforcement Learning  via Bayesian Optimization

Sreejith Balakrishnan (National University of Singapore) · Quoc Phong Nguyen (National University of Singapore) · Bryan Kian Hsiang Low (National University of Singapore) · Harold Soh (National University Singapore)

67、EPOC: A Provably Correct Policy Gradient Approach to  Reinforcement Learning

Alekh Agarwal (Microsoft Research) · Mikael Henaff (Microsoft) · Sham Kakade (University of Washington) · Wen Sun (Microsoft Research NYC)

68、Provably Efficient  Reinforcement Learning  with Kernel and Neural Function Approximations

Zhuoran Yang (Princeton) · Chi Jin (Princeton

发表评论
评论通过审核后显示。
文章分类
联系我们
联系人: 透明七彩巨人
Email: weok168@gmail.com
© 2013- 2024 透明七彩巨人-tmqcjr.com   
SQL查询: 26
内存占用: 8.25MB
PHP 执行时间: 0.22