科研学术

您当前的位置: 首页 > 科研学术 > 学术预告 > 学术报告 > 正文

伯努利学会东亚及太平洋区域系列报告|智能机器人的统计推断

发布时间:2022-03-09     来源:    点击数:

讲座时间:

202231008:00-10:00(伦敦时间,周四)

2022310 16:00-18:00(北京时间,周四)

腾讯会议ID874-650-891


主讲人一:史成春

讲座题目:Statistical inference in reinforcement learning


主讲人简介:

史成春是伦敦政治经济学院统计系的助理教授,目前有10余篇第一作者同行评审的文章被顶级统计期刊AOS, JRSSBJASA接受,还在顶级机器学习会议(ICMLNeurIPS)上发表了论文。2022年起,担任JRSSBJournal of Nonparametric Statistics的副主编。目前,他的研究主要是在强化学习和复杂数据中开发统计学习方法。他是2021年皇家统计学会研究奖的获奖者,连续两年获得IMS Travel Award


报告摘要:

Reinforcement learning (RL) is concerned with how intelligence agents take actions in a given environment to maximize the cumulative reward they receive. In healthcare, applying RL algorithms could assist patients in improving their health status. In ride-sharing platforms, applying RL algorithms could increase drivers' income and customer satisfaction. RL has been arguably one of the most vibrant research frontiers in machine learning over the last few years. Nevertheless, statistics as a field, as opposed to computer science, has only recently begun to engage with reinforcement learning both in depth and in breadth. In today's talk, I will discuss some of my recent work on developing statistical inferential tools for reinforcement learning, with applications to mobile health and ridesharing companies. The talk will cover several different papers published in highly-ranked statistical journals (JASA & JRSSB) and top machine learning conference (ICML)



主讲人二:严晓东

讲座题目:Statistical inference for reinforcement learning from the perspective of two-armed bandit process


主讲人简介:

严晓东,山东大学未来学者,山东大学金融研究院副研究员,云南大学与香港理工大学联合培养博士,加拿大阿尔伯塔大学博士后,中国现场统计研究会高维数据统计分会理事,山东省大数据专业建设委员会常务副秘书长,山东省应用统计学会副秘书长,山东省财政厅第一批省级政策性农业保险咨询专家。在国际著名期刊AOS, JASAJOE以及高水平期刊IJOF Statistica SinicaJMA等发表论文近20篇,荣获云南省2020年优秀博士论文奖。目前主持国家自科基金,省自科与社科基金等。


报告摘要:

Motivated by the study of asymptotic behaviour of the two armed bandit problem, we obtain several nonlinear limit theorems about the central limit theorem which is identified explicitly, and depend heavily on the structure of the events or the integrating functions. This demonstrates the key signature of the nonlinear structure. It also lays the theoretical foundation for statistical inference in determining the arm that offers a higher chance of reward. Meanwhilethis presentation also proposes a strategic sampling procedure to construct a treatment effect testing statistics and employs nonlinear limit theory to study its asymptotic behaviour referred to strategic central limit theorem (strategic CLT) . We also provide a common strategic sampling-based bootstrap to recover the limit distribution of the developed statistics, making its use possible on observational dataset and scalable for other hypothesis testings. The theoretical results achieve the explicit density function of limit distribution, known as spike distribution with a more spike function image than standard normal density. Simulation studies pose supportive evidence that the proposed spike statistics performs well with finite samples and especially shows powerful behavior with small size of the sample. A real data example is provided for illustration.


欢迎各位老师同学积极参加!


版权所有:山东大学中泰证券金融研究院
   地址:中国山东省济南市山大南路27号   邮编:250100    电话:0531-88364100   院长信箱: sxyuanzhang@sdu.edu.cn