报告题目：Non-nested model selection based on empirical likelihood ratio tests.
报告摘要： We propose an empirical likelihood ratio (ELR) test for comparing any two supervised learning models, which may be nested, non-nested, overlapping, mis-specified, or correctly specified. The test compares the prediction losses of models based on the cross-validation. and allows for heteroscedasticity of the errors. We establish asymptotic null and alternative distributions of the ELR test for comparing two nonparametric learning models under a general framework of convex loss functions. However, the prediction losses from the cross-validation involve repeatedly fitting the models with one observation left out, which leads to a heavy computational burden. We introduce an easy-to-implement ELR test which requires fitting the models only once and shares the same asymptotics as the original one. The proposed tests are applied to compare additive models with varying-coefficient models. Furthermore, a scalable distributed ELR test is proposed for testing the importance of a group of variables in possibly mis-specified additive models with massive data. It is shown that the distributed ELR performs the same as the ideal ELR with full data running on one machine. Simulations show that the proposed tests work well and have favorable finite-sample performance over some existing approaches. The methodology is validated in an empirical application.
报告人简介： 蒋建成，美国北卡大学夏洛特分校数学与统计系教授。主要从事生物统计、金融计量经济学、非参数统计、数据科学等方面的研究，在Annals of Statistics, Biometrika, Journal of American Statistical Association, Journal of the Royal Statistical Society 等国际著名统计期刊发表论文50余篇, 担任Statistica Sinica 等杂志副主编。(Dr. Jiancheng Jiang is a professor of statistics at the University of North Carolina at Charlotte, USA. He was appointed as chair professor of Nankai University in 2017-2020 and served as the statistics program coordinator at UNC Charlotte and the associate editor of Statistica Sinica and other journals since 2017. He has been awarded several NSF/NIH grants since 2004. His research interest ranges from (bio)statistics to econometrics and data science.