DB视讯(中国)学术报告第22期-数据科学与商业智能联合DB视讯(中国)

DB视讯(中国)

您当前的位置: 首 页 > 学术活动 > 学术报告 > 正文

DB视讯(中国)学术报告第22期

题目:Effect Size Heterogeneity Matters in High Dimensions

主讲人:宾夕法尼亚大学 苏炜杰助理教授

主持人:西南财经大学统计学院 常晋源教授

时间:202065日(星期五)10:00-11:20

直播平台及会议ID:腾讯会议,872 707 851


报告摘要:

In high-dimensional linear regression, would increasing true effect sizes always lead to better model selection, while maintaining the other conditions unchanged (such as fixing sparsity)? In this paper, we answer this question in the negative in a certain regime of sparsity for the Lasso method, through introducing a new notion we term effect size heterogeneity. Roughly speaking, a regression coefficient vector has high effect size heterogeneity if the nonzero entries of this vector have significantly different magnitudes, and vice versa. From the perspective of this new measure, we prove that in the regime of linear sparsity, false and true positive rates achieve the optimal trade-off uniformly along the Lasso path when this measure is maximal in the sense that all nonzero effect sizes have very different magnitudes, and the worst-case trade-off is achieved when it is minimum in the sense that all nonzero effect sizes are about equal. Moreover, we demonstrate that the Lasso path produces an optimal ranking of explanatory variables in terms of the rank of the first false variable when the effect size heterogeneity is maximum, and vice versa. Taken together, the two findings suggest that effect size heterogeneity shall serve as a complementary measure to the sparsity of regression coefficients in the analysis of high- dimensional regression problems. In the case of low effect size heterogeneity, variables with comparable effect sizes—no matter how large they are—metaphorically, would compete with each other along the Lasso path, leading to the degradation of the Lasso in terms of variable selection. Our proofs use techniques from approximate message passing theory as well as a novel argument for estimating the rank of the first false variable.


主讲人简介:

Weijie Su is an Assistant Professor in the Wharton Statistics Department at the University of Pennsylvania, where he co-directs the Penn Research in Machine Learning. Prior to joining Penn, he received his Ph.D. in Statistics from Stanford University in 2016 and his B.S. in Mathematics from Peking University in 2011. His research interests span high-dimensional statistics, mathematical optimization, privacy-preserving data analysis, multiple hypothesis testing, and deep learning theory. He is a recipient of the Stanford Theodore W. Anderson Dissertation Award in 2016, an NSF CAREER Award in 2019, an Alfred P. Sloan Research Fellowship and a Facebook Faculty Research Award in 2020.


电话:86-028-87352207                
地址:四川省成都市青羊区光华村街55号                
邮编:610074                
西南财经大学 数据科学与商业智能联合DB视讯(中国) 版权所有                
蜀ICP备05006386号