题目:Robust Estimation and Inference under Huber’s Contamination Model
主讲人: 匹兹堡大学 任钊副教授
主持人:统计学院 常晋源教授
时间:2021年3月26日(周五)上午10:00-11:00
直播平台及会议ID:腾讯会议, 537 472 552
报告摘要:
This talk describes some new challenges and results in statistical inference of regression and nonparametric estimation under the celebrated Huber’s contamination model, with a focus on the influence of contamination on the minimax rates.
In the first part of the talk, we study the robust estimation and inference problem for linear models in the increasing dimension regime. Given random design, we consider the conditional distributions of error terms are contaminated by some arbitrary distribution (possibly depending on the covariates) with proportion ε but otherwise can also be heavy-tailed and asymmetric. We show that simple robust M-estimators such as Huber and smoothed Huber, with an additional intercept added in the model, can achieve the minimax rates of convergence under the l2 loss. In addition, two types of confidence intervals with root-n consistency are provided by a multiplier bootstrap technique when the necessary condition on contamination proportion ε = o(1/ n) holds. For a larger ε, we further propose a debiasing procedure to reduce the potential bias caused by contamination, and prove the validity of the debiased confidence interval. Our method can be extended to the communication-efficient distributed estimation and inference setting in a straightforward way.
In the second part of the talk, we address the problem of density function estimation in Rd under Lp losses (1 ≤ p < ∞) with contaminated data. We investigate the effects of contamination proportion ε among other key quantities on the corresponding minimax rates of convergence for both structured and unstructured contamination over a scale of the anisotropic Nikol’skii classes: for structured contamination, ε always appears linearly in the optimal rates while for unstructured contamination, the leading term of the optimal rate involving ε also relies on the smoothness of target density class and the specific loss function. The corresponding adaption theory is also investigated by establishing Lp risk oracle inequalities via novel Goldenshluger-Lepski-type methods. An interesting feature is that in certain situation adaptive estimation can become a much harder task with the presence of contamination.
Based on joint works with Wen-Xin Zhou and Peiliang Zhang.
主讲人简介:
Zhao Ren is an Associate Professor in the Department of Statistics at the University of Pittsburgh. Prior to joining Pitt, Dr. Ren obtained his Ph.D. in Statistics at Yale University in 2014. He is broadly interested in high-dimensional statistical inference, covariance/precision matrix estimation, graphical models, robust statistics, statistical machine learning, nonparametric function estimation and applications in statistical genomics.