报告题目：A decentralized parameter fusion for heterogeneous scattered data
报 告 人：钟文瑄教授(美国佐治亚大学统计系)
This talk focuses on the problem of parametric data fusion in scattered data, i.e., the data collected and stored in local data centers. Such a problem is known to be challenging owing to two distinguishing features of scattered data: (1) each data center can only communicate with neighboring data centers; (2) data have heterogeneous distributions across all local centers. Most of the existing methods for scattered data do not take the heterogeneity of data into account. In addition, the performances of these methods highly rely on the assumption that the models across all data centers are identical. Our empirical studies demonstrate that these methods have unsatisfactory performance when such an assumption is invalid and/or the data heterogeneity exists. To surmount these critical challenges, we propose a general statistical model, which accommodates across-center heterogeneity through center-specific models and integrates the center-specific models together through common parameters. We estimate the parameters of the model by combining the idea of minimum average variance estimation and decentralized computing. The proposed algorithm, named Decentralized Parametric Data Fusion (D-PDF), adaptively fuse the parameters of all the center-specific models, without requiring the explicit form of these models. We establish theoretical guarantees on the convergence and efficiency of the proposed algorithm. We explore our findings empirically and observe the benefits through a variety of synthetic and real-world examples.