Dai, Hongsheng and Pollock, Murray and Roberts, Gareth (2023) Bayesian Fusion: Scalable unification of distributed statistical analyses. Journal of the Royal Statistical Society Series B: Statistical Methodology, 85 (1). pp. 84-107. DOI https://doi.org/10.1093/jrsssb/qkac007
Dai, Hongsheng and Pollock, Murray and Roberts, Gareth (2023) Bayesian Fusion: Scalable unification of distributed statistical analyses. Journal of the Royal Statistical Society Series B: Statistical Methodology, 85 (1). pp. 84-107. DOI https://doi.org/10.1093/jrsssb/qkac007
Dai, Hongsheng and Pollock, Murray and Roberts, Gareth (2023) Bayesian Fusion: Scalable unification of distributed statistical analyses. Journal of the Royal Statistical Society Series B: Statistical Methodology, 85 (1). pp. 84-107. DOI https://doi.org/10.1093/jrsssb/qkac007
Abstract
There has recently been considerable interest in addressing the problem of unifying distributed statistical analyses into a single coherent inference. This problem naturally arises in a number of situations, including in big-data settings, when working under privacy constraints, and in Bayesian model choice. The majority of existing approaches have relied upon convenient \emph{approximations} of the distributed analyses. Although typically being computationally efficient, and readily scaling with respect to the number of analyses being unified, approximate approaches can have significant shortcomings -- the quality of the inference can degrade rapidly with the number of analyses being unified, and can be substantially biased even when unifying a small number of analyses that do not concur. In contrast, the recent \emph{Fusion} approach of \citet{jap:dpr19} is a rejection sampling scheme which is readily parallelisable and is \emph{exact} (avoiding any form of approximation other than Monte Carlo error), albeit limited in applicability to unifying a small number of low-dimensional analyses. In this paper we introduce a practical \emph{Bayesian Fusion} approach. We extend the theory underpinning the Fusion methodology and, by embedding it within a sequential Monte Carlo algorithm, we are able to recover the \emph{correct} target distribution. By means of extensive guidance on the implementation of the approach, we demonstrate theoretically and empirically that Bayesian Fusion is robust to increasing numbers of analyses, and coherently unifies analyses which do not concur. This is achieved while being computationally competitive with approximate schemes.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Bayesian inference; Distributed data; Fork-and-join; Langevin diffusion; Sequential Monte Carlo |
Divisions: | Faculty of Science and Health Faculty of Science and Health > Mathematics, Statistics and Actuarial Science, School of |
SWORD Depositor: | Unnamed user with email elements@essex.ac.uk |
Depositing User: | Unnamed user with email elements@essex.ac.uk |
Date Deposited: | 20 Feb 2023 15:35 |
Last Modified: | 16 May 2024 21:37 |
URI: | http://repository.essex.ac.uk/id/eprint/34292 |
Available files
Filename: qkac007.pdf
Licence: Creative Commons: Attribution 4.0