Bayesian Fusion: Scalable unification of distributed statistical analyses

Dai, Hongsheng and Pollock, Murray and Roberts, Gareth (2023) Bayesian Fusion: Scalable unification of distributed statistical analyses. Journal of the Royal Statistical Society Series B: Statistical Methodology, 85 (1). pp. 84-107. DOI https://doi.org/10.1093/jrsssb/qkac007

Abstract

There has recently been considerable interest in addressing the problem of unifying distributed statistical analyses into a single coherent inference. This problem naturally arises in a number of situations, including in big-data settings, when working under privacy constraints, and in Bayesian model choice. The majority of existing approaches have relied upon convenient \emph{approximations} of the distributed analyses. Although typically being computationally efficient, and readily scaling with respect to the number of analyses being unified, approximate approaches can have significant shortcomings -- the quality of the inference can degrade rapidly with the number of analyses being unified, and can be substantially biased even when unifying a small number of analyses that do not concur. In contrast, the recent \emph{Fusion} approach of \citet{jap:dpr19} is a rejection sampling scheme which is readily parallelisable and is \emph{exact} (avoiding any form of approximation other than Monte Carlo error), albeit limited in applicability to unifying a small number of low-dimensional analyses. In this paper we introduce a practical \emph{Bayesian Fusion} approach. We extend the theory underpinning the Fusion methodology and, by embedding it within a sequential Monte Carlo algorithm, we are able to recover the \emph{correct} target distribution. By means of extensive guidance on the implementation of the approach, we demonstrate theoretically and empirically that Bayesian Fusion is robust to increasing numbers of analyses, and coherently unifies analyses which do not concur. This is achieved while being computationally competitive with approximate schemes.

Item Metadata

Item Type:	Article
Uncontrolled Keywords:	Bayesian inference; Distributed data; Fork-and-join; Langevin diffusion; Sequential Monte Carlo
Divisions:	Faculty of Science and Health Faculty of Science and Health > Mathematics, Statistics and Actuarial Science, School of
SWORD Depositor:	Unnamed user with email elements@essex.ac.uk
Depositing User:	Unnamed user with email elements@essex.ac.uk
Date Deposited:	20 Feb 2023 15:35
Last Modified:	16 May 2024 21:37
URI:	http://repository.essex.ac.uk/id/eprint/34292

Available files

Published Version

Filename: qkac007.pdf

Licence: Creative Commons: Attribution 4.0

Download

Bayesian Fusion: Scalable unification of distributed statistical analyses

Abstract

Item Metadata

Share and export

Available files

Published Version

Statistics

Altmetrics

Downloads