Dai, Hongsheng and Pollock, Murray and Roberts, Gareth (2020) Bayesian Fusion: Scalable unification of distributed statistical analyses. Working Paper. under review. (Submitted)
Dai, Hongsheng and Pollock, Murray and Roberts, Gareth (2020) Bayesian Fusion: Scalable unification of distributed statistical analyses. Working Paper. under review. (Submitted)
Dai, Hongsheng and Pollock, Murray and Roberts, Gareth (2020) Bayesian Fusion: Scalable unification of distributed statistical analyses. Working Paper. under review. (Submitted)
Abstract
There has recently been considerable interest in addressing the problem of unifying distributed statistical analyses into a single coherent inference. This problem naturally arises in a number of situations, including in big-data settings, when working under privacy constraints, and in Bayesian model choice. The majority of existing approaches have relied upon convenient approximations of the distributed analyses. Although typically being computationally efficient, and readily scaling with respect to the number of analyses being unified, approximate approaches can have significant shortcomings -- the quality of the inference can degrade rapidly with the number of analyses being unified, and can be substantially biased even when unifying a small number of analyses that do not concur. In contrast, the recent Fusion approach of Dai et al. (2019) is a rejection sampling scheme which is readily parallelisable and is exact (avoiding any form of approximation other than Monte Carlo error), albeit limited in applicability to unifying a small number of low-dimensional analyses. In this paper we introduce a practical Bayesian Fusion approach. We extend the theory underpinning the Fusion methodology and, by embedding it within a sequential Monte Carlo algorithm, we are able to recover the correct target distribution. By means of extensive guidance on the implementation of the approach, we demonstrate theoretically and empirically that Bayesian Fusion is robust to increasing numbers of analyses, and coherently unifying analyses which do not concur. This is achieved while being computationally competitive with approximate schemes.
Item Type: | Monograph (Working Paper) |
---|---|
Uncontrolled Keywords: | Big Data; Distributed Analysis; Parallel Computation; Path-space Rejection Sampling; Sequential Monte Carlo; Unification Monte Carlo |
Divisions: | Faculty of Science and Health Faculty of Science and Health > Mathematics, Statistics and Actuarial Science, School of |
SWORD Depositor: | Unnamed user with email elements@essex.ac.uk |
Depositing User: | Unnamed user with email elements@essex.ac.uk |
Date Deposited: | 20 Nov 2019 11:56 |
Last Modified: | 16 May 2024 20:05 |
URI: | http://repository.essex.ac.uk/id/eprint/25975 |
Available files
Filename: Bayesian_Fusion_arxiv_version.pdf