Research Repository

Bayesian Fusion: Scalable unification of distributed statistical analyses

Dai, Hongsheng and Pollock, Murray and Roberts, Gareth (2020) Bayesian Fusion: Scalable unification of distributed statistical analyses. Working Paper. under review. (Submitted)

[img]
Preview
Text
Bayesian_Fusion_arxiv_version.pdf

Download (1MB) | Preview

Abstract

There has recently been considerable interest in addressing the problem of unifying distributed statistical analyses into a single coherent inference. This problem naturally arises in a number of situations, including in big-data settings, when working under privacy constraints, and in Bayesian model choice. The majority of existing approaches have relied upon convenient approximations of the distributed analyses. Although typically being computationally efficient, and readily scaling with respect to the number of analyses being unified, approximate approaches can have significant shortcomings -- the quality of the inference can degrade rapidly with the number of analyses being unified, and can be substantially biased even when unifying a small number of analyses that do not concur. In contrast, the recent Fusion approach of Dai et al. (2019) is a rejection sampling scheme which is readily parallelisable and is exact (avoiding any form of approximation other than Monte Carlo error), albeit limited in applicability to unifying a small number of low-dimensional analyses. In this paper we introduce a practical Bayesian Fusion approach. We extend the theory underpinning the Fusion methodology and, by embedding it within a sequential Monte Carlo algorithm, we are able to recover the correct target distribution. By means of extensive guidance on the implementation of the approach, we demonstrate theoretically and empirically that Bayesian Fusion is robust to increasing numbers of analyses, and coherently unifying analyses which do not concur. This is achieved while being computationally competitive with approximate schemes.

Item Type: Monograph (Working Paper)
Uncontrolled Keywords: Big Data, Distributed Analysis, Parallel Computation, Path-space Rejection Sampling, Sequential Monte Carlo, Unification Monte Carlo
Divisions: Faculty of Science and Health > Mathematical Sciences, Department of
Depositing User: Elements
Date Deposited: 20 Nov 2019 11:56
Last Modified: 08 Feb 2021 15:15
URI: http://repository.essex.ac.uk/id/eprint/25975

Actions (login required)

View Item View Item