Research Repository

Correlates of record linkage and estimating risks of non-linkage biases in business data sets

Moore, Jamie C and Smith, Peter WF and Durrant, Gabriele B (2018) 'Correlates of record linkage and estimating risks of non-linkage biases in business data sets.' Journal of the Royal Statistical Society: Series A (Statistics in Society), 181 (4). 1211 - 1230. ISSN 0964-1998

[img]
Preview
Text
Moore et al. 2018b.pdf - Published Version
Available under License Creative Commons Attribution.

Download (860kB) | Preview

Abstract

Researchers often utilize data sets that link information from multiple sources, but non‐linkage biases caused by linked and non‐linked subject differences are little understood, especially in business data sets. We address these knowledge gaps by studying biases in linkable 2010 UK Small Business Survey data sets. We identify correlates of business linkage propensity, and also for the first time its components: consent to linkage and register identifier appendability. As well, we take a novel approach to evaluating non‐linkage bias risks, by computing data set representativeness indicators (comparable, decomposable sample subset similarity measures). We find that the main impacts on linkage propensities and bias risks are due to consenter–non‐consenter differences explicable given business survey response processes, and differences between subjects with and without identifiers caused by register undercoverage of very small businesses. We then discuss consequences for the analysis of linked business data sets, and implications of the evaluation methods we introduce for linked data set producers and users.

Item Type: Article
Divisions: Faculty of Social Sciences > Institute for Social and Economic Research
Depositing User: Elements
Date Deposited: 22 Nov 2019 15:16
Last Modified: 22 Nov 2019 16:15
URI: http://repository.essex.ac.uk/id/eprint/25983

Actions (login required)

View Item View Item