Research Repository

Automatic Methods for Coding Historical Occupation Descriptions to Standard Classifications

Kirby, G and Carson, J and Dunlop, F and Dibben, C and Dearle, A and Williamson, L and Garrett, EM and Reid, A (2015) 'Automatic Methods for Coding Historical Occupation Descriptions to Standard Classifications.' In: Bloothooft, G and Christen, P and Mandemakers, K and Schraagen, M, (eds.) Population Reconstruction. Springer, 43 - 60. ISBN 978-3-319-19883-5

Full text not available from this repository.


The increasing availability of digitised registration records presents a significant opportunity for research in many fields including those of human geography, genealogy and medicine. Re-examining original records allows researchers to study relationships between factors such as occupation, cause of death, illness and geographic region. This can be facilitated by coding these factors to standard classifications. This chapter describes work to develop a method for automatically coding the occupations from 29 million Scottish birth, death and marriage records, containing around 50 million occupation descriptions, to standard classifications. A range of approaches using text processing and supervised machine learning is evaluated, achieving classification performance of 75 % micro-precision/recall, 61 % macro-precision and 66 % macro-recall on a smaller test set. Further development that may be needed for classification of the full data set is discussed.

Item Type: Book Section
Subjects: D History General and Old World > D History (General)
H Social Sciences > H Social Sciences (General)
H Social Sciences > HA Statistics
Divisions: Faculty of Humanities > History, Department of
Depositing User: Jim Jamieson
Date Deposited: 10 Aug 2015 09:09
Last Modified: 17 Aug 2017 17:34

Actions (login required)

View Item View Item