Machlanski, Damian (2024) Understanding hyperparameters in machine learning for causal estimation from observational data. Doctoral thesis, University of Essex.
Machlanski, Damian (2024) Understanding hyperparameters in machine learning for causal estimation from observational data. Doctoral thesis, University of Essex.
Machlanski, Damian (2024) Understanding hyperparameters in machine learning for causal estimation from observational data. Doctoral thesis, University of Essex.
Abstract
Causal analysis is fundamental to science and decision-making. It unravels the structure of the process underlying the data and estimates the effectiveness of interventions. Deriving causal notions from randomised experiments is well understood and relatively simple analytically. However, there is also considerable interest in the analysis of far more widely available non-experimental observational data, which is substantially more challenging. There has been considerable research on causal analysis in the statistics literature, but more recently also from computer scientists since the robustness and performance of existing statistical approaches to causal estimation can be improved by machine learning (ML). However, despite growing evidence that hyperparameters of ML methods are critical for strong predictive performance, this aspect is neglected in causality. To make matters worse, the use of observational data, while convenient due to their availability, creates serious obstacles for model evaluation and tuning. The relationship between hyperparameters and model performance is understudied in ML, let alone in significantly more challenging causal settings. This work fills this gap by investigating the intricate interplay of challenges posed by causal settings, ML methods, hyperparameter selection, and estimation performance, all within the context of two causal estimation tasks: treatment effect estimation and causal structure learning. This unique direction has led this study to several original contributions, such as a novel estimation method, or the first of their kind extensive performance evaluation analyses from the perspective of hyperparameters. The results form the ultimate claim of this thesis, which is that hyperparameters play a pivotal role in causal estimation performance, but crucially, their optimisation is significantly limited by incomplete observations within observational causal data. Consequently, this work calls for more careful treatment of hyperparameters in practice and more fundamental research into causal hyperparameter optimisation to harness the full potential of ML in causal estimation.
Item Type: | Thesis (Doctoral) |
---|---|
Uncontrolled Keywords: | causality; causal inference; causal discovery; machine learning; observational data; hyperparameters |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Divisions: | Faculty of Science and Health > Computer Science and Electronic Engineering, School of |
Depositing User: | Damian Machlanski |
Date Deposited: | 30 Sep 2024 13:26 |
Last Modified: | 30 Sep 2024 13:26 |
URI: | http://repository.essex.ac.uk/id/eprint/39288 |
Available files
Filename: MACHLANSKI Thesis (final).pdf