Extensions to Bayesian tree-based machine
learning algorithms

Prado, Estevão B.

Extensions to Bayesian tree-based machine learning algorithms

Share and Export

Prado, Estevão B. (2022) Extensions to Bayesian tree-based machine learning algorithms. PhD thesis, National University of Ireland Maynooth.

[thumbnail of Thesis_Estevao_Batista.pdf]

Preview

Text
Available under License Creative Commons Attribution Non-commercial Share Alike.
Download (4MB) | Preview

Abstract

Bayesian additive regression trees (BART) is a Bayesian tree-based algorithm which can provide high predictive accuracy in both classification and regression problems. Unlike other machine learning algorithms based on an ensemble of trees, such as random forests and gradient boosting, BART is not based on recursive partitioning. Rather, it is a fully Bayesian model built upon a likelihood function and diligently specified prior distributions. In this thesis, we propose methodological extensions to BART to deal with two main limitations of tree-based methods: the limited ability to fit smooth functions, which is inherently associated with how methods based on trees are built, as well as the lack of adequate mechanisms that enable to quantify in an interpretable fashion the impact of certain inputs of primary interest on the output. Firstly, we present an extension that aims to deal with linear effects at the terminal nodes level. By considering linear piecewise functions instead of piecewise constants, local linearities are captured more efficiently and fewer trees are required to achieve equal or better performance than BART. Secondly, motivated by an agricultural application, we develop a semi-parametric BART model in which marginal genotypes and environment effects are estimated along with their interactions. Last, motivated by data collected in 2019 under the seventh cycle of the quadrennial Trends in International Mathematics and Science Study, we extend semiparametric models based on BART, which generally assume that the set of covariates in the linear predictor and the BART model are mutually exclusive, to account for shared covariates. In particular, we change the tree-generation moves in BART to deal with bias/confounding between the parametric and non-parametric components, even when they have covariates in common.

Item Type:	Thesis (PhD)
Keywords:	Extensions; Bayesian; tree-based; machine learning; algorithms;
Academic Unit:	Faculty of Science and Engineering > Research Institutes > Hamilton Institute
Item ID:	17285
Depositing User:	IR eTheses
Date Deposited:	06 Jun 2023 15:02
Use Licence:	This item is available under a Creative Commons Attribution Non Commercial Share Alike Licence (CC BY-NC-SA). Details of this licence are available here

MURAL - Maynooth University Research Archive Library

Extensions to Bayesian tree-based machine learning algorithms

Abstract

Downloads

Origin of downloads

Repository Staff Only (login required)