Doctoral Degrees (Statistics and Actuarial Science)
Permanent URI for this collection
Browse
Browsing Doctoral Degrees (Statistics and Actuarial Science) by Subject "Bayesian statistical decision theory"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
- ItemBayesian approaches of Markov models embedded in unbalanced panel data(Stellenbosch : Stellenbosch University, 2012-12) Muller, Christoffel Joseph Brand; Mostert, Paul J.; Stellenbosch University. Faculty of Economic and Management Sciences. Dept. of Statistics and Actuarial Science.ENGLISH ABSTRACT: Multi-state models are used in this dissertation to model panel data, also known as longitudinal or cross-sectional time-series data. These are data sets which include units that are observed across two or more points in time. These models have been used extensively in medical studies where the disease states of patients are recorded over time. A theoretical overview of the current multi-state Markov models when applied to panel data is presented and based on this theory, a simulation procedure is developed to generate panel data sets for given Markov models. Through the use of this procedure a simulation study is undertaken to investigate the properties of the standard likelihood approach when fitting Markov models and then to assess its shortcomings. One of the main shortcomings highlighted by the simulation study, is the unstable estimates obtained by the standard likelihood models, especially when fitted to small data sets. A Bayesian approach is introduced to develop multi-state models that can overcome these unstable estimates by incorporating prior knowledge into the modelling process. Two Bayesian techniques are developed and presented, and their properties are assessed through the use of extensive simulation studies. Firstly, Bayesian multi-state models are developed by specifying prior distributions for the transition rates, constructing a likelihood using standard Markov theory and then obtaining the posterior distributions of the transition rates. A selected few priors are used in these models. Secondly, Bayesian multi-state imputation techniques are presented that make use of suitable prior information to impute missing observations in the panel data sets. Once imputed, standard likelihood-based Markov models are fitted to the imputed data sets to estimate the transition rates. Two different Bayesian imputation techniques are presented. The first approach makes use of the Dirichlet distribution and imputes the unknown states at all time points with missing observations. The second approach uses a Dirichlet process to estimate the time at which a transition occurred between two known observations and then a state is imputed at that estimated transition time. The simulation studies show that these Bayesian methods resulted in more stable results, even when small samples are available.
- ItemA framework for estimating risk(Stellenbosch : Stellenbosch University, 2008-03) Kroon, Rodney Stephen; Steel, S. J.; Stellenbosch University. Faculty of Economic and Management Sciences. Dept. of Statistics and Actuarial Science.We consider the problem of model assessment by risk estimation. Various approaches to risk estimation are considered in a uni ed framework. This a discussion of various complexity dimensions and approaches to obtaining bounds on covering numbers is also presented. The second type of training sample interval estimator discussed in the thesis is Rademacher bounds. These bounds use advanced concentration inequalities, so a chapter discussing such inequalities is provided. Our discussion of Rademacher bounds leads to the presentation of an alternative, slightly stronger, form of the core result used for deriving local Rademacher bounds, by avoiding a few unnecessary relaxations. Next, we turn to a discussion of PAC-Bayesian bounds. Using an approach developed by Olivier Catoni, we develop new PAC-Bayesian bounds based on results underlying Hoe ding's inequality. By utilizing Catoni's concept of \exchangeable priors", these results allowed the extension of a covering number-based result to averaging classi ers, as well as its corresponding algorithm- and data-dependent result. The last contribution of the thesis is the development of a more exible shell decomposition bound: by using Hoe ding's tail inequality rather than Hoe ding's relative entropy inequality, we extended the bound to general loss functions, allowed the use of an arbitrary number of bins, and introduced between-bin and within-bin \priors". Finally, to illustrate the calculation of these bounds, we applied some of them to the UCI spam classi cation problem, using decision trees and boosted stumps. framework is an extension of a decision-theoretic framework proposed by David Haussler. Point and interval estimation based on test samples and training samples is discussed, with interval estimators being classi ed based on the measure of deviation they attempt to bound. The main contribution of this thesis is in the realm of training sample interval estimators, particularly covering number-based and PAC-Bayesian interval estimators. The thesis discusses a number of approaches to obtaining such estimators. The rst type of training sample interval estimator to receive attention is estimators based on classical covering number arguments. A number of these estimators were generalized in various directions. Typical generalizations included: extension of results from misclassi cation loss to other loss functions; extending results to allow arbitrary ghost sample size; extending results to allow arbitrary scale in the relevant covering numbers; and extending results to allow arbitrary choice of in the use of symmetrization lemmas. These extensions were applied to covering number-based estimators for various measures of deviation, as well as for the special cases of misclassi - cation loss estimators, realizable case estimators, and margin bounds. Extended results were also provided for strati cation by (algorithm- and datadependent) complexity of the decision class. In order to facilitate application of these covering number-based bounds,