Masters Degrees (Statistics and Actuarial Science)
Permanent URI for this collection
Browse
Browsing Masters Degrees (Statistics and Actuarial Science) by Subject "Biplots"
Now showing 1 - 5 of 5
Results Per Page
Sort Options
- ItemAn application of copulas to improve PCA biplots for multivariate extremes(Stellenbosch : Stellenbosch University, 2018-12) Perrang, Justin; Van der Merwe, Carel Johannes; Stellenbosch University. Faculty of Economic and Management Sciences. Dept. of Statistics and Actuarial Science.ENGLISH SUMMARY : Principal Component Analysis (PCA) biplots is a valuable means of visualising high dimensional data. The application of PCA biplots over a wide variety of research areas containing multivariate data is well documented. However, the application of biplots to financial data is limited. This is partly due to PCA being an inadequate means of dimension reduction for multivariate data that is subject to extremes. This implies that its application to financial data is greatly diminished since extreme observations are common in financial data. Hence, the purpose of this research is to develop a method to accommodate PCA biplots for multivariate data containing extreme observations. This is achieved by fitting an elliptical copula to the data and deriving a correlation matrix from the copula parameters. The copula parameters are estimated from only extreme observations and as such the derived correlationmatrices contain the dependencies of extreme observations. Finally, applying PCA to such an “extremal” correlation matrix more efficiently preserves the relationships underlying the extremes and a more refined PCA biplot can be constructed.
- ItemAn application of geometric data analysis techniques to South African crime data(Stellenbosch : Stellenbosch University, 2016-12) Gurr, Benjamin William; Le Roux, Niel J.; Stellenbosch University. Faculty of Economic and Management Sciences. Dept. of Statistics & Actuarial Science.ENGLISH SUMMARY : Due to the high levels of violent crime in South Africa, improved methods of analysis are required in order to better scrutinize these statistics. This study diverges from traditional multivariate data analysis, and provides alternative methods for analyzing crime data in South Africa. This study explores the applications of several types of geometric data analysis (GDA) methods to the study of crime in South Africa, these include: correspondence analysis, the correspondence analysis biplot, and the log-ratio biplot. Chapter 1 discusses the importance of data visualization in modern day statistics, as well as the geometric data analysis and its role as a multivariate analytical tool. Chapter 2 provides the motivation for the choice of subject matter to be explored in this study. As South Africa is recognized as having the eighth highest homicide rate in the world, along with a generally high level of violent crime, the analysis is conducted on reported violent crime statistics in South Africa. Additionally, the possible data collection challenges are also discussed in Chapter 2. The study is conducted on the violent crime statistics in South Africa for the 2004-2013 reporting period, the structure and details of which are discussed in Chapter 3. In order for this study to be comparable, it is imperative that the definitions of all crimes included are well defined. Chapter 3 places a large emphasis on declaring the exact definition of the various crimes which are utilized in this study, as recorded by the South African Police Services. The more common approaches to graphically representing crime data in South Africa are explored in Chapter 4. Chapter 4 also marks the beginning of the analysis of the South African crime data for the 2004-2013 reporting period. Univariate graphical techniques are used to analyze the data (line graphs and bar plots) for the 2004-2013 time period. However, as it is to be expected, they are hampered by serious limitations. In an attempt to improve on the analysis, focus is shifted to geometric data analysis techniques. The general methodologies to correspondence analysis, biplots, and correspondence analysis biplots are discussed in Chapter 5. Both the algorithms and the construction of the associated figures are discussed for the aforementioned methods. The application of these methodologies are implemented in Chapter 6. The results of Chapter 6 suggest some improvement upon the results of Chapter 4. These techniques provided a geometric setting where both the crimes and provinces could be represented in a single diagram, and where the relationships between both sets of variables could be analyzed. The correspondence analysis biplot proved to have some advantages in comparison to the correspondence analysis maps, as it can display numerous metrics, provide multiple calibrated axes, and allows for greater manipulation of the figure itself. Chapter 7 introduced the concept of compositional data and the log-ratio biplot. The log-ratio biplot combined the functionality of the biplot, along with a comparability measure in terms of a ratio. The log-ratio biplot proved useful in the analysis of the South African crime data as it expressed differences on a ratio scale as multiplicative differences. Additionally, log-ratio analysis has the property of being sub-compositionally coherent. Chapter 8 provides the summary and conclusions of this study. It was found that Gauteng categorically has the largest number of reported violent crimes over the reported period (2004-2013). However, the Western Cape proved to have the highest violent crime rates per capita of all the South African provinces. It was noted that over the past decade South Africa has experienced a downward trend in the number of reported murders. However, there has been a spike in the number of reported cases of murder in more recent year. This is spike is mostly driven by the large increases in reported murder cases in the Western Cape, Gauteng and KwaZulu-Natal. The most notable trend seen in the South African crime data is the rapid increase in the number of reported cases of drug-related crimes over the reported period across all provinces, but more noticeably in the Western Cape and Gauteng. On a whole, a majority of the South African provinces share similar violent crime profiles, however, Gauteng and the Western Cape deviate away from other provinces. This is due to Gauteng’s high association to robbery with aggravating circumstances and the Western Cape’s high association to drug-related crime. This study presents some evidence that the use of geometric data analysis techniques provides an improvement upon traditional reporting methods for the South African crime data. Geometric data analysis and its related methods should thus form an integral part of any study conducted into the topic at hand.
- ItemCategorical CVA biplots(Stellenbosch : Stellenbosch University, 2020-12) Rodwell, David Timothy; Van der Merwe, Carel Johannes; Stellenbosch University. Faculty of Economic and Management Sciences. Dept. of Statistics and Actuarial Science.ENGLISH ABSTRACT: In the modern era a great amount of emphasis is placed on data visualisation, especially in cases where a large amount of data is present. Usually, in these instances, the data is of a high-dimensional nature which cannot be visualised using conventional means. Fortunately, there has been a recent surge in using biplots to visualise multivariate data, where biplots can be described as a generalisation of a scatterplot. Moreover, these biplots use dimension reduction techniques to construct a two-dimensional representation of the data with non-orthogonal axes. However, at present, an effective biplot construction technique which adequately separates classes, in cases where categorical data is present does not exist. Hence, this research builds upon an existing biplot construction technique by using elements from Canonical Variate Analysis (CVA) and non-linear Principal Component Analysis (PCA) to develop a technique that can perform class separation in cases where numerical and categorical data is present. This novel biplot construction methodology forms the crux of this research assignment. Subsequently, the feasibility of this method was explored by considering the well-known Iris data set where two variables are binned to form categorical variables. It is shown that this novel method improves upon existing biplot construction in terms of classification accuracy and class separation. However, it is noted this method can be extended by incorporating CVA in the iterative algorithm which solves the optimal categorical level scores. A web-based Shiny application was built as supplement to this paper, and can be found at https://davidrodwell:shinyapps:io/CategoricalCVABiplotApp/. Here the user can interact with the data sets, proposed methodology, and functionalities presented in this research.
- ItemEstimating expected exposure profiles using biplot interpolation(Stellenbosch : Stellenbosch University, 2020-03) Van Niekerk, Francesca; Van der Merwe, Carel Johannes; Stellenbosch University. Faculty of Economic and Management Sciences. Dept. of Statistics and Actuarial Science.ENGLISH SUMMARY : The accounting standards provides guidelines on how to determine the fair value for a financial asset or liability held at fair value. When considering the fair value of derivative instruments, some additional adjustments need to be made for counterparty credit risk. For interest rate swaps, in particular, one needs to calculate the effective exposure of the swap in order to make these adjustments. One of the most popular methods, albeit computationally intensive, is to calculate these exposures through Monte Carlo simulation. In this study an alternative method of calculating the effective exposure using biplot interpolation is proposed. In this proposed method, an analytical approach in approximating the effective exposure profile is implemented through fitting a beta function. The parameters for this beta function are then estimated through biplot interpolation, which in turn approximates the exposure profile. When the performance of the biplot interpolation approach was tested using a standard interval testing approach, the approximated biplot interpolated profile provided a reasonable approximation of the true profile.
- ItemExploding biplots with density axes in Plotly(Stellenbosch : Stellenbosch University, 2020-12) Sandilands, Delia; Van der Merwe, Carel Johannes; Stellenbosch University. Faculty of Economic and Management Sciences. Dept. of Statistics and Actuarial Science.ENGLISH SUMMARY : Biplots are visual approximations to multidimensional data sets and are very useful in data driven procedures. However, certain issues occur when dealing with a large number of observations or variables. A proposed solution to the former would be moving the variable axes orthogonally to the edges of the plot with an automatic procedure. This unclutters the center of the plot and makes it easier to interpret specific observations. When dealing with a large number of observations the datapoints become indifferentiable from one another and this could be difficult to interpret. This problem was addressed by creating densities on the variable axes. Along with these proposed enhancements, an interactive element to biplot plotting was introduced via the use of Plotly in R. This allows the user to dissect the biplot even further. A web-based Shiny application was built as supplement to this paper, and can be found at https://carelvdmerwe:shinyapps:io/ExplodingBiplots/. Here the user can interact with the data sets, proposed methodology, and functionalities presented in this research.