Masters Degrees (Computer Science)
Permanent URI for this collection
Browse
Browsing Masters Degrees (Computer Science) by Title
Now showing 1 - 20 of 42
Results Per Page
Sort Options
- ItemApplication of statistical pattern recognition and deep learning for morphological classification in radio astronomy(Stellenbosch : Stellenbosch University, 2022-04) Becker, Adolf Burger; Grobler, Trienko; Stellenbosch University. Faculty of Science. Dept. of Computer Science.ENGLISH ABSTRACT: The morphological classification of radio sources is important to gain a full under standing of galaxy evolution processes and their relation with local environmental properties. Furthermore, the complex nature of the problem, its appeal for citi zen scientists and the large data rates generated by existing and upcoming radio telescopes combine to make the morphological classification of radio sources an ideal test case for the application of machine learning techniques. One approach that has shown great promise recently is Convolutional Neural Networks (CNNs). Literature, however, lacks two major things when it comes to CNNs and radio galaxy morphological classification. Firstly, a proper analysis to identify whether overfitting occurs when training CNNs to perform radio galaxy morphological clas sification is needed. Secondly, a comparative study regarding the practical appli cability of the CNN architectures in literature is required. Both of these short comings are addressed in this thesis. Multiple performance metrics are used for the latter comparative study, such as inference time, model complexity, compu tational complexity and mean per class accuracy. A ranking system based upon recognition and computational performance is proposed. MCRGNet, ATLAS and ConvXpress (novel classifier) are the architectures that best balance computational requirements with recognition performance.
- ItemAutomatic assignment of diagnosis codes to free-form text medical notes(Stellenbosch : Stellenbosch University, 2021-12) Strydom, Stefan; Van der Merwe, Brink; Stellenbosch University. Faculty of Science. Dept. of Mathematical Sciences. Division Computer Science.ENGLISH ABSTRACT: Clinical coding is the process of describing and categorising healthcare episodes according to standardised ontologies. The coded data have important downstream applications, including population morbidity studies, health systems planning and reimbursement. Clinical codes are generally assigned based on information contained in free-form text clinical notes by specialist human coders. This process is expensive, time-consuming, subject to human error and burdens scarce clinical human resources with administrative roles. An accurate automatic coding system can alleviate these problems. Clinical coding is a challenging task for machine learning systems. The source texts are often long, has a highly specialised vocabulary, contains non-standard clinician shorthand and the code sets can contain tens-of-thousands of codes. We review previous work on clinical auto-coding systems and perform an empirical analysis of widely used and current state-of-the-art machine learning approaches to the problem. We propose a novel attention mechanism that takes the text description of clinical codes into account. We also construct a small pre-trained transformer model that achieves state-of-the-art performance on the MIMIC II and III ICD-9 auto-coding tasks. To the best of our knowledge, it is the first successful application of a pre-trained transformer model on this task.
- ItemAutomatic Prediction of Comment Quality(Stellenbosch : Stellenbosch University, 2016-03) Brand, Dirk Johannes; Van der Merwe, Brink; Kroon, R. S. (Steve); Cleophas, Loek; Stellenbosch University. Faculty of Science. Department of Mathematical Sciences (Computer Science)ENGLISH ABSTRACT : The problem of identifying and assessing the quality of short texts (e.g. comments, reviews or web searches) has been intensively studied. There are great bene ts to being able to analyse short texts. As an example, advertisers might be interested in the sentiment of product reviews on e-commerce sites to more e ciently pair marketing material to content. Analysing short texts is a di cult problem, because traditional machine learning models generally perform better on data sets with larger samples, which often translates to more features. More data allow for better estimation of parameters for these models. Short texts generally do not have much content, but still carry high variability in that they may still consist of a large corpus of words. This thesis investigates various methods for feature extraction for short texts in the context of online user comments. These methods include the leading manual feature extraction techniques for short texts, N-gram models and techniques based on word embeddings. The e ect of using di erent kernels for a support vector classi er is also investigated. The investigation is centred around two data sets, one provided by News24 and the other extracted from Slashdot.org. It was found that N-gram models performed relatively well, mostly outperforming manual feature extraction techniques.
- ItemAutomatic recognition and interpretation of finite state automata diagrams(Stellenbosch : Stellenbosch University, 2015-12) Babalola, Olusola Tope; Van Zijl, Lynette; Stellenbosch University. Faculty of Science. Department Mathematical Sciences (Computer Science)ENGLISH ABSTRACT : An application capable of reading graphically-encoded information is beneficial to blind or visually impaired students. Such a system needs to recognize and understand visual markings and their arrangement as presented in a diagram image. In that light, this thesis examines the practical possibility of a real world system for the automatic recognition and interpretation of machine-printed Finite State Automata diagrams. The suggested system uses known image processing and pattern recognition methods to extract the visual markings from the diagram image pixels. A second stage, to interpret the meaning of the diagram, is based on modeling the language of Finite State Automata diagrams using Constraint Multiset Grammars. Our results show that a practical application for automatic interpretation of Finite State Automata diagrams is possible.
- ItemBuilding identification in aerial imagery using deep learning(Stellenbosch : Stellenbosch University, 2024-03) Nakiranda, Proscovia; Grobler, Trienko; Stellenbosch University. Faculty of Science. Dept. of Computer Science.ENGLISH ABSTRACT: Advancements in the field of remote sensing have facilitated the effortless re- trieval of information about any location on Earth at any given time. This has resulted in a variety of developments, including the identification of economic activities taking place in a particular area. The task of identifying build- ings is one significant application of remote sensing imagery as it is crucial for assessing resource distribution, especially in low-resource or data-scarce areas. Advances in machine learning and computation resources allow for au- tomatic analysis of collected remote sensing data, eliminating the need for human intervention. The task of building identification falls under computer vision and is an example of a task that can be automated. Several machine learning architectures (models) have been proposed for building identification. However, choosing the appropriate one can be challenging due to limitations such as difficulty in accurately identifying boundary or near boundary pixels, resource requirements, and overall model accuracy. Therefore, conducting a comparative study is necessary to evaluate the performance of the building identification models. In this thesis, we carry out a comparative study of four state-of-the-art models used for the building identification task. We eval- uate their performance both qualitatively and quantitatively. Furthermore, we investigate the effect of multitask learning on the models’ performance in building identification. The thesis concludes by providing our research find- ings and outlining prospective future research avenues. Moreover, it provides a thorough overview of the fundamental theory underpinning remote sensing and machine learning.
- ItemCombining reverse debugging and live programming towards visual thinking in computer programming(Stellenbosch : Stellenbosch University, 2015-03) Coetzee, Abraham Liebrecht; Van Zijl, Lynette; Hoffmann, McElory R.; Stellenbosch University. Faculty of Science. Mathematical Sciences (Computer Science)ENGLISH ABSTRACT : Interaction plays a key role in the process of learning, and a learner’s abilities are enhanced when multiple cognitive functions work in parallel, especially those related to language and visuals. Time is the most fundamental variable that governs the interaction between programmer and computer, and the substantial temporal separation of cause and effect leads to poor mental models. Furthermore, programmers do not have means by which to express their mental models. The feasibility of combining reverse debugging and live programming was therefore investigated. This combination was found to be feasible, and a reverse debugger with higher levels of liveness was created for the Python programming language. It establishes a foundation for combining language and visual models as aids in computer programming education.
- ItemCombining tree kernels and text embeddings for plagiarism detection(Stellenbosch : Stellenbosch University, 2018-03) Thom, Jacobus Daniel; Van der Merwe, A. B.; Kroon, R. S. (Steve); Stellenbosch University. Faculty of Science. Dept. of Mathematical Sciences (Computer Science)ENGLISH ABSTRACT : The internet allows for vast amounts of information to be accessed with ease. Consequently, it becomes much easier to plagiarize any of this information as well. Most plagiarism detection techniques rely on n-grams to find similarities between suspicious documents and possible sources. N-grams, due to their simplicity, do not make full use of all the syntactic and semantic information contained in sentences. We therefore investigated two methods, namely tree kernels applied to the parse trees of sentences and text embeddings, to utilize more syntactic and semantic information respectively. A plagiarism detector was developed using these techniques and its effectiveness was tested on the PAN 2009 and 2011 external plagiarism corpora. The detector achieved results that were on par with the state of the art for both PAN 2009 and PAN 2011. This indicates that the combination of tree kernel and text embedding techniques is a viable method of plagiarism detection.
- ItemConcrete and symbolic linearisability checking of non-blocking concurrent data structures(Stellenbosch : Stellenbosch University, 2021-12) Du Toit, Nicole Cathryn; Inggs, Cornelia P.; Stellenbosch University. Faculty of Science. Dept. of Mathematical Sciences. Division Computer Science.ENGLISH ABSTRACT: Non-blocking concurrent data structures are developed as a more efficient solution to concurrent data structures; in non-blocking concurrent data structures hardware-level atomic instructions are used instead of higher-level, expensive locking mechanisms. Lock-free algorithms, however, are notoriously hard to design and prone to subtle concurrency errors that are difficult to pick up. Linearisability Checking is the standard correctness condition for non-blocking concurrent data structures; a data structure is linearisable if each concurrent execution of the data structure corresponds to the execution of its correct sequential specification. In this thesis, the focus is on the linearisability checking of non-blocking data structures using a model checker. The approaches for checking linearisability using a model checker can be broadly categorised into linearisation point and automatic linearisability checking. The state-of-the-art strategies were implemented using the Java PathFinder Model Checker as basis. The linearisation point linearisability checking strategy of Vechev et al. was extended to include data structures with operations that act generically on the data structure, and not just on one element in the data structure. An improved version of Doolan et al.’s external automatic checker was implemented and the idea of an external checker was extended to the improved linearisation point checking strategy. The lazy read optimisation, proposed by Long et al., and a hash optimisation, proposed in this thesis, for the automatic checker was implemented and the effectiveness and benefit of the optimisations determined. The performance-limiting factor of the automatic checker was investigated and the claims made by Vechev et al., Liu et al., and Doolan et al. confirmed/falsified. The concrete checker’s usefulness in finding linearisability errors is constrained by the user’s ability to hand-craft test cases in which errors are present. A new Symbolic Linearisability Checker was developed, the major novel contribution in this thesis, that integrates linearisability checking into Symbolic PathFinder, a symbolic model checker. The symbolic checker performs linearisability checking on all possible test cases and program paths; it verifies the linearisability of a data structure in general, constrained only by a user-defined number of operations to be executed by each thread. Finally, extensive evaluations and comparisons of all checkers were performed, on the same model checking framework and hardware, considering their manual input required, resource usage, scalability, and ability to find errors.
- ItemCoverage directed algorithms for test suite construction from LR-automata(Stellenbosch : Stellenbosch University, 2022-04) Rossouw, Christoffel Jacobus; Fischer, Bernd; Stellenbosch University. Faculty of Science. Dept. of Computer Science.ENGLISH ABSTRACT: Bugs in software can have disastrous results in terms of both economic cost and human lives. Parsers can have bugs, like any other type of software, and must therefore be thoroughly tested in order to ensure that a parser recognizes its intended language accurately. However, parsers often need to recognize many different variations and combinations of grammar structures which can make it time consuming and difficult to construct test suites by hand. We therefore require automated methods of test suite construction for these systems. Currently, the majority of test suite construction algorithms focus on the grammar describing the language to be recognized by the parser. In this thesis we take a different approach. We consider the LR-automaton that recognizes the target language and use the context information encoded in the automaton. Specifically, we define a new class of algorithm and coverage criteria over a variant of the LR-automaton that we define, called an LR-graph. We define methods of constructing positive test suites, using paths over this LR-graph, as well as mutations on valid paths to construct negative test suites. We evaluate the performance of our new algorithms against other state-of-the-art algorithms. We do this by comparing coverage achieved over various systems, some smaller systems used in a university level compilers course and other larger, real-world systems. We find good performance of our algorithms over these systems, when compared to algorithms that produce test suites of equivalent size. Our evaluation has uncovered a problem in grammar-based testing algorithms that we call bias. Bias can lead to significant variation in coverage achieved over a system, which can in turn lead to a flawed comparison of two algorithms or unrealized performance when a test suite is used in practice. We therefore define bias and measure it for all grammar-based test suite construction algorithms we use in this thesis.
- ItemCreating 3D models using reconstruction techniques(Stellenbosch : Stellenbosch University, 2018-12) Martin, Javonne Jason; Kroon, R. S. (Steve); De Villiers, H. A. C.; Stellenbosch University. Faculty of Science. Dept. of Mathematical Sciences. Division Computer Science.ENGLISH ABSTRACT :Virtual reality models of real world environments have a number of compelling applications, such as preserving the architecture and designs of older buildings. This process can be achieved by using 3D artists to reconstruct the environment, however this is a long and expensive process. Thus, this thesis investigates various techniques and approaches used in 3D reconstruction of environments using a single RGB-D camera and aims to reconstruct the 3D environment to generate a 3D model. This would allow non-technical users to reconstruct environments and use these models in business and simulations, such as selling real-estate, modifying pre-existing structures for renovation and planning. With the recent improvements in virtual reality technology such as the Oculus Rift and HTC Vive, a user can be immersed into virtual reality environments created from real world structures. A system based on Kinect Fusion is implemented to reconstruct an environment and track the motion of the camera within the environment. The system is designed as a series of selfcontained subsystems that allows for each of the subsystems to be modified, expanded upon or easily replaced by alternative methods. The system is made available as an open source C++ project using Nvidia’s CUDA framework to aid reproducibility and provides a platform for future research. The system makes use of the Kinect sensor to capture information about the environment. A coarse-to-fine least squares approach is used to estimate the motion of the camera. In addition, the system employs a frame-to-model approach that uses a view of the estimated reconstruction of the model as the reference frame and the incoming scene data as the target. This minimises the drift with respect to the true trajectory of the camera. The model is built using a volumetric approach, with volumetric information implicitly stored as a truncated signed distance function. The system filters out noise in the raw sensor data by using a bilateral filter. A point cloud is extracted from the volume using an orthogonal ray caster which enables an improved hole-filling approach. This allows the system to extract both the explicit and implicit structure from the volume. The 3D reconstruction is followed by mesh generation based on the point cloud. This is achieved by using an approach related to Delaunay triangulation, the ball-pivot algorithm. The resulting system processes frames at 30Hz, enabling real-time point cloud generation, while the mesh generation occurs offline. This system is initially tested using Blender to generate synthetic data, followed by a series of real world tests. The synthetic data is used to test the presented system’s motion tracking against the ground truth. While the presented system suffers from the effects of drift over long frame sequences, it is shown to be capable of tracking the motion of the camera. This thesis finds that the ball pivot algorithm can generate the edges and faces for synthetic point clouds, however it performs poorly when using the noisy synthetic and real world data sets. Based on the results obtained it is recommended that the obtained point cloud be preprocessed to remove noise before it is provided to the mesh generation algorithm and an alternative mesh generation technique should be employed that is more robust to noise.
- ItemA deep framework for predictive maintenance(2021-12-01) Steyl, Charl Cilliers; Hoffmann, McElory R.; Grobler, Trienko Lups; Herbst, Barend MattheusENGLISH ABSTRACT: Predictive maintenance (PdM) is a well-known maintenance approach that comprises of two problems, machine prognostic modelling and maintenance scheduling. The objective of prognostic modelling is to predict faults in machine components such as aircraft engines, lithium-ion batteries or bearings. The objective of maintenance scheduling is to reduce the cost of performing maintenance once the future degradation behaviour of a component has been established. Sensors are used to monitor the degradation behaviour of components as they change over time. Supervised learning is a suitable solution for prognostic modelling problems, especially with the increase in sensor readings being collected with Internet of Things (IoT) devices. Prognostic modelling can be formulated as remaining useful life (RUL)- or machine state estimation. The former is a regression- and the later is a classification problem. Long short-term memory (LSTM) recurrent neural networks (RNNs) are an extension of traditional RNNs that are effective at interpreting trends in the sensor readings and making longer term estimations. An LSTM uses a window of sequential sensor readings when making prognostic estimates which causes it to be less sensitive to local sensor variations, which results in improved prognostic model performance. In this study we create a framework to implement PdM approaches. The work consists of a codebase which can be used to create testable, comparable and repeatable prognostic modelling results and maintenance scheduling simulations. The codebase is designed to be extensible, to allow future researchers to standardise prognostic modelling results. The codebase is used to compare the prognostic modelling performance of an LSTM with tradition supervised prognostic modelling approaches such as Random Forests (RF)s, Gradient boosted (GB) trees and Support Vector Machines (SVM)s. The prognostic models are tested on three well-known prognostic datasets, the Commercial Modular Aero-Propulsion System Simulation (C-MAPSS) engine aircraft-, Center for Advanced Life Cycle Engineering (CALCE) battery- and Intelligent Maintenance Systems (IMS) bearing datasets. During the study we highlight factors that influence prognostic model performance, such as the effect of de-noising sensor readings and the size of the sample window used by the LSTM when making estimations. The results of the prognostic models are compared with previous studies and the LSTM shows improved performance on considered cases. The developed prognostic models are used to perform preventative maintenance scheduling with assumed costs in two simulations. The objective is first to compare the efficacy of traditional maintenance approaches, such as a mean time between failure (MTBF) strategy, with a PdM strategy, and second to investigate the effect of using a better performing prognostic model (such as the LSTM) in a PdM strategy. The improvements are measured by the reduction in costs. Key words: Predictive maintenance; remaining useful life; machine state estimation; preventative maintenance scheduling.
- ItemDesign and evaluation of a formula cache for SMT-based bounded model checking tools(Stellenbosch : Stellenbosch University, 2018-03) Breytenbach, Jean Anré; Fischer, Bernd; Stellenbosch University. Faculty of Science. Dept. of Mathematical Sciences (Computer Science)ENGLISH ABSTRACT : Program verification is a computationally expensive and time-consuming process. Bounded model checking is a branch of program verification that produces FOL formulas to be checked for satisfiability by an SMT solver. These formulas encode state transitions to states where property violations will occur and the SMT solver attempts to find a list of variable assignments that would create a path to one of these states. Bounded model checking tools create these formulas by iteratively increasing an unwind bound k that dictates the number of transitions that can be present in a path, for each unwind bound k all possible paths of length k are generated. Any state containing a property violation that is generated during the unwind bound k − 1 should also be present during the unwind bound k with perhaps a longer path to reach it. This creates many of the same paths being checked during each subsequent iteration causing the SMT solver to potentially perform duplicate work. This thesis seeks to build and evaluate a formula cache in which to store parts of the formula for which the satisfiability is already known. During subsequent iterations the formula can be sliced by removing the parts that are found in the cache, providing smaller formulas for the SMT solver which should take less time to solve. Similar formula caches have already proven successful in the field of symbolic execution. Multiple techniques are described to modify the formulas to increase the likelihood of finding a match in the cache and these techniques are combined in a multitude of ways to generate a collection of caching strategies. These strategies are then evaluated against multiple data sets to find the best performing strategies and to identify the types of problems that benefit the most from caching. The results are then compared to the results of caching formulas during symbolic execution to gain insight as to how these different approaches effectively implement caching.
- ItemDetecting and quantifying resource contention in concurrent programs(Stellenbosch : Stellenbosch University, 2016-03) Venter, Dirk Willem; Inggs, Cornelia P.; Stellenbosch University. Faculty of Science. Department of Mathematical Sciences (Computer Science)ENGLISH ABSTRACT : Parallel programs, both shared-memory and message-passing programs, typically require the sharing of resources. For example, software resources, such as shared mutual exclusion locks and hardware resources, such as caches and memory. Shared resources can only be used by one thread or process at a time. The competition for limited resources is called resource contention. The result of resource contention is delays while waiting for access to a resource and/or extra computational overhead to resolve the request for a resource. Thus, the performance of the program can be improved by identifying and reducing contention for shared resources. This study investigates the effect of individual types of contention for hardware and software resources in detail and discusses the three tools that were developed to identify and quantify the sources of contention in concurrent programs.
- ItemErgo: a gesture-based computer interaction device(Stellenbosch : Stellenbosch University, 2024-03) Kane, Boyd Robert; Grobler, Trienko Lups; Stellenbosch University. Faculty of Science. Dept. of Computer Science.ENGLISH ABSTRACT: This thesis presents Ergo, a bespoke glove‐based sensor suite designed to fully replace the regular QWERTY keyboard in terms of both number of input keys and speed of input. Ergo collects acceleration data from each of the user’s 10 fingertips at 40 times per second and is able to predict which of 50 differ‐ ent hand gestures is being performed at 40 times per second. The user does not need to explicitly mark the start or end of each gesture, as Ergo is able to automatically distinguish between intentional gestures made by the user and other non‐gesture movements. When a known gesture is detected, a corre‐ sponding keystroke is emitted, allowing the user to “type” on their computer by performing gestures in sequence. Five different classification models are eval‐ uated (Hidden Markov Models, Support Vector Machines, Cumulative Sum, and two different Neural Network architectures) and Neural Networks are found to be the most effective. The difference in difficulty between classification tasks which either do or do not include observations without intentional movement is also evaluated. The additional requirement to be able to distinguish inten‐ tional gestures from other hand movements is found to increase the difficulty of the task significantly.
- ItemExplaining neural networks used for modeling credit risk(Stellenbosch : Stellenbosch University, 2021-03) Mohamed, Zhunaid; Visser, Willem; Herbst, B. M.; Hoffman, McElory; Stellenbosch University. Faculty of Science. Dept. of Mathematical Sciences. Division Computer Science.ENGLISH ABSTRACT: Calculating risk before providing loans is a common problem that credit companies face. The most common solution is credit employees manually assessing the risk of a client by reviewing their credit portfolios. This can be a slow process and is prone to human error. Recently credit companies have been adopting machine learning techniques in order to automate this process, however this has been limited to linear techniques due to interpretability being a strict requirement. Neural networks could provide significant improvements to the way credit risk is modeled, however these are still seen as black boxes. In this work we compare various techniques which claim to provide interpretability into these black boxes. We also use these techniques to provide explanations on a neural network trained on credit data that has been provided to us.
- ItemAn extension of the linear regression model for improved vessel trajectory prediction utilising a priori AIS Information(Stellenbosch : Stellenbosch University, 2022-04) Burger, Christiaan Neil; Grobler, Trienko Lups; Kleynhans, Waldo; Stellenbosch University. Faculty of Science. Dept. of Computer Science.ENGLISH ABSTRACT: As maritime activities increase globally, there is a greater dependency on technology in monitoring, control and surveillance of vessel activity. One of the most prominent systems for monitoring vessel activity is the Automatic Identification System (AIS). An increase in both vessels fitted with AIS transponders, and satellite- and terrestrial receivers has resulted in a significant increase in AIS messages received globally. This resultant rich spatial and temporal data source related to vessel activity provides analysts with the ability to perform enhanced vessel movement analytics, of which a pertinent example is the improvement of vessel location predictions. In this thesis, we propose a novel method for predicting future locations of vessels by making use of historic AIS data. The proposed method extends a Linear Regression Model (LRM), utilising historic AIS movement data in the form of a priori generated spatial maps of the course over ground (LRMAC). The LRMAC has low complexity and is programmatically easy to implement, and attains accurate prediction results. We first compare the LRM with a Discrete Kalman Filter (DKF) on linear trajectories. We then extend the LRM to form the LRMAC. The LRMAC is compared to another method in literature called the Single Point Neighbour Search (SPNS). For the use case of predicting Cargo and Tanker vessel trajectories, with a prediction horizon of up to six hours, the LRMAC has an improved execution time and performance compared to the SPNS.
- ItemThe generation of longest optimal box repetition-free words(Stellenbosch : Stellenbosch University, 2022-04) Habeck, Manfred; Grobler, Trienko; Van Zijl, Lynette; Geldenhuys, Jaco; Stellenbosch University. Faculty of Science. Dept. of Computer Science.ENGLISH ABSTRACT: This thesis focuses on a specific problem within the field of combinatorial generation, namely, the generation of box repetition-free words. A box is a word over a given alphabet, where the first symbol in the word is the same as the last symbol. For example, the word abaca is a box. A box can contain other boxes. The box abaca contains boxes aba and aca. Boxes can overlap, such as aba and aca in abaca. This work investigates the generation of the longest possible sequence of symbols, over a given alphabet, which does not contain any repeating boxes. We show that an exhaustive enumeration based on a brute force approach with backtracking is not feasible. That is, we checked if adding a symbol to a word would create a repeating box; if not, recursively add another symbol. This method will eventually find all valid words, but takes an unreasonable amount of time for larger alphabets. As a non-enumerative attempt to find individual valid words, the Monte Carlo tree search is used. The search is based on the assumption that prefixes with good past results will also give good results in the future. Based on an analysis of the properties of box repetition-free words, a new search is devised. Factors of words are mapped onto a graph, and all non-optimal edges removed. It is then shown that any Hamiltonian path on this graph will result in a longest optimal word. The results of this work show that backtracking fails to generate longest optimal words within a reasonable time for any alphabet with more than three symbols. The Monte Carlo tree search performs better than backtracking, finding optimal words for an alphabet size of four, but failing for larger alphabets. The new method outperforms both, and with a small optimization, is shown to generate longest optimal words up to an alphabet size of six.
- ItemHierarchical text classification with transformer-based language models(Stellenbosch : Stellenbosch University, 2024-03) Du Toit, Jaco; Dunaiski, Marcel; Stellenbosch University. Faculty of Science. Dept. of Computer Science.ENGLISH ABSTRACT: Hierarchical text classification (HTC) is a natural language processing (NLP) task which has the objective of classifying text documents into a set of classes from a structured class hierarchy. For example, news articles can be classified into a hierarchical class set which comprises broad categories such as “Politics” and “Sport” in higher-levels with associated finer-grained categories such as “Europe” and “Cycling” in lower-levels. In recent years many different NLP approaches have been significantly improved through the use of transformer-based pre-trained language mod- els (PLMs). PLMs are typically trained on large amounts of textual data through self-supervised tasks such that they acquire language understanding capabilities which can be used to solve various NLP tasks, including HTC. In this thesis, we propose three new approaches for leveraging transformer-based PLMs to improve classification performance on HTC tasks. Our first approach formulates how hierarchy-aware prompts can be applied to discriminative language models such that it allows HTC tasks to scale to problems with very large hierarchical class structures. Our second approach uses label-wise attention mechanisms to obtain label-specific document repre- sentations which are used to fine-tune PLMs for HTC tasks. Furthermore, we propose a label-wise attention mechanism which splits the attention mecha- nisms into the different levels of the class hierarchy and leverages the predic- tions of all ancestor levels during the prediction of classes at a particular level. The third approach combines features extracted from a PLM and a topic model to train a classifier which comprises convolutional layers followed by a label- wise attention mechanism. We evaluate all three approaches comprehensively and show that our first two proposed approaches obtain state-of-the-art per- formances on three HTC benchmark datasets. Our results show that the use of prompts and label-wise attention mechanisms to fine-tune PLMs are very effective techniques for classifying text documents into hierarchical class sets. Furthermore, we show that these techniques are able to effectively leverage the language understanding capabilities of PLMs and incorporate the hierarchical class structure information to improve classification performance. We also introduce three new HTC benchmark datasets which comprise the titles and abstracts of research publications from the Web of Science publica- tion database with associated categories. The first two datasets use journal- and citation-based classification schemas respectively, while the third dataset combines these classifications with the aim of removing documents and classes which do not have a clear overlap between the two schemas. We show that this results in a more consistent classification of the publications. Finally, we per- form experiments on these three datasets with the best-performing approaches proposed in this thesis to provide a baseline for future research.
- ItemImpendulo: A Tool for Analysing Programmer Behaviour(Stellenbosch : Stellenbosch University, 2015-03) Jordaan, Pieter; Visser, Willem; Geldenhuys, Jaco; Stellenbosch University. Faculty of Science. Department of Mathematical Sciences (Computer Science)ENGLISH ABSTRACT : Automated submission systems for Computer Science courses are common- place today, however, these systems are typically solely focused on grading submissions and their ability to provide analysis and feedback is vastly under- utilised. This thesis seeks to address this by presenting Impendulo, a platform for analysing programmer behaviour. Impendulo allows one to study student per- formance at both a group and an individual level. This is achieved through the use of a customisable set of software analysis tools which Impendulo runs on user submissions. The results produced by the tools are parsed by Impendulo to produce rich feedback in the form of interactive charts, annotated code and detailed reports. In order to ascertain whether Impendulo's approach is viable, experimental trials were conducted with it throughout its development process. These trials consisted of a group of students using the Impendulo system while solving a set of programming problems. After each trial, the results were studied and used to determine which aspects of the system needed to be improved. At the end of Impendulo's development, all of the experiments were studied again to gain insight into the tested students' programming.
- ItemImplementation of the Cavalieri Integral(2023-02) van Zyl, Christoff; Grobler, Trienko; Stellenbosch University. Faculty of Science. Dept. of Computer Science.ENGLISH ABSTRACT: Cavalieri Integration in R n presents a novel visualization mechanism for weighted integration and challenges the notion of strictly rectangular integration strips. It does so by concealing the integrator inside the boundary curves of the integral. This paper investigates the Cavalieri integral as a superset of Riemann-integration in R n−1 , whereby the integral is defined by a translational region in R n−1 , which uniquely defines the integrand, integrator and integration region. In R 2 , this refined translational region definition allows for the visualization of Riemann-Stieltjes integrals along with other forms of weighted integration such as the Riemann–Liouville fractional integral and convolution operator. Programmatic implementation of such visualizations and computation of integral values are also investigated and relies on knowledge relating to numeric integration, algorithmic differentiation and numeric root finding. For the R 3 case, such visualizations over polygonal regions requires a mechanism for the triangulation of a set of nested polygons and transformations which allow for the use of repeated integration to solve the integration value over the produced triangular regions using standard 1-dimensional integration routines.
- «
- 1 (current)
- 2
- 3
- »