Masters Degrees (Computer Science)

Permanent URI for this collection

https://scholar.sun.ac.za/handle/10019.1/96339

Browse

Now showing 1 - 4 of 4

Coverage directed algorithms for test suite construction from LR-automata
(Stellenbosch : Stellenbosch University, 2022-04) Rossouw, Christoffel Jacobus; Fischer, Bernd; Stellenbosch University. Faculty of Science. Dept. of Computer Science.
ENGLISH ABSTRACT: Bugs in software can have disastrous results in terms of both economic cost and human lives. Parsers can have bugs, like any other type of software, and must therefore be thoroughly tested in order to ensure that a parser recognizes its intended language accurately. However, parsers often need to recognize many different variations and combinations of grammar structures which can make it time consuming and difficult to construct test suites by hand. We therefore require automated methods of test suite construction for these systems. Currently, the majority of test suite construction algorithms focus on the grammar describing the language to be recognized by the parser. In this thesis we take a different approach. We consider the LR-automaton that recognizes the target language and use the context information encoded in the automaton. Specifically, we define a new class of algorithm and coverage criteria over a variant of the LR-automaton that we define, called an LR-graph. We define methods of constructing positive test suites, using paths over this LR-graph, as well as mutations on valid paths to construct negative test suites. We evaluate the performance of our new algorithms against other state-of-the-art algorithms. We do this by comparing coverage achieved over various systems, some smaller systems used in a university level compilers course and other larger, real-world systems. We find good performance of our algorithms over these systems, when compared to algorithms that produce test suites of equivalent size. Our evaluation has uncovered a problem in grammar-based testing algorithms that we call bias. Bias can lead to significant variation in coverage achieved over a system, which can in turn lead to a flawed comparison of two algorithms or unrealized performance when a test suite is used in practice. We therefore define bias and measure it for all grammar-based test suite construction algorithms we use in this thesis.
Design and evaluation of a formula cache for SMT-based bounded model checking tools
(Stellenbosch : Stellenbosch University, 2018-03) Breytenbach, Jean Anré; Fischer, Bernd; Stellenbosch University. Faculty of Science. Dept. of Mathematical Sciences (Computer Science)
ENGLISH ABSTRACT : Program verification is a computationally expensive and time-consuming process. Bounded model checking is a branch of program verification that produces FOL formulas to be checked for satisfiability by an SMT solver. These formulas encode state transitions to states where property violations will occur and the SMT solver attempts to find a list of variable assignments that would create a path to one of these states. Bounded model checking tools create these formulas by iteratively increasing an unwind bound k that dictates the number of transitions that can be present in a path, for each unwind bound k all possible paths of length k are generated. Any state containing a property violation that is generated during the unwind bound k − 1 should also be present during the unwind bound k with perhaps a longer path to reach it. This creates many of the same paths being checked during each subsequent iteration causing the SMT solver to potentially perform duplicate work. This thesis seeks to build and evaluate a formula cache in which to store parts of the formula for which the satisfiability is already known. During subsequent iterations the formula can be sliced by removing the parts that are found in the cache, providing smaller formulas for the SMT solver which should take less time to solve. Similar formula caches have already proven successful in the field of symbolic execution. Multiple techniques are described to modify the formulas to increase the likelihood of finding a match in the cache and these techniques are combined in a multitude of ways to generate a collection of caching strategies. These strategies are then evaluated against multiple data sets to find the best performing strategies and to identify the types of problems that benefit the most from caching. The results are then compared to the results of caching formulas during symbolic execution to gain insight as to how these different approaches effectively implement caching.
Scaling the ConceptCloud browser to very large semi-structured data sets: architecture and data completion
(Stellenbosch : Stellenbosch University, 2020-12) Berndt, Joshua; Fischer, Bernd; Britz, K.; Stellenbosch University. Faculty of Science. Dept. of Mathematical Sciences. Division Computer Science.
ENGLISH ABSTRACT: Semi-structured data sets such as product reviews or event log data are simultaneously becoming more widely used and ever larger. This thesis describes ConceptCloud, a exible, interactive browser for semi-structured datasets, with a focus on the improvements made to accommodate larger datasets, more intuitive data representation and the enrichment of the underlying data by way of data-imputation. ConceptCloud makes use of an intuitive tag cloud visualisation viewer in combination with an underlying concept lattice to provide a formal structure for navigation through datasets without prior knowledge of the structure of the data or compromising scalability. This scalability is achieved by the implementation of architectural changes to increase the system's resource efficiency. These changes are demonstrated by way of a case study on a dataset of wine reviews. Semi-structured data sets such as product reviews or event log data often contain a geolocation aspect: for example, the location of the winery for wine reviews, or the accident location for traffic data. In this thesis, I describe ConceptCloud extensions which allow for the rendering of specialised geolocation data while providing alternate navigation paths through the dataset. I show that using biclusters can make the navigation bidirectional, and demonstrate this approach on a crime data set making use of a geolocation specialised map viewer. Semi-structured data often contains implicit information which will be useful in driving data exploration if made explicit. I take advantage of domain ontologies to both allow implicit data in each input data set to be made explicit and verify and correct inconsistencies allowing for better data exploration. I demonstrate this approach with a continuation of the wine case study.
Test case generation for context free grammars
(Stellenbosch : Stellenbosch University, 2018-03) Esterhuizen, M. H.; Fischer, Bernd; Stellenbosch University. Faculty of Science. Dept. of Mathematical Sciences (Computer Science)
ENGLISH ABSTRACT : Software testing, despite decades of ongoing research, still forms a significant part of the development cycle. When the input domain of a software system must satisfy structural constraints, such as those specified by file formats or command line arguments, test input construction has the potential to be an extremely time consuming process. In this work we investigate how structured test inputs may be constructed through the use of formal grammars and how the input domain of a software system may be adequately represented by the constructed test inputs. We develop a framework which facilitates the automatic generation of test inputs and use it to implement a number of test input generation techniques to satisfy specific coverage criteria. We also introduce a method of test case case generation that exploits the structure of LR-automata to generate sets of passing and failing test cases. The investigations conducted in this work focusses on primarily two areas. Firstly, we investigate how the LR-automata based method of generating test inputs compares to the existing methods. Secondly we seek to verify that the LR-automata and existing methods of test input generation conform to their expected running times. The framework and algorithms are evaluated, in the context of two university level compiler courses as a method to assess parsers, generated from handwritten grammars, submitted by students and compared to the method of assessing submissions with test inputs constructed manually. We find that, in our sample set, assessment based on the positive test inputs generated by the LR-automata based method correlates most closely to assessment based on manually constructed test inputs. This indicates that the method would be most appropriate as a replacement for manually constructing test inputs. The results obtained from assessment based on negative test inputs generated by both, grammar an LR-automata, based methods is found to be unsatisfactory as there is no correlation between them and the results obtained through assessment with a set of manually constructed negative test inputs. An assessment of the performance of the algorithms is also given. We find that the running time of the existing methods of test input generation, based on context free grammars, varies linearly with respect to the size of the grammar and that the running time of the LR-automata based methods varies linearly with respect to the size of the constructed parsing automata.

Browse

Browsing Masters Degrees (Computer Science) by browse.metadata.advisor "Fischer, Bernd"

Results Per Page

Sort Options