Test case generation for context free grammars

Esterhuizen, Marvin (2018-03)

Thesis (MSc)--Stellenbosch University, 2018.

Thesis

ENGLISH ABSTRACT : Software testing, despite decades of ongoing research, still forms a significant part of the development cycle. When the input domain of a software system must satisfy structural constraints, such as those specified by file formats or command line arguments, test input construction has the potential to be an extremely time consuming process. In this work we investigate how structured test inputs may be constructed through the use of formal grammars and how the input domain of a software system may be adequately represented by the constructed test inputs. We develop a framework which facilitates the automatic generation of test inputs and use it to implement a number of test input generation techniques to satisfy specific coverage criteria. We also introduce a method of test case case generation that exploits the structure of LR-automata to generate sets of passing and failing test cases. The investigations conducted in this work focusses on primarily two areas. Firstly, we investigate how the LR-automata based method of generating test inputs compares to the existing methods. Secondly we seek to verify that the LR-automata and existing methods of test input generation conform to their expected running times. The framework and algorithms are evaluated, in the context of two university level compiler courses as a method to assess parsers, generated from handwritten grammars, submitted by students and compared to the method of assessing submissions with test inputs constructed manually. We find that, in our sample set, assessment based on the positive test inputs generated by the LR-automata based method correlates most closely to assessment based on manually constructed test inputs. This indicates that the method would be most appropriate as a replacement for manually constructing test inputs. The results obtained from assessment based on negative test inputs generated by both, grammar an LR-automata, based methods is found to be unsatisfactory as there is no correlation between them and the results obtained through assessment with a set of manually constructed negative test inputs. An assessment of the performance of the algorithms is also given. We find that the running time of the existing methods of test input generation, based on context free grammars, varies linearly with respect to the size of the grammar and that the running time of the LR-automata based methods varies linearly with respect to the size of the constructed parsing automata.

AFRIKAANSE OPSOMMING : Sagteware toetsing, ten spyte van dekades van voortgesette navorsing, vorm steeds ‘n belangrike deel van die ontwikkelings siklus. Wanneer die insette domein van ‘n sagteware stelsel strukturele beperkings moet bevredig, soos wat gespesifiseer word deur lˆeerformate of bevellyn argumente, het toetsinsette konstruksie die potensiaal om ‘n uiters tydrowende proses te wees. In hierdie werk ondersoek ons hoe gestruktureerde toetsinsette deur die gebruik van formele grammatikas kan gebou word en hoe die insetdomein van ‘n sagteware sisteem voldoende verteenwoordig word deur die konstruksie van toetsinsette. Ons ontwikkel ‘n raamwerk wat die outomatiese generering van toetsinsette fasiliteer en gebruik dit om ‘n aantal toetsinvoer genereringtegnieke te implementeer om spesifieke dekkingskriteria te bevredig. Ons stel ook ‘n metode van toetsgevalgenerering bekend wat die struktuur van ‘n LR-automaat gebruik om stelle postiewe en negatiewe toetsgevalle te genereer. Die ondersoek wat in hierdie werk uitgevoer word fokus hoofsaaklik op twee gebiede. Eerstens, ondersoek ons hoe die LR-outomaat gebaseerde metode om toetsinsette te genereer vergelyk met die bestaande metodes. Tweedens poog ons om te verifeer dat die LR-automaat en bestaande metodes van toetsinvoergenerering ooreenstem met hul verwagte looptye. Die raamwerk en algoritmes word ge¨evalueer, in die konteks van twee universiteits kompileerder kurses opdragte as ‘n metode om sintaksontleders te evalueer, wat gegenereer is vanaf handgeskrewe grammatikas, en word vergelyk met die metode om voorleggins met handgeskrewe toetsinsette te assesseer. Ons vind dat in ons steekproef, assessering gebaseer op positiewe toetsinsette wat gegenereer word deur die LR-automaat gebaseerde metode, die beste korreleer met assessering gebaseer op handopgestelde toetsinsette. Dit dui aan dat die metode mees geskik sal wees as ‘n plaasvervanger vir toetsing deur middel van handopgestelde toetsinsette. Die resultate verkry uit assessering gebaseer op negatiewe toetsinsette gegenereer deur grammatika en LR-outomaat, is gevind om onbevredigend te wees aangesien daar geen verband tussen hierdie metodes relation between them and the results obtained through assessment with a set of manually constructed negative test inputs. An assessment of the performance of the algorithms is also given. We find that the running time of the existing methods of test input generation, based on context free grammars, varies linearly with respect to the size of the grammar and that the running time of the LR-automata based methods varies linearly with respect to the size of the constructed parsing automata.

Please refer to this item in SUNScholar by using the following persistent URL: http://hdl.handle.net/10019.1/103802
This item appears in the following collections: