A generic framework for aspect-based sentiment analysis.
dc.contributor.advisor | Van Vuuren, Jan Harm | en_ZA |
dc.contributor.advisor | Nel, Gerrit Stephanus | en_ZA |
dc.contributor.author | van Zyl, Bianca Jordan | en_ZA |
dc.contributor.other | Stellenbosch University. Faculty of Engineering. Dept. of Industrial Engineering. | en_ZA |
dc.date.accessioned | 2023-02-09T11:11:29Z | en_ZA |
dc.date.accessioned | 2023-05-18T07:04:25Z | en_ZA |
dc.date.available | 2023-02-09T11:11:29Z | en_ZA |
dc.date.available | 2023-05-18T07:04:25Z | en_ZA |
dc.date.issued | 2023-02 | en_ZA |
dc.description | Thesis (MEng)--Stellenbosch University, 2023. | en_ZA |
dc.description.abstract | ENGLISH ABSTRACT: With the increasing volume and complexity of user-generated content shared via the Internet, the need has arisen for automated methods capable of extracting meaningful insights from unstructured text data. Sentiment analysis is a form of text analysis involving the computational identification of the polarity of an opinion expressed by an author of a given piece of text. While much of the existing work in this field focusses on document-level or sentence-level analysis, in which an entire document or sentence, respectively, is viewed as a single information unit and is assumed to contain at most one expression of sentiment, aspect-based sentiment analysis involves a more fine-grained approach, facilitating the discovery of multiple topics, and the sentiment polarities towards these topics, present in a document containing text data. The most promising approaches towards aspect-based sentiment analysis to date are those based on supervised machine learning. Many of the methodologies in the literature are, however, focussed on the application of specific machine learning models, on only specific sub-tasks of the problem, or on a specific domain of application. In this thesis, a generic framework is proposed for aspect-based sentiment analysis, the aim of which is to guide a user through the process of gaining insights from an unstructured text data set from any domain. This is achieved by facilitating the development of machine learning models for each task of aspect-based sentiment analysis in respect of the specific data set under analysis. As such, the goal of the framework is to guide the user through the data preparation, model development, and deployment stages of an aspect-based sentiment analysis project, rather than fully automating the process, thereby allowing for a greater degree of generality. Such a framework may aid organisations in leveraging unstructured text data to better understand customer or public sentiment, proactively identify areas for improvement, and support appropriate decision making. An instantiation of the proposed framework is implemented in Python as a proof-of-concept demonstration of the generality and utility of the framework. The instantiation is verified to ensure its quality and correct working, after which it is validated in respect of two popular benchmark data sets in the restaurant and laptop review domains, respectively. The results achieved by applying the framework instantiation to these benchmark data sets are promising and demonstrate the value of a structured model development process. The framework instantiation is also applied to a real-world case study to demonstrate its practical applicability in a business context. | en_ZA |
dc.description.abstract | AFRIKAANS OPSOMMING: Met die toenemende volume en kompleksiteit van gebruiker-gegenereerde inhoud wat via die internet gedeel word, het die behoefte ontstaan vir outomatiese metodes wat daartoe in staat is om betekenisvolle insigte uit ongestruktureerde teksdata te onttrek. Sentimentanalise is ’n vorm van teksanalise wat gemik is op die berekeningsidentifikasie van die polariteit van ’n mening wat deur ’n outeur van ’n gegewe stuk teks uitgespreek word. Terwyl baie van die bestaande werk in hierdie veld op dokumentvlak- of sinsvlak-analise fokus, waarin ’n hele dokument of sin, onderskeidelik, as ’n enkele inligtingseenheid beskou word en daar aanvaar word dat hierdie eenheid hoogstens een uitdrukking van sentiment kan bevat, behels aspek-gebaseerde sentimentanalise ’n fyner benadering wat die ontdekking van veelvuldige onderwerpe fasiliteer, sowel as die ontdekking van sentimentpolariteite teenoor hierdie onderwerpe teenwoordig in ’n document wat teksdata bevat. Die mees belowende benaderings tot aspek-gebaseerde sentimentanalise tot dusver is di´e wat op masjienleer onder toesig gebaseer is. Baie van die metodologie¨e in die literatuur is egter gefokus op die toepassing van spesifieke masjienleermodelle, slegs op spesifieke sub-take van die probleem, of op ’n spesifieke toepassingsgebied. In hierdie tesis word ’n generiese raamwerk vir aspek-gebaseerde sentimentanalise daargestel wat ten doel het om ’n gebruiker te lei deur die proses om insigte uit ’n ongestruktureerde teksdatastel afkomstig uit enige gebied te verkry. Hierdie doel word bereik deur die ontwikkeling van masjienleermodelle vir elke taak van aspek-gebaseerde sentimentanalise ten opsigte van ’n spesifieke datastel wat ontleed word. As sodanig is die doel van die raamwerk om die gebruiker deur die datavoorbereiding-, modelontwikkeling- en ontplooiingstadiums van ’n aspek-gebaseerde sentimentanaliseprojek te lei, eerder as om die proses ten volle te outomatiseer, en sodoende ’n groter mate van algemeenheid moontlik te maak. So ’n raamwerk kan organisasies help om ongestruktureerde teksdata te benut om kli¨ente- of publieke sentiment beter te verstaan, proaktief areas vir verbetering te identifiseer, en toepaslike besluitneming te ondersteun. ’n Instansiasie van die voorgestelde raamwerk word in Python ge¨ımplementeer as ’n bewysvan-konsep demonstrasie van die algemeenheid en bruikbaarheid van die raamwerk. Die instansiasie word geverifieer om die kwaliteit en korrekte werking daarvan te verseker, waarna dit gevalideer word ten opsigte van twee gewilde maatstafdatastelle in onderskeidelik die restauranten skootrekenaar-resensiegebiede. Die resultate van die raamwerk-instansiasie soos op hierdie maatstafdatastelle toegepas, is belowend en demonstreer die waarde van ’n gestruktureerde modelontwikkelingsproses. Die raamwerk-instansiasie word ook in ’n werklike gevallestudie toegepas om die praktiese toepaslikheid daarvan in ’n besigheidskonteks te demonstreer. | af_ZA |
dc.description.version | Masters | en_ZA |
dc.format.extent | xxiv, 255 pages : illustrations. | en_ZA |
dc.identifier.uri | http://hdl.handle.net/10019.1/127103 | en_ZA |
dc.language.iso | en_ZA | en_ZA |
dc.language.iso | en_ZA | en_ZA |
dc.publisher | Stellenbosch : Stellenbosch University | en_ZA |
dc.rights.holder | Stellenbosch University | en_ZA |
dc.subject.lcsh | Natural language processing (Computer science) | en_ZA |
dc.subject.lcsh | Sentiment analysis | en_ZA |
dc.subject.lcsh | Machine learning | en_ZA |
dc.subject.lcsh | Decision support systems | en_ZA |
dc.title | A generic framework for aspect-based sentiment analysis. | en_ZA |
dc.type | Thesis | en_ZA |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- vanzyl_generic_2023.pdf
- Size:
- 21.05 MB
- Format:
- Adobe Portable Document Format
- Description: