A study of the data mining of meeting minutes of construction projects

Date
2020-12
Journal Title
Journal ISSN
Volume Title
Publisher
Stellenbosch : Stellenbosch University
Abstract
ENGLISH ABSTRACT: This research is motivated by the increased use of big data and the need to decrease the cost/time overruns experienced in the construction industry. During the construction period of a project, numerous factors contribute to the outcome of the project. Simply knowing some of these factors may not contribute to the successful completion of the project. Being able to use the known and the unknown factors to create a model that can predict the outcome of a project will enable the project management team to make informed decisions. This research aims to determine if the information currently being recorded in site progress meeting minutes, is sufficient to use in data mining applications for the prediction of the outcomes of a project, and to establish if new knowledge can be obtained from this process. Data mining to aid project management in the construction industry has seen limited application, especially in South Africa. Data mining is part of the Knowledge Discovery in Data (KDD) process, which is used to learn new information from data. The research starts with a literature review to identify a list of factors that influence the outcome of projects – positively and negatively. From the identified project outcome factors, the two that are highlighted most often are leadership and planning. These two overarching categories were used to determine if and how influencing attributes are recorded in the site meeting minutes. The current uses of data mining in the construction industry were investigated to determine how data mining and KDD have been implemented in the industry. Although KDD has been applied in the construction industry, no information was found about its application in the South African construction industry. Some of the reasons why it has not yet been implemented could be related to copyright, privacy and data security, and lack of incentives to implement data mining. An investigation of several projects’ meeting minutes was undertaken where the meeting minutes were data mined to determine if they can be used to predict the outcome of future projects. The two overarching categories above where used to identify the information that is present in the meeting minutes. These attributes were then used as the data mining features. Two data mining applications were used to compare the applications and to validate the results. The most accurate data mining models were created using the Random Forest data mining algorithm. The prediction models are able to predict the outcome of future projects with a high degree of certainty.
AFRIKAANSE OPSOMMING: Hierdie navorsing is gemotiveer deur die toename in die toepassing van data in verskeie industrieë sowel as deur behoefte na suksesvolle projekte. Gedurende die konstruksie tydperk is daar talle faktore wat 'n rol speel in die uitkoms van 'n projek. Dit is belangrik om te weet wat hierdie faktore is en hoe hulle gebruik kan word saam met die onbekende faktore om die uitkoms van 'n projek te kan voorspel. Genoegsame data kan die projekbestuurspan in staat stel om ingeligte besluite te kan neem. Die doel van hierdie navorsing is om te bepaal of die inligting wat tans in die projek se vorderingsvergaderings se notules aangeteken word, gebruik kan word om die uitkoms van projekte te voorspel sowel as om vas te stel of nuwe kennis gedurende hierdie proses verkry kan word. Tot dusver was ontginning van data in die konstruksie bedryf van beperkte omvang in Suid-Afrika. Data ontginning is deel van die Kennisontdekking in Data (KDD) proses, wat gebruik word om nuwe inligting uit data te leer. 'n Literatuur oorsig is gedoen om 'n lys faktore te identifiseer wat die uitkomste van projekte beïnvloed – beide positief en negatief. Van die geïdentifiseerde projekuitkomsfaktore is die twee wat die meeste uitgelig word, leierskap en beplanning. Hierdie twee oorhoofse kategorië is toe gebruik om te bepaal hoe die atribute asook watter atribute, in die projek se vorderingsvergaderings se notules die uitkomste van die projek aanspreek. Die huidige gebruik van data ontginning in die konstruksie bedryf is ondersoek om te bepaal hoe dit in die bedryf geïmplementeer word. Alhoewel KDD in die konstruksie bedryf toegepas is, kon geen inligting gevind word oor die toepassing daarvan in die Suid-Afrikaanse konstruksie bedryf nie. 'n Ondersoek na verskeie projekte se vorderingsvergaderings se notules is gedoen. Die notules is gebruik in die KDD proses, insluitend die ontginning van data, om te bepaal of die notules gebruik kan word om die uitkoms van toekomstige projekte te voorspel. Twee verskillende data-ontginningstoepassings is gebruik om die resultate te vergelyk en te valideer. Die mees akkurate data-ontginningsmodelle is geskep met behulp van die ewekansige woud algoritme. Die voorspellings modelle is in staat om die uitkoms van toekomstige projekte met 'n hoë mate van akkuraatheid te voorspel.
Description
Thesis (MEng)--Stellenbosch University, 2020.
Keywords
Construction Industry, UCTD, Construction projects, Engineering and construction projects, Construction prices, Data mining
Citation