Browsing by Author "Middleton, Francesca"
Now showing 1 - 1 of 1
Results Per Page
Sort Options
- ItemArray completion methods for thermodynamic data generation(Stellenbosch : Stellenbosch University, 2023-12) Middleton, Francesca; Cripwell, Jamie Theo; Stellenbosch University. Faculty of Engineering. Dept. of Chemical Engineering.ENGLISH ABSTRACT: This investigation considered the viability of array completion methods (ACMs), a class of machine learning method, for pseudo-data generation for thermodynamic properties. The purpose of the pseudo-data generation was to aid thermodynamic model development, such as that of complex equations of state used in the development and optimisation of processes in the chemical engineering industry. The property of the excess enthalpy of binary liquid mixtures was used for this investigation. This property has significant variations in behaviour that are difficult to predict accurately. Excess enthalpy data are expensive to produce with experimental methods, and, thus, the machine learning method of array completion aims to reduce this expense. The ACM was proposed as opposed to other machine learning methods as it is purely data-driven, therefore, does not require descriptors, and works well with sparse datasets. ACMs operate solely on the data available within the array, making data quality a critical factor for optimal outcomes. A meticulous data collection effort was undertaken to achieve the overarching goal of pseudo-data generation. Reliable excess enthalpy data was collected for binary liquid mixtures encompassing various temperature conditions. The array of excess enthalpy data had 4 dimensions or ways, including the mixtures’ two components on the first two ways and the mixtures’ composition and temperature conditions on the third and fourth ways, respectively. The study involved the exploration of three ACMs, using singular value decomposition (SVD) for 2-way or matrix completion methods (MCM) and higher-order SVD (HOSVD) for 3- and 4-way completion. When used in conjunction with UNIFAC predictions, the MCM outperformed the standalone UNIFAC model. Notably, it is found that a rank of ses for the decomposition suffices for completing the excess enthalpy data array. The research demonstrated, however, that the 3-way and 4-way ACMs did not apply to the excess enthalpy data. The MCM was, therefore, applied on 2-way or matrix slices of the array formed at discrete temperature and composition conditions. The slices were related via a constraint when completing matrix slices in parallel, ensuring smooth predictions across composition. This adjustment to the MCM significantly improved prediction quality and allowed the MCM to be successfully applied to matrices of constant temperature and composition conditions. The optimal pattern of missing entries for pseudo-data generation was found to be randomly missing entries, as opposed to systematically entries. Therefore, the concept of targeted measurements is proposed. This involves directing thermodynamic experiments towards creating randomly missing patterns of entries in arrays. This fills sparse areas of the arrays as well, allowing the MCM to be applied for better quality pseudo-data at a lower cost than experimentation. This circumvention of the limitations imposed by data sparsity could enrich the training data for thermodynamic models and enhance their predictive capabilities. The efficacy of the MCM was also found to rely on initial guesses for missing entries in an array. The research demonstrated the synergy of ACMs with UNIFAC, where the group contribution method provided initial guesses for the MCM, resulting in a hybrid thermodynamic-machine learning method. These informed initial guesses also provided insight into the interpretation of pseudo-data sets, as UNIFAC provides informed estimations for the dataset and can, thus, provide quick checks to users of the MCM. The efficacy of the MCM for varied thermodynamic complexity was also investigated, using the mathematical and thermodynamic descriptions of the data. This included investigating behaviour for the functional groups present in a mixture and other measures of the complexity of mixture behaviour. The MCM recognised underlying patterns inherent in thermodynamic theory, and grouped systems based on their behaviour. The mixture complexity played a small role in prediction accuracy, as mixtures of varied complexity required the same rank for optimal completion. It was, instead, clear that the distribution of data and the presence of similar mixtures played a more pivotal role in predicting the accuracy of the pseudo-data generated. The implications of the study extend to future research. While effective, the MCM employed in this study warrants further refinement, possibly by incorporating fundamental knowledge and robust statistical motivations. This research contributes to understanding how ACMs can be used for pseudo-data generation for composition-dependent thermodynamic properties. The investigation used the excess enthalpy of binary liquid mixtures, a difficult-to-predict property, and succeeded, demonstrating the MCMs efficacy.