Graph-based semi-supervised learning for the detection of potential disease causing genes

dc.contributor.advisorVan Vuuren, J. H.en_ZA
dc.contributor.authorVan Zyl, G.en_ZA
dc.contributor.otherStellenbosch University. Faculty of Engineering. Dept. of Industrial Engineering.en_ZA
dc.date.accessioned2020-10-07T18:20:55Z
dc.date.accessioned2021-01-31T19:35:33Z
dc.date.available2020-10-07T18:20:55Z
dc.date.available2021-01-31T19:35:33Z
dc.date.issued2020-12
dc.descriptionThesis (PhD)--Stellenbosch University, 2020.en_ZA
dc.description.abstractENGLISH ABSTRACT: AbstractIt is widely believed that almost all diseases are, to some extent, influenced by individuals’ geneticmake-up. Ample insight into this relationship may usher in a new age where preventative,precision medicine is the norm. The identification of human genes associated with diseases(disease genesin short) is a central step in the realisation of this ambition. The developmentof computational approaches aimed at identifying putative disease genes among a large pool ofcandidates — so as to reduce the number of alternatives to be explored in further validationexperiments and functional studies — has, therefore, become one of the fundamental problemsin bioinformatics.In the realm of bioinformatics, disease gene classification is primarily based on the principlethat “the network neighbour of a disease gene is likely to cause the same or a similar disease.”In this dissertation, a novel computational approach to the disease gene identification problemis proposed. This methodological framework utilises the aforementioned principle and exploitsboth the modular nature of biological networks and the abundance of available data related tothe similarities between genes within a semi-supervised machine learning paradigm.The proposed disease gene identification methodology is demonstrated practically and found toexhibit significant classification abilities. In addition, the framework is successfully applied toobtain ranked sets of putative disease gene predictions — a number of which are verified byretrieving evidence of their involvement in the origins of diseases from the literature.en_ZA
dc.description.abstractAFRIKAANSE OPSOMMING: Daar word algemeen geglo dat bykans alle siektes tot ’n sekere mate deur die genetiese samestel-ling van individue be ̈ınvloed word. Goeie insig in hierdie verwantskap kan na ’n nuwe tydvak leiwaar voorkomende, presiesie-medisyne die norm is. Die identifikasie van menslike gene wat metsiektes (siekte-genein kort) verbind word, is ’n sentrale stap in die verwesenliking van hierdieideaal. Die ontwikkeling van berekeningsbenaderings wat daarop gemik is om vermeende siekte-gene tussen ’n groot aantal kandidate te identifiseer — om sodoende die aantal alternatiewe watin verdere valideringseksperimente en funksionele studies ondersoek moet word, te verminder —is dus een van die fundamentele probleme in bioinformatika.Op die gebied van bioinformatika is die klassifikasie van siekte-gene hoofsaaklik gebaseer opdie beginsel dat “die netwerkbuurgeen van ’n siekte-geen waarskynlik dieselfde of ’n soortge-lyke siekte sal veroorsaak.” In hierdie proefskrif word ’n nuwe berekeningsbenadering tot dieidentifikasieprobleem van die siekte-gene daargestel. Hierdie metodologiese raamwerk maak ge-bruik van die bogenoemde beginsel en benut beide die modulˆere aard van biologiese netwerkeen die oorvloed beskikbare data wat verband hou met die ooreenkomste tussen gene binne ’nsemi-toesighoudende masjienleerparadigma.Die voorgestelde metodologie vir die identifikasie van siekte-gene word prakties gedemonstreeren daar word bevind dat die metodologie oor betekenisvolle klassifikasievermo ̈e beskik. Daar-benewens word die raamwerk suksesvol toegepas om rangordes van vermeende siekte-gene daarte stel, waarvan ’n aantal geverifieer word deur bewyse van hul deelname aan die oorsprong vansiektes uit die literatuur te staaf.af_ZA
dc.description.versionDoctoralen_ZA
dc.format.extent411 pages : illustrationsen_ZA
dc.identifier.urihttp://hdl.handle.net/10019.1/109105
dc.language.isoen_ZAen_ZA
dc.publisherStellenbosch : Stellenbosch Universityen_ZA
dc.rights.holderStellenbosch Universityen_ZA
dc.subjectBioinformaticsen_ZA
dc.subjectProtein-protein interaction networken_ZA
dc.subjectLinkage mapping (Genetics)en_ZA
dc.subjectHeterogeneous distributed computing systemsen_ZA
dc.subjectUCTDen_ZA
dc.titleGraph-based semi-supervised learning for the detection of potential disease causing genesen_ZA
dc.typeThesisen_ZA
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
vanzyl_graph_2020.pdf
Size:
260.48 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: