The utilisation of whole exome sequencing to dissect the genetic aetiology of familial Parkinson’s disease in a South African Afrikaner family

Sebate, Boiketlo (2019-04)

Thesis (MSc)--Stellenbosch University, 2019.

Thesis

ENGLISH ABSTRACT: Parkinson’s disease (PD) is a complex neurodegenerative disorder, the aetiology of which is thought to be an interaction of genetic, biological and environmental factors. Its cardinal motor features, tremor, muscular rigidity, bradykinesia and abnormal gait occur relatively late in the course of disease, as a result of over 60% loss of dopaminergic neurons in the substantia nigra pars compacta. While most reported PD cases are sporadic, 5-10% of all cases are caused by several PD-causing genes including Parkin, PINK1, LRRK2, SNCA, SYNJ1, DJ-1 and EIF4G1. These PD genes were discovered using first-generation sequencing technologies which were expensive and time consuming. The development of high-throughput next generation sequencing technologies like whole exome sequencing (WES) has fast-tracked the discovery of disease-causing genetic mutations in various Mendelian disorders like PD. WES enables the screening of only the protein coding regions of the genome, to locate mutations which can interrupt cellular processes and lead to diseases. To date, WES has identified several PD-causing genes including CHCHD2, VPS35 and LRP10, implicating the dysfunction of pathways regulating mitochondrial, lysosomal and synaptic function. The aim of the present study is to combine WES technology and functional studies to identify a novel pathogenic mutation in a gene that could be implicated in the autosomal dominant form of PD in a South African Afrikaner family. To do this, a comprehensive filtration strategy was applied, combining various bioinformatic tools, in a step-wise analysis approach to assist in the filtration, interpretation and prioritisation of the NGS results. The bioinformatic tools used included SIFT, PolyPhen-2, MutationTaster, CADD, GERP++, Allen Brain Atlas, PANTHER and SWISS-MODEL. Once a single candidate gene was selected from the computational prioritisation, its protein expression was investigated in a disease relevant cell model using Western blotting. WES was conducted on three affected individuals yielding over 20,000 variants each. Quality control (sequence alignment, alignment postprocessing, variant detection and quality evaluation) and filtering only the co-segregating non-synonymous variants through bioinformatics analysis yielded nine variants. Sanger sequencing was used to verify these variants, and four variants in the genes ACTN3, CDC27, POU2F1 and TUBB6 were found to be sequencing artefacts. Five variants in the genes RFT1, NRXN2α, TEP1, CCNF and CFAP65 were found to be present in all four affected PD individuals of the family. Only one variant, p.G849D in NRXN2α, fulfilled the various prioritisation criteria. The mutation was not present in the unaffected family members, in 671 South African PD patients and in 192 ethnically-matched controls. Multiple online population frequency databases also showed that the variant had not been previously reported in any population. Using a web-based database containing the exomes of 3000 patients with neurological disorders, 50 PD patients were identified with 24 other variants in the NRXN2α gene including indel mutations and premature stop codons. Amongst the five verified candidates, the p.G849D NRXN2α variant was predicted to be pathogenic across all four functional prediction tools with the highest Combined Annotation Dependent Depletion (CADD) score (29,50). It was also found to have a very high GERP++ score denoting that the level of evolutionary constraint acting on this site is predicted to be very high. The p.G849D NRXN2α amino acid change was the most severe, from a small, non-polar, side chain free amino acid, to Aspartic acid a larger, negatively charged amino acid. The homology modelling of the mutant vs wild-type revealed no change in the protein secondary structure but biochemically the substitution could lead to unwanted interactions with the neighbouring residues which could possibly affect the function and activity of the protein. Also, NRXN2α was found to be highly expressed in the substantia nigra which is the main region of the brain affected in PD pathogenesis. It is associated with pathways related to calcium channel regulation, transmembrane signalling receptor activity, neuronal cell adhesion, synaptic organization and neuroligin family protein binding at the synapse, which makes it a plausible candidate gene for PD. Additionally, for functional studies, SH-SY5Y Neuroblastoma cells which are commonly used in vitro model for PD, were utilised. We investigated the endogenous NRXN2α levels in the SH-SY5Y cells and found that they produced detectable levels of the NRXN2α protein and were stably expressing this protein. In summary, by integrating WES and in vitro studies, we identified p.G849D NRXN2α, a variant possibly associated with the autosomal dominant form of PD in a South African Afrikaner family. To our knowledge, this is the first report of an association between NRXN2α and PD. As a candidate, NRXN2α is well-suited for future functional mutant characterisation studies that will elucidate the impact of variants in this gene and their relative contribution to the disease phenotype. The further study of NRXN2α in PD may provide critical insight into novel disease mechanisms or genetic interactions with established PD mechanisms. Ultimately, this could potentially lead to development of improved therapeutic modalities for this debilitating disorder.

AFRIKAANSE OPSOMMING: Parkinson se siekte (PS) is 'n komplekse neurodegeneratiewe siekte, waarvan die etiologie 'n interaksie is tussen genetiese, biologiese en omgewingsfaktore. Die kardinale motoriese simptome,word beskyf as spierstyfheid en bradykinesie. Hierdie abnormale gang word relatief laat in die verloop van die siekte waargeneem dit is as gevolg van die 60% verlies van dopaminerge neurone in die substantia nigra pars compacta. Terwyl die meeste PS-gevalle sporadies is word 5-10% van alle gevalle veroorsaak deur verskeie gene, insluitend Parkin, PINK1, LRRK2, SNCA, SYNJ1, DJ-1 en EIF4G1. Hierdie familiele PS-gene is ontdek deur gebruik te maak van eerste generasie sekwensie tegnologieë wat duur en tydrowend was. Die ontwikkeling van hoë-deursettingsvolgorde (NGS)-tegnologieë, soos die heel eksoom volgordebepaling (WES), het die ontdekking van siekte-veroorsakende genetiese mutasies in verskeie Mendeliese afwykings soos PS verspoedig. WES maak sifting van slegs die proteïenkoderende streke van die genoom om sodoende mutasies op te spoor wat sellulêre prosesse kan onderbreek en direk tot siektes kan lei. Tans het WES verskeie PS-veroorsakende gene geïdentifiseer, insluitend CHCHD2, VPS35 en LRP10 gene, wat geïmpliseer word in die disfunksie van paaie wat mitochondriale, lysosomale en sinaptiese funksie reguleer. Die doel van die huidige studie is om WES tegnologie en funksionele studies te kombineer om 'n nuwe patogeniese mutasie in 'n geen te identifier wat in die outosomale dominante vorm van PS in 'n Suid-Afrikaanse Afrikaner-familie geïmpliseer kan word. Om dit te kan doen is 'n omvattende filtreertegnologie toegepas. Hierdie filtreer-tegnologieë kombineer doeltreffende nuwe bioinformatika-instrumente wat elkeen verskillende take aanspreek in 'n stap vir stap analise-benadering om te help met die filtrering, interpretasie en prioritering van die NGS-resultate. Die bioinformatiese gereedskap wat gebruik word sluit in SIFT, PolyPhen-2, MutationTaster, CADD, GERP ++, Allen Brain Atlas, PANTHER en SWISS-MODEL. Sodra 'n enkele kandidaat-geen uit die berekeningsprioritering gevind was, was ‘n funksionele analiese gedoen om die gene-uitdrukking in 'n siekte-relevante selmodel te ondersoek. WES is uitgevoer op drie geaffekteerde individue wat meer as 20 000,00 variante lewer. Gehaltebeheer (volgordebelyning, belyningsprosessering, variantdeteksie en kwaliteitevaluering) en die filter van slegs die mede-segregerende, nie-sinoniem variante deur bioinformatika-analise het nege variante opgelewer. Sanger-volgordebepaling is gebruik om die nege WES-variante te bevestig, en vier variante in die gene ACTN3, CDC27, POU2F1 en TUBB6 was bevind as volgorde artefakte. Terwyl vyf variante in die gene RFT1, NRXN2α, TEP1, CCNF en CFAP65 gevind is, was dit teenwoordig in al vier die betrokke PS-individue van die familie. Slegs een variant van die p.G849D-variant in NRXN2α het aan die verskillende prioritisasiekriteria voldoen. Die mutasie was nie teenwoordig in die onaangeraakte familielede in plaaslike pasiënte en kontroles nie. Dit is ook nie binne aanlyn bevolkingsfrekwensie databasisse gevind nie. Met behulp van 'n webgebaseerde databasis met die uitkomste van 3000 neurologiese pasiënte, is 50 PS pasiënte geïdentifiseer met 24 ander mutasies in die NRXN2α-geen, insluitend indel mutasies en voortydige stopkodons. Van die vyf vasgestelde kandidate word dit voorspel dat die p.G849D NRXN2α-variant patogenies is oor al vier funksionele voorspellingsgereedskap met die hoogste CADD-telling. Daar is ook gevind dat dit 'n baie hoë GERP++-telling het, wat aandui dat die vlak van evolusionêre beperking wat op hierdie webwerf optree, baie hoog is. Die p.G849D NRXN2α-aminosuurverandering was die ernstigste, 'n klein, nie-polêre, ketting vrye aminosuur, bewerkstellig ’n verandering tot Aspartiensuur wat 'n groter negatief gelaaide aminosuur is. Die homologie modellering van die mutant teenoor die wildtiepe het geen verandering in die proteïen sekondêre struktuur geopenbaar nie, maar biochemies kan die verandering lei tot ongewenste interaksies met die naburige residue wat moontlik die funksie en aktiwiteit van die proteïen kan beïnvloed. Ook, NRXN2α was hoogs uitgedruk in Pars Reticulata van die substantia nigra, die hoofstreek van die brein wat in PS patogenese geraak word. Dit word geassosieer met paaie wat verband hou met kalsiumkanaalregulering, transmembraansein, reseptoraktiwiteit, neuronale adhesie, sinaptiese organisasie en neuroligin-familie-proteïenbinding. Vir funksionele studies is SH-SY5Y-selle wat 'n gevestigde en algemeen gebruikte in-vitro-model is vir PS , gebruik. Ons het die endogene NRXN2α-vlakke in die SH-SY5Y-selle ondersoek en gevind dat hulle waarneembare vlakke van die NRXN2α-proteïen produseer en hierdie proteïen stabiel uitdruk. Deur integere-sekwensvolgorde en funksionele studies te integreer, het ons p.G849D NRXN2α geïdentifiseer in 'n variant wat moontlik met ‘ n outosomale dominante vorm van PS in 'n Suid-Afrikaanse Afrikaner-familie geassosieer kan word. Ons weet dit is die eerste verslag van 'n assosiasie tussen NRXN2α en PS. As ’n kandidaat is NRXN2α geskik vir toekomstige funksionele mutasie karakteriserings-studies wat die impak van variante in hierdie geen en hul relatiewe bydrae tot die siekte fenotipe sal verhelder. Die verdere studie van NRXN2α in PS kan kritiese insig gee in nuwe siekte meganismes of genetiese interaksies met gevestigde PS meganismes. Dit kan moontlik lei tot die ontwikkeling van verbeterde terapeutiese modaliteite vir hierdie afwykende siekte.

Please refer to this item in SUNScholar by using the following persistent URL: http://hdl.handle.net/10019.1/105628
This item appears in the following collections: