Identification of novel Parkinson’s disease genes in the South African population using a whole exome sequencing approach

Glanzmann, Brigitte (2016-03)

Thesis (PhD)--Stellenbosch University, 2016.

Thesis

ENGLISH SUMMARY: Parkinson’s disease (PD) is a progressive and severely debilitating neurodegenerative disorder that is characterised by a range of motor symptoms and the selective loss of dopaminergic neurons in the substantia nigra. While the aetiology of PD remains poorly understood, it is hypothesised to involve a combination of various environmental, genetic and cellular factors that independently or collectively contribute to neurodegeneration and ultimately disease. To date, a number of genes including Parkin, PINK1, LRRK2, SNCA, DJ-1, ATP13A2 and VPS35 that have been directly associated with disease and investigations of their functions have provided significant insights into the pathobiology of PD. However, these genes do not play a significant role in the South African PD cohort and for this reason, novel genes and pathogenic mutations must be investigated and identified. This will aid in early diagnosis of patients and also ultimately for the design of more effective therapeutic strategies to treat this debilitating and poorly understood chronic systemic disorder. The present study aimed to identify novel PD-causing mutations in the South African Afrikaner population using a genealogical and whole exome sequencing (WES) approach.. The Afrikaner are unique to South Africa and are known to have undergone a bottleneck in the 1800s which has led to genetic founder effects for a number of different disorders in this particular group. Additionally, we further aimed to determine whether the identified putative disease-causing mutation(s) could be attributed to the development of PD in other South African ethnic groups. A total of 458 patients were recruited, of which 148 were self-identified as Afrikaner. From these, a total of 48 Afrikaner probands were subjected to extensive genealogical analyses and 40 of them could be traced back to a single common couple. For this reason, it was hypothesised that the disorder in these patients may be due to a genetic founder effect. The use of a whole genome SNP array confirmed the relatedness of the individuals to varying degrees (8 to 12 generations back) and subsequently three of the probands and one affected sibling were selected for WES. The selected individuals were sequenced using the Illumina Genome Hiseq 2000TM and approximately 78 000 variants were identified for each individual. Numerous bioinformatics tools were used to scrutinize the variants but none were able to produce a candidate list of plausible disease-causing variants. All variants identified were either present at high frequency, did not co-segregate with the disorder or were artefacts. In order to facilitate and expedite the variant prioritisation process, a novel method for the filtration of WES data was designed in-house. This strategy named TAPER™ (Tool for Automated selection and Prioritisation for Efficient Retrieval of sequence variants) implements a set of logical steps by which to prioritise candidate variants that could be pathogenic. It is primarily aimed at the support of resource-constrained scientific environments with limited bioinformatics capacity. As a proof of concept various independent WES datasets for PD, severe intellectual disability and microcephaly as well as ataxia and myoclonic epilepsy were used, and TAPER™ was able to successfully prioritise and identify the causal variants in each case. Through the use of TAPER™, two putative candidate variants in SYNJ1 and USP17 were identified. The homozygous V1405I variant in SYNJ1 was found only in the affected sibling pair and in none of the 458 patients and 690 control individuals that had been screened. This variant is predicted to be deleterious across multiple platforms and has a CADD score of 29.40 and may alter synaptic vesicle recycling. The homozygous C357S variant in USP17 was found in 18/458 probands (12 Afrikaner, two white and four mixed ancestry) but was identified in 0.14% of the controls (1/184 Afrikaner, 0/160 white, 0/180 mixed ancestry and 0/160 black). This variant is also anticipated to be deleterious across multiple platforms and has a CADD score of 34.89. In summary, the results of the present study reveal that PD in the 40 South African Afrikaner patients studied is not due to a founder effect, but highlights two variants of interest for future studies. Further work is necessary to analyse both of these variants and to assess their possible effect on protein structure and function. The discovery of novel PD-causing genes is important as this allows for the generation of disease-linked protein networks, thereby facilitating identification of additional disease genes and subsequently providing insights into the underlying pathobiology. Moreover, this knowledge is critical for the development of improved treatment strategies and drug interventions that will ultimately prevent or halt neuronal cell loss in susceptible individuals. Although the present study did not conclusively identify a novel PD-causing gene, it does provide a solid foundation for future work in our laboratory in the challenging and rapidly evolving research area of WES and bioinformatics, and its application to studies on PD.

AFRIKAANSE OPSOMMING: Parkinson se siekte (PS) is ʼn erg aftakelende neuro-degeneratiewe siekte wat gekenmerk word deur 'n verskeidenheid van simptome en uiteindelik die inkorting van beweging veroorsaak. Hierdie toestand is die gevolg van selektiewe degenerasie van die dopaminergiese neurone substantia nigra pars compacta in die midbrein. Dit lei tot patologiese simptome naamlik bradikinese, rus tremore, posturale onstabiliteit en rigiditeit. Aanvanklik was die hipotese dat persone wat PS ontwikkel blootgestel was aan omgewingsverwante snellers wat die aanvang van die siekte veroorsaak. Maar onlangse bewyse dui daarop beide omgewing- en genetiese faktore speel ʼn rol in die patogenese van die siekte. Tans is daar sewe gene (Parkin, PINK1, LRRK2, SNCA, DJ-1, ATP13A2 en VPS35) wat direk betrokke is by PD. Die doel van die huidige studie is om ʼn 'n PS oorsaak-mutasies in die Suid-Afrikaanse Afrikaner bevolking te identifiseer met behulp van 'n genealogiese en die heel eksoom volgorde-benadering (WES). Die Afrikaner is uniek aan Sui Afrika en het in die 1800s ń genetiese knelpunt ondervind wat tot genetiese stigterseffek gelei het. Daarbenewens het ons verder ten doel om te bepaal of die geïdentifiseerde vermeende siekte-veroorsakende mutasie(s) toegeskryf kan word aan die ontwikkeling van PS in ander Suid-Afrikaanse etniese groepe. ʼn Totaal van 458 pasiënte is vir die studie gewerf, waarvan 148 self-geïdentifiseerde Afrikaners is. ʼn Totaal van 48 Afrikaner probandi was onderworpe aan genealogiese analise en 40 van hulle kon teruggevoer word na 'n enkele gemeenskaplike voorouer. Dit word dus veronderstel dat die individue aan mekaar verwant is en dat PS weens ń stigterseffek is. Die gebruik van 'n hele genoom SNP verskeidenheid bevestig die verwantskap van die individue in verskillende grade (tussen 8 en 12 generasies) en daarvolgens is drie van die probandi en een geaffekteerde bloedverwant gekies vir WES. Die gekose eksooms is georden volgens die Illumina Genome Hiseq 2000TM en ongeveer 78 000 variante is geïdentifiseer vir elke individu. Verskeie bio-informatika instrumente is gebruik om die variante wat deur WES verkry is te bestudeer maar geen een was in staat om ʼn beweerde lys van geloofwaardige siekte-veroorsakende variante te identifiseer nie. Ten einde die variante identifikasie proses te ondersteun, is ʼn nuwe metode vir filtrasie van WES-data ontwikkel, naamlik TAPER™ (Tool for Automated selection and Prioritization for Efficient Retrieval of sequence variants). TAPER™ implementeer ʼn stel logiese stappe waardeur kandidaat variante gekies word wat met die siekte geassosieer word; dit het ten doel om ondersteuning te bied aan wetenskaplike omgewings met beperkte bioinformatika kapasiteit. Verder is die sukses van TAPER™ geëvalueer op reeds bestaande data-stelsels wat die konsep bewys. Met behulp van TAPER™ is twee waarskynlike kandidaat variante in SYNJ1 en USP17 geïdentifiseer. Die V1405I variant in SYNJ1 is slegs in ʼn geaffekteerde bloedverwant paar gevind en in geen van die 458 pasiënte of 690 gekeurde kontrole groep individue. Dit word voorspel dat hierdie variant skadelik is en het ń CADD telling van 29.40. Die C357S variant is homosigoties in USP17 in 18/458 probandi (12 Afrikaner, twee wit en vier gemengde afkoms) gevind is. Maar dit is ook geïdentifiseer in 0.14% van die kontrole individu (1/184 Afrikaner, 0/160 wit, 0/180 gemengde afkoms en 0/160 swart) wat verkry is van die Westelike Provinsie Bloedoortappingsdienste. Dit word voorspel dat hierdie variant skadelik is en het ń CADD telling van 34.89. Die resultate van die huidige studie toon dat PD in die Suid-Afrikaanse Afrikaner nie die oorsprong het by 'n stigterslid nie, maar beklemtoon twee variante van belang. Verdere werk is nodig om elkeen van die variante te analiseer en hul moontlike patogenese te ondersoek. Die ontdekking van nuwe PS veroorsakende gene is belangrik omdat dit help met die ontwikkeling van siekte-verwante proteïen netwerke, en om sodoende addisionele gene te identifiseer in sleutel siekte prosesse en gevolglik kern biologiese insig in onderliggende prosesse te verskaf. Alhoewel die huidige studie nie ń nuwe PS-veroorsakende geen geïdentifiseer het nie, dit bied wel ń ferm platform vir toekomstige navorsing in die uitdagende en versnellende veranderende velde van WES en bioinformatika en die toepassing daarvan op PS studies.

Please refer to this item in SUNScholar by using the following persistent URL: http://hdl.handle.net/10019.1/98301
This item appears in the following collections: