Computational analysis of the immunogenicity and sequence diversity of Mycobacterium tuberculosis PPE_MPTR proteins

Colic, Antoinette (2017-03)

Thesis (MSc)--Stellenbosch University, 2017.

Thesis

ENGLISH ABSTRACT: Mycobacterium tuberculosis presents a substantial health risk to humans, particularly in Africa. Prevention of infectious diseases via vaccination is the most effective strategy in decreasing prevalence; however the current BCG vaccine against tuberculosis has shown varying levels of efficacy. M. tuberculosis infection represents an on-going interaction between the host and the bacteria, of which we do not yet fully understand all the mechanisms contributing to the pathogenesis at a molecular level. A deeper understanding of host-pathogen interactions is an important step towards developing new and more effective vaccines and therefore combating the disease. Protective immunity against M. tuberculosis is induced by stimulating antigen specific T-cells, which recognise peptide antigens presented by HLA molecules on infected cells. Identifying epitopes that are capable of binding to HLA molecules and eliciting T-cell responses form part of the development of subunit vaccines. An area of mycobacterial biology that is poorly understood is the function of the PE/PPE proteins. These proteins are a large, genetically diverse family of immunogenic proteins that are predicted to play a role in modulating host immune responses. In particular, the PPE major polymorphic tandem repeat (PPE_MPTR) proteins are a subgroup of the PE/PPE proteins which are restricted to pathogenic mycobacterial species and represent one of the most genetically diverse set of proteins within the M. tuberculosis proteome. While many studies have investigated the presence of T-cell epitopes within the PE/PPE family of proteins, no studies have focused specifically on the PPE_MPTR subfamily. Based on the extreme variation in both the length and genetic diversity of the PPE_MPTR proteins, it has been speculated that they may represent a source of antigenic variation which allows the organism to escape antigen-specific host responses. Given the hyper-variable nature of the PPE_MPTR proteins and their possible role in host-pathogen interactions, genetic diversity within the PPE_MPTR proteins may differentially modulate human immune response. Furthermore, epitopes within the PPE_MPTR proteins may be possible subunit vaccine candidates for M. tuberculosis. Conventional experimental techniques used to identify potential epitopes can often be time consuming and expensive. Various computational tools exist to predict binding of peptide sequences to various HLA alleles. Using a collection of known M. tuberculosis epitopes from the Immune Epitope Database (IEDB), an evaluation of the current open source HLA class II prediction tools has been performed, with the results used to inform an in silico identification of human CD4+ T-cell epitopes within the PPE_MPTR proteins. Characterisation of the genetic diversity of these proteins is also an essential step in improving our understanding of this protein family. Publically available whole genome sequence data from strains belonging to various lineages has been used to investigate the level of sequence diversity within these ppe_mptr genes, and the impact of genetic variants on epitope density has been investigated. To date, this study is the most comprehensive analysis of the genetic variation of the ppe_mptr genes. Predicted epitopes have been filtered using a reverse vaccinology approach in order to identify possible subunit vaccine candidates for M. tuberculosis. Findings from epitope prediction analysis support the hypothesis of host-pathogen interactions for the PPE_MPTR proteins. Genetic variation results indicate that certain PPE_MPTR proteins are highly variable while others are relatively conserved across strains, and that genetically diverse regions are less likely to contain epitopes. Therefore no evidence to support antigenic variation was found. Areas of high and low epitope density are correlated to areas of non-repeat and repeat regions within the genome respectively, and therefore epitopes within the PPE_MPTR proteins are conserved non-repeating peptides. This is consistent with previous literature on the conservation of reported M. tuberculosis epitopes within clinical strains. Further studies are therefore needed to determine the role of the variable copy number of repeats found within the PPE_MPTR proteins. Possible vaccine candidates with high predicted population coverage in African countries within the PPE_MPTR proteins have been identified.

AFRIKAANSE OPSOMMING: Mycobacterium tuberculosis bied ‘n aansienlike gesondheidsrisiko vir mense, veral in Afrika. Vroegtydige inenting is die mees suksesvolle strategie in die bekamping van aansteeklike siektes. Ongelukkig het BCG (Bacillus Calmette–Guérin), die enigste entstof teen tuberkulose, wisselvallige sukses bereik in die taak. M. tuberculosis infeksies verteenwoordig ‘n aanhoudende stryd tussen beide gasheer en bakterie, waarvan die molekulêre meganismes wat bydrae tot patogenese nog nie volledig beskryf is nie. Dus, 'n meer indiepte begrip van die gasheer-patogeen interaksie sal die ontwikkeling van doeltreffende inentingsstowwe bevorder. Beskermende immuniteit teen M. tuberculosis word geïnduseer deur die stimulering van antigeen spesifieke T-selle wat peptied antigene herken wat deur HLA (human leucocyte antigen) molekules blootgestel word. Die identifisering van epitope wat aan HLA molekules bind en T-sel reaksies lok, vorm deel van die ontwikkeling van subeenheid entstowwe. Die bydrae en funksie van PE/PPE proteïene in mikobakteriële biologie word tans nog nie volledig verstaan nie. PE/PPE proteïene is afkomstig van ‘n groot, geneties diverse familie van immunogeniese proteïene waarvan die rol in modulering van die gasheer immuun respons voorspel word. Die PPE “major polymorphic tandem repeat” (PPE_MPTR) proteïene, wat ‘n subgroep van die PE/PPE proteïene vorm, is beperk to die patogeniese mikobakteriële spesies en verteenwoordig die mees genetiese diverse stel van proteïene in die M. tuberculosis proteoom. Alhoewel baie navorsing al uitgevoer is oor die teenwoordigheid van T-sel epitope binne die konteks van die PE/PPE familie van proteïene, is daar nog geen studie wat spesifiek fokus op die PPE_MPTR subfamilie nie. As gevolg van die hoë variasie in beide die lengte en genetiese diversiteit van PPE_MPTR proteïene, word daar gespekuleer dat PPE_MPTR ‘n bron van antigeniese variasie is wat die organisme in staat stel om die antigeen-spesifieke gasheer respons te vermy. Die hoё variasie van PPE_MPTR proteïene en hul moontlike rol in gasheer-patogeen interaksie kan die gasheer immunrespons moduleer. Epitope binne die PPE_MPTR proteïene kan dus goeie kandidate vir subeenheid entstowwe teen M. tuberculosis wees. Tradisionele metodes wat gebruik word om potensiële epitope te identifiseer is dikwels tydrowend en duur. Dus, rekenaargebaseerde tegnieke was ontwikkel om die binding van peptiede aan verskeie HLA allele te voorspel. Verskeie oopbron HLA klass II voorspellings tegnieke was gebruik om CD4+ T-sel epitope in silico te identifiseer binne die PPE_MPTR proteïene, deur gebruik te maak van ‘n versameling bekende M. tuberculosis epitope wat verkry is vanaf die Immune Epitope Databasis (IEDB). Die karakterisering van PPE_MPTR proteïene is ‘n noodsaaklike stap in die ontwikkeling van kennis oor hierdie proteïen familie. Openbare heel genoom data van stamme, wat aan verskeie stamfamilies behoort, was gebruik om variasie te bepaal binne die ppe-mtpr gene, asook die impak van genetiese variante op epitoop digtheid was bepaal. Die huidige studie, wat die PPE_MPTR genetiese variasie ondersoek, is die mees omvattendste analise tot dusver. Voorspelde epitope was geselekteer deur gebruik te maak van ‘n tru-vaksinologie benadering om subeenheid entstowwe teen M. tuberculosis te identifiseer. PPE_MPTR protein epitoop voorspellings staaf die hipotese van gasheer-patogeen interaksie. Analises rakende die genetiese variasie dui daarop dat sekere PPE_MPTR proteïene baie veranderlik voorkom, terwyl ander relatief behoue bly binne hul verskillende stamme. Dus, geen bewyse wat antigeniese varisasie staaf was gevind nie. Areas met hoë en lae epitoop digtheid onderskeidelik, korreleer goed met nie-herhalende en herhalende dele binne die genoom. Dus, epitope binne die PPE_MPTR proteïene is konservatiewe, nie-herhalende peptiede. Hierdie is in lyn met vorige literatuur wat die bewaring van M. tuberculosis epitope binne kliniese stamme aandui. Verdere navorsing is nodig om die rol van die variasie in aantal herhalings binne die PPE_MPTR proteïene te bepaal. Entstof kandidate wat hoë voorspelde dekking bied onder die Afrika lande is geïdentifiseer binne PPE_MPTR proteïene.

Please refer to this item in SUNScholar by using the following persistent URL: http://hdl.handle.net/10019.1/101469
This item appears in the following collections: