Copy number variations in South African Nguni cattle : prevalence, characterization and genetic diversity

Wang, Magretha Diane (2016-12)

Thesis (PhDAgric)--Stellenbosch University, 2016.

Thesis

ENGLISH ABSTRACT: Copy Number Variations (CNVs) comprise of deletions, duplications and insertions larger than 1kb that occur within genomes. The identification of CNVs within regions of the bovine genome important for adaptation renders them a potential role in breed formation and adaptation. South African Nguni cattle are adapted and demonstrate an enhanced ability to endure the harsh environmental conditions of Southern Africa. This study investigated the prevalence of CNVs in the genome of South African Nguni cattle. CNV occurrence and distribution within Nguni subpopulations was assessed and comparisons with other South African cattle breeds were performed. The dynamics between CNVs and haplotype blocks (HPBs), correlations amongst CNVs and the genic locality of CNVs were investigated with the objective of determining CNV prevalence in adaptation. The Illumina BovineSNP50 beadchip was used in the first experiment to genotype 492 South African Nguni cattle sampled nationwide. PennCNV software identified 334 CNV regions (CNVRs) of between 30kb and 1Mb in length. Population structure analyses was performed and HPBs identified using ADMIXTURE and PLINK software respectively. Five subpopulations were evident with some degree of CNV segregation amongst populations. CNVRs covered or lay within 10Mb of 289 genes of which 149, 28, 44, 2 and 14 genes exclusive to the five sub-populations were identified. Some degree of overlap between CNVRs and the 541 HPBs was evident. In the second experiment, 59 Nguni genotypes were analyzed using the Bovine 50K Beadchip in conjunction with six other South African breeds. PennCNV software identified 356 unique CNVRs. One hundred and sixty three CNVRs identified in more than 1 animal were utilized as genetic markers to assess within and between breed genetic diversity (GD). Between breed group GD scores were 2.510, 6.115 and 4.233 for the Sanga, Taurine and composite breeds respectively. One hundred and two (Taurine) and seven (Sanga and composite) of the CNVRs demonstrated a significant (p≤0.05) association with one another. PANTHER overrepresentation analyses demonstrated significant representation of a number of processes, functions, components and proteins by correlated CNVR genes. CNVR based phylogenetic clustered animals of the same breed group together. In the third experiment 24 Nguni animals were sequenced at 7X coverage using illumina next generation sequencing technologies. Reads were mapped to the UMD3.1 reference genome and RAPTR-SV software was utilized to identify CNVs. CNVs identifed were filtered according to the number of reads that support the event with low (F10), medium (F45) and high stringencies (F75). Adjacent and overlapping CNVs were merged to form 399, 55 and 23 unique CNVRs that covered or lay within 1Mb of 358, 51 and 23 genes at F10, F45 and F75 stringencies respectively. NGS tools identified smaller CNVs compared to those reported from SNP data. Despite discrepancies between array and NGS methods, CNVR genes represented the same specific ontologies. The study demonstrated CNVRs to be prevalent in South African Nguni cattle, with potential role in breed formation and adaptation. CNVR GD scores, population structure, distribution and incidence dynamics were thus ascertained for the South African Nguni.

AFRIKAAANS OPSOMMING: Kopie Getal Variasies (KGV) bestaan uit genomiese delesies, duplikasies of invoegings groter as 1kb in die genoom. Die identifisering van KGVs binne streke van die bees genoom, belangrik vir aanpassing, maak dat hulle 'n potensiële rol in ras vorming en aanpassing kon speel. Suid-Afrikaanse Nguni beeste is aangepas en bestand teen die harde klimaat toestande wat ervaar word in Suidelike Afrika. Hierdie studie het die teenwoordigheid van KGV’s in die genoom van Suid-Afrikaanse Nguni beeste bestudeer. Die voorkoms en verspreiding van KGV’s binne die Nguni sub-populasies is ge-assesseer en vergelyk met ander Suid- Afrikaanse bees rasse. Die dinamika tussen KGV’s en haplotipe blokke (HPB), die korrelasie tussen verskillende KGV’s en die posisie op die genoom is bestudeer met die doel om KGV voorkoms in aanpassing te bepaal. In die eerste eksperiment is die Illumina BovineSNP50 beadchip gebruik om die genotipes van 492 Nguni beeste, wat landwye ingesamel is, te bepaal. Drie honderd vier en dertig KGV Areas (KGVA) met lengtes tussen 30kb en 1Mb is met PennCNV sagteware geidentifiseer. Populasie struktuur analise sowel as HPB evaluasie is uitgevoer met onderskeidelik die ADMIXTURE en PLINK sagteware. Vyf sub-populasies is duidelik onderskeidbaar met n sekere graad van KGV segregasie. Die KGVA is waargeneem oor 10Mb van 298 gene; en onderskeidelik 149, 28, 44, 2 en 14 gene kon toegeskryf word aan elk van die vyf sub-populasies. ‘n Sekere graad van oorvleuling kon waargeneem word tussen die KGVA’s en die 541 HPB. In die tweede eksperiment is genotipes van 59 Nguni beeste ge-analiseer met die Bovine 50K Beadchip, saam met ses ander Suid-Afrikaanse bees rasse. PennCNV het 356 unieke KGVA’s ge-identifiseer. Genetiese diversiteit (GD) is bepaal op graad van 163 KGVA’s, wat versprei was oor meer as een bees. Die GD tellings tussen verskillende bees ras groepe, was 2.510, 6.115 en 4.233 vir die Sanga, Taurine en saamgestelde rasse respektiewelik. ‘n Totaal van 102 (Taurine) en sewe (Sanga en saamgestelde ras) KGVA’s het n beduidende assosiasie (p≤0.05) getoon met mekaar. Oor-representasie analise met die sagteware PANTHER, demonstreer n oorweldige verteenwoordiging van prosesse, funksies, komponente en proteïene wat korreleer met die KGVA gene. KGVA filogenetiese bome het diere van dieselfde rastipe saam groepeer. Dat spesifieke KGV’s kan onderskei tussen verskillende rasse was ook opvallend. Die derde eksperiment het die genome van 24 Nguni beeste bepaal (teen 7x dekking) deur die Illumina “Next Generation Sequencing (NGS)” tegnologie. Genomiese fragmente is toegevoeg aan die oorspronlike UMD3.1 verwysings genoom, en die RAPTR-SV sagteware is gebruik om KGV’s te identifiseer. Die KGV’s is gefilter op die hoeveelheid fragmente wat die DNA basis volgorde ondersteun met lae (F10), gemiddeld (F45) en hoë (F75) strenghede. Aangrensende en oorvleulende KGV’s was saamgesmelt om 399, 55 en 23 unieke KGVA’s te vorm wat verpreid is oor 1Mb. Ongeveer 358, 51 en 23 gene kon ge-identifiseer word by F10, F45 en F75 onderskeidelik. NGS tegnologie kon kleiner KGV’s identifiseer, wanneer vergelyk word met data vanaf SNPs. Ten spyte van teenstrydighede tussen die twee metodes, was dieselfde spesifieke ontologieë verteenwoordig deur die KGVA gene. In die geheel, demonstreer hierdie studie dat KGVA’s algemeen voorkom in Suid-Afrikaanse Nguni beeste, met potensiële rolle in ras formasie en adaptasie. KGVA GD tellings, bevolkingstruktuur, verspreiding en voorkoms dinamika is toe vasgestel vir die Suid- Afrikaanse Nguni.

Please refer to this item in SUNScholar by using the following persistent URL: http://hdl.handle.net/10019.1/100361
This item appears in the following collections: