Proteogenomics of the Spotted Hyena (Crocuta crocuta)

Date
2021-03
Journal Title
Journal ISSN
Volume Title
Publisher
Stellenbosch : Stellenbosch University
Abstract
ENGLISH ABSTRACT: The spotted hyena (Crocuta crocuta) is an important yet understudied organism that could provide insights into the fields of disease resistance, pathogen movement and disease evolution. They exist in matrilineally controlled, transient, clan-like groups that feed on a variety of organic matter and, subsequently, control the spread of pathogenic infections within an environment. Due to this, they appear to possess a high degree of resistance to pathogens. In this project, RNA-Seq data were utilized to assemble a transcriptome for the spotted hyena and tissue samples were further used to acquire protein data via MS/MS analysis. The aim of this study was to produce an accurate assembly via the transcriptomic data and subsequently further validate this assembly through the use of proteomics to better prove the quality therein. The assembly was produced using the Trinity de novo assembly software tool and assessed via the BUSCO and TransRate analysis tools. Orthology detection was carried out using ProteinOrtho, using closely related species (tiger, house cat, leopard, cheetah). Finally, LC-MS/MS data (consisting of tissue samples from peripheral, abdominal, head and thoracic lymph, as well as lung and liver tissue), and fractionated data from the sample containing the most diverse spectra, were searched against both the assembly itself and the translated genome data from the NCBI. These data served as the means by which the proteomic data were assessed and to determine whether the fractionation was successful, based on the comparative quantity of spectra between initial and fractionated analyses, in diversifying the sample. Further, these data were utilized to determine whether the translated transcriptome assembly could be successfully aligned against the proteomic data. The analysis of the quality control results found that the assembly was of appropriate quality when compared to the standards found within NCBI and within those described by the quality analysis tools. This coupled with the analysis of the proteomic data suggest that the assembly is useable, though requires further refinement. Based on the above, the inclusion of more data for assembly, is required for it to be a completely viable and ideal model assembly, however, current results are promising.
AFRIKAANSE OPSOMMIMG: Alhoewel die huidige tydlyn dit verhoed het, sou daar data oor hiëna-reekse voor hierdie projek beskikbaar wees, die analise sal verder uitgebrei word. Die eerste stap sou 'n meer uitgebreide snywerk en daaropvolgende kwaliteitsbeoordelingsstap gewees het, wat sou bepaal of die snystap suksesvol is om die kwaliteit van die samestelling van die begin af te verbeter. Die voordeel van die beskikbaarheid van 'n genoom sou die gebruik van 'n ander samesteller noodsaak, moontlik deur die verwysingsgebaseerde samestellingsinstrument te gebruik, wat die genoom sou benut om 'n beter samestelling te bewerkstellig. 'n Verdere assessering, met behulp van 'n versameling monteerinstrumente, kan voordelig wees, aangesien een instrument waarskynlik onvoldoende is om al die data vas te lê. Die toets van 'n toepaslike instrument vir versoeningsversameling volg die vorige stap, wat die navorser in staat stel om te ondersoek of elkeen van die gemeentes saam beter resultate lewer as wanneer dit afsonderlik gebruik word. Toetsing van kwaliteit behou die gebruik van BUSCO en TransRate, maar kon nie so maklik vir verwysingsgebaseerde analise gebruik word nie. In hierdie geval is dit die beste om 'n vergelykende stap met die NCBI-samestelling uit te voer of instrumente te ondersoek wat meer geskik is vir hierdie tipe analise, hoewel TransRate steeds gebruik kan word, aangesien dit die samestelling op die oorspronklike fastq-lêers karteer. Daar is verskeie ander instrumente vir genoomassessering, soos GAGE, maar dit is onseker of dit korrek van toepassing kan wees op 'n RNA-Seq-vergadering of 'n versoenende vergadering met behulp van RNA-Seq-data. Na versoening en kwaliteitsbeoordeling is verdere ontleding nodig met behulp van die proteïendata. Hierdie stap sal die NCBI-proteïendata insluit vanaf die begin van die analise. Dit kan eenvoudiger wees, aangesien proteogenomiese navorsing met RNA, DNA en proteïene uitgevoer is, in plaas daarvan om slegs met RNA-Seq-data of genomiese data te begin. Een metode behels die bepaling van die vlak van oorvleueling tussen die twee proteïenstelle, sowel as tussen die proteïenstelle en die verskillende samestellings, as 'n vorm van vergelykende analise. Die bestryding kan in hierdie geval 'n meer verwante organisme wees, 'n lid van die Felidae-familie, of 'n selfs verder verwante spesie, soos 'n mens, wat 'n uitgebreide vergadering beskikbaar het.
Description
Thesis (MMed)--Stellenbosch University, 2021.
Keywords
Spotted hyena, UCTD, Proteomics, Disease resistance, Pathogenic microorganisms, Crocuta crocuta
Citation