ABSTRACT
The gene sequences of Growth hormone (GH), Insulin-like Growth Factor 1 (IGF-1) and Myostatin (MSTN) were downloaded from National Center for Biotechnology Information (NCBI) database, through Entrez of the database as non-redundant reference sequence in FASTA format, using respective accession numbers of the various genes in the GenBank to access the necessary gene information. They were subjected to different computational tools, on-line softwares and programs; for Multiple sequence alignment, phylogenetic tree, BLAST-like alignment tool (BLAT), Basic Like Alignment Search Tool (BLAST)were used to analyzed for gene number in a genome, exon type and number per gene, number of codons per gene; gaps within the alignment of the three species for each gene; single nucleotide polymorphism between the alignments of chicken by rabbit, chicken by sheep, and rabbit by sheep; the conserved regions between the alignments of chicken by rabbit, chicken by sheep, rabbit by sheep, and chicken by rabbit by sheep and other parameters by submitting the genes respective sequences in FASTA format to the tool. Results indicated that higher body weight and size of sheep might have been due to two categories of GH, high gene number of GH, high exon number of GH, long sequence length of GH which resulted to higher predicted coding sequence and higher predicted peptide size. Rabbit and chicken never shared common ancestor in GH, the number of chromosomes GHgene distributed in sheep genome was higher than that in rabbit and chicken, and all are found on the opposite strand (negative). Rabbit and sheep once shared common ancestor in IGF-1 and MSTN gene. Rabbit and sheep IGF-1 has longer gene length, higher number, exon number, predicted coding sequence, predicted peptide size which might have been the reason for higher body weight at maturity compare to chicken. It is concluded that; the number of GH gene of sheep is higher than that of GH of chicken, than GH of rabbit but the
viii
number IGF-1 gene of rabbit and sheep are higher than that of chicken IGF-1 gene, while the number of rabbit myostatin is higher than that of chicken and sheep Myostatin. The number of genes in chicken, rabbit and sheep genomes plays a key role in establishing effective gene function. The genes shares some conserved regions but the length/size, gene number, exon number, exon type, number of gene/genome, gene DNA strand, codons/gene, gaps, SNP varied greatly among chicken, rabbit and sheep species. The gene conserved regions similarities are for conserved functions while, the differences are for different expression pathways among the three species. Myostatin gene depresses animal growth process through the action of IGF-1 gene. Frame shift mutations in the upstream or downstream regions of the genes, lead to differential gene regulation without actually changing the structure and functions of the protein. It is recommended that; Chicken and sheep growth could be improved through increasing the length/size and number of their IGF-1 gene by gene modification techniques. Rabbit growth could be improved through increasing the length/size and number of the GH gene by gene modification techniques. Some missing nucleotides in rabbit and chicken GH and IGF-1 gene sequences could be modified by gene modification technology and use as a useful marker for economic traits. It is recommended that more experiments could be conducted to knock-off or knock-out some of these extra length sequence regions of sheep GH and IGF-1 gene not found in rabbit and chicken GH and IGF-1 gene, when aligned.
TABLE OF CONTENTS
Title Page ………………………………………………………………………………………………………………….. ii
Declaration ………………………………………………………………………………………………………………. iii
Certification ………………………………………………………………………………………………………………. iv
Dedication …………………………………………………………………………………………………………………. v
Acknowledgements …………………………………………………………………………………………………….. vi
Abstract. …………………………………………………………………………………………………………………..vii
Table of Contents ……………………………………………………………………………………………………….. ix
List of Tables ………………………………………………………………………………………………………….. xiii
List of Figures ………………………………………………………………………………………………………….. xiv
List of Appendixes ………………………………………………………………………………………………….. xxiv
CHAPTER ONE ……………………………………………………………………………………………………….. 1
1.0 INTRODUCTION …………………………………………………………………………………………………1
1.1 Justification ………………………………………………………………………………………………………….5
1.2 Objectives of the Study…………………………………………………………………………………………..7
1.3 Hypotheses ……………………………………………………………………………………………………………8
CHAPTER TWO ………………………………………………………………………………………………………. 9
2.0 LITERATURE REVIEW ………………………………………………………………………………………9
2.1 Genes and Mutations ……………………………………………………………………………………………..9
2.2 Phylogenic Analysis …………………………………………………………………………………………….. 12
2.3 Growth Hormone Gene (GH) ………………………………………………………………………………. 14
2.4 Myostatin Gene (MSTN) ……………………………………………………………………………………… 19
2.4.1. Mutation of myostatin gene ………………………………………………………………………………… 21
2.5 Insulin-like Growth Factor-1 Gene (IGF-1) …………………………………………………………… 29
2.5.1. Expressions of insulin-like growth factor-1 gene (IGF-1) and potential as a marker assisted selection ……………………………………………………………………………………………………………. 33
x
2.5.2 IGF Family and other expressions roles …………………………………………………………………. 38
2.6 On-line Tools/Programs ………………………………………………………………………………………. 40
CHAPTER THREE …………………………………………………………………………………………………. 45
3.0 MATERIALS AND METHODS ………………………………………………………………………….. 45
3.1 Research Centre/Location ……………………………………………………………………………………. 45
3.2 Genes Sequence Download and Alignment MEGA7 Computational Tool ………………… 45
3.3 Evolutionary Phylogenetic Tree Using MEGA7 Computational Tool ………………………. 47
3.4 Vista Computational Tool ……………………………………………………………………………………. 47
3.5 BLAT Analysis of UCSC Genome Browser Computational Tool …………………………….. 48
3.6 Genscan Computational Tool ………………………………………………………………………………. 48
3.7 The BLAST Analysis of Ensembl Computational Tool and Vista Genome ……………….. 49
CHAPTER FOUR ……………………………………………………………………………………………………. 50
4.0 RESULTS ………………………………………………………………………………………………………….. 50
4.1: GENSCAN Output Result of GH Gene ………………………………………………………………… 50
4.2: GENSCAN Output Result of Myostatin Gene ………………………………………………………. 51
4.3: GENSCAN Output Result of IGF-1 Gene …………………………………………………………….. 55
4.4: MEGA7 Alignment Result ………………………………………………………………………………….. 58
4.5: BLAT Analysis Result Using Genome Browser …………………………………………………….. 60
4.6: BLAST Analysis Result Using Ensembl Browser and Vista Genome ………………………. 62
4.7: Phylogenetic Analysis Result of Growth Hormone Gene of Chicken, Rabbit and Sheep …………………………………………………………………………………………………………………… 65
4.8: Phylogenetic Analysis Result of Myostatin Gene of Chicken, Rabbit and Sheep ………. 65
4.9: Phylogenetic analysis result of Insulin-like Growth Factor-1 Gene of Chicken, Rabbit and Sheep. …………………………………………………………………………………………………… 68
4.10: Alignment Result of Rabbit GH X Chicken GH:1-3507bp …………………………………… 70
4.11: Alignment Result of Rabbit Myostatin X Sheep Myostatin X Chicken Myostatin:1-5493bp ………………………………………………………………………………………………………… 70
4.12: Alignment Result of Rabbit IGF-1 X Chicken IGF-1 Gene:1-48428bp ………………….. 70
4.13: Alignment Result of Sheep IGF-1 X Chicken IGF-1 Gene:1-48428bp …………………… 75
xi
4.14: Alignment Result of Sheep IGF-1 X Rabbit IGF-1 Gene: 1-76791bp …………………….. 75
CHAPTER FIVE …………………………………………………………………………………………………….. 80
5.0 DISCUSSION …………………………………………………………………………………………………….. 80
5.1 GENSCAN Output Result of GH Gene …………………………………………………………………. 80
5.1.1 Regulatory elements and expressions of GH, IGF-1 and MSTN genes ………………………… 81
5.1.2 Growth hormone genes, regulatory elements and transcription ………………………………….. 82
5.2 GENSCAN Output Result of Myostatin Gene ……………………………………………………….. 84
5.3 GENSCAN Output Result of IGF-1 Gene ……………………………………………………………… 86
5.3.1.IGF-1 gene of chicken ………………………………………………………………………………………… 86
5.3.2.IGF-1 gene of rabbit …………………………………………………………………………………………… 87
5.3.3 IGF-1 gene of sheep …………………………………………………………………………………………… 88
5.4 MEGA7 Alignment Result …………………………………………………………………………………… 90
5.4.1. Growth hormone gene (GH) deletions and codons ………………………………………………….. 90
5.4.2. Insulin-like growth factor-1 gene (IGF-1) deletions and codons ………………………………… 91
5.4.3. Myostatin (MSTN) deletions and codons ………………………………………………………………. 91
5.4.4. Growth hormone gene (GH) SNP and conservations……………………………………………….. 92
5.4.5. Insulin-like growth factor-1 gene (IGF-1) SNP and conservations …………………………….. 93
5.4.6. Myostatin gene (MSTN) SNP and conservations ……………………………………………………. 94
5.5 BLAT Analysis Result Using Genome Browser ……………………………………………………… 95
5.5.1. Growth hormone (GH) gene ……………………………………………………………………………….. 95
5.5.2. Insulin-like growth factor-1 (IGF-1) gene ……………………………………………………………… 96
5.5.3. Myostatin (MSTN) gene …………………………………………………………………………………….. 97
5.6 BLAST Analysis Result Using Ensembl Browser and Vista Genome ……………………….. 99
5.6.1. Chicken genome ……………………………………………………………………………………………….. 99
5.6.2. Rabbit genome ……………………………………………………………………………………………….. 100
xii
5.6.3. Sheep genome ………………………………………………………………………………………………… 100
5.7 Phylogenetic Analysis Resultof Growth Hormone Gene of Chicken, Rabbit and Sheep ……………………………………………………………………………………………………………………. 101
5.8 Phylogenetic AnalysisResult of Myostatin Gene of Chicken, Rabbit And Sheep ……… 102
5.9 Phylogenetic Analysis Result of Insulin-like Growth Factor-1 Gene of Chicken, Rabbit and Sheep. ……………………………………………………………………………………………………. 102
5.10 Alignment of Rabbit GH X Chicken GH:1-3507bp …………………………………………….. 103
5.11Alignment of Rabbit Myostatin X Sheep Myostatin X Chicken Myostatin:1-5493bp . 104
5.12 Alignmentof Rabbit IGF-1 X Chicken IGF-1 Gene:1-48428bp …………………………….. 104
5.13Alignment of Sheep IGF-1 X Chicken IGF-1 Gene:1-48428bp ……………………………… 105
5.14 Alignment of Sheep IGF-1 X Rabbit IGF-1 Gene:1-76791bp ……………………………….. 106
CHAPTER SIX ……………………………………………………………………………………………………… 107
6.0 SUMMARY, CONCLUSION AND RECOMMENDATION …………………………………. 107
6.1 Summary …………………………………………………………………………………………………………. 107
6.2 Conclusion ……………………………………………………………………………………………………. 108
6.3 Recommendations …………………………………………………………………………………………….. 109
REFERENCES ……………………………………………………………………………………………………… 110
xiii
LIST OF TABLES
Table 3.1: Accession Numbers of GH, IGF-1 and Myostatin Genes of Chicken, Rabbit and Sheep……………………………………..……….…………………………..…….….46
Table 4.1: GENSCAN Output Result of GH Gene ………………………………………………………. 53
Table 4.2: GENSCAN Output Result of Myostatin Gene …………………………………………….. 54
Table 4.3: GENSCAN Output Result of IGF-1 Gene……………………………..…………..57
Table 4.4: MEGA7 Alignment Result ………………………………………………………………………… 59
Table 4.5: BLAT Analysis Result Using Genome Browser …………………………………………… 61
Table 4.6: BLAST Analysis Result Using Ensembl Browser and Vista Genome …………….. 64
CHAPTER ONE
1.0 INTRODUCTION
Bioinformatics is the science of storing, extracting, organizing, analyzing, interpreting and utilizing information from biological sequences and molecules (Khalid, 2010). Bioinformatics is often defined as the application of computational techniques to understand and organize the information associated with biological macro-molecules (Luscombe et al., 2001). It has been mainly fueled by advances in DNA sequencing and mapping techniques (Khalid, 2010). Over the past few decades, rapid developments in genomic, other molecular research technologies and information technologies have combined to produce a tremendous amount of information related to molecular biology. The primary goal of bioinformatics is to increase the understanding of biological processes (Khalid, 2010).As biology is increasingly becoming a technology-driven science, databases have become indispensable to store not only data, but also the results of experiments generated by different research projects around the world (Hey et al., 2009). A biological database is a collection of information, or data from a biological system, stored in a computer readable format. Some databases are also called data repositories if they function as a place where large biological datasets can be stored and retrieved by users. Sharing of data between scientists accelerates the speed of discoveries and has the potential to greatly advance a scientific field as a whole (this is known as the Fourth Paradigm of Data-Driven Scientific Discovery (Hey et al., 2009). There are two types of biological databases: public databases that are freely accessible on-line, and private databases that require payment before you can access them (Dutilh and Keșmir, 2016).
The genome of a species encodes genes and other functional elements, interspersed with non-functional nucleotides in a single uninterrupted string of DNA (IHGSC, 2001).
2
Recognizing protein-coding genes typically relies on finding stretches of nucleotides free of stop codons called Open Reading Frames (ORFs) that are too long to have likely occurred by chance. Since stop codons occur at a frequency of roughly 1 in 20 random sequence, ORFs of at least 60 amino acids will occur frequently by chance (5% under a simple Poisson model), and even ORFs of 150 amino acids will appear by chance in a large genome (0.05%). This poses a huge challenge for higher eukaryotes in which genes are typically broken into many, small exons (on average 125 nucleotides long for internal exons in mammals (IHGSC, 2001). Some regions within a protein sequence are more conserved than others during evolution (Dutilh and Keșmir, 2016). These regions are generally important for the function of a protein and/or the maintenance of its three dimensional structure, or other features related to its localization or modification. By analyzing constant and variable properties of such groups of similar sequences, it is possible to derive a signature for a protein family or domain, which distinguishes its members from other unrelated proteins by sequence alignment, which allows us to discover these signatures (Dutilh and Keșmir, 2016). Sequence alignment is defined as the bioinformatics task of locating equivalent regions of two or more sequences, and aligning their nucleotide or amino acid residues side by side, to maximize their similarity (Dutilh and Keșmir, 2016). Multiple sequence alignments allow for identification of conserved sequence regions. This is very useful in designing experiments to test and modify the function of specific proteins, in predicting the function and structure of proteins, and in identifying new members of protein families (Dutilh and Keșmir, 2016).
DNA Sequencing is a technique/method by which the exact order of nucleotides within a DNA molecule is determined (Mayor et al., 2000). Comparative data analysis provides the
3
opportunity to determine what is shared and what is unique to each species (Mayor et al., 2000). Growth in animals is controlled by a complex system, in which the somatotropic axis plays a key role. The genes that operate in the somatotropic axis are responsible for the postnatal growth, mainly GH that acts on the growth of bones and muscles mediated by IGF-1 (Sellier, 2000). The growth hormone (GH) and insulin-like growth factor 1 (IGF-1) genes are candidates for growth in bovine, since they play a key role in growth regulation and development (Hossner et al., 1997; Tuggle and Trenkle, 1996). Effects of GH on growth are observed in several tissues, including bone, muscle and adipose tissue. These effects result from both direct action of GH on the partition of nutrients and cellular multiplication and IGF-1-mediated action stimulating cell proliferation and metabolic processes associated to protein deposition (Boyd and Bauman, 1989). IGF-1 stimulates protein metabolism and is important for the function of some organs, being considered a factor of cellular proliferation and differentiation (Andreaet al., 2005). Polymorphisms in GH gene have been used as a genetic marker associated with different performances and productions traits such as body weight, birth weight and weaning weight in goat (Wickramaratne et al., 2010), The rabbit GH gene has already been sequenced by Wallis and Wallis (1995) and has been investigated as a gene associated with market weight of commercial rabbit (Fontanesi et al., 2012). Mutations of this GH gene have been described in goats (Malveiro et al., 2001), and poultry (Feng et al., 1997) to affect important production traits.
In chickens divergently selected for high or low growth rates, there were significantly higher IGF-1 mRNA levels in the high growth rate line than in the low growth rate line (Beccavin, et al., 2001). The growth hormone receptor (GHR), insulin-like growth factor-1 (GH-IGF-1) system controls the number of follicles in animals that are recruited to the
4
rapid growth phase (Roberts et al., 1994; Monget, et al., 2002). It is also known that the GH-IGF-1 system has been modified as a result of selection for enhanced growth rate (Ballard et al., 1990; Ge et al., 2001). The insulin-like growth factor gene (IGF1) is a candidate gene for growth, body composition and metabolism, skeletal characteristics and growth of adipose tissue and fat deposition in chickens (Zhou et al., 2005). Earlier research on GHR, IGF-1 and IGFBP-3 in cattle, goats and chickens showed genetic polymorphisms and their association with production traits (Liu et al., 2010). The IGF1 gene is essential for normal embryonic and postnatal growth in mammals (Bian et al., 2008).
Myostatin (MSTN), previously called Growth differentiation factor 8 (GDF8), is a member of transforming growth factor-β (TGF-β) superfamily. It is a negative regulator for both embryonic development and adult homeostasis of skeletal muscle (Tu et al., 2014). Myostatin (MSTN) is a negative regulator of the muscle growth factor, which belongs to the transforming growth factor beta superfamily (McPherron et al., 1997). It is able to negatively control the growth of muscle cells by inhibiting the transcriptional activity of MyoD family members. Its expression is negatively correlated with muscle weight (Weber et al., 2005). Mutations in the myostatin gene have also been shown to cause double muscling in humans and other species (Clop et al., 2006). These findings suggest that strategies for inhibiting myostatin function may be applied to improve animal growth. Homozygote and heterozygote cattle with mutations of the MSTN gene-conserved Ribbon bases exhibit the advantage of strong muscle in increase birth weight, and obvious double-hip muscle characteristics (Casas et al., 1999). As the candidate gene in pig double-hip muscle, the MSTN gene has an important impact on the amount of lean meat and fat deposition (Sonstegard et al., 1998). The rabbit is a high quality and efficient meat producing livestock as well as a common experimental animal. Therefore, providing
5
information on its genetic basis and regulation mechanism of skeletal muscle growth and development has an important theoretical and practical significance (Qiao, 2014). The effects of the SNPs of myostatin gene on chicken growth in a F2 resource population are associated with increase in abdominal fat weight, abdominal fat percentage, birth weight and breast muscle percentage (Zhiliang et al., 2004). Notably, these data suggest that myostatin could be an ideal molecular marker for marker-assisted selection for skeletal muscle and adipose growth in chicken breeding program. It was reported that TTTTA deletion phenomenon occurred in MSTN gene was unique for goats when compared with sheep, cattle, water buffalo, domestic yak, pigs, and humans (Grisolia et al., 2009; Zhang et al., 2013) Khichar et al. (2016) found an important effect of a 5-base pair (bp) deletion on early body weight and size of a goat.
1.1 Justification
Identification of a candidate gene is a powerful method for understanding the direct genetic basis involved in the expression of quantitative traits and their differences between individuals (Rothschild and Soller, 1997; Nagaraja et al., 2000). Mutations of the MSTN gene-conserved region bases in chicken, rabbit and goat will lead to the activation or inhibition of the gene expression product and the loss or increase in function or inhibiting muscle growth, which will result in excessive muscle development and expression (Lee and McPherron, 1999). Indeed, there have been several recent examples in which comparative sequence data have led to the discovery and understanding of function of previously undefined genes. The complete human/mouse orthologous-sequence dataset proved particularly valuable in the characterization of gene families in humans and mice (Dehal et al., 2001). For instance, by comparing olfactory receptor gene families on human
6
chromosome 19, computational analysis indicated that humans have approximately 49 olfactory receptor genes, but only 22 had maintained an open reading frame and appeared functional. This contrasts with the vast majority of the homologous mouse genes that have retained an open reading frame. This finding of reduced olfactory receptor diversity in humans is consistent with the reduced olfactory needs and capabilities of humans relative to rodents (Pennacchio and Rubin, 2003). Growth hormone gene (GH) a single polypeptide produced in the anterior pituitary gland is a promising candidate gene marker for improving milk and meat production in goats and other farm animals (Min et al., 2005). IGF1 is a mediator of many biological effects; it increases the absorption of glucose, stimulates myogenesis and production of progesterone, inhibits apoptosis, participates in the activation of cell cycle genes, increases the synthesis of lipids, and intervenes in the synthesis of DNA, protein, RNA , and in cell proliferation (Mohammadi et al., 2011)
The increasing availability of genomic sequence from multiple organisms has provided biomedical scientists with a large dataset for orthologous-sequence comparisons. The rationale for using cross-species sequence comparisons to identify biologically active regions of a genome is based on the observation that sequences that perform important functions are frequently conserved between evolutionarily distant species, distinguishing them from nonfunctional surrounding sequences. (Pennacchio and Rubin, 2003). Sequence alignment is a good way of predicting the function of a gene or protein. Moreover, sequences contain a lot more information, such as from which organism the gene or protein is derived, and what are the evolutionary relationships of the gene or species with other genes or species. Much of this information can only be discovered by finding homologs of the gene or protein in other species (Dutilh and Keșmir, 2016).
7
To justify this study, a comparative genomics analysis to access the similarities and differences between these three growth genes; Growth hormone (GH), Myostatin (MSTN) and Insulin-like growth factor-1 (IGF-1) gene among chicken, rabbit, and sheep will identify the similarities or differences in the rate of increase in growth and body size to maturity, final body size at maturity, and body conformation at maturity. The analysis of sequences conserved between these three species will further enrich available information of biologically active sequences in these species.
1.2 Objectives of the Study
The main objective of this study was to determine gene sequence conservations and variations of GH, IGF-1 and MSTN gene in chicken, rabbit and sheep that can be utilize for animal production improvement. The specific objectives are;
1. To determine the distribution of GH, IGF-1 and MSTN genes and their chromosomal and DNA strand contained in the genome of chicken, rabbit and sheep.
2. To determine the similarities and differences in nucleotide sequences of GH, IGF-1 and MSTN genes in chicken, rabbit, and sheep.
3. To establish the kind of conserved nucleotides regions, gaps, single nucleotide polymorphism between GH, IGF-1 and MSTN genes that affect the codons of chicken, rabbit and sheep.
4. To determine the evolutionary nucleotide substitution and relationship of GH, IGF-1 and MSTN genes in chicken, rabbit and sheep.
5. To establish the nucleotide sequence variation in GH, IGF-1 and MSTN genes that result in weight differences between chicken, rabbit and sheep at maturity.
8
1.3 Hypotheses
H0 = Variation in nucleotide sequences and conservation of these genes does not have specific effect on the growth, development and size at maturity of Chicken, Rabbit and Sheep. Ha = Variation in nucleotide sequences and conservation of these genes have specific effect on the growth, development and size at maturity of Chicken, Rabbit and Sheep.
9
IF YOU CAN'T FIND YOUR TOPIC, CLICK HERE TO HIRE A WRITER»