Discovery of single nucleotide polymorphisms

Genome-wide genetic marker discovery in South African indigenous cattle breeds using next generation sequencing

Industry Sector: Cattle and Small Stock

Research Focus Area: Livestock production with global competitiveness: Animal growth, nutrition and management

Research Institute: Agriculture Research Institute – Animal Production Institute

Researcher: Dr. Avhashoni Zwane

Title Initials Surname Highest Qualification
Prof. Azwihangwisi Maiwashe PhD
Prof Este Van Marle-Koster PhD
Prof Jerry Taylor PhD
Prof Mahlako Makgahlela PhD
Dr Ananyo Choudhury PhD
Dr Farai Muchadeyi PhD

Year of completion : 2018

Aims Of The Project

  • To conduct a genome wide search for new SNPs in local cattle breeds
  • To validate newly identified SNPs using Run 5 data from the 1000 Bull Genomes Project and perform functional annotation and enrichment analysis
  • To identify selective sweeps and a panel of SNP markers to discriminate between the three indigenous breeds

Executive Summary

South African (SA) livestock has played an important role in food security country’s sustainability. Due to the important role of indigenous cattle breeds in SA, it is crucial for these breeds to be included in the generation of genotypic and sequence data. Genomic data provide opportunity for various genetic investigations including identification of breed-informative markers, selective sweeps and genome-wide association studies (GWAS). In this study sequence data were generated and used in combination with genotypic data to conduct a SNP discovery in the three indigenous SA breeds (Afrikaner, Drakensberger, and Nguni) and study potential selective sweeps and identify panel of breed-specific markers. Commercial bovine SNP assays, (BovineSNP50 and GGP-80K) were used for identifying the breed-informative markers, while an approach of breed pooled samples were used for sequencing. Sequencing of the three breeds generated approximately 1.8 billion (184 Giga-bp) of high quality paired-end reads which 99 % reads mapped to the bovine reference genome (UMD 3.1), with an average coverage of 21.1-fold. A total of 17.6 million variants were identified across the three breeds with the highest number of variants identified in NGI (12,514,597) than in AFR (11,165,172) and the DRA (7,049,802). In total 89 % of variants were SNPs and 11 % were Indels. On average, 85 % of the total SNPs identified were also shared among the breeds from 1000 Bull Genomes Project data and the remaining 15 % of SNPs were unique to SA indigenous breeds. Novel SNPs were further annotated to identify genes enriched in novel SNPs. In total, 461, 478 and 542 genomic regions identified from the top (5%) windows were enriched for novel variants (p < 0.001). A total of 174 putative breed-specific SNPs were identified across the breeds and showed the overall 100% breed allocation using PCA and GeneClass 2. This study provides the first analysis of sequence data to discover SNPs in indigenous SA cattle breeds and the results provide insight into the genetic composition of the breeds and offer the potential for further applications in their genetic improvement.



A.A. Zwane1,2, A. Choudhury, M.L. Makgahlela1, E. van Marle-Köster2, A. Maiwashe1,5 and J.F. Taylor4
1Department of Animal Breeding and Genetics, ARC-API, P/Bag X2, Irene, 0062, 2Department of Animal and Wildlife Sciences, University of Pretoria, P/Bag X20, Hatfield, Pretoria, 0028, 3Sydney Brenner Institute of Molecular Bioscience, University of the Witwatersrand, P/Bag 3, Wits, Gauteng, 2050, 4Division of Animal Sciences, University of Missouri, 920 East Campus Drive, Columbia, MO 65211-5300, USA, 5Department of Animal, Wildlife and Grassland Sciences, University of the Free State, Bloemfontein 9300, South Africa
#Corresponding author:

Background: Whole-genome sequencing now provides a suitable platform to examine the entire genome for the identification of selective sweeps. Indigenous South African (SA) breeds including Afrikaner (AFR), Drakensberger (DRA), and Nguni (NGI) are important genetic resources for SA cattle production. These breeds were subjected to strong selection leading to changes in their morphology, physiology and behaviour.

Aim: The aim of this study was to identify selective sweeps that shaped phenotypic diversity among indigenous SA breeds.
Methodologies: Whole genome sequencing of pools of DNA from AFR, DRA, and NGI was performed using an Illumina HiSeq 2000 and 17.6 million variants were discovered across the breeds. To identify the selective sweep regions, SNPs were used to calculate Z-transformations of the pooled heterozygosity (ZHp) in each of the three breeds using a 150 kb sliding window to compute the ZHp Z-scores in each breed. The results were used to plot the distribution of SNP counts within the windows. The regions of selective sweeps were represented by the lower ZHp Z-scores with the minimum threshold of -4. Animal QTL database was used to determine the gene ontology of the genes identified in selective sweep regions.

Results: In total 688 candidate selective sweeps, with the ZHp Z-score ≤ −4 were identified across the three breeds with 223 putative selective sweeps (ZHp Z-score ≤ -5). About 93 regions had extremely low ZHp Z-scores (ZHp scores ≤ −6). These are the regions subjected to selection segninatures. Using animal QTLdb, several genes were identified, e.g., ESM1, CNOT6, ASIC5, KIT and MITF, associated with phenotypic variation in livestock species (Zielak-Steciwko et al., 2014; Fallahsharoudi et al., 2016).

Discussion: The ability to detect selective sweep regions provided useful genomic information for these breeds, whereas functional analysis of these regions revealed the presence of genes of biological and economic importance.
Conclusions and recommendations: This study provides a broad insight into the events that happened during recent selection events and artificial selection processes that have shaped the livestock genome. More work is needed to characterise genomic regions and genes identified in this study.

Please contact the Primary Researcher if you need a copy of the comprehensive report of this project –  vhashoni Zwane  on