Discovery of single nucleotide polymorphisms

Genome-wide genetic marker discovery in South African indigenous cattle breeds using next generation sequencing

Industry Sector: Cattle and Small Stock

Research Focus Area: Livestock production with global competitiveness: Animal growth, nutrition and management

Research Institute: Agriculture Research Institute – Animal Production Institute

Researcher: Dr. Avhashoni Zwane

Title Initials Surname Highest Qualification
Prof. Azwihangwisi Maiwashe PhD
Title Name Surname Highest Qualification
Prof Este Van Marle-Koster PhD
Prof Jerry Taylor PhD
Prof Mahlako Makgahlela PhD
Dr Ananyo Choudhury PhD
Dr Farai Muchadeyi PhD

Aims Of The Project

  • To conduct a genome wide search for new SNPs in local cattle breeds
  • To validate newly identified SNPs using Run 5 data from the 1000 Bull Genomes Project and perform functional annotation and enrichment analysis
  • To identify selective sweeps and a panel of SNP markers to discriminate between the three indigenous breeds

Executive Summary

South African (SA) livestock has played an important role in food security country’s sustainability. Due to the important role of indigenous cattle breeds in SA, it is crucial for these breeds to be included in the generation of genotypic and sequence data. Genomic data provide opportunity for various genetic investigations including identification of breed-informative markers, selective sweeps and genome-wide association studies (GWAS). In this study sequence data were generated and used in combination with genotypic data to conduct a SNP discovery in the three indigenous SA breeds (Afrikaner, Drakensberger, and Nguni) and study potential selective sweeps and identify panel of breed-specific markers. Commercial bovine SNP assays, (BovineSNP50 and GGP-80K) were used for identifying the breed-informative markers, while an approach of breed pooled samples were used for sequencing. Sequencing of the three breeds generated approximately 1.8 billion (184 Giga-bp) of high quality paired-end reads which 99 % reads mapped to the bovine reference genome (UMD 3.1), with an average coverage of 21.1-fold. A total of 17.6 million variants were identified across the three breeds with the highest number of variants identified in NGI (12,514,597) than in AFR (11,165,172) and the DRA (7,049,802). In total 89 % of variants were SNPs and 11 % were Indels. On average, 85 % of the total SNPs identified were also shared among the breeds from 1000 Bull Genomes Project data and the remaining 15 % of SNPs were unique to SA indigenous breeds. Novel SNPs were further annotated to identify genes enriched in novel SNPs. In total, 461, 478 and 542 genomic regions identified from the top (5%) windows were enriched for novel variants (p < 0.001). A total of 174 putative breed-specific SNPs were identified across the breeds and showed the overall 100% breed allocation using PCA and GeneClass 2. This study provides the first analysis of sequence data to discover SNPs in indigenous SA cattle breeds and the results provide insight into the genetic composition of the breeds and offer the potential for further applications in their genetic improvement.

Please contact the Primary Researcher if you need a copy of the comprehensive report of this project –  vhashoni Zwane  on