For populations to maintain optimal fitness, harmful mutations must be efficiently purged from the genome. Yet, under circumstances that diminish the effectiveness of natural selection, such as the process of plant and animal domestication, deleterious mutations are predicted to accumulate. Here, we compared the load of deleterious mutations in 21 accessions from natural populations and 19 domesticated accessions of the common sunflower using whole-transcriptome single nucleotide polymorphism data. Although we find that genetic diversity has been greatly reduced during domestication, the remaining mutations were disproportionally biased toward nonsynonymous substitutions. Bioinformatically predicted deleterious mutations affecting protein function were especially strongly over-represented. We also identify similar patterns in two other domesticated species of the sunflower family (globe artichoke and cardoon), indicating that this phenomenon is not due to idiosyncrasies of sunflower domestication or the sunflower genome. Finally, we provide unequivocal evidence that deleterious mutations accumulate in low recombining regions of the genome, due to the reduced efficacy of purifying selection. These results represent a conundrum for crop improvement efforts. Although the elimination of harmful mutations should be a long-term goal of plant and animal breeding programs, it will be difficult to weed them out because of limited recombination.
Usage Notes:
HA412_trinity_noAltSplice_400bpmin.fa
Link to reference transcriptome described in Renaut et al. 2013 (NatCom), used for all alignments, and previously deposited in Dryad as part of http://dx.doi.org/10.5061/dryad.9q1n4.
new_provean_results_all
Results of PROVEAN analyses identifying deleterious non-synonymous mutations. First column identifies the non-synonymous AA changes and their position (e.g. P113A). The correspondence between this AA change and the SNPs identified in the dataset can be found in this file (new_snp_table_effect). Second column is the Provean score. Third column are the name of the genes.
new_sift_results_all
Results of SIFT analyses identifying deleterious non-synonymous mutations. First column identifies the non-synonymous AA changes and their positions (e.g. S11T). Column 2-6 are the SIFT statistics. Last column are the name of the genes.
unique_orf
Unique (longest) open reading frames identified in the reference transcriptome
snp_table_all3
Genotypes and positions of all SNPs identified in the dataset.
new_snp_table_effect
List of all SNPs. Column 4-43 indicate whether this SNP was noncoding (nc), non-synonymous (ns), synonymous (s), alternate stop codon (STOP), or homozygous reference allele (0). Column 44 indicates whether this mutation was analyzed by PROVEAN. Column 45 indicates PROVEAN score, if applicable. Column 46-50 indicate frequency of alternate allele in different classes of individuals.
table_S1_24_02_15
Table S1 with information about samples (location, sequencing stats, etc.)