Search

Search Results

Dryad Logo
Dryad
Sylvester, Emma V.A.; Bentzen, Paul; Bradbury, Ian R.; Clément, Marie; Pearce, Jon; Horne, John; Beiko, Robert G.; Sylvester, Emma V. A. 2017-07-27 Genetic population assignment used to inform wildlife management and conservation efforts requires panels of highly informative genetic markers and sensitive assignment tests. We explored the utility of machine-learning algorithms (random forest, regularized random forest, and guided regularized random forest) compared with FST ranking for selection of single nucleotide polymorphisms (SNP) for fine-scale population assignment. We applied these methods to an unpublished SNP dataset for Atlantic salmon (Salmo salar) and a published SNP data set for Alaskan Chinook salmon (Oncorhynchus tshawytscha). In each species, we identified the minimum panel size required to obtain a self-assignment accuracy of at least 90% using each method to create panels of 50-700 markers Panels of SNPs identified using random forest-based methods performed up to 7.8 and 11.2 percentage points better than FST-selected panels of similar size for the Atlantic salmon and Chinook salmon data, respectively. Self-assignment accuracy ≥90% was obtained with panels of 670 and 384 SNPs for each dataset, respectively, a level of accuracy never reached for these species using FST-selected panels. Our results demonstrate a role for machine-learning approaches in marker selection across large genomic datasets to improve assignment for management and conservation of exploited populations. https://creativecommons.org/publicdomain/zero/1.0/

Map search instructions

1.Turn on the map filter by clicking the “Limit by map area” toggle.
2.Move the map to display your area of interest. Holding the shift key and clicking to draw a box allows for zooming in on a specific area. Search results change as the map moves.
3.Access a record by clicking on an item in the search results or by clicking on a location pin and the linked record title.
Note: Clusters are intended to provide a visual preview of data location. Because there is a maximum of 50 records displayed on the map, they may not be a completely accurate reflection of the total number of search results.