High-throughput sequencing methods for genotyping genome-wide markers are being rapidly adopted for phylogenetics of non-model organisms in conservation and biodiversity studies. However, the reproducibility of SNP genotyping and degree of marker overlap or compatibility between datasets from different methodologies have not been tested in non-model systems. Using double-digest restriction site associated DNA sequencing, we sequenced a common set of 22 specimens from the butterfly genus Speyeria on two different Illumina platforms, using two variations of library preparation. We then used a de novo approach to bioinformatic locus assembly and SNP discovery for subsequent phylogenetic analyses. We found a high rate of locus recovery despite differences in library preparation and sequencing platforms, as well as overall high levels of data compatibility after data processing and filtering. These results provide the first application of NGS methods for phylogenetic reconstruction in Speyeria, and support the use and long-term viability of SNP genotyping applications in non-model systems.
Usage Notes:
NextSeq_raw_fastq
Zipped file containing 24 raw fastq files of 11 butterfly species. A ddRAD library was prepared using PstI and MspI restriction enzymes, following the protocol of Peterson et al. 2012, and then sequenced on an Illumina NextSeq 500.
HiSeq_raw_fastq
Zipped file containing 24 raw fastq files of 11 butterfly species. A double enzyme GBS library was prepared using PstI and MspI restriction enzymes, following the protocol of Poland et al. 2012, and then sequenced on an Illumina HiSeq 2000.
all_5%_structure_input
Input file for STRUCTURE analysis of the HiSeq + NextSeq dataset filtered with a minor allele frequency of 5%.
all_5%.str
processed nexus input files
Zipped file containing all 21 processed nexus files used for phylogenetic analyses