I will be posting portions of all 10 chapters of my upcoming textbook, Applied Population Genetics, as early draft chapters to this website over the spring semester. Read more
I’ve been working on integrating the Swift language into my analysis workflow but much of what I do involves the GNU Scientific Libraries for matrix analysis and other tools. Here is a quick tutorial on how to install the GSL library on a clean OSX platform.
- It is easiest if you have XCode installed. You can get this from the App Store for free. Go download it and install it.
- Download the latest version of the GSL libraries. You can grab them by:
- Looking for your nearest mirror site listed at http://www.gnu.org/prep/ftp.html and connecting to it.
- Open the directory
gsl/where all the versions will be listed. Scroll down and grab
- Open the terminal (Utilities -> Terminal.app) and type:
- Unpack the archive by:
tar zxvf gsl-latest.tar.gzthen
cd gsl-1.16/(or whatever the version actually was, it will probably be some number larger than 1.16).
- Inside that folder will be a README file (which you probably won’t read) and an INSTALL file (which you should read). In that folder it will tell you to:
sudo make install. This last command will require you to type in your password as it is going to install something into the base system.
- All the libraries and header files will be installed into the
This is the main package that provides data types and routines for spatial analysis of genetic marker data. The previous version is currently available on CRAN and you can install it rom within your R environtment by invoking the command
If you want to keep up with the latest developments of this package, you can use the version found on GitHub. Install it from within R as:
and that should get you up-to-date. You’ll need to have a fully working LaTeX install and some other stuff to build it if you fork.
The Users Manual for the package with several examples can be found here
I have started a github account for this package, you can get access to the whole codebase read about it on the wiki, and contribute to the project from its repo at https://github.com/dyerlab.
Pollen-mediated gene flow is a major driver of spatial genetic structure in plant populations. Both individual plant characteristics and site-specific features of the landscape can modify the perceived attractiveness of plants to their pollinators and thus play an important role in shaping spatial genetic variation. Most studies of landscape-level genetic connectivity in plants have focused on the effects of interindividual distance using spatial and increasingly ecological separation, yet have not incorporated individual plant characteristics or other at-site ecological variables. Using spatially explicit simulations, we first tested the extent to which the inclusion of at-site variables influencing local pollination success improved the statistical characterization of genetic connectivity based upon examination of pollen pool genetic structure. The addition of at-site characteristics provided better models than those that only considered interindividual spatial distance (e.g. IBD). Models parameterized using conditional genetic covariance (e.g. population graphs) also outperformed those assuming panmixia. In a natural population of Cornus florida L. (Cornaceae), we showed that the addition of at-site characteristics (clumping of primary canopy opening above each maternal tree and maternal tree floral output) provided significantly better models describing gene flow than models including only between-site spatial (IBD) and ecological (isolation by resistance) variables. Overall, our results show that including interindividual and local ecological variation greatly aids in characterizing landscape-level measures of contemporary gene flow.
The manner by which pollinators move across a landscape and their resulting preferences and/or avoidances of travel through particular habitat types can have a significant impact on plant population genetic structure and population-level connectivity. We examined the spatial genetic structure of the understory tree Cornus florida (Cornaceae) adults (NAdults = 452) and offspring (NOffspring = 736) across two mating events to determine the extent to which pollen pool genetic covariance is influenced by intervening forest architecture. Resident adults showed no spatial partitioning but genotypes were positively autocorrelated up to a distance of 35 m suggesting a pattern of restricted seed dispersal. In the offspring, selfing rates were small (sm = 0.035) whereas both biparental inbreeding (sb;open canopy = 0.16, sb;closed canopy = 0.11) and correlated paternity (rp;open canopy = 0.21, rp;closed canopy = 0.07) were significantly influenced by primary canopy opening above individual mothers. The spatial distribution of genetic covariance in pollen pool composition was quantified for each reproductive event using Pollination Graphs, a network method based upon multivariate conditional genetic covariance. The georeferenced graph topology revealed a significant positive relationship between genetic covariance and pollinator movement through C. florida canopies, a negative relationship with open primary canopy (e.g., roads under open canopies and fields with no primary canopy), and no relationship with either conifer or mixed hardwood canopy species cover. These results suggest that both resident genetic structure within stands and genetic connectivity between sites in C. florida populations are influenced by spatial heterogeneity of mating individuals and quality of intervening canopy cover.
Habitat fragmentation and landscape topology may influence the genetic structure and connectivity between natural populations. Six microsatellite loci were used to infer the population structure of 35 populations (N = 788) of the alpine Arabian burnet moth Reissita simonyi (Lepidoptera, Zygaenidae) in Yemen and Oman. Due to the patchy distribution of larval food plants, R. simonyi is not continuously distributed throughout the studied area and the two recognized subspecies of this endemic species (Reissita s. simonyi/R. s. yemenicola) are apparently discretely distributed. All microsatellites showed prevalence of null alleles and therefore a thorough investigation of the impact of null alleles on different population genetic parameters (FST, inbreeding coefficients, and Population Graph topologies) is given. In general, null alleles reduced genetic covariance and independence of allele frequencies resulting in a more connected genetic topology in Population Graphs and an overestimation of pairwise FST values and inbreeding coefficients. Despite the presence of null alleles, Population Graphs also showed a much higher genetic connectivity within subspecies (and lower genetic differentiation, via FST) than between; supporting existing taxonomic distinction. Partial Mantel tests showed that both geo- graphical distance and altitude were highly correlated with the observed distribution of genetic structure within R. simonyi. In conclusion, we identified geographical and altitudinal distances in R. simonyi as well as an intervening desert area to be the main factors for spatial genetic structure in this species and show that the taxonomic division into two subspecies is confirmed by genetic analysis.
Patterns of spatial genetic structure produced following the expansion of an invasive species into novel habitats reflect demographic processes that have shaped the genetic structure we see today. We examined 359 individuals from 23 populations over 370 km within the James River Basin of Virginia, USA as well as four populations outside of the basin. Population diversity levels and genetic structure was quantified using several analyses. Within the James River Basin there was evidence for three separate introductions and a zone of secondary contact between two distinct lineages suggesting a relatively recent expansion within the basin. Microstegium vimineum possesses a mixed-mating system advantageous to invasion and populations with low diversity were found suggesting a recent founder event and self-fertilization. However, surprisingly high levels of diversity were found in some populations suggesting that out-crossing does occur. Understanding how invasive species spread and the genetic consequences following expansion may provide insights into the cause of invasiveness and can ultimately lead to better management strategies for control and eradication.
Landscape genetics is a burgeoning field of interest that focuses on how site-specific factors influence the distribution of genetic variation and the genetic connectivity of individuals and populations. In this manuscript, we focus on two methodological extensions for landscape genetic analyses: the use of conditional genetic distance (cGD) derived from population networks and the utility of extracting potentially confounding effects caused by correlations between phylogeographic history and contemporary ecological factors. Individual-based simulations show that when describing the spatial distribution of genetic variation, cGD consistently outperforms the traditional genetic distance measure of linearized FST under both 1- and 2-dimensional stepping stone models and Cavalli-Sforza and Edward’s chord distance Dc in 1-dimensional landscapes. To show how to identify and extract the effects of phylogeographic history prior to embarking on landscape genetic analyses, we use nuclear genotypic data from the Sonoran desert succulent Euphorbia lomelii (Euphrobiaceae), for which a detailed phylogeographic history has previously been determined. For E. lomelii, removing the effect of phylogeographic history significantly influences our ability to infer both the identity and the relative importance of spatial and bio-climatic variables in subsequent landscape genetic analyses. We close by discussing the utility of cGD in landscape genetic analyses.
To examine the generality of population-level impacts of ancient vicariance identified for numerous arid-adapted animal taxa along the Baja peninsula, we tested phylogeographical hypotheses in a similarly distributed desert plant, Euphorbia lomelii (Euphorbiaceae). In light of fossil data indicating marked changes in the distributions of Baja floristic assemblages throughout the Holocene and earlier, we also examined evidence for range expansion over more recent temporal scales. Two classes of complementary analytical approaches — hypothesis-testing and hypothesis-generating — were used to exploit phylogeographical signal from chloroplast DNA sequence data and genotypic data from six codominant nuclear intron markers. Sequence data are consistent with a scenario of mid-peninsular vicariance originating c. 1 million years ago (Ma). Alternative vicariance scenarios representing earlier splitting events inferred for some animals (e.g. Isthmus of La Paz inundation, c. 3 Ma; Sea of Cortez formation, c. 5 Ma) were rejected. Nested clade phylo- geographical analysis corroborated coalescent simulation-based inferences. Nuclear markers broadened the temporal spectrum over which phylogeographical scenarios could be addressed, and provided strong evidence for recent range expansions along the north– south axis of the Baja peninsula. In contrast to previous plant studies in this region, however, the expansions do not appear to have been in a strictly northward direction. These findings contribute to a growing appreciation of the complexity of organismal responses to past climatic and geological changes — even when taxa have evolved in the same landscape context.
The analysis of genetic marker data is increasingly being conducted in the context of the spatial arrangement of strata (e.g. populations) necessitating a more flexible set of analysis tools. GeneticStudio consists of four interacting programs: (i) Geno a spreadsheet-like interface for the analysis of spatially explicit marker-based genetic variation; (ii) Graph software for the analysis of Population Graph and network topologies, (iii) Manteller, a general purpose for matrix analysis program; and (iv) SNPFinder, a program for identifying single nucleotide polymorphisms. The GeneticStudio suite is available as source code as well as binaries for OSX and Windows and is distributed under the GNU General Public License.
This manuscript explores the simultaneous evolution of population genetic parameters and topological features within a population graph through a series of Monte Carlo simulations. I show that node centrality and graph breadth are significantly correlated to population genetic parameters FST and M (ρ = -0.95; ρ -0.98, respectively), which are commonly used in quantifying among population genetic structure and isolation by distance. Next, the topological consequences of migration patterns are examined by contrasting N-island and stepping stone models of gene movement. Finally, I show how variation in migration rate influences the rate of formation of specific topological features with particular emphasis to the phase transition that occurs when populations begin to become fixed due to restricted movement of genes among populations. I close by discussing the utility of this method for the analysis of intra-specific genetic variation.
Patterns of intraspecific genetic variation result from interactions among both historical and contemporary evolutionary processes. Traditionally, population geneticists have used methods such as F-statistics, pairwise isolation by distance models, spatial autocorrelation and coalescent models to analyses this variation and to gain insight about causal evolutionary processes. Here we introduce a novel approach (Population Graphs) that focuses on the analysis of marker-based population genetic data within a graph theoretic framework. This method can be used to estimate traditional population genetic summary statistics, but its primary focus is on characterizing the complex topology resulting from historical and con- temporary genetic interactions among populations. We introduce the application of Population Graphs by examining the range-wide population genetic structure of a Sonoran Desert cactus (Lophocereus schottii). With this data set, we evaluate hypotheses regarding historical vicariance, isolation by distance, population-level assignment and the importance of specific populations to species-wide genetic connectivity. We close by discussing the applicability of Population Graphs for addressing a wide range of population genetic and phylogeographical problems.