martes, 9 de mayo de 2017

niceOverPlot, or when the number of dimensions does matter


  Hi there!

    Over the last few months, my lab-mate Irene Villa (see more of her work here!)
and I have been discussing ecological niche overlap. The niche concept dates back to ideas first proposed by ornithologist J. Grinnell (1917). Later on, G.E. Hutchinson
(1957) defined the ecological niche of a species as the n-dimensional hyper-volume of ecological factors that define a space where the species can exist indefinitely. These ecological factors can be either abiotic (fundamental niche) or biotic variables, and the intersection of both defines the realized niche, or potential ecological range where the species can survive. The last necessary element for the existence of a species in a particular place (actual presence or distribution) is related to the “movement” ability or dispersal capacity of the species to arrive to those places where ecological factors are suitable. This theory is nicely summarized in the BAM diagram (Biotic-Abiotic-Movement; Soberón, 2007).

    In the last decades, with the development of cartographic information about climatic variables (mainly temperature and precipitation) and mathematical algorithms, many research efforts have been directed towards modeling the ecological niche of species from presence records and bioclimatic information (Elith & Leathwick, 2009). As these techniques were developed, many researchers focused on the comparison of ecological niches of two species, and the degree of overlap between them. A wide range of literature has been published about this topic (Peterson, 2011) but, in the last years, a general consensus has been reached about the methods to measure this overlap. Briefly, the method consists of two steps: 1) building an environmental space using bioclimatic values in the presence records and background points with any multivariate technique (usually Principal Component Analysis), and 2) representing both species niches in this space via kernel density, comparing them using D or I indexes and applying similarity or identity tests to assess significance (Warren et al., 2008).

    Following this general approach, the R package ecospat (Di Cola et al., 2016) provides an easy way to perform a Principal Component Analysis and to compare two species niches using D overlapping index, identity tests, etc. However, Irene and I realized that, when using this approach, in many cases for which niche overlap was not obtained, this was due to the lack of overlap for just one of the two PC axes, while a complete overlap was retrieved for the other axis. Taking into account that both axes do not usually contribute equally to explain environmental space (the first axis usually explains much more variability than the following ones), we worked on a function using the ggplot2 R package to create a nice plot for visual exploration of the Principal Component scores in both axes. This plot allows us to examine the factors that are or are not producing niche overlap, since it shows three plots in one: each axis separately and the joint distribution.


   Here is the function code (too large to print here). You can download and run it in an R console. The following is an example with two wall lizards from a Mediterranean area. You can download ecological values for the 19 bioclimatic variables at presence points of these two species here.


   
 ##########################
 #######   EXAMPLE   ######  
 ##########################
library(ecospat)
library(ggplot2)
library(grid)
library(gridExtra)
library(gtable)
library(RColorBrewer) 
   
 # Read wall lizards ecological data  
   
 data_Ph_Pl<-read.csv("./data_Ph_Pl.csv")  
   
 # Note: This data represent the values for the 19 bioclimatic variables at occurrence   
 # points for two Mediterranean wall lizards, Iberian wall lizard (Podarcis hispanicus,   
 # mainly in Iberian Peninsula) and Lilford's wall lizard (Podarcis lilfordi, from the   
 # Balearic Islands). Data were downloaded from GBIF* and Worldclim (http://worldclim.org/).  
 # The first 1125 rows are the bioclimatic values for P.hispanicus presences, while the  
 # next 44 rows represent bioclimatic values for P.lilifordi. The following rows represent   
 # available bioclimatic conditions (background) obtained from 10000 random points   
 # generated in a buffer of 400 km approximatively around presence points. Some inaccuracies   
 # (such as 2 Lilford's wall lizard presences in Iberian Peninsula) were not removed (this is   
 # just an example!!).  
 # *Data searches:  
 # Podarcis hispanicus: GBIF.org (30th April 2017) GBIF Occurrence Download http://doi.org/10.15468/dl.yx4fbg  
 # Podarcis lilfordi: GBIF.org (30th April 2017) GBIF Occurrence Download http://doi.org/10.15468/dl.qqmzq3  
   
 # From this data, we perform a Principal Component Analysis using "dudi.pca" function from "ecospat" package.  
 # We will choose 2 axes for the environmental space representation.  
   
 pca_Ph_Pl <-dudi.pca(na.omit(data_Ph_Pl, center = T, scale = T, scannf = F, nf = 2))  
 2  
   
 # Now, we can use directly this result with the niceOverPlot function to represent a   
 # 2D environmental space in a central plot, and each environmental gradient represented   
 # by each axis at top and right. We must provide to the function with the number of presences  
 # of each species in the same order as in the input data (nº of presences  
 # for Sp1 and nº of presences for Sp2)  
   
 niceOverPlot(pca_Ph_Pl, n1=1125 , n2= 44)  

Results from niceOverPlot (Blue area: P. lilifordi; Pink area: P. hispanicus)

      

  This plot could help us to interpret the results obtained from niche overlap analysis for these two species. In this example, we can see that, in 2D environmental space, the bioclimatic preferences of the two wall lizard species do not overlap. But you can notice that this lack of overlap is due to Axis2 of the PCA, which represents around 18% of contribution in environmental space. Now, we can focus on the bioclimatic variables that are involved in this axis to understand which factors are preventing the overlap.
In upcoming posts, we will discuss some alternative ways to address the D overlap index to take into account cases like our example.

Have fun!

Podarcis lilifordi, from the Balearic Islands

 References


-Soberón, J. (2007). Grinnellian and Eltonian niches and geographic distributions of species. Ecology letters, 10(12), 1115-1123.

-Elith, J., & Leathwick, J. R. (2009). Species distribution models: ecological explanation and prediction across space and time. Annual review of ecology, evolution, and systematics, 40, 677-697.

-Peterson, A. T. (2011). Ecological niche conservatism: A time‐structured review of evidence. Journal of Biogeography, 38(5), 817-827.

-Di Cola, V., Broennimann, O., Petitpierre, B., Breiner, F. T., D'Amen, M., Randin, C., ... & Pellissier, L. (2016). ecospat: an R package to support spatial analyses and modeling of species niches and distributions. Ecography.

6 comentarios:

  1. Nice blog post, I wonder if this could be done for bacterial "species" from environmental samples where you would have a large number of species.

    ResponderEliminar
    Respuestas
    1. Hi Ido!

      Yeah, it would be cool! Currently the function is prepared to plot only two "species" (or groups), but it could be modified...! You could also use other traits as morphological characters in order to built PCA axes...
      Cheers!

      Eliminar
  2. Hi,
    Nice post, thanks for sharing! I think I'm going to look into editing the code for >2 taxa

    ResponderEliminar
    Respuestas
    1. Hi!
      It would be really cool! Actually, we were thinking about that, but we decided first to publish this first version of the function... please, let us know when the edition is ready! Our next steps would need this >2 species plot!

      Cheers!

      Eliminar
  3. Nice plots indeed!
    And, most importantly, a useful contribution for the interpretation of these analyses results!
    Thanks Javi and Irene!
    Javi (another one)

    ResponderEliminar