domingo, 14 de abril de 2019

Understanding Bayesian Inference with a simple example in R!



 Hi there!

Last summer, the Royal Botanical Garden (Madrid, Spain) hosted the first edition of MadPhylo, a workshop about Bayesian Inference in phylogeny using RevBayes. It was a pleasure for me to be part of the organization staff with John Huelsenbeck, Brian Moore, Sebastian Hoena, Mike May, Isabel Sanmartin and Tamara Villaverde. Next edition of Madphylo will be held June 10, 2019 to June 19, 2019at the Real Jardín Botánico de Madrid. If you are interested in Bayesian Inference and phylogeny just can't miss it! You'll learn the RevBayes language, a programming language to perform phylogeny (and other) analyses under a  Bayesian framework! Apply here!


 

The first days were focused to explain how we can use the Bayesian framework to estimate the parameters of a model. I'm not an expert in Bayesian Inference at all, but in this post I'll try to reproduce one of the first Madphylo tutorials in R language. As a simple example, we'll use a coin flipping experiment. We can model this experiment with a binomial distribution:

binomial(n, p)

where n is the number of tosses and the parameter p is the probability to obtain a head. Given the number of tosses and the number of heads (h), we can use Bayesian Inference to compute the probability to obtain a head (p).

If we use Bayes’ theorem, we have that the probability of a specific value of p given the number of heads (h) is:

where Pr(h|p) is the likelihood of the data given a value of p, Pr(p) is the prior probability for p and Pr(h) is the marginal likelihood for the data. The prior probability of our parameter represent our believe about what values can take the parameter. If we think our parameter can take whatever value with the same probability, we use an uniform (flat) prior.

We can use a Markov Chain Monte Carlo (MCMC) to introduce many different values of p and get the posterior probability of these values. To do this, we can use the Metropolis-Hastings MCMC algorithm, with the next steps (simplified):

- Step 1) Set an initial value for p.

 p <- runif(1, 0, 1)  

- Step 2) Propose a new value of p, called p'.

 p_prime <- p + runif(1, -0.05, 0.05)  

- Step 3) Compute the acceptance probability of this new value for the parameter. We have to check if the new value improves the posterior probability given our data. This can be seen as the ratio: Pr(p'|h) / Pr(p|h).



It can be simplify as:




 
The advantage of this method is that we avoid to compute the marginal likelihood, that is often difficult to obtain with more complex models. Let’s stop here a little bit to explain each term of this equation.

Since our main model is a binomial model (coin toss), the likelihood function Pr(h|p) can be defined in R language as:

 # help(dbinom)  
 likelihood <- function(h, n, p){  
  lh <- dbinom(h, n, p)  
  lh  
 }  

Pr(p) is our prior probability, that is, our believe about what values can take p. The only thing that we know is that it must be a value between 0 and 1, since it is a probability. If we think that all values have the same probability, we can define a flat prior using the beta distribution. A beta distribution with parameters β(1,1) is a flat distribution between 0 and 1 (you can learn more about beta distribution here). Then, in R we can obtain Pr(p) under a β(1,1) as:

 # help(dbeta)  
 dbeta(p, 1, 1)  

Now, the acceptance probability (R, see equations in Step 3) will be the minimum value: 1 or the ratio of posterior probabilities given the different p. We express this equation in R language as follows:

 R <- likelihood(h,n,p_prime)/likelihood(h,n,p) * (dbeta(p_prime,1,1)/dbeta(p,1,1))  

- Step 4) Next, we generate a uniform random number between 0 and 1. If this number is < R, we will accept the new value for p (p') and we update the value of p = p'. If not, the change is rejected.

  if (R > 1) {R <- 1}  
   random <- runif (1, 0, 1)  
  if (random < R) {  
   p <- p_prime  
  }  

- Step 5) Now we record the current value of p.

 posterior[i,1] <- log(likelihood(h, n, p))  
 posterior[i,2] <- p  

Finally, we should repeat this loop many times to obtain a good estimate of p. This can be easily done in R using a For loop (check the full code below).
 
  Following the example studied in MadPhylo with RevBayes, I wrote some code to reproduce this example with R. We toss a coin 100 times, and we obtain 73 heads. Is my coin biased? Let's check what is the probability to obtain a head given the data:

 # Set the numer of tosses.  
 n <- 100
 # Set the number of heads obtained.  
 h <- 73  
 # Define our likelihood function.   
 # Since our model is a binomial model, we can use:  
 likelihood <- function(h,n,p){  
  lh <- dbinom(h,n,p)  
  lh  
 }  
 # Set the starting value of p  
 p <- runif(1,0,1)  
 # Create an empty data.frame to store the accepted p values for each iteration.  
 # Remember: "the posterior probability is just an updated version of the prior"  
 posterior <- data.frame()  
 # Set the lenght of the loop (Marcov Chain, number of iterations).  
 nrep <- 5000  
 # Start the loop (MCMC)  
 for (i in 1:nrep) {  
  # Obtain a new proposal value for p  
  p_prime <- p + runif(1, -0.05,0.05)  
  # Avoid values out of the range 0 - 1  
  if (p_prime < 0) {p_prime <- abs(p_prime)}  
  if (p_prime > 1) {p_prime <- 2 - p_prime}  
  # Compute the acceptance proability using our likelihood function and the  
  # beta(1,1) distribution as our prior probability.  
  R <- likelihood(h,n,p_prime)/likelihood(h,n,p) * (dbeta(p_prime,1,1)/dbeta(p,1,1))  
  # Accept or reject the new value of p  
  if (R > 1) {R <- 1}  
   random <- runif (1,0,1)  
  if (random < R) {  
   p <- p_prime  
  }  
  # Store the likelihood of the accepted p and its value  
  posterior[i,1] <- log(likelihood(h, n, p))  
  posterior[i,2] <- p  
  print(i)  
 }  

Let's plot some results...


 par(mfrow= c(1,2))  
 prior <- rbeta(5000, 1,1)  
 plot(1:5000 ,posterior$V2, cex=0, xlab = "generations", ylab = "p",  
    main = "trace of MCMC\n accepted values of parameter p\n prior = beta(1,1) generations = 5000")  
 lines(1:5000, posterior$V2, cex=0)  
 abline(h=mean(posterior$V2), col="red")  
 plot(density(posterior$V2), xlim = c(min(min(prior),min((posterior$V2))), max(max(prior),max((posterior$V2)))),   
    ylim = c(0, max(max(density(prior)$y),max((density(posterior$V2)$y)))), main= "prior VS posterior\n prior= beta(1,1)",  
    lwd=3, col="red")  
 lines(density(prior), lwd=3, lty=2, col="blue")  
 legend("topleft", legend=c("prior density","posterior density"),  
     col=c("blue","red"), lty=c(3,1), lwd=c(3,3), cex = 1)  


Well, we can see that the probability to obtain a head given our data is around 0.7, so our coin must be a fake!

 You can play with the code and explorewith a different number of tosses, or the effect of a different prior for p...





If you want to learn more about Bayesian Inference, I recommend you these YouTube videos by Rasmus Bååth, or this wonderful book/course by Richard McElreath
 

That's all!

Enjoy!



domingo, 25 de noviembre de 2018

Plotting wind highways using rWind



Hi there!

Our manuscript about rWind R package has been recently accepted for publication in Ecography! As you know, rWind is a tool used to download and manage wind data, with some utilities that make easy to include wind information in ecological or evolutionary analyses (or others!).

Though there are several examples in the Supporting material of the publication, and others in the vignette included in the CRAN version, here you have a full tutorial with the main utilities of rWind... enjoy!

Computing wind connectivity of Azores Archipelago


In this example, we will compute wind connectivity of Azores Islands with different points of Europe, Africa and North America during June 2018. I will divide each task of this tutorial in different sections for clarification.

 

Downloading wind time series using lubridate and wind.dl_2()


In the last version of rWind (1.0.4), we can use lubridate to create a list of dates which will be used latter with wind.dl_2 to download wind time series. The object obtained with wind.dl_2 is an "rWind_series" "list". This object can be easily used with other rWind functions as we'll see later.

 # Make sure that you have the last version of rWind  
   
 devtools::install_github("jabiologo/rWind")  
 library(rWind)  
 library(lubridate)  
   
 # Here, we use ymd_hms from lubridate package to create a sequence of dates  
   
 dt <- seq(ymd_hms(paste(2018,6,1,12,00,00, sep="-")),  
      ymd_hms(paste(2018,6,30,12,00,00, sep="-")),by="1 days")  
   
 # Now we can use wind.dl_2 with this sequence of dates. Have into account  
 # that it could take a while, they are 30 datasets and it's a big area.   
   
 ww <- wind.dl_2(dt,-85,5,20,60)  


Obtaining maximum wind speed during the time series with dplyr


We can use the rWind function called "tidy" to play with the "rWind_series" "list" object to obtain, for example, the maximum speed reported for each cell in the study area during the time series.

 # After tidy our wind data, we can use dplyr utilities to easily  
 # obtain several metrics from our wind data. Notice that wind average  
 # is a special case, and it will be discussed later  
   
 t_ww <- tidy(ww)  
   
 library(dplyr)  
 g_ww <- t_ww %>% group_by(lat, lon)  
   
 # Now, we select only the maximum speed values for each coordinate  
   
 max_ww <- g_ww %>% summarise(speed = max(speed))  
 maxw <- cbind(max_ww$lon, max_ww$lat, max_ww$speed)  
 head(maxw)  
   
 # We can also check the maximum speed value in the whole study area   
   
 max(maxw[,3])  
   
 # Finally, we can transform this data into a raster layer  
   
 rmax <- rasterFromXYZ(maxw)  
 acol <- colorRampPalette(c("white", "blue", "darkblue"))  
 plot(rmax, col=acol(1000), main= "Maximum wind speed reported",  
     xlab="Longitude", ylab="Lattitude")  
   
 library(rworldmap)  
 lines(getMap(resolution = "high"), lwd=2)  



Computing averages of wind speeds and directions


Since wind data is a vectorial magnitude given in two components (U and V), to compute speed and direction averages we should use both vectors. To make it easier, rWind includes wind.mean function. Notice that wind.mean is specially developed to work with wind.dl_2 outputs, so it will require "rWind_series" object as input.
 # We can easily use wind.mean() function  
   
 w_mean <- wind.mean(ww)  
   
 # And then use wind2raster directly to create a raster layer  
   
 r_mean <- wind2raster(w_mean)  

 # We can plot a subset of this raster around Azores Islands.  
 # Using "arrowDir" we can include arrows for wind direction with   
 # "Arrowhead" function from "shape" R package  
   
 plot(r_mean$wind.speed, main= "Wind speed and direction average",
      col=acol(1000), xlim=c(-38,-18), ylim=c(33,43))  
 alpha <- arrowDir(w_mean)  
 library(shape)  
 Arrowhead(w_mean$lon, w_mean$lat, angle=alpha,   
      arr.length = 0.2, arr.type="curved")  
 lines(getMap(resolution = "low"), lwd=4)  


Wind connectivity between Azores and mainland


To complete this tutorial, we will use the downloaded wind data to compute wind connectivity between several points in mainland and Azores Archipelago. This kind of measures can be useful in order to establish theories about island colonization by plants or other organisms that could be influenced by wind currents. We will also use this connectivity to compute least cost paths between two locations, in order to virtually plot a wind highway connecting mainland with Azores islands.

 # First, we define our locations. 7 locations in the East   
 # and 7 in the West. The last one is a point in the Azores islands  
   
 loc <- matrix(c(-7, -4, -1, -9, -6,-10,-15,   
         -60, -56,-63,-76,-76,-81,-80, -27,  
         55, 50, 45, 40, 35, 30, 25,   
         55, 50, 45, 40, 35, 30, 25, 39), 15, 2)  
   
 colnames(loc) <- c("lon", "lat")  
 rownames(loc) <- c(seq(1,14,1), "Azores")  
   
 # Check how "loc" is looking   
   
 tail(loc)  
   
 # You can plot them  
 #plot(r_mean$wind.speed, col=acol(1000))  
 #lines(getMap(resolution = "low"), lwd=2)  
 #points(loc, col="red4", pch = 19, cex = 2)  


Now, we'll execute the next tasks:
  1. We'll use wind2raster to obtain raster layers of wind direction and speed of each day in the time series.
  2. Then, for each day, we'll compute the conductance matrix, a matrix with connectivity values between every single cell of the entire study area.
  3. Later, with each conductance matrix, we'll compute the cost between our locations and will store them in a cost_list object.
  4. Finally, for each day, if the Cost is not Infinite, we'll get the shortest path between two selected locations, and we'll store it into a “paths” object.

 # 1) Raster layers from wind time series  
   
 layers <- wind2raster(ww)  
   
 # 2) Conductance from wind direction and speed  
   
 Conductance <- flow.dispersion(layers, type="passive",  
                 output="transitionLayer")  
   
 # 3) and 4) Using a loop to compute cost between  
 # locations and least cost paths between location 11  
 # and Azores islands (as an example).  
   
 cost_list <- array(NA_real_, dim=c(15,15,30))  
 paths <- list(1:30)  
   
 library(gdistance)  
   
 for (i in 1:30){  
  cost_list[,,i] <- costDistance(Conductance[[i]],loc)  
  if (costDistance(Conductance[[i]],loc[11,] , loc[15,]) != Inf){  
   paths[[i]] <- shortestPath(Conductance[[i]], loc[11,], loc[15,],  
                 output="SpatialLines")  
   print(i)  
  }  
 }


Now, we will manage a little bit the results to plot them in a cool way.



 # Transform costs in connectivity  
   
 connectivity <- 1/cost_list  
   
 # Obtain connectivity averages  
   
 conn_avg <- apply(connectivity, c(1, 2), mean, na.rm = TRUE)  
   
 # Here we filter only connectivity from mainland to Azores  
 # As connectivity has arbitrary magnitude and units, we will   
 # multiply connectivity values 1000 times to avoid decimal zeros.  
   
 conn_azores <- conn_avg[,15]  
 conn_azores <- conn_azores *1000  
   
 # Finally, we can plot connectivity values relative to the width  
 # of the connections between mainland and Azores. You can power  
 # the values to increase the differences between them.  
   
 plot(r_mean$wind.speed, main= "Connectivity from mainland to Azores",
     col=acol(1000))  
 lines(getMap(resolution = "low"), lwd=1)  
 for (i in 1:14){  
  lines(rbind(loc[i,], loc[15,]), lwd = conn_azores[i]^3, col="red4")  
 }  
 points(loc, col="red", pch = 19, cex = 1.5)  
 text( loc+2, labels=c(round(conn_azores[1:14], 2),"Azores"),  
    col="white", cex= 1.5, font=2)  

Last but not least, we can use the shortest paths stored for each day to plot a density heatmap. This can help us to visualize wind corridors across our study area (our wind highways!).

 # We'll use spatstat library to convert the shortest  
 # paths lines into an psp object.  
   
 library(spatstat)  
   
 paths_clean <- paths[!sapply(paths, is.null)]  
 paths_merged <- paths_clean[[1]]  
 for (h in 2:length(paths_clean)) {  
  paths_merged <- rbind(paths_merged,paths_clean[[h]])  
 }  
 paths_psp <- as(paths_merged, "psp")  
   
 # Now, we can obtain a kernel density of the paths  
 # and convert it into a raster layer that can be plotted.  
 # You can play with sigma value to modify the smoothing.  
   
 lines_kernel <- density(paths_psp, sigma=1, dimyx=c(350,410))  
 kernel <- raster(lines_kernel)  
 #kernel[kernel<(maxValue(kernel)*0.1)]<-NA  
   
 ext <- extent(r_mean$wind.speed)  
 kernel <- extend(kernel, ext)  
 kernel[is.na(kernel)] <- minValue(kernel)  
   
 acol <- colorRampPalette(c("grey40","darkblue","red2",
                               "orange","yellow","white"))  
 plot(kernel, col=acol(1000), zlim=c(-0.01,5),  
       main = "Least cost paths density", xlab="Longitude",  
       ylab="Lattitude")  
 lines(getMap(resolution = "high"), lwd=2)





Isn't it cool?

 That's all for now!





lunes, 23 de julio de 2018

Antipredator behavior with R (or why wildebeest should stay together)



Hi there!

 In 2010, when I was studying my Biology Degree at Universidad Complutense in Madrid, I fell in love with a documentary miniseries called Great Migrations (National Geographic). Their episodes talk about awesome migrations of animals around the globe. One of these chapters is “Great Migrations: Science Of Migrations”. It shows how scientists study the patterns and processes of animal migration. One of these researchers is John Fryxell, from Gelph University in Canada. John explains how mathematical models can help to understand movement patterns. Through simulations, he shows how wildebeests maintaining a clustered movement pattern can more effectively avoid to be hunted by a virtual lion (here is the Spanish version in YouTube). I have to confess that each time I see those images, I think that I would be completely fulfilled with this kind of jobs.





I modified a little bit my previous “go” function to reproduce the experiment shown by John. This is a small tribute to this documentary series and to all those people who inspire and motivate us and arouse our curiosity, as John.

You can find the code to perform these simulations with R below. To perform the gif of this post, you can follow this procedures. And if you want to be updated with my posts, find me on my new Twitter account @javi_ferlop

That is all, enjoy!




 # First, we prepare packages and raster layers to be used later. We also load  
 # the species class to define the hunter inital location and his environmental  
 # optimum (grass cover in this case), as in previous posts.  
   
 library (raster)  
 library(dismo)  
   
 gc <- raster(nrows = 100, ncols = 100, xmn = 0, xmx = 100, ymn = 0, ymx = 100)  
 gc[] <- runif(10000, -80, 180)  
 gc <- focal(gc, w = matrix(1, 5, 5), mean)  
 gc <- focal(gc, w = matrix(1, 5, 5), mean)  
   
 species <- setClass("species", slots = c(x = "numeric", y = "numeric",   
                      opt = "numeric"))  
   
 # Then we will load the hunt function. This is pretty similar to my previous  
 # "go" function. However, I added a "preys" attribute, a set of locations  
 # of available preys. After each movement of the hunter, the function checks  
 # if the hunter is in the same cell as any prey. If yes, it records this  
 # location and delete this prey from the available preys. The output is a 4  
 # column dataframe: the first two are the location of the hunter at each  
 # iteration. The last two show if there is any capture at this iteration.   
   
 hunt <- function (sp, env, n, sigma, theta_x, alpha_x, theta_y, alpha_y, preys) {  
  track <- data.frame() #create an empty data.frame to store each step coordinates  
  track[1,1] <- sp@x  #the first position will be the initial position of the specimen defined by the species class  
  track[1,2] <- sp@y  #the first position will be the initial position of the specimen defined by the species class  
  hunted <- data.frame()#create an empty data.frame to store hunted preys  
    
    
  for (step in 2:n) {  
     
   #update preys locations   
     
   prey_cells <- cellFromXY(env, as.matrix(preys + step))  
     
   #First, the function searches in the adjacent cells what have the value that best fits with the species requirements  
   lon_candidate<--9999  
   lat_candidate<--9999  
   #randomization/atraction routine  
   while ( is.na(extract(env, matrix(c(lon_candidate,lat_candidate),1,2)))) {  
    lon_candidate <- track[step-1,1]+ (sigma * rnorm(1)) + (alpha_x * ( theta_x - track[step-1,1]))  
    lat_candidate <- track[step-1,2]+ (sigma * rnorm(1)) + (alpha_y * ( theta_y - track[step-1,2]))  
   }  
     
   neig <- adjacent(env,   
            cellFromXY(env, matrix(c(lon_candidate,  
                        lat_candidate), 1,2)),   
            directions=8, pairs=FALSE )  
     
   options <- data.frame()  
   for (i in 1:length(neig)){  
    options[i,1]<-neig[i]  
    options[i,2]<- sp@opt - env[neig[i]]  
   }  
   option <- c(na.omit(options[abs(options$V2) == min(abs(na.omit(options$V2))), 1 ]),   
         na.omit(options[abs(options$V2) == min(abs(na.omit(options$V2))), 1 ]))  
     
   new_cell <- sample(option,1)  
   new_coords <- xyFromCell(env,new_cell)  
     
   track[step,1] <- new_coords[1]  
   track[step,2] <- new_coords[2]  
     
   if (new_cell %in% prey_cells) {  
    hunted[step,1] <- new_coords[1]  
    hunted[step,2] <- new_coords[2]  
    preys <- preys[-match(new_cell,prey_cells),]  
   }  
   else {  
    hunted[step,1] <- NA  
    hunted[step,2] <- NA  
   }  
  }  
  return(cbind(track,hunted))  
 }  
   
 # Later we define our predator, in this case, a lion! The inital possition will  
 # be at the middle of the space (50,50). We will also select the environmental  
 # optimum of our lion, 60% of grass cover (our lion hates bare ground xD)  
   
 lion <- species(x= 50, y =50, opt= 60)  
   
 # Now, we will set the locations of a bunch of wildebeests using "randomPoints"   
 # from "dismo" package. As our wildebeests will be running around, we will   
 # initially locate them at the bottom left.  
   
 r <- raster(nrows=500, ncols=500, xmn=-500, xmx=0, ymn=-500,ymx=0)  
 wb <- randomPoints(r, 10000)  
   
 # Here we have our wildebeests ready to run!  
   
 plot(wb)  
   
 # Let start the hunting!   
   
 hunting <- hunt(lion, gc, 600, 2, 0, 0, 0, 0, wb)  
   
 # In "hunting" object is stored each location of our lion during the 600   
 # iterations, as well as each time she catches a wildebeest. To see only the   
 # catches, you can use:  
   
 na.omit(hunting)  
   
 # Pretty awesome, right? But, what happens if we group our wildebeests in   
 # clusters? To do it, we will define another small function that creates   
 # clusters of points:  
   
 clus <- function(n,m1,m2){  
  df<-data.frame()  
  for (i in 1:n){  
     
   a <- rnorm(100,m1[i],1)  
   b <- rnorm(100,m2[i],1)  
   df <- rbind(df,data.frame(a,b))  
  }  
    
  return(df)  
 }  
   
 wb_clustered <- clus(100, runif(100,-500,0),runif(100,-500,0))  
   
 # Look at the wildebeests locations now!  
   
 plot(wb_clustered)  
   
 # Ok, let's go again!  
   
 hunting_clustered <- hunt(lion, gc, 600, 2, 0, 0, 0, 0, wb_clustered)  
   
 # Let's chek how many catches he have now:  
   
 na.omit(hunting_clustered)  
   
 # And comparing both strategies...  
   
 nrow(na.omit(hunting))  
 nrow(na.omit(hunting_clustered))  
   
 # ...looks like wildebeests should stay together!!  
 # You can run these simulations n times to see if the results are robust.  
   
 # Finally we can print each iteration to make a cool gif ;)  
   
 for (i in 1:600) {  
  id <- sprintf("%03d", i)  
  png(paste("lion1_",id,".png", sep=""), width=1300, height=690, units="px",  
    pointsize = 16)  
    
  par(mfrow=c(1,2))  
    
  plot(gc, xlim = c(6,94), ylim = c(6,94), main="random", legend=FALSE)  
  points(hunting[i,1:2], cex=1.5, col="red", pch = 0, lwd =3)  
  points(hunting[1:i,3:4], cex=1, col="blue", pch = 3, lwd =3)  
  points (wb +i, pch = 20)  
    
  plot(gc, xlim = c(6,94), ylim = c(6,94), main="clustered", legend=FALSE)  
  points(hunting_clustered[i,1:2], cex=1.5, col="red", pch = 0, lwd =3)  
  points(hunting_clustered[1:i,3:4], cex=1, col="blue", pch = 3, lwd =3)  
  points (wb_clustered +i, pch = 20)  
    
  dev.off()  
 }  
   
   

miércoles, 4 de julio de 2018

New rWind release on CRAN! (v1.0.2)



Hi there!

Just a few lines to inform you about the new release of rWind R package (v1.0.2). This version have several new features. Here you have an example of one of them, the function wind.dl_2 to download time series of wind data. In this example we create a bunch of PNG files to be converted later into a GIF of wind speed of east Asia region.

Enjoy!




 # We will create a gif of wind speed for east Asia during the end of June (2018)  
   
 # First, we should load the packages that we will use. Use install.packages() to   
 # download and install the new version of rWind (v1.0.2)  
   
 install.package("rWind")  
   
 library(rWind)  
   
 library(fields)  
 library(shape)  
 library(rworldmap)  
 library(lubridate)  
   
 # Now, we use lubridate package to create a sequence of dates/times (each three  
 # hours)  
   
 dt <- seq(ymd_hms(paste(2018,6,25,00,00,00, sep="-")),  
      ymd_hms(paste(2018,7,4,21,00,00, sep="-")),by="3 hours")  
   
 # Now, we use the new function wind.dl_2 to download the whole time series of  
 # wind data. We use the "dt" object created with lubridate to provide the input   
 # to wind.dl_2. Since it's a large area and many days, it could take a while...  
   
 wind_series <- wind.dl_2(dt,90,150,5,40)  
   
 # Next, we can use wind2raster function from rWind package directly over the   
 # output from wind.dl_2, it has been adapted to work with this lists of wind  
 # data.  
   
 wind_series_layer <- wind2raster(wind_series)  
   
 # Finally, we will create a bunch of PNG files to be converted later in a GIF  
   
 id<-0  
 for (i in 1:72) {  
  id <- sprintf("%03d", i)  
  png(paste("asia",id,".png", sep=""), width=1000, height=600, units="px",  
    pointsize=18)  
  image.plot(wind_series_layer[[i]]$wind.speed, col=bpy.colors(1000),  
        zlim=c(0,18), main =wind_series[[i]]$time[1])  
  lines(getMap(resolution = "low"), lwd=3)  
  dev.off()  
 }  
   

martes, 1 de mayo de 2018

Simulating animal movements and habitat use


 Hi there!

 I was training some ways to simulate animal (or other organisms) movements having into account habitat suitability. To do this, I used my previous eWalk model as the underlying process to simulate random or directional walks. This model is based on Brownian / Ornstein–Uhlenbeck process. You can find more about eWalk model here!

 Today, I will add one more element to this movement simulations. In this case, we will have into account the habitat or environmental preferences of the simulated species, to perform a simulation like this:



 First, we will create a raster layer as a random environmental variable, for example tree cover.

   
 library (raster)  
 library (dismo)  
   
 tc <- raster(nrows=100, ncols=100, xmn=0, xmx=100, ymn=0,ymx=100)  
 tc[] <- runif(10000, -80, 180)  
 tc <- focal(tc, w=matrix(1, 5, 5), mean)  
 tc <- focal(tc, w=matrix(1, 5, 5), mean)  
 plot(tc)  

 Second, we will define the species class. The species will be defined by their position (coordinates), and their optimum for the environmental variable.

   
 species <- setClass("species", slots=c(x="numeric", y="numeric", opt="numeric"))  


 Here we will define the Red deer species as a specimen in the coordinates (50,50) and an optimum of 80 for the environmental variable (tree cover). In the same way, we will define the Egyptian mongoose as a specimen in the coordinates (50,50) and an optimum of 30 for the tree cover variable.

   
 Red_deer <- species(x= 50, y =50, opt= 90)  
 Egyptian_mongoose <- species(x= 50, y =50, opt= 30)  
   

 Now, we will load the "go" function (I do not have a name for it yet). It require a species (sp), a raster layer with any environmental variable (env), number of iterations (n), a Brownian motion parameter (that is, how random is the movement of your species), a geographical optimum (the wanted destination of your species theta_x and theta_y), and the attraction strength or "interest" of the species to get this position (alpha_x and alpha_y). The syntaxis should be something like this:

   
 path <- go (sp, env, n, sigma, theta_x, alpha_x, theta_y, alpha_y)  
   

 Here is the function to load (I will comment the function in a future post):

   
 go <- function (sp, env, n, sigma, theta_x, alpha_x, theta_y, alpha_y) {  
  track <- data.frame()  
  track[1,1] <- sp@x  
  track[1,2] <- sp@y  
  for (step in 2:n) {  
   neig <- adjacent(env,   
            cellFromXY(env, matrix(c(track[step-1,1],  
                        track[step-1,2]), 1,2)),   
            directions=8, pairs=FALSE )  
   options <- data.frame()  
   for (i in 1:length(neig)){  
    options[i,1]<-neig[i]  
    options[i,2]<- sp@opt - env[neig[i]]  
   }  
   option <- c(options[abs(na.omit(options$V2)) == min(abs(na.omit(options$V2))), 1 ],   
         options[abs(na.omit(options$V2)) == min(abs(na.omit(options$V2))), 1 ])  
   new_cell <- sample(option,1)  
   new_coords <- xyFromCell(env,new_cell)  
   lon_candidate<--9999  
   lat_candidate<--9999  
     
   while ( is.na(extract(env, matrix(c(lon_candidate,lat_candidate),1,2)))) {  
    lon_candidate <- new_coords[1]+ (sigma * rnorm(1)) + (alpha_x * ( theta_x - new_coords[1]))  
    lat_candidate <- new_coords[2]+ (sigma * rnorm(1)) + (alpha_y * ( theta_y - new_coords[2]))  
   }  
   track[step,1] <- lon_candidate  
   track[step,2] <- lat_candidate  
  }  
  return(track)  
 }  
   
   


 Well, now we can perform a simple experiment with our two specimens. We will simulate random movement of these two species having into account their environmental optimums. The "go" function will return us the track or the path followed by each specimen (coordinates by each step).

   
 deer_simul <- go (Red_deer, tc, 100, 2, 90, 0, 90, 0)  
 mongoose_simul <- go (Egyptian_mongoose, tc, 100, 2, 90, 0, 90, 0)  
   

 We can plot the paths...

   
 plot(tc)  
 lines(deer_simul, lwd=1.5, col="red")  
 points(deer_simul, cex=0.3, col="red")  
 lines(mongoose_simul, lwd=1.5, col="blue")  
 points(mongoose_simul, cex=0.3, col="blue")  
 legend("topleft", legend=c("deer","mongoose"), col=c("red","blue"),  
     lty=c(1,1), lwd=c(2,2))  
   

 To test if each species is actually "searching" their environmental optimum, we can extract the environmental values by step for each species and plot their density distributions.

   
 plot(density(extract(tc, deer_simul)),lwd=3, col="red", xlim=c(20,80),   
    ylim=c(0,max(c(density(extract(tc, deer_simul))$y,  
           density(extract(tc, mongoose_simul))$y))),  
    main="locations density distribution", xlab="tree cover")  
 lines(density(extract(tc, mongoose_simul)),lwd=3, col="blue")  
 legend("topleft", legend=c("deer","mongoose"), col=c("red","blue"),  
     lty=c(1,1), lwd=c(3,3))  
   

 So, we can see that the deer is actually using patches with a higher value of tree cover than the mongoose... our simulation worked! You can use the code of this post to perform a GIF like the one above.

 That's all I have to say about this for now... In the next posts we will simulate more animal movements and migrations!


PD: Let me show you a nice song about the mongoose...