lunes, 23 de julio de 2018

Antipredator behavior with R (or why wildebeest should stay together)

Hi there!

 In 2010, when I was studying my Biology Degree at Universidad Complutense in Madrid, I fell in love with a documentary miniseries called Great Migrations (National Geographic). Their episodes talk about awesome migrations of animals around the globe. One of these chapters is “Great Migrations: Science Of Migrations”. It shows how scientists study the patterns and processes of animal migration. One of these researchers is John Fryxell, from Gelph University in Canada. John explains how mathematical models can help to understand movement patterns. Through simulations, he shows how wildebeests maintaining a clustered movement pattern can more effectively avoid to be hunted by a virtual lion (here is the Spanish version in YouTube). I have to confess that each time I see those images, I think that I would be completely fulfilled with this kind of jobs.

I modified a little bit my previous “go” function to reproduce the experiment shown by John. This is a small tribute to this documentary series and to all those people who inspire and motivate us and arouse our curiosity, as John.

You can find the code to perform these simulations with R below. To perform the gif of this post, you can follow this procedures. And if you want to be updated with my posts, find me on my new Twitter account @javi_ferlop

That is all, enjoy!

 # First, we prepare packages and raster layers to be used later. We also load  
 # the species class to define the hunter inital location and his environmental  
 # optimum (grass cover in this case), as in previous posts.  
 library (raster)  
 gc <- raster(nrows = 100, ncols = 100, xmn = 0, xmx = 100, ymn = 0, ymx = 100)  
 gc[] <- runif(10000, -80, 180)  
 gc <- focal(gc, w = matrix(1, 5, 5), mean)  
 gc <- focal(gc, w = matrix(1, 5, 5), mean)  
 species <- setClass("species", slots = c(x = "numeric", y = "numeric",   
                      opt = "numeric"))  
 # Then we will load the hunt function. This is pretty similar to my previous  
 # "go" function. However, I added a "preys" attribute, a set of locations  
 # of available preys. After each movement of the hunter, the function checks  
 # if the hunter is in the same cell as any prey. If yes, it records this  
 # location and delete this prey from the available preys. The output is a 4  
 # column dataframe: the first two are the location of the hunter at each  
 # iteration. The last two show if there is any capture at this iteration.   
 hunt <- function (sp, env, n, sigma, theta_x, alpha_x, theta_y, alpha_y, preys) {  
  track <- data.frame() #create an empty data.frame to store each step coordinates  
  track[1,1] <- sp@x  #the first position will be the initial position of the specimen defined by the species class  
  track[1,2] <- sp@y  #the first position will be the initial position of the specimen defined by the species class  
  hunted <- data.frame()#create an empty data.frame to store hunted preys  
  for (step in 2:n) {  
   #update preys locations   
   prey_cells <- cellFromXY(env, as.matrix(preys + step))  
   #First, the function searches in the adjacent cells what have the value that best fits with the species requirements  
   #randomization/atraction routine  
   while (, matrix(c(lon_candidate,lat_candidate),1,2)))) {  
    lon_candidate <- track[step-1,1]+ (sigma * rnorm(1)) + (alpha_x * ( theta_x - track[step-1,1]))  
    lat_candidate <- track[step-1,2]+ (sigma * rnorm(1)) + (alpha_y * ( theta_y - track[step-1,2]))  
   neig <- adjacent(env,   
            cellFromXY(env, matrix(c(lon_candidate,  
                        lat_candidate), 1,2)),   
            directions=8, pairs=FALSE )  
   options <- data.frame()  
   for (i in 1:length(neig)){  
    options[i,2]<- sp@opt - env[neig[i]]  
   option <- c(na.omit(options[abs(options$V2) == min(abs(na.omit(options$V2))), 1 ]),   
         na.omit(options[abs(options$V2) == min(abs(na.omit(options$V2))), 1 ]))  
   new_cell <- sample(option,1)  
   new_coords <- xyFromCell(env,new_cell)  
   track[step,1] <- new_coords[1]  
   track[step,2] <- new_coords[2]  
   if (new_cell %in% prey_cells) {  
    hunted[step,1] <- new_coords[1]  
    hunted[step,2] <- new_coords[2]  
    preys <- preys[-match(new_cell,prey_cells),]  
   else {  
    hunted[step,1] <- NA  
    hunted[step,2] <- NA  
 # Later we define our predator, in this case, a lion! The inital possition will  
 # be at the middle of the space (50,50). We will also select the environmental  
 # optimum of our lion, 60% of grass cover (our lion hates bare ground xD)  
 lion <- species(x= 50, y =50, opt= 60)  
 # Now, we will set the locations of a bunch of wildebeests using "randomPoints"   
 # from "dismo" package. As our wildebeests will be running around, we will   
 # initially locate them at the bottom left.  
 r <- raster(nrows=500, ncols=500, xmn=-500, xmx=0, ymn=-500,ymx=0)  
 wb <- randomPoints(r, 10000)  
 # Here we have our wildebeests ready to run!  
 # Let start the hunting!   
 hunting <- hunt(lion, gc, 600, 2, 0, 0, 0, 0, wb)  
 # In "hunting" object is stored each location of our lion during the 600   
 # iterations, as well as each time she catches a wildebeest. To see only the   
 # catches, you can use:  
 # Pretty awesome, right? But, what happens if we group our wildebeests in   
 # clusters? To do it, we will define another small function that creates   
 # clusters of points:  
 clus <- function(n,m1,m2){  
  for (i in 1:n){  
   a <- rnorm(100,m1[i],1)  
   b <- rnorm(100,m2[i],1)  
   df <- rbind(df,data.frame(a,b))  
 wb_clustered <- clus(100, runif(100,-500,0),runif(100,-500,0))  
 # Look at the wildebeests locations now!  
 # Ok, let's go again!  
 hunting_clustered <- hunt(lion, gc, 600, 2, 0, 0, 0, 0, wb_clustered)  
 # Let's chek how many catches he have now:  
 # And comparing both strategies...  
 # ...looks like wildebeests should stay together!!  
 # You can run these simulations n times to see if the results are robust.  
 # Finally we can print each iteration to make a cool gif ;)  
 for (i in 1:600) {  
  id <- sprintf("%03d", i)  
  png(paste("lion1_",id,".png", sep=""), width=1300, height=690, units="px",  
    pointsize = 16)  
  plot(gc, xlim = c(6,94), ylim = c(6,94), main="random", legend=FALSE)  
  points(hunting[i,1:2], cex=1.5, col="red", pch = 0, lwd =3)  
  points(hunting[1:i,3:4], cex=1, col="blue", pch = 3, lwd =3)  
  points (wb +i, pch = 20)  
  plot(gc, xlim = c(6,94), ylim = c(6,94), main="clustered", legend=FALSE)  
  points(hunting_clustered[i,1:2], cex=1.5, col="red", pch = 0, lwd =3)  
  points(hunting_clustered[1:i,3:4], cex=1, col="blue", pch = 3, lwd =3)  
  points (wb_clustered +i, pch = 20)  

miércoles, 4 de julio de 2018

New rWind release on CRAN! (v1.0.2)

Hi there!

Just a few lines to inform you about the new release of rWind R package (v1.0.2). This version have several new features. Here you have an example of one of them, the function wind.dl_2 to download time series of wind data. In this example we create a bunch of PNG files to be converted later into a GIF of wind speed of east Asia region.


 # We will create a gif of wind speed for east Asia during the end of June (2018)  
 # First, we should load the packages that we will use. Use install.packages() to   
 # download and install the new version of rWind (v1.0.2)  
 # Now, we use lubridate package to create a sequence of dates/times (each three  
 # hours)  
 dt <- seq(ymd_hms(paste(2018,6,25,00,00,00, sep="-")),  
      ymd_hms(paste(2018,7,4,21,00,00, sep="-")),by="3 hours")  
 # Now, we use the new function wind.dl_2 to download the whole time series of  
 # wind data. We use the "dt" object created with lubridate to provide the input   
 # to wind.dl_2. Since it's a large area and many days, it could take a while...  
 wind_series <- wind.dl_2(dt,90,150,5,40)  
 # Next, we can use wind2raster function from rWind package directly over the   
 # output from wind.dl_2, it has been adapted to work with this lists of wind  
 # data.  
 wind_series_layer <- wind2raster(wind_series)  
 # Finally, we will create a bunch of PNG files to be converted later in a GIF  
 for (i in 1:72) {  
  id <- sprintf("%03d", i)  
  png(paste("asia",id,".png", sep=""), width=1000, height=600, units="px",  
  image.plot(wind_series_layer[[i]]$wind.speed, col=bpy.colors(1000),  
        zlim=c(0,18), main =wind_series[[i]]$time[1])  
  lines(getMap(resolution = "low"), lwd=3)  

martes, 1 de mayo de 2018

Simulating animal movements and habitat use

 Hi there!

 I was training some ways to simulate animal (or other organisms) movements having into account habitat suitability. To do this, I used my previous eWalk model as the underlying process to simulate random or directional walks. This model is based on Brownian / Ornstein–Uhlenbeck process. You can find more about eWalk model here!

 Today, I will add one more element to this movement simulations. In this case, we will have into account the habitat or environmental preferences of the simulated species, to perform a simulation like this:

 First, we will create a raster layer as a random environmental variable, for example tree cover.

 library (raster)  
 library (dismo)  
 tc <- raster(nrows=100, ncols=100, xmn=0, xmx=100, ymn=0,ymx=100)  
 tc[] <- runif(10000, -80, 180)  
 tc <- focal(tc, w=matrix(1, 5, 5), mean)  
 tc <- focal(tc, w=matrix(1, 5, 5), mean)  

 Second, we will define the species class. The species will be defined by their position (coordinates), and their optimum for the environmental variable.

 species <- setClass("species", slots=c(x="numeric", y="numeric", opt="numeric"))  

 Here we will define the Red deer species as a specimen in the coordinates (50,50) and an optimum of 80 for the environmental variable (tree cover). In the same way, we will define the Egyptian mongoose as a specimen in the coordinates (50,50) and an optimum of 30 for the tree cover variable.

 Red_deer <- species(x= 50, y =50, opt= 90)  
 Egyptian_mongoose <- species(x= 50, y =50, opt= 30)  

 Now, we will load the "go" function (I do not have a name for it yet). It require a species (sp), a raster layer with any environmental variable (env), number of iterations (n), a Brownian motion parameter (that is, how random is the movement of your species), a geographical optimum (the wanted destination of your species theta_x and theta_y), and the attraction strength or "interest" of the species to get this position (alpha_x and alpha_y). The syntaxis should be something like this:

 path <- go (sp, env, n, sigma, theta_x, alpha_x, theta_y, alpha_y)  

 Here is the function to load (I will comment the function in a future post):

 go <- function (sp, env, n, sigma, theta_x, alpha_x, theta_y, alpha_y) {  
  track <- data.frame()  
  track[1,1] <- sp@x  
  track[1,2] <- sp@y  
  for (step in 2:n) {  
   neig <- adjacent(env,   
            cellFromXY(env, matrix(c(track[step-1,1],  
                        track[step-1,2]), 1,2)),   
            directions=8, pairs=FALSE )  
   options <- data.frame()  
   for (i in 1:length(neig)){  
    options[i,2]<- sp@opt - env[neig[i]]  
   option <- c(options[abs(na.omit(options$V2)) == min(abs(na.omit(options$V2))), 1 ],   
         options[abs(na.omit(options$V2)) == min(abs(na.omit(options$V2))), 1 ])  
   new_cell <- sample(option,1)  
   new_coords <- xyFromCell(env,new_cell)  
   while (, matrix(c(lon_candidate,lat_candidate),1,2)))) {  
    lon_candidate <- new_coords[1]+ (sigma * rnorm(1)) + (alpha_x * ( theta_x - new_coords[1]))  
    lat_candidate <- new_coords[2]+ (sigma * rnorm(1)) + (alpha_y * ( theta_y - new_coords[2]))  
   track[step,1] <- lon_candidate  
   track[step,2] <- lat_candidate  

 Well, now we can perform a simple experiment with our two specimens. We will simulate random movement of these two species having into account their environmental optimums. The "go" function will return us the track or the path followed by each specimen (coordinates by each step).

 deer_simul <- go (Red_deer, tc, 100, 2, 90, 0, 90, 0)  
 mongoose_simul <- go (Egyptian_mongoose, tc, 100, 2, 90, 0, 90, 0)  

 We can plot the paths...

 lines(deer_simul, lwd=1.5, col="red")  
 points(deer_simul, cex=0.3, col="red")  
 lines(mongoose_simul, lwd=1.5, col="blue")  
 points(mongoose_simul, cex=0.3, col="blue")  
 legend("topleft", legend=c("deer","mongoose"), col=c("red","blue"),  
     lty=c(1,1), lwd=c(2,2))  

 To test if each species is actually "searching" their environmental optimum, we can extract the environmental values by step for each species and plot their density distributions.

 plot(density(extract(tc, deer_simul)),lwd=3, col="red", xlim=c(20,80),   
    ylim=c(0,max(c(density(extract(tc, deer_simul))$y,  
           density(extract(tc, mongoose_simul))$y))),  
    main="locations density distribution", xlab="tree cover")  
 lines(density(extract(tc, mongoose_simul)),lwd=3, col="blue")  
 legend("topleft", legend=c("deer","mongoose"), col=c("red","blue"),  
     lty=c(1,1), lwd=c(3,3))  

 So, we can see that the deer is actually using patches with a higher value of tree cover than the mongoose... our simulation worked! You can use the code of this post to perform a GIF like the one above.

 That's all I have to say about this for now... In the next posts we will simulate more animal movements and migrations!

PD: Let me show you a nice song about the mongoose...

miércoles, 14 de marzo de 2018

Nice ggplot with sad data: something happens with women in science

   Last March 8, millions of women in more than 170 countries around the world joined street protests calling for "a society free of sexist oppression, exploitation and violence". Spanish strike was one of the most numerous, where around 5.3 million women joined the strike. Today, I will use some data (download) regarding the scientific career in Spain (2016) to make a plot using ggplot2 and discuss about it. Here you have the graph and the code:

 dat <- read.csv("wis.csv")  
 dat$pos <-  
  factor(dat$pos, levels = c("pre", "post", "ryc", "ct", "ic" , "pi"))  
 ggplot(dat, aes(x = pos, y = per)) +  
  geom_line(size = 3, aes(group = sex, color = sex)) +  
  geom_point() +  
  geom_text(aes(label = round(per * 100, 1)), vjust = -1, size = 4) +  
   title = "Women in Science \n Spanish National Research Council",  
   x = "research stage",  
   y = "sex ratio",  
   color = " \n",  
   subtitle = "(2016)"  
  ) +  
  scale_color_manual(labels = c("women", "men"),  
            values = c("purple", "orange")) +  
   breaks = c("pre", "post", "ryc", "ct", "ic" , "pi"),  
   labels = c(  
    "Ramón y Cajal",  
    "Profesor de\nInvestigación"  
  ) +  
  scale_y_continuous(labels = percent) +  
   text = element_text(size = 15),  
   legend.text = element_text(size = 15),  
   axis.text.x = element_text(face = "bold", size = 11, angle = 30),  
   axis.title = element_text(size = 14, face = "bold"),  
   plot.title = element_text(hjust = 0.5, face = "bold", size = 20),  
   plot.subtitle = element_text(hjust = 0.5, size = 19)  

In X axis, we can see the stages at the Spanish National Research Council (known in Spanish as CSIC) positions, from young PhD candidates (left - “Predoctoral”) to the highest level, Research Professor (right - “Profesor de Investigación”). It’s easy to understand: in Spain, there are more women starting a scientific career than men, but only few of them get to the highest positions in CSIC.

We usually call this the “scissors graph”, since the two curves soon cross each other changing the trend shown at the beginning. For the last stages this imbalance between males and females can be due to the age (mean age of Research Professor in CSIC is around 58 years old, and things have changed a lot in Spain since 30 years ago). However, the graph shows that for the earlier stages this imbalance is still being a problem: just from PhD candidate stage to first postdoctoral position, women lose around 15% positions in favor of men. And this is not only happening in Spain. Only active politics and special measures orientated to these first stages could change the trend for the highest positions in the future!

If you want to know a little bit more about women in science in Spain, here you have a poster made by some friends from the Ecology Department in Universidad de Alcalá.

sábado, 9 de diciembre de 2017

Brownian Motion GIF with R and ImageMagick

Hi there!

Last Monday we celebrated a “Scientific Marathon” at Royal Botanic Garden in Madrid, a kind of mini-conference to talk about our research. I was talking about the relation between fungal spore size and environmental variables such as temperature and precipitation. To make my presentation more friendly, I created a GIF to explain the Brownian Motion model. In evolutionary biology, we can use this model to simulate the random variation of a continuous trait through time. Under this model, we can notice how closer species tend to maintain closer trait values due to shared evolutionary history. You have a lot of information about Brownian Motion models in evolutionary biology everywhere!
Here I will show you how I built a GIF to explain Brownian Motion in my talk using R and ImageMagick.

 # First, we simulate continuous trait evolution by adding in each iteration  
 # a random number from a normal distribution with mean equal to 0 and standard  
 # deviation equal to 1. We simulate a total of 4 processes, to obtain at first  
 # two species and a specieation event at the middle of the simulation, obtaining  
 # a total of 3 species at the end.  
 df1<- data.frame(0,0)  
 names(df1)<- c("Y","X")  
 for (g in 1:750){  
 df1[g,2] <- g  
 df1[g,1] <- y  
 y <- y + rnorm(1,0,1)  
 #plot(df1$X,df1$Y, ylim=c(-100,100), xlim=c(0,1500), cex=0)  
 #lines(df1$X,df1$Y, col="red")  
 df2<- data.frame(0,0)  
 names(df2)<- c("Y","X")  
 for (g in 1:1500){  
  df2[g,2] <- g  
  df2[g,1] <- y  
  y <- y + rnorm(1,0,1)  
 #lines(df2$X,df2$Y, col="blue")  
 df3<- data.frame(750,df1[750,1])  
 names(df3)<- c("Y","X")  
 for (g in 750:1500){  
  df3[g-749,2] <- g  
  df3[g-749,1] <- y  
  y <- y + rnorm(1,0,1)  
 #lines(df3$X,df3$Y, col="green")  
 df4<- data.frame(750,df1[750,1])  
 names(df4)<- c("Y","X")  
 for (g in 750:1500){  
  df4[g-749,2] <- g  
  df4[g-749,1] <- y  
  y <- y + rnorm(1,0,1)  
 #lines(df4$X,df4$Y, col="orange")  

 # Now, we have to plot each simmulation lapse and store them in our computer.  
 # I added some code to make lighter the gif (plotting just odd generations) and   
 # to add a label at the speciation time. Note that, since Brownan Model is a   
 # stocasthic process, my simulation will be different from yours.  
 # You should adjust labels or repeat the simulation process if you don't   
 # like the shape of your plot.  
 parp<-rep(0:1, times=7, each= 15)  
 parp<- c(parp, rep(0, 600))  
 for (q in 1:750){  
  if ( q %% 2 == 1) {  
  id <- sprintf("%04d", q+749)  
  png(paste("bm",id,".png", sep=""), width=900, height=570, units="px",   
  par(omd = c(.05, 1, .05, 1))  
  plot(df1$X,df1$Y, ylim=c(-70,70), xlim=c(0,1500), cex=0,   
     main=paste("Brownian motion model \n generation=", 749 + q) ,   
     xlab="generations", ylab="trait value", font.lab=2, cex.lab=1.5 )  
 lines(df1$X,df1$Y, col="red", lwd=4)  
 lines(df2$X[1:(q+749)],df2$Y[1:(q+749)], col="blue", lwd=4)  
 lines(df3$X[1:q],df3$Y[1:q], col="green", lwd=4)  
 lines(df4$X[1:q],df4$Y[1:q], col="orange", lwd=4)  
 if (parp[q]==0)  
 text(750, 65,labels="speciation event", cex= 1.5, col="black", font=2)  
 if (parp[q]==0)  
 arrows(750, 60, 750, 35, length = 0.20, angle = 30, lwd = 3)  

 Now, you just have to use ImageMagick to put all the PNG files together in a GIF using a command like this in a terminal:

 convert -delay 10 *.png bm.gif  

Et voilà!