Home ranges in R: MCP versus kernel density

MASS

home range

movement ecology

ecology tutorial

Estimate animal home ranges in R with minimum convex polygons and kernel density. Why the 100% MCP inflates with sample size and how bandwidth drives the KDE.

Author

Tidy Ecology

Published

2026-07-09

A home range is the area an animal uses over some period. From a set of relocations (GPS fixes, VHF triangulations, resightings) the question is how to turn the point cloud into an area. Two estimators dominate the older literature and still anchor most workflows: the minimum convex polygon (MCP) and the kernel density estimator (KDE). They answer slightly different questions and they fail in different ways. This post builds both from base R plus MASS, on a synthetic bilobed range, and shows the two failure modes you have to watch: the MCP grows with sample size and chases outliers, while the KDE hands the whole result to one bandwidth choice.

A synthetic range with two centres and a few excursions

Real ranges are rarely a single blob. We simulate 200 relocations from a mixture of two activity centres, then add six long excursions to stand in for the occasional foray outside the core. Coordinates are in kilometres and the data are illustrative, not a real site.

library(MASS); library(ggplot2); library(dplyr)

set.seed(4809)
n_core <- 200
comp <- rbinom(n_core, 1, 0.40)
xA <- rnorm(sum(comp == 0), 0.0, 1.20); yA <- rnorm(sum(comp == 0), 0.0, 1.20)
xB <- rnorm(sum(comp == 1), 4.0, 0.90); yB <- rnorm(sum(comp == 1), 2.5, 0.90)
core <- data.frame(x = c(xA, xB), y = c(yA, yB))

n_exc <- 6
ang <- runif(n_exc, 0, 2 * pi); rad <- runif(n_exc, 7, 10)
exc <- data.frame(x = 2 + rad * cos(ang), y = 1 + rad * sin(ang))

loc <- rbind(core, exc)
loc$kind <- c(rep("relocation", n_core), rep("excursion", n_exc))
nrow(loc)

[1] 206

Minimum convex polygon

The MCP is the smallest convex polygon that contains a chosen fraction of the relocations. The 100% version is just the convex hull of every point; chull returns the hull vertices and the shoelace formula turns them into an area.

mcp_area <- function(x, y) {
  h <- chull(x, y); xh <- x[h]; yh <- y[h]
  0.5 * abs(sum(xh * c(yh[-1], yh[1]) - c(xh[-1], xh[1]) * yh))
}
a100 <- mcp_area(loc$x, loc$y)

# 95% MCP: drop the 5% of points farthest from the centroid
cx <- mean(loc$x); cy <- mean(loc$y)
d2 <- (loc$x - cx)^2 + (loc$y - cy)^2
keep <- d2 <= quantile(d2, 0.95)
a95mcp <- mcp_area(loc$x[keep], loc$y[keep])

round(c(mcp100 = a100, mcp95 = a95mcp, ratio = a100 / a95mcp), 2)

mcp100  mcp95  ratio 
136.54  35.63   3.83

The 100% hull covers 136.5 km squared; dropping the peripheral 5% cuts it to 35.6 km squared. A factor of almost four separates them, and the six excursions drive the gap. That sensitivity to a handful of extreme fixes is the first problem with the MCP: the estimate is defined by its most peripheral points, so one long foray can double the reported range. The 95% rule trims that, but the 5% threshold is a convention with no biological content.

The MCP grows with the number of relocations

The deeper problem is subtler. Because the convex hull can only ever expand as points are added, MCP area increases with sample size and does not settle to a stable value. We subsample the core relocations at increasing sizes and average the area over 60 random draws at each size, then do the same for a 95% kernel range for comparison.

iso_threshold <- function(z, cell, p) {
  zv <- sort(as.vector(z), decreasing = TRUE)
  zv[which(cumsum(zv) * cell >= p)[1]]
}
iso_area <- function(z, cell, p) sum(z >= iso_threshold(z, cell, p)) * cell

set.seed(7788)
ns <- seq(20, n_core, by = 20); R <- 60
xlim0 <- range(core$x) + c(-1, 1) * 2; ylim0 <- range(core$y) + c(-1, 1) * 2
acc <- lapply(ns, function(nn) {
  am <- ak <- numeric(R)
  for (r in seq_len(R)) {
    s <- core[sample(nrow(core), nn), ]
    am[r] <- mcp_area(s$x, s$y)
    kk <- kde2d(s$x, s$y, n = 80, lims = c(xlim0, ylim0))
    cc <- diff(kk$x[1:2]) * diff(kk$y[1:2])
    ak[r] <- iso_area(kk$z, cc, 0.95)
  }
  data.frame(n = nn, mcp = mean(am), kde = mean(ak))
})
acc <- do.call(rbind, acc)
round(c(mcp_n20 = acc$mcp[1], mcp_n200 = acc$mcp[10],
        kde_n20 = acc$kde[1], kde_n200 = acc$kde[10]), 1)

 mcp_n20 mcp_n200  kde_n20 kde_n200 
    19.1     38.2     53.1     45.9

From 20 to 200 relocations the mean MCP area climbs from 19.1 to 38.2 km squared, a doubling with no sign of levelling off. The kernel range moves the other way, from 53.1 to 45.9 km squared, and settles rather than inflating. This is why MCP areas from studies with different tracking effort are not comparable: more fixes mean a larger reported range, all else equal. Reviews of home-range methods have made this point repeatedly (Harris 1990; Powell 2000).

acc_long <- rbind(
  data.frame(n = acc$n, area = acc$mcp, estimator = "100% MCP"),
  data.frame(n = acc$n, area = acc$kde, estimator = "95% KDE"))
ggplot(acc_long, aes(n, area, colour = estimator)) +
  geom_line(linewidth = 0.8) + geom_point(size = 1.8) +
  scale_colour_manual(values = c("100% MCP" = te_red, "95% KDE" = te_forest), name = NULL) +
  labs(title = "Home-range area against number of relocations",
       x = "number of relocations", y = expression(area ~ (km^2))) +
  theme_te(13)

Two lines against sample size: the MCP line rises steadily from about 19 to 38, while the kernel line stays near 46 to 53 and does not increase. — Figure 1: Mean home-range area against the number of relocations, averaged over 60 random subsamples. The 100% MCP keeps climbing while the 95% kernel range settles.

Kernel density and volume-based isopleths

The KDE replaces the hard hull with a smooth utilisation distribution: a density surface over the plane, estimated by placing a kernel on each relocation and summing. MASS::kde2d does this on a grid, and by default it picks a bandwidth from the normal reference rule for each axis.

xr <- range(loc$x) + c(-1, 1) * 2; yr <- range(loc$y) + c(-1, 1) * 2
href <- c(bandwidth.nrd(loc$x), bandwidth.nrd(loc$y))
k <- kde2d(loc$x, loc$y, n = 120, lims = c(xr, yr), h = href)
cell <- diff(k$x[1:2]) * diff(k$y[1:2])
round(c(h_x = href[1], h_y = href[2], integral = sum(k$z) * cell), 3)

     h_x      h_y integral 
   3.429    2.614    1.000

To get a home range from a density surface you take an isopleth: the contour enclosing a given share of the total volume. The 95% isopleth is the smallest region holding 95% of the utilisation, and the 50% isopleth marks the core. Sort the grid densities from high to low, accumulate their volume, and read off the density level where the running total crosses the target.

thr95 <- iso_threshold(k$z, cell, 0.95); thr50 <- iso_threshold(k$z, cell, 0.50)
a95kde <- iso_area(k$z, cell, 0.95); a50kde <- iso_area(k$z, cell, 0.50)
round(c(kde95 = a95kde, kde50 = a50kde), 2)

kde95 kde50 
59.50 13.86

The 95% kernel range is 59.5 km squared and the 50% core is 13.9 km squared. Unlike the convex hull, the kernel surface can represent the gap between the two centres and the separate cores, which a single polygon cannot.

hull_idx <- chull(loc$x, loc$y)
hull_df  <- loc[c(hull_idx, hull_idx[1]), c("x", "y")]
grid_df  <- expand.grid(x = k$x, y = k$y); grid_df$z <- as.vector(k$z)

ggplot() +
  geom_point(data = loc, aes(x, y, shape = kind, colour = kind), size = 1.6, alpha = 0.8) +
  geom_path(data = hull_df, aes(x, y), colour = te_red, linewidth = 0.7) +
  geom_contour(data = grid_df, aes(x, y, z = z), breaks = thr95, colour = te_forest, linewidth = 0.8) +
  geom_contour(data = grid_df, aes(x, y, z = z), breaks = thr50, colour = te_gold, linewidth = 0.8) +
  scale_colour_manual(values = c(relocation = te_faint, excursion = te_red), name = NULL) +
  scale_shape_manual(values = c(relocation = 16, excursion = 4), name = NULL) +
  coord_equal() +
  labs(title = "Home range: 100% MCP versus kernel isopleths",
       x = "easting (km)", y = "northing (km)") +
  theme_te(13)

A bilobed cloud of relocation points with a red convex-polygon outline reaching out to scattered excursion crosses, a green kernel contour around both lobes, and gold contours on the two dense centres. — Figure 2: Relocations with the 100% MCP hull (red), the 95% kernel isopleth (green) and the 50% core (gold). Crosses mark the six excursions that stretch the hull.

Bandwidth is the choice that matters

The kernel result is only as good as the bandwidth. The normal reference rule assumes a single roughly Gaussian blob, so for a multimodal range it tends to oversmooth: the surface spreads across the gap and the isopleth inflates. Scaling the reference bandwidth up and down shows how much rides on it.

bw_area <- function(mult) {
  kk <- kde2d(loc$x, loc$y, n = 120, lims = c(xr, yr), h = href * mult)
  cc <- diff(kk$x[1:2]) * diff(kk$y[1:2])
  iso_area(kk$z, cc, 0.95)
}
round(sapply(c(0.6, 1.0, 1.5), bw_area), 2)

[1] 45.15 59.50 81.28

At 0.6 times the reference the 95% range is 45.2 km squared; at the reference it is 59.5; at 1.5 times it is 81.3. The estimate almost doubles across a plausible band of smoothing, with the data unchanged. This is the central trade-off: too small a bandwidth breaks the range into islands around individual fixes, too large a bandwidth smears it into one oversized blob. Least-squares cross-validation and plug-in selectors try to choose objectively, and their behaviour has been studied at length (Worton 1989; Seaman 1996), but no rule removes the judgement entirely.

Which to use

The MCP is quick, needs no tuning and is still the standard for a crude outer boundary, but its area depends on sample size and on the most extreme fixes, so it is a poor choice for comparing ranges across animals or studies with uneven effort. The KDE gives a proper utilisation distribution with a defensible core, handles multimodal ranges, and is far less sensitive to sample size, at the cost of a bandwidth decision that changes the answer. Report which estimator and, for the kernel, which bandwidth selector you used; without that, a home-range area is not reproducible. Both estimators also assume the relocations are an unbiased sample of use, which autocorrelated tracking data and gappy fix schedules can violate.

References

Mohr 1947 American Midland Naturalist 37(1):223-249 (10.2307/2421652)

Worton 1989 Ecology 70(1):164-168 (10.2307/1938423)

Harris, Cresswell, Forde, Trewhella, Woollard & Wray 1990 Mammal Review 20(2-3):97-123 (10.1111/j.1365-2907.1990.tb00106.x)

Seaman & Powell 1996 Ecology 77(7):2075-2085 (10.2307/2265701)

Powell 2000, in Research Techniques in Animal Ecology (Boitani & Fuller, eds), Columbia University Press, ISBN 978-0-231-11341-2

Silverman 1986 Density Estimation for Statistics and Data Analysis, Chapman and Hall, ISBN 978-0-412-24620-3

--- title: "Home ranges in R: MCP versus kernel density" description: "Estimate animal home ranges in R with minimum convex polygons and kernel density. Why the 100% MCP inflates with sample size and how bandwidth drives the KDE." date: "2026-07-09 09:00" categories: [R, MASS, home range, movement ecology, ecology tutorial] image: thumbnail.png image-alt: "A bilobed cloud of relocations with a 100% minimum convex polygon hull and 95% and 50% kernel density contours drawn over it." --- A home range is the area an animal uses over some period. From a set of relocations (GPS fixes, VHF triangulations, resightings) the question is how to turn the point cloud into an area. Two estimators dominate the older literature and still anchor most workflows: the minimum convex polygon (MCP) and the kernel density estimator (KDE). They answer slightly different questions and they fail in different ways. This post builds both from base R plus `MASS`, on a synthetic bilobed range, and shows the two failure modes you have to watch: the MCP grows with sample size and chases outliers, while the KDE hands the whole result to one bandwidth choice. ## A synthetic range with two centres and a few excursions Real ranges are rarely a single blob. We simulate 200 relocations from a mixture of two activity centres, then add six long excursions to stand in for the occasional foray outside the core. Coordinates are in kilometres and the data are illustrative, not a real site. ```{r} #| label: setup #| include: false te_ink <- "#16241d"; te_body <- "#2c3a31"; te_forest <- "#275139" te_label <- "#46604a"; te_sage <- "#93a87f"; te_paper <- "#f5f4ee" te_line <- "#dad9ca"; te_faint <- "#5d6b61"; te_gold <- "#cda23f" te_green <- "#2f8f63"; te_red <- "#b5534e" theme_te <- function(base_size = 12) { ggplot2::theme_minimal(base_size = base_size) + ggplot2::theme( text = ggplot2::element_text(colour = te_body), plot.title = ggplot2::element_text(colour = te_ink, face = "bold"), plot.subtitle = ggplot2::element_text(colour = te_faint), axis.title = ggplot2::element_text(colour = te_label), axis.text = ggplot2::element_text(colour = te_faint), panel.grid.major = ggplot2::element_line(colour = te_line, linewidth = 0.3), panel.grid.minor = ggplot2::element_blank(), plot.background = ggplot2::element_rect(fill = te_paper, colour = NA), panel.background = ggplot2::element_rect(fill = te_paper, colour = NA), legend.key = ggplot2::element_blank(), legend.title = ggplot2::element_text(colour = te_label), strip.text = ggplot2::element_text(colour = te_ink, face = "bold")) } ``` ```{r} #| label: data #| message: false library(MASS); library(ggplot2); library(dplyr) set.seed(4809) n_core <- 200 comp <- rbinom(n_core, 1, 0.40) xA <- rnorm(sum(comp == 0), 0.0, 1.20); yA <- rnorm(sum(comp == 0), 0.0, 1.20) xB <- rnorm(sum(comp == 1), 4.0, 0.90); yB <- rnorm(sum(comp == 1), 2.5, 0.90) core <- data.frame(x = c(xA, xB), y = c(yA, yB)) n_exc <- 6 ang <- runif(n_exc, 0, 2 * pi); rad <- runif(n_exc, 7, 10) exc <- data.frame(x = 2 + rad * cos(ang), y = 1 + rad * sin(ang)) loc <- rbind(core, exc) loc$kind <- c(rep("relocation", n_core), rep("excursion", n_exc)) nrow(loc) ``` ## Minimum convex polygon The MCP is the smallest convex polygon that contains a chosen fraction of the relocations. The 100% version is just the convex hull of every point; `chull` returns the hull vertices and the shoelace formula turns them into an area. ```{r} #| label: mcp mcp_area <- function(x, y) { h <- chull(x, y); xh <- x[h]; yh <- y[h] 0.5 * abs(sum(xh * c(yh[-1], yh[1]) - c(xh[-1], xh[1]) * yh)) } a100 <- mcp_area(loc$x, loc$y) # 95% MCP: drop the 5% of points farthest from the centroid cx <- mean(loc$x); cy <- mean(loc$y) d2 <- (loc$x - cx)^2 + (loc$y - cy)^2 keep <- d2 <= quantile(d2, 0.95) a95mcp <- mcp_area(loc$x[keep], loc$y[keep]) round(c(mcp100 = a100, mcp95 = a95mcp, ratio = a100 / a95mcp), 2) ``` The 100% hull covers 136.5 km squared; dropping the peripheral 5% cuts it to 35.6 km squared. A factor of almost four separates them, and the six excursions drive the gap. That sensitivity to a handful of extreme fixes is the first problem with the MCP: the estimate is defined by its most peripheral points, so one long foray can double the reported range. The 95% rule trims that, but the 5% threshold is a convention with no biological content. ## The MCP grows with the number of relocations The deeper problem is subtler. Because the convex hull can only ever expand as points are added, MCP area increases with sample size and does not settle to a stable value. We subsample the core relocations at increasing sizes and average the area over 60 random draws at each size, then do the same for a 95% kernel range for comparison. ```{r} #| label: accumulation iso_threshold <- function(z, cell, p) { zv <- sort(as.vector(z), decreasing = TRUE) zv[which(cumsum(zv) * cell >= p)[1]] } iso_area <- function(z, cell, p) sum(z >= iso_threshold(z, cell, p)) * cell set.seed(7788) ns <- seq(20, n_core, by = 20); R <- 60 xlim0 <- range(core$x) + c(-1, 1) * 2; ylim0 <- range(core$y) + c(-1, 1) * 2 acc <- lapply(ns, function(nn) { am <- ak <- numeric(R) for (r in seq_len(R)) { s <- core[sample(nrow(core), nn), ] am[r] <- mcp_area(s$x, s$y) kk <- kde2d(s$x, s$y, n = 80, lims = c(xlim0, ylim0)) cc <- diff(kk$x[1:2]) * diff(kk$y[1:2]) ak[r] <- iso_area(kk$z, cc, 0.95) } data.frame(n = nn, mcp = mean(am), kde = mean(ak)) }) acc <- do.call(rbind, acc) round(c(mcp_n20 = acc$mcp[1], mcp_n200 = acc$mcp[10], kde_n20 = acc$kde[1], kde_n200 = acc$kde[10]), 1) ``` From 20 to 200 relocations the mean MCP area climbs from 19.1 to 38.2 km squared, a doubling with no sign of levelling off. The kernel range moves the other way, from 53.1 to 45.9 km squared, and settles rather than inflating. This is why MCP areas from studies with different tracking effort are not comparable: more fixes mean a larger reported range, all else equal. Reviews of home-range methods have made this point repeatedly (Harris 1990; Powell 2000). ```{r} #| label: fig-accumulation #| fig-cap: "Mean home-range area against the number of relocations, averaged over 60 random subsamples. The 100% MCP keeps climbing while the 95% kernel range settles." #| fig-alt: "Two lines against sample size: the MCP line rises steadily from about 19 to 38, while the kernel line stays near 46 to 53 and does not increase." #| fig-width: 7 #| fig-height: 4.6 acc_long <- rbind( data.frame(n = acc$n, area = acc$mcp, estimator = "100% MCP"), data.frame(n = acc$n, area = acc$kde, estimator = "95% KDE")) ggplot(acc_long, aes(n, area, colour = estimator)) + geom_line(linewidth = 0.8) + geom_point(size = 1.8) + scale_colour_manual(values = c("100% MCP" = te_red, "95% KDE" = te_forest), name = NULL) + labs(title = "Home-range area against number of relocations", x = "number of relocations", y = expression(area ~ (km^2))) + theme_te(13) ``` ## Kernel density and volume-based isopleths The KDE replaces the hard hull with a smooth utilisation distribution: a density surface over the plane, estimated by placing a kernel on each relocation and summing. `MASS::kde2d` does this on a grid, and by default it picks a bandwidth from the normal reference rule for each axis. ```{r} #| label: kde xr <- range(loc$x) + c(-1, 1) * 2; yr <- range(loc$y) + c(-1, 1) * 2 href <- c(bandwidth.nrd(loc$x), bandwidth.nrd(loc$y)) k <- kde2d(loc$x, loc$y, n = 120, lims = c(xr, yr), h = href) cell <- diff(k$x[1:2]) * diff(k$y[1:2]) round(c(h_x = href[1], h_y = href[2], integral = sum(k$z) * cell), 3) ``` To get a home range from a density surface you take an isopleth: the contour enclosing a given share of the total volume. The 95% isopleth is the smallest region holding 95% of the utilisation, and the 50% isopleth marks the core. Sort the grid densities from high to low, accumulate their volume, and read off the density level where the running total crosses the target. ```{r} #| label: isopleth thr95 <- iso_threshold(k$z, cell, 0.95); thr50 <- iso_threshold(k$z, cell, 0.50) a95kde <- iso_area(k$z, cell, 0.95); a50kde <- iso_area(k$z, cell, 0.50) round(c(kde95 = a95kde, kde50 = a50kde), 2) ``` The 95% kernel range is 59.5 km squared and the 50% core is 13.9 km squared. Unlike the convex hull, the kernel surface can represent the gap between the two centres and the separate cores, which a single polygon cannot. ```{r} #| label: fig-overlay #| fig-cap: "Relocations with the 100% MCP hull (red), the 95% kernel isopleth (green) and the 50% core (gold). Crosses mark the six excursions that stretch the hull." #| fig-alt: "A bilobed cloud of relocation points with a red convex-polygon outline reaching out to scattered excursion crosses, a green kernel contour around both lobes, and gold contours on the two dense centres." #| fig-width: 7.2 #| fig-height: 5.2 hull_idx <- chull(loc$x, loc$y) hull_df <- loc[c(hull_idx, hull_idx[1]), c("x", "y")] grid_df <- expand.grid(x = k$x, y = k$y); grid_df$z <- as.vector(k$z) ggplot() + geom_point(data = loc, aes(x, y, shape = kind, colour = kind), size = 1.6, alpha = 0.8) + geom_path(data = hull_df, aes(x, y), colour = te_red, linewidth = 0.7) + geom_contour(data = grid_df, aes(x, y, z = z), breaks = thr95, colour = te_forest, linewidth = 0.8) + geom_contour(data = grid_df, aes(x, y, z = z), breaks = thr50, colour = te_gold, linewidth = 0.8) + scale_colour_manual(values = c(relocation = te_faint, excursion = te_red), name = NULL) + scale_shape_manual(values = c(relocation = 16, excursion = 4), name = NULL) + coord_equal() + labs(title = "Home range: 100% MCP versus kernel isopleths", x = "easting (km)", y = "northing (km)") + theme_te(13) ``` ## Bandwidth is the choice that matters The kernel result is only as good as the bandwidth. The normal reference rule assumes a single roughly Gaussian blob, so for a multimodal range it tends to oversmooth: the surface spreads across the gap and the isopleth inflates. Scaling the reference bandwidth up and down shows how much rides on it. ```{r} #| label: bandwidth bw_area <- function(mult) { kk <- kde2d(loc$x, loc$y, n = 120, lims = c(xr, yr), h = href * mult) cc <- diff(kk$x[1:2]) * diff(kk$y[1:2]) iso_area(kk$z, cc, 0.95) } round(sapply(c(0.6, 1.0, 1.5), bw_area), 2) ``` At 0.6 times the reference the 95% range is 45.2 km squared; at the reference it is 59.5; at 1.5 times it is 81.3. The estimate almost doubles across a plausible band of smoothing, with the data unchanged. This is the central trade-off: too small a bandwidth breaks the range into islands around individual fixes, too large a bandwidth smears it into one oversized blob. Least-squares cross-validation and plug-in selectors try to choose objectively, and their behaviour has been studied at length (Worton 1989; Seaman 1996), but no rule removes the judgement entirely. ## Which to use The MCP is quick, needs no tuning and is still the standard for a crude outer boundary, but its area depends on sample size and on the most extreme fixes, so it is a poor choice for comparing ranges across animals or studies with uneven effort. The KDE gives a proper utilisation distribution with a defensible core, handles multimodal ranges, and is far less sensitive to sample size, at the cost of a bandwidth decision that changes the answer. Report which estimator and, for the kernel, which bandwidth selector you used; without that, a home-range area is not reproducible. Both estimators also assume the relocations are an unbiased sample of use, which autocorrelated tracking data and gappy fix schedules can violate. ## References Mohr 1947 American Midland Naturalist 37(1):223-249 (10.2307/2421652) Worton 1989 Ecology 70(1):164-168 (10.2307/1938423) Harris, Cresswell, Forde, Trewhella, Woollard & Wray 1990 Mammal Review 20(2-3):97-123 (10.1111/j.1365-2907.1990.tb00106.x) Seaman & Powell 1996 Ecology 77(7):2075-2085 (10.2307/2265701) Powell 2000, in Research Techniques in Animal Ecology (Boitani & Fuller, eds), Columbia University Press, ISBN 978-0-231-11341-2 Silverman 1986 Density Estimation for Statistics and Data Analysis, Chapman and Hall, ISBN 978-0-412-24620-3 ## Related tutorials - [Mapping species richness with sf](../richness-mapping-sf/) - [Complete spatial randomness and quadrat tests](../complete-spatial-randomness-quadrat/) - [Step lengths and turning angles](../step-lengths-turning-angles/) - [Correlated random walks and net displacement](../correlated-random-walk/)