ordisurf in R: when a straight envfit arrow lies about your gradient

Ordination

Gradient analysis

vegan

envfit draws a straight arrow that assumes one direction of increase. For a humped, unimodal variable that arrow points nowhere. Fit a smooth GAM surface with ordisurf() in R to recover the structure envfit misses.

Author

Tidy Ecology

Published

2026-06-21

The envfit and PERMANOVA tutorial fit environmental variables to an ordination as arrows. An arrow is a straight line: it says the variable increases steadily in one direction across the ordination and decreases in the opposite one. That is a strong assumption, and for many environmental variables it is wrong. Elevation, soil pH, and disturbance often have a unimodal relationship with composition: the community at the middle of the range differs from the community at either end, but the two ends resemble each other. There is no single direction of increase for an arrow to point along, and envfit quietly reports almost nothing.

ordisurf() is the fix. Instead of a vector, it fits a smooth surface, a generalized additive model (Wood 2017), over the ordination and draws it as contour lines. A surface can bend, peak, and curve, so it captures structure that a straight arrow cannot. This post fits both methods to the same ordination, with one monotone variable and one humped variable, and shows exactly where the arrow succeeds and where it fails.

Two variables, two shapes

The community sits on two latent gradients, the stronger of which drives most of the species turnover, so an NMDS recovers it as the first axis. Onto that we attach two environmental variables with deliberately different shapes. Moisture increases steadily along the primary gradient: monotone, the kind of variable an arrow is built for. Elevation peaks in the middle of the gradient and falls off toward both ends: a symmetric hump.

library(vegan)
library(mgcv)
library(ggplot2)

set.seed(24)

n  <- 80
g1 <- runif(n, -2, 2)       # primary gradient: most turnover
g2 <- runif(n, -1.2, 1.2)   # weaker secondary gradient

# 18 species with 2D Gaussian niches
n_sp <- 18
o1 <- runif(n_sp, -2.2, 2.2); o2 <- runif(n_sp, -1.3, 1.3)
w1 <- 0.9; w2 <- 0.8; peak <- 22
lambda <- matrix(0, n, n_sp)
for (j in seq_len(n_sp))
  lambda[, j] <- peak * exp(-((g1 - o1[j])^2)/(2*w1^2) - ((g2 - o2[j])^2)/(2*w2^2))
comm <- matrix(rpois(length(lambda), lambda), nrow = n)
colnames(comm) <- paste0("sp", sprintf("%02d", seq_len(n_sp)))

# moisture: monotone in the primary gradient (an arrow should work)
moisture  <- 50 + 12 * g1 + rnorm(n, 0, 6)
# elevation: humped in the primary gradient (an arrow should fail)
elevation <- 400 - 70 * g1^2 + rnorm(n, 0, 20)

c(moisture_vs_g1 = cor(moisture, g1), elevation_vs_g1 = cor(elevation, g1))

 moisture_vs_g1 elevation_vs_g1 
     0.91568791      0.06258176

The two correlations with the underlying gradient tell the whole story in advance. Moisture correlates near 0.9 with the primary gradient. Elevation correlates near zero, not because it is unrelated but because a symmetric hump has no linear trend: every rise on the left is cancelled by a matching fall on the right. A method that only looks for linear trend will see moisture clearly and elevation not at all.

The ordination and the linear fit

Run the NMDS, then fit both variables with envfit.

set.seed(7)
mds <- metaMDS(comm, distance = "bray", k = 2, trymax = 100,
               autotransform = FALSE, trace = FALSE)

set.seed(42)
ef <- envfit(mds, data.frame(moisture = moisture, elevation = elevation),
             permutations = 999)
ef


***VECTORS

             NMDS1    NMDS2     r2 Pr(>r)    
moisture   0.99939 -0.03496 0.8142  0.001 ***
elevation  0.21798  0.97595 0.0303  0.310    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Permutation: free
Number of permutations: 999

The stress is low, so the two-dimensional picture is trustworthy. The envfit table splits the two variables cleanly. Moisture gets a large r-squared (near 0.81) and a significant p-value: the arrow is real and worth drawing. Elevation gets an r-squared near 0.03 and a non-significant p-value. Taken at face value, envfit says elevation has no relationship with the ordination. That conclusion is wrong, and the next step shows why.

The smooth surface

ordisurf() fits a GAM smoother over the ordination scores and returns the fitted surface. Call it with plot = FALSE to keep the object without drawing the base graphic, then read the model summary.

os_moist <- ordisurf(mds ~ moisture,  plot = FALSE)
os_elev  <- ordisurf(mds ~ elevation, plot = FALSE)

summary(os_elev)


Family: gaussian 
Link function: identity 

Formula:
y ~ s(x1, x2, k = 10, bs = "tp", fx = FALSE)

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  309.500      3.546   87.28   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
           edf Ref.df     F p-value    
s(x1,x2) 6.757      9 48.54  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.847   Deviance explained =   86%
-REML = 400.97  Scale est. = 1006      n = 80

Look at the deviance explained for elevation: about 86%. The smooth term is significant at p below 0.001. The variable that envfit called nothing explains the large majority of the surface once you let the fit bend. Putting the two methods side by side makes the disagreement concrete.

sm_m <- summary(os_moist); sm_e <- summary(os_elev)
data.frame(
  variable          = c("moisture", "elevation"),
  shape             = c("monotone", "humped"),
  envfit_r2         = round(ef$vectors$r, 3),
  envfit_p          = ef$vectors$pvals,
  ordisurf_dev_expl = round(100 * c(sm_m$dev.expl, sm_e$dev.expl), 1),
  ordisurf_edf      = round(c(sm_m$s.table[1, "edf"], sm_e$s.table[1, "edf"]), 2),
  row.names = NULL
)

   variable    shape envfit_r2 envfit_p ordisurf_dev_expl ordisurf_edf
1  moisture monotone     0.814    0.001              85.6         6.65
2 elevation   humped     0.030    0.310              86.0         6.76

For moisture the two methods agree: high envfit r-squared, high ordisurf deviance explained. The arrow and the surface tell the same story because the variable really does climb in one direction. For elevation they part company completely: envfit near zero, ordisurf near 86%. Notice that both surfaces carry a high effective degrees of freedom (edf near 6.7), not just the humped one. That is worth understanding: the NMDS embedding itself is curved, so even a monotone variable needs a slightly wavy surface to fit well in ordination space. A high edf flags that a straight arrow is a simplification; it does not by itself tell you the arrow will fail. What makes the arrow fail for elevation specifically is that the response is non-monotone, so no single direction summarises it.

Drawing the surface

To put the surface in a ggplot rather than the base-graphics default, pull the fitted grid out of the ordisurf object. One trap worth flagging: ordiArrowMul(), the helper that scales envfit arrows to the plot, returns Inf when there is no active base plot, so under ggplot you scale the arrow by hand instead.

# tidy the fitted surface grid
surf_df <- function(os) {
  g <- expand.grid(NMDS1 = os$grid$x, NMDS2 = os$grid$y)
  g$z <- as.vector(os$grid$z)
  g[!is.na(g$z), ]
}
site <- as.data.frame(scores(mds, display = "sites"))

# manual arrow scaling (ordiArrowMul returns Inf under ggplot)
arr <- as.data.frame(scores(ef, "vectors")); arr$var <- rownames(arr)
mul <- 0.85 * max(abs(site[, c("NMDS1", "NMDS2")])) /
              max(sqrt(rowSums(arr[, 1:2]^2)))
arr_e <- arr[arr$var == "elevation", ]

ggplot() +
  geom_contour_filled(data = surf_df(os_elev), aes(NMDS1, NMDS2, z = z),
                      alpha = 0.65, bins = 8) +
  geom_point(data = site, aes(NMDS1, NMDS2), colour = "#16241d",
             size = 1.7, alpha = 0.7) +
  geom_segment(data = arr_e, aes(0, 0, xend = NMDS1*mul, yend = NMDS2*mul),
               arrow = arrow(length = unit(0.22, "cm")),
               linewidth = 1, colour = "#b5534e") +
  geom_text(data = arr_e, aes(NMDS1*mul, NMDS2*mul, label = "envfit arrow"),
            colour = "#b5534e", vjust = -0.5, size = 3.4) +
  scale_fill_brewer(palette = "YlGn", name = "elevation\n(fitted)") +
  labs(x = "NMDS1", y = "NMDS2") +
  theme_minimal(base_size = 13) +
  theme(panel.grid.minor = element_blank())

Figure 1: Elevation as an ordisurf surface. The contours reveal a central hump that the short, non-significant envfit arrow completely misses.

The contours make the hump obvious: fitted elevation is highest through the centre of the first axis and drops toward both edges. The envfit arrow is short (its length scales with r-squared) and points off at an angle that means nothing, because the linear fit found no real direction. If you had only run envfit, you would have dropped elevation from the analysis and missed one of the strongest patterns in the data.

For moisture the same plot would show contours running across the ordination, perpendicular to an arrow that points firmly along the first axis: the two methods drawing the same conclusion. The contrast between the two variables is the lesson. envfit and ordisurf are not competitors where one is always better; they encode different assumptions, and the gap between them is itself diagnostic.

When to reach for ordisurf

Use envfit for a quick scan and for variables you expect to change monotonically across the gradient; it is fast, and its arrows are easy to read. But treat a non-significant envfit result as a question, not an answer, especially for variables like elevation, pH, depth, or disturbance that ecology gives good reason to expect peak somewhere in the middle of their range (ter Braak and Prentice 1988). Fit those with ordisurf. When the arrow and the surface agree, the arrow was a fair summary. When they disagree, the surface is almost always the one telling the truth, because it did not have to assume a direction that was not there.

References

ter Braak, C.J.F. & Prentice, I.C. (1988). A theory of gradient analysis. Advances in Ecological Research, 18, 271-317. https://doi.org/10.1016/S0065-2504(08)60183-X

Wood, S.N. (2017). Generalized Additive Models: An Introduction with R, 2nd edition. Chapman & Hall/CRC, Boca Raton. https://doi.org/10.1201/9781315370279