Start here: ecological data analysis in R

A guided reading order for ecological data analysis in R and QGIS: diversity, ordination, GLMs, mixed models and spatial methods, from first steps onward.

This page is a reading order, not a feed. The tutorials below are grouped by the stage of a typical analysis, so you can follow a path instead of picking posts at random. Each section is roughly self-contained; jump to the stage you need, or read top to bottom.

If you are new to R for ecology, start with the foundations, then move to whichever data type you work with: community tables (diversity, ordination), counts and presence-absence (GLMs), grouped or repeated measurements (mixed models), or spatial layers (GIS).

Foundations

Get the basics of estimation and a reproducible setup in place before modelling anything.

Standard errors and confidence intervals - what SE actually measures, t-based CIs for a mean, and why coverage matters.
Bootstrap confidence intervals - intervals when there is no formula: percentile and BCa, with worked ecological cases.
A reproducible statistical workflow - where to put set.seed, version pinning with sessionInfo, relative paths, and clean-session rendering.

Diversity and community description

Summarise what is in your samples: richness, evenness, and the structure of an assemblage.

Diversity indices in R - richness, Shannon and Simpson from a site-by-species matrix.
When not to use Shannon - why one number hides richness and evenness, and what Hill numbers give you instead.
Rarefaction and accumulation curves - comparing richness fairly across uneven sampling effort.
Species abundance distributions - rank-abundance and fitting SAD models.
Functional diversity - diversity from continuous and categorical traits.
Phylogenetic diversity with picante - Faith PD and mean pairwise distance.
Beta diversity partitioning - splitting turnover from nestedness in the Baselga framework.

Ordination and multivariate structure

Explore gradients and group differences in multivariate community data.

Choosing a dissimilarity index - the decision that sits underneath every ordination and PERMANOVA.
NMDS ordination - non-metric ordination of community composition.
PCA on environmental data - ordination for abiotic variables, with standardisation.
Constrained ordination with dbRDA - relating composition to measured predictors.
capscale vs dbRDA - the two distance-based constrained ordinations in vegan.
RDA vs CCA and gradient length - choosing a linear or unimodal method from DCA gradient length.
Variance partitioning - how much variation environment and space each explain.
Non-linear gradients with ordisurf - a smooth GAM surface where a straight arrow would mislead.
envfit and PERMANOVA - fitting vectors and testing group differences with adonis2 and betadisper.
Pairwise PERMANOVA - which groups differ after a significant omnibus test.
Common PERMANOVA mistakes - dispersion vs location, permutation structure, and unbalanced designs.
Hierarchical clustering and dendrograms - grouping sites from a Bray-Curtis distance.
Mantel tests - correlating two distance matrices.
Indicator species analysis - which species characterise which groups.
Co-occurrence null models - testing whether species associations differ from random.

Regression, GLMs and the modelling workflow

Model how a response depends on predictors, for continuous, count and presence-absence data.

Classical tests as linear models - the t-test and ANOVA seen as one framework.
Contrasts and post-hoc tests - which groups differ after a significant ANOVA.
Logistic regression for presence-absence - binomial GLMs with a logit link.
GLMs for count data - why not to log-transform counts, and what to fit instead.
Zero-inflated count models - ZINB and hurdle models for excess zeros.
Offsets for rates and densities - modelling per-effort rates correctly in a Poisson GLM.
GAM species response curves - smooth response shapes with mgcv.
GLM residual diagnostics - why Pearson and deviance residuals mislead, and what to check.
Predicting on the response scale - building CIs on the link scale, then back-transforming.
Interaction terms in GLMs - reading an interaction through predictions, not coefficients.
Contrasts after a GLM - marginal means and pairwise contrasts with multiplicity correction.
Collinearity and VIF - how correlated predictors inflate standard errors.
Model selection and AIC - what AIC measures and how to use it.
Power analysis by simulation - estimating power when there is no closed-form formula.

Mixed models and correlated data

Handle grouping, pseudoreplication and correlation that ordinary regression ignores.

Pseudoreplication in ecology - why subsamples are not replicates, and what it does to your error rate.
GLMMs for nested counts - random intercepts for nested count data.
GLS for spatial correlation - correcting standard errors when residuals are spatially correlated.
Random slopes in mixed models - when a shared-slope model underestimates the fixed-effect SE.
Repeated measures and temporal correlation - corAR1 for measurements correlated over time.
Variance structure and heteroscedasticity - varIdent and varPower for non-constant variance.
Marginal vs conditional R-squared - separating what the predictors explain from what the grouping explains.

Spatial data and GIS

Work with coordinates, rasters and the link between QGIS and R.

Richness mapping with sf - spatial vector work and mapping in R.
Raster basics with terra - raster structure and map algebra.
QGIS to R spatial join - a hybrid workflow between the QGIS GUI and sf.
Spatial autocorrelation and Moran’s I - testing for spatial dependence with spdep.
Kriging and spatial interpolation - continuous surfaces from scattered point samples.

Communicating results

Publication-quality ggplot figures - physical size, resolution and clean export with ggsave.

Common pitfalls

A few posts above are dedicated to mistakes that are easy to make and hard to spot. If something looks wrong, these are the ones to reread: pseudoreplication, common PERMANOVA mistakes, collinearity and VIF, when not to use Shannon, GLM residual diagnostics, and offsets for rates and densities.