1 + 1
R
What is R and why use it
R is a programming language and software environment for statistical computing and graphics. It’s free, open-source, and has become a go-to tool for many scientists.
Why R? It’s great at handling the kind of data we often work with. R comes with powerful statistical tools and can create high-quality graphs. It’s also designed to make your work reproducible, which is crucial in science.
Another big plus is R’s active community. There are packages (libraries) for almost everything you might need to do, from analyzing satellite imagery to modeling species distributions.
Your first R command
Let’s try something simple. In the Console (bottom-left part of RStudio), type:
Hit Enter, and you should see:
[1] 2
There you go - you’ve run your first R command! This is the basic idea: you tell R to do something, and it gives you a result. We’ll build on this to do more complex things as we go along.
Basic R Concepts
At its simplest, R is nothing more than a calculator. You can perform basic arithmetic operations directly in the console:
5 + 3
[1] 8
10 - 4
[1] 6
6 * 7
[1] 42
15 / 3
[1] 5
2^3
[1] 8
R follows the standard order of operations (PEMDAS), allowing for more complex calculations:
(10 + 5) * 2
[1] 30
R Packages
R has a lot of built-in functions, but you can extend its capabilities by installing packages (some relevant ones are listed here1). These are collections of functions that others have written to do specific tasks. For example, the tidyverse
package includes a bunch of functions for data manipulation and visualization.
To install a package, you can use the install.packages()
function. For example, to install the tidyverse
package, you would run:
install.packages("tidyverse")
To load a package, you use the library()
function. For example, to load the tidyverse
package, you would run:
You will need to load the packages you want to use each time you start a new R script or R session.
note: ?functionname
parts of the function: call, arguments
Further Reading
If you want to explore more on your own:
- R for Data Science (https://r4ds.had.co.nz/ ) is a good, comprehensive intro to R - much content from this lab manual builds off of it.
- Cheatsheets are SUPER useful references: https://rstudio.github.io/cheatsheets/
- Swirl (https://swirlstats.com/) lets you learn R interactively, right in the R console.
-
Biodiversity Conservation and Ecological Analysis
Core Statistical Packages
vegan: Community ecology package for diversity analysis, ordination methods, and dissimilarity analyses.
MASS: Functions and datasets for Venables and Ripley’s “Modern Applied Statistics with S”, including robust regression and discriminant analysis.
car: Companion to Applied Regression, providing functions for regression diagnostics and variable selection.
lme4: Linear mixed-effects models, crucial for analyzing nested data structures common in ecology.
Data Manipulation and Visualization
tidyverse: Collection of R packages for data science, including ggplot2 for visualization and dplyr for data manipulation.
reshape2: Flexibly restructure and aggregate data.
lubridate: Functions to work with dates and times, useful for phenology studies.
sf: Simple features for R, handling spatial vector data.
Biodiversity Metrics
BiodiversityR: GUI for biodiversity and community ecology analysis.
entropart: Entropy partitioning to measure diversity.
FD: Measuring functional diversity from multiple traits.
betapart: Partitioning beta diversity into turnover and nestedness components.
Species Distribution and Niche Modeling
dismo: Species distribution modeling.
sdm: Species distribution modeling, integrating various modeling methods.
ENMeval: Evaluating and tuning ecological niche models.
biomod2: Ensemble platform for species distribution modeling.
Phylogenetic Analysis
ape: Analyses of Phylogenetics and Evolution.
phytools: Phylogenetic tools for comparative biology.
picante: Integrating phylogenies and ecology.
Ecological Network Analysis
bipartite: Visualizing bipartite networks and calculating indices.
igraph: Network analysis and visualization.
Environmental Data
raster: Reading, writing, manipulating, analyzing and modeling of spatial data.
ncdf4: Interface to Unidata netCDF format data files.
rgbif: Interface to the Global Biodiversity Information Facility API.
Multivariate Analysis
mvabund: Statistical methods for analyzing multivariate abundance data.
indicspecies: Assessing the strength and significance of relationships between species and groups of sites.