for(counter in sequence) {
# Body of the loop
# Commands to be repeated
}
Coding Fundamentals Review
follow this guide if you are having trouble with last weeks activity.
For Loops
Anatomy of a For Loop
A for loop is built from several key components, each serving a specific purpose:
Essential Components
-
The
for
Statement- Begins with the keyword
for
- Tells R you want to repeat some commands
- Always followed by parentheses containing the counter and sequence
- Begins with the keyword
-
The Counter
- A variable (often
i
) that keeps track of where you are in the loop - Gets updated automatically in each iteration
- Can be used inside the loop to:
- Access elements in sequences
- Store results
- Control calculations
- A variable (often
-
The Sequence
- Defines what values the counter will take
- Common forms:
-
1:n
for numbers 1 through n -
seq(length(x))
for the length of vector x - A vector of specific values
-
-
The Body
- Commands between curly braces
{ }
- Code that will be executed in each iteration
- Can use the counter to:
- Access elements:
x[i]
- Store results:
output[i] <- result
- Perform calculations
- Access elements:
- Commands between curly braces
-
Initialization (often needed)
- Objects created before the loop starts
- Usually vectors or matrices to store results
- Must be the right size for your expected output
Basic Structure Examples
# Simple counting loop
for(i in 1:5) {
print(i)
}
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] "a"
[1] "b"
[1] "c"
Example
Goal: Create a for loop that takes a vector of temperatures temps <- c(20, 22, 24, 26, 28)
and calculates the average temperature up to each point. The output should be 20, 21, 22, 23, 24 (each number being the running average up to that point).
Step 1: Understand Input and Desired Output
First, let’s map out what we know:
Input: 20, 22, 24, 26, 28
Output: 20, 21, 22, 23, 24
# Breaking down the output:
20 = 20 # First value: just 20
21 = (20 + 22)/2 # Average of first two values
22 = (20 + 22 + 24)/3 # Average of first three values
23 = (20 + 22 + 24 + 26)/4 # Average of first four values
24 = (20 + 22 + 24 + 26 + 28)/5 # Average of all values
Step 2: Identify Loop Components
We need to determine:
- What are we looping over? We need to loop over the positions in the temperature vector (1 to 5)
- What needs to be calculated in each iteration? The average of temperatures from position 1 to current position.
- Do we need to store results? Yes, we’ll need a vector to store our running averages
Step 3: Initialize Storage
Step 4: Build the Loop
Let’s build this step by step:
Step 5: Understanding Each Iteration
Let’s break down what happens in each iteration:
Functions
Anatomy of a Function
A function in R consists of several key components that work together to create reusable code:
function_name <- function(argument1, argument2, ...) {
# Documentation
# Function body
return(output)
}
-
Function Name
- Should be descriptive and follow naming conventions
- Usually uses verbs to describe action
- Examples:
calculate_mean
,standardize_values
,convert_temperature
-
Arguments
- Input parameters the function needs
- Can have default values:
function(x, na.rm = TRUE)
- Should have meaningful names
- Can be required or optional
-
Documentation
- Comments explaining what the function does
- Description of arguments
- Description of return value
- Examples of usage
-
Function Body
- Code that performs the operations
- Can be multiple lines
- Should include error checking when needed
- Should be clear and well-commented
-
Return Value
- What the function outputs
- Can be explicit using
return()
- Or implicit (last evaluated expression)
Example
let’s create a function that converts values to percentages of their maximum:
Starting code that needs to be converted to a function:
Step 1: Identify Pattern
The pattern here is:
Take a vector of numbers (a, b, c)
Divide by its maximum (max)
Multiply by 100 to get percentages (*100)
Handle NA values appropriately (na.rm = TRUE)
Step 2: Design Function Structure
Let’s think about:
What does it do? → Converts to percentage of maximum
What input does it need? → A numeric vector
What should it return? → A vector of percentages
optional: What options might be useful? → NA handling
Step 3: Create and Document Function
convert_to_percent_of_max <- function(x) {
# Convert numeric values to percentages of their maximum
#
# Args:
# x: A numeric vector to be converted
# na.rm: Logical, should NA values be removed? (default = TRUE)
#
# Returns:
# A numeric vector where each value is expressed as a percentage
# of the maximum value in the input vector
# Calculate percentages
result <- x / max(x, na.rm = T) * 100
return(result)
}
# options: add more arguments (eg na.rm), add more error checking (eg input type)
Step 4: Test the Function
# Test with simple vector
test_vector <- c(1, 2, 3, 4, 5)
convert_to_percent_of_max(test_vector)
[1] 20 40 60 80 100
# Test with NA values
test_with_na <- c(1, 2, NA, 4, 5)
convert_to_percent_of_max(test_with_na)
[1] 20 40 NA 80 100
# Test error handling
# Should produce error:
# convert_to_percent_of_max(c("a", "b", "c"))
Other Key Concepts
-
Generalization
- Function takes any numeric vector, not just specific variables
- Makes code reusable and efficient
-
Error Handling
- Checks input type
- Handles NA values through parameter
- Provides informative error messages
-
Documentation
- Clearly explains purpose
- Describes arguments
- States what is returned
- Could include examples
-
Flexibility
- Optional parameters (
na.rm
) - Could be extended for more options
- Optional parameters (