Coding Fundamentals Review

follow this guide if you are having trouble with last weeks activity.

For Loops

Anatomy of a For Loop

A for loop is built from several key components, each serving a specific purpose:

for(counter in sequence) {
    # Body of the loop
    # Commands to be repeated
}

Essential Components

  1. The for Statement
    • Begins with the keyword for
    • Tells R you want to repeat some commands
    • Always followed by parentheses containing the counter and sequence
  2. The Counter
    • A variable (often i) that keeps track of where you are in the loop
    • Gets updated automatically in each iteration
    • Can be used inside the loop to:
      • Access elements in sequences
      • Store results
      • Control calculations
  3. The Sequence
    • Defines what values the counter will take
    • Common forms:
      • 1:n for numbers 1 through n
      • seq(length(x)) for the length of vector x
      • A vector of specific values
  4. The Body
    • Commands between curly braces { }
    • Code that will be executed in each iteration
    • Can use the counter to:
      • Access elements: x[i]
      • Store results: output[i] <- result
      • Perform calculations
  5. Initialization (often needed)
    • Objects created before the loop starts
    • Usually vectors or matrices to store results
    • Must be the right size for your expected output

Basic Structure Examples

# Simple counting loop
for(i in 1:5) {
    print(i)
}
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
# Loop over vector elements
x <- c("a", "b", "c")
for(letter in x) {
    print(letter)
}
[1] "a"
[1] "b"
[1] "c"
# Loop with storage
numbers <- 1:5
result <- numeric(length(numbers))  # Initialization
for(i in 1:length(numbers)) {
    result[i] <- numbers[i] * 2
}

Example

Goal: Create a for loop that takes a vector of temperatures temps <- c(20, 22, 24, 26, 28) and calculates the average temperature up to each point. The output should be 20, 21, 22, 23, 24 (each number being the running average up to that point).

Step 1: Understand Input and Desired Output

First, let’s map out what we know:

Input:  20, 22, 24, 26, 28
Output: 20, 21, 22, 23, 24

# Breaking down the output:
20 = 20                     # First value: just 20
21 = (20 + 22)/2           # Average of first two values
22 = (20 + 22 + 24)/3      # Average of first three values
23 = (20 + 22 + 24 + 26)/4 # Average of first four values
24 = (20 + 22 + 24 + 26 + 28)/5 # Average of all values

Step 2: Identify Loop Components

We need to determine:

    1. What are we looping over? We need to loop over the positions in the temperature vector (1 to 5)
    1. What needs to be calculated in each iteration? The average of temperatures from position 1 to current position.
    1. Do we need to store results? Yes, we’ll need a vector to store our running averages

Step 3: Initialize Storage

# Create our input vector
temps <- c(20, 22, 24, 26, 28)

# Create a vector to store results
# It should be the same length as our input vector
running_avg <- numeric(length(temps))

Step 4: Build the Loop

Let’s build this step by step:

# First, create our storage vector
running_avg <- numeric(length(temps))

# Now build the loop
for(i in 1:length(temps)) {
    # Each iteration will calculate the average up to position i
    running_avg[i] <- mean(temps[1:i])
}

# Look at our results
print(running_avg)
[1] 20 21 22 23 24

Step 5: Understanding Each Iteration

Let’s break down what happens in each iteration:

# Iteration 1 (i = 1):
mean(temps[1:1])    # means mean(c(20)) = 20
[1] 20
# Iteration 2 (i = 2):
mean(temps[1:2])    # means mean(c(20, 22)) = 21
[1] 21
# Iteration 3 (i = 3):
mean(temps[1:3])    # means mean(c(20, 22, 24)) = 22
[1] 22
# And so on...

Functions

Anatomy of a Function

A function in R consists of several key components that work together to create reusable code:

function_name <- function(argument1, argument2, ...) {
    # Documentation
    # Function body
    return(output)
}
  1. Function Name
    • Should be descriptive and follow naming conventions
    • Usually uses verbs to describe action
    • Examples: calculate_mean, standardize_values, convert_temperature
  2. Arguments
    • Input parameters the function needs
    • Can have default values: function(x, na.rm = TRUE)
    • Should have meaningful names
    • Can be required or optional
  3. Documentation
    • Comments explaining what the function does
    • Description of arguments
    • Description of return value
    • Examples of usage
  4. Function Body
    • Code that performs the operations
    • Can be multiple lines
    • Should include error checking when needed
    • Should be clear and well-commented
  5. Return Value
    • What the function outputs
    • Can be explicit using return()
    • Or implicit (last evaluated expression)

Example

let’s create a function that converts values to percentages of their maximum:

Starting code that needs to be converted to a function:

a / max(a, na.rm = TRUE) * 100
b / max(b, na.rm = TRUE) * 100
c / max(c, na.rm = TRUE) * 100

Step 1: Identify Pattern

The pattern here is:

  • Take a vector of numbers (a, b, c)

  • Divide by its maximum (max)

  • Multiply by 100 to get percentages (*100)

  • Handle NA values appropriately (na.rm = TRUE)

Step 2: Design Function Structure

Let’s think about:

  • What does it do? → Converts to percentage of maximum

  • What input does it need? → A numeric vector

  • What should it return? → A vector of percentages

  • optional: What options might be useful? → NA handling

Step 3: Create and Document Function

convert_to_percent_of_max <- function(x) {
    # Convert numeric values to percentages of their maximum
    #
    # Args:
    #   x: A numeric vector to be converted
    #   na.rm: Logical, should NA values be removed? (default = TRUE)
    #
    # Returns:
    #   A numeric vector where each value is expressed as a percentage
    #   of the maximum value in the input vector

    # Calculate percentages
    result <- x / max(x, na.rm = T) * 100
    
    return(result)
}

# options: add more arguments (eg na.rm), add more error checking (eg input type)

Step 4: Test the Function

# Test with simple vector
test_vector <- c(1, 2, 3, 4, 5)
convert_to_percent_of_max(test_vector)
[1]  20  40  60  80 100
# Test with NA values
test_with_na <- c(1, 2, NA, 4, 5)
convert_to_percent_of_max(test_with_na)
[1]  20  40  NA  80 100
# Test error handling
# Should produce error:
# convert_to_percent_of_max(c("a", "b", "c"))

Other Key Concepts

  1. Generalization
    • Function takes any numeric vector, not just specific variables
    • Makes code reusable and efficient
  2. Error Handling
    • Checks input type
    • Handles NA values through parameter
    • Provides informative error messages
  3. Documentation
    • Clearly explains purpose
    • Describes arguments
    • States what is returned
    • Could include examples
  4. Flexibility
    • Optional parameters (na.rm)
    • Could be extended for more options