Unfold this code to find a companion script that you can copy into your script editor (that we are about to learn about) to follow along. You can copy all the companion scripts from each section into one script for your notes for the day.
show R code
# ========================================# Variables, Environments, and Plots Comprehensive Companion Script# ========================================# ----------------------------------------# Variables and the Environment# ----------------------------------------# Variables in R store data that you can use throughout your analysis# The Environment pane in RStudio shows all the variables you've created# Creating Variables using '<-'x<-5y<-"Hello"z<-c(1, 2, 3, 4, 5)df<-data.frame(id =1:3, value =c("A", "B", "C"))# '<-' is the standard assignment operator in Raverage<-mean(c(1, 3, 7, 7, 2))print(average)# Note: While '=' can be used for basic assignments, '<-' is preferred for clarity and consistency# Multiple Choice Q1: What is the value of 'x' after running the following code? with an otherwise empty environmentx<-10y<-5x<-y# a) 10# b) 5# c) 15# d) Error# ----------------------------------------# Variable Names and Good Practice# ----------------------------------------# Rules for variable names:# - Can include letters, numbers, underscores (_), and periods (.)# - Must begin with a letter or a period (if period, can't be followed by a number)# - Are case-sensitive# - Cannot be reserved words (if, else, TRUE, FALSE, NULL, etc.)# - Cannot contain special characters or spaces# Good practice: Use descriptive namesmax_temperature<-30min_temperature<-15# Multiple Choice Q2: Which of the following is NOT a valid variable name in R?# a) my_variable# b) .my_variable# c) 1st_variable# d) myVariable# ----------------------------------------# Viewing the Environment# ----------------------------------------# List all variables in the environmentls()# View the structure of a variablestr(df)# Multiple Choice Q3: After running the following code, how many variables will be in the environment? starting with an otherwise empty evnironemntx<-1y<-2z<-x+yrm(y)# a) 1# b) 2# c) 3# d) 4# ----------------------------------------# Removing Variables# ----------------------------------------# Remove a specific variablerm(x)# Remove all variablesrm(list =ls())# Remove variables matching a patternrm(list =ls(pattern ="temp"))ls()# Multiple Choice Q4: What will be the output of ls() after running the following code?a<-1b<-2temp_1<-3temp_2<-4rm(list =ls(pattern ="temp"))# a) a, b# b) a, b, temp_1, temp_2# c) temp_1, temp_2# d) An empty list# ----------------------------------------# Basic Plotting# ----------------------------------------# R has powerful built-in plotting capabilities# Scatter Plotx<-1:5y<-c(2, 4, 6, 8, 10)plot(x, y, main ="Scatter Plot", xlab ="X axis", ylab ="Y axis")# Line Plotplot(x, y, type ="l", main ="Line Plot", xlab ="X axis", ylab ="Y axis")# Bar Plotbarplot(y, names.arg =x, main ="Bar Plot", xlab ="X axis", ylab ="Y axis")# Histogramhist(y, main ="Histogram", xlab ="Values")# ----------------------------------------# Food for Thought# ----------------------------------------# 1. How does the use of descriptive variable names improve code readability and maintainability?# 2. How can the ability to remove variables from the environment be useful in data analysis workflows?# ----------------------------------------# Challenges# ----------------------------------------# 1. Create a vector of 10 random numbers between 1 and 100. Calculate and store the mean, median, and standard deviation in separate variables.# 2. Create a data frame with columns for 'name', 'age', and 'height' for 5 individuals. Use str() to examine its structure.# 3. Generate a sequence of numbers from 1 to 50. Create a histogram of these numbers with appropriate title and axis labels.# 4. Write a script that creates several variables, then removes all variables that start with the letter 'a'.
Variables and the Environment
Variables in R store data that you can use throughout your analysis. The Environment pane in RStudio shows all the variables you’ve created.
You can see all the variables in your current environment in the Environment tab in RStudio
Creating Variables using <-
To create a variable, use the assignment operator <-:
x<-5y<-"Hello"z<-c(1, 2, 3, 4, 5)df<-data.frame(id =1:3, value =c("A", "B", "C"))
In R, <- is the standard assignment operator. It assigns the value on the right to the variable on the left:
While = can be used for basic assignments, <- remains the more conventional choice, making code more idiomatic and readable to other R programmers. It enhances clarity and readability by clearly indicating assignment, whereas = can be confused with equality comparison or a function argument. Using <- aligns with R’s traditional conventions and distinguishes assignment from specifying named arguments in function calls, where = is commonly used. Additionally, <- supports bidirectional assignment (e.g., x <- 5 or 5 -> x), unlike = which only works left-to-right. Moreover, using <- ensures compatibility with older R code and follows recommendations from popular R style guides like those by Google and Hadley Wickham.
Variable Names and Good Practice
In R, variable naming follows specific rules to ensure proper functionality and readability. Here are the key rules and additional context:
Allowed Characters: Variable names can include letters, numbers, underscores (_), and periods (.).
Starting Character: Names must begin with a letter or a period. If a name starts with a period, it cannot be followed by a number.
Case Sensitivity: Variable names are case-sensitive, meaning age, Age, and AGE would be considered distinct variables.
Reserved Words: Variable names cannot be reserved words in R, such as if, else, TRUE, FALSE, NULL, Inf, NaN, and NA.
Length: While there is no strict limit on the length of variable names, overly long names can make code difficult to read and maintain.
Special Characters and Spaces: Variable names cannot contain special characters (e.g., @, #, $, %) or spaces.
Adhering to these rules helps maintain code clarity and prevents errors related to invalid variable names. Additionally, following naming conventions, such as using descriptive names and consistent casing (e.g., camelCase or snake_case), can further enhance code readability and maintainability.
Tip
It is good practice to use descriptive variable names, like max_temperature or min_temperature, instead of single letters like x or y.