3.7 Vectors and Other Variables
Variables in R can hold different types of data. One of the most fundamental data structures is the vector — an ordered sequence of elements of the same data type. Unlike a mathematical vector (which implies direction and magnitude), an R vector is simply a sequence of values.
3.7.1 Storing Many Numbers as a Vector
Use the c() function (“combine”) to create a vector from individual values.
## [1] 10 20 30 40 50
3.7.2 Storing Text Data
Vectors can also hold text (called “character” data in R). Enclose text values in quotes.
## [1] "Alice" "Bob" "Charlie" "David"
3.7.3 Storing “True or False” Data
Logical values (TRUE and FALSE) are used for conditions and filtering. While R allows the abbreviations T and F, avoid using them — unlike TRUE and FALSE, they can be overwritten by accident (e.g., T <- 5), which leads to subtle bugs.
# Store a vector of logical values
logical_vector <- c(TRUE, FALSE, TRUE, TRUE)
print(logical_vector)## [1] TRUE FALSE TRUE TRUE
3.7.4 Indexing Vectors
Use square brackets ([]) to access or modify individual elements. Note that R indexing starts at 1, not 0.
# Accessing individual elements
numbers <- c(10, 20, 30, 40, 50)
print(numbers[1]) # Access the first element## [1] 10
## [1] 20 30 40
## [1] 10 20 35 40 50
3.7.5 Missing Values (NA)
Real-world data frequently contains missing values. In R, missing values are represented by NA (Not Available). Understanding how R handles NA is essential because missing values can silently affect calculations.
# A vector with a missing value
temps <- c(72, 68, NA, 75, 71)
# mean() returns NA if any value is missing
mean(temps)## [1] NA
## [1] 71.5
## [1] FALSE FALSE TRUE FALSE FALSE
Many R functions return NA by default when the input contains missing values — this is a safety feature that forces you to make an explicit decision about how to handle them. The na.rm = TRUE argument (available in functions like mean(), sum(), sd()) removes NA values before computing the result. We will explore more sophisticated approaches to missing data in Chapter 5.