5.5 Data Reshaping with tidyr
tidyr provides functions for reshaping data between wide and long formats (Wickham et al. 2024). The two key functions are pivot_longer() (wide to long) and pivot_wider() (long to wide).
5.5.1 Pivoting Longer with pivot_longer()
pivot_longer() transforms data from wide format — where multiple columns represent levels of a variable — into long format, where those levels are gathered into a single column.
Example: Survey satisfaction scores in wide format, with groups as columns:
# Hypothetical survey data in a wide format
survey_data <- tibble(
year = c(2021, 2022),
A = c(8.2, 8.4),
B = c(7.8, 7.9)
)
survey_data# Transforming the wide-format survey data into a longer format
long_survey_data <- survey_data |>
pivot_longer(
cols = -year,
names_to = "Group",
values_to = "Satisfaction_Score"
)
long_survey_dataHere, pivot_longer() takes the group columns (A and B), gathers their names into a Group column, and their values into a Satisfaction_Score column. The cols = -year argument means “pivot all columns except year.”
5.5.2 Pivoting Wider with pivot_wider()
pivot_wider() is the inverse — it spreads a long-format dataset back into wide format.
# Transforming the long-format data back into a wide format
wide_survey_data_again <- long_survey_data |>
pivot_wider(
names_from = c(Group),
values_from = Satisfaction_Score
)
wide_survey_data_againnames_from specifies which column provides the new column names, and values_from specifies which column provides the cell values. The result matches the original wide-format structure.