5.5 Data Reshaping with tidyr

tidyr provides functions for reshaping data between wide and long formats (Wickham et al. 2024). The two key functions are pivot_longer() (wide to long) and pivot_wider() (long to wide).

5.5.1 Pivoting Longer with pivot_longer()

pivot_longer() transforms data from wide format — where multiple columns represent levels of a variable — into long format, where those levels are gathered into a single column.

Example: Survey satisfaction scores in wide format, with groups as columns:

# Hypothetical survey data in a wide format
survey_data <- tibble(
  year = c(2021, 2022),
  A = c(8.2, 8.4),
  B = c(7.8, 7.9)
)
survey_data
# Transforming the wide-format survey data into a longer format
long_survey_data <- survey_data |>
  pivot_longer(
    cols = -year,
    names_to = "Group",
    values_to = "Satisfaction_Score"
  )
long_survey_data

Here, pivot_longer() takes the group columns (A and B), gathers their names into a Group column, and their values into a Satisfaction_Score column. The cols = -year argument means “pivot all columns except year.”

5.5.2 Pivoting Wider with pivot_wider()

pivot_wider() is the inverse — it spreads a long-format dataset back into wide format.

# Transforming the long-format data back into a wide format
wide_survey_data_again <- long_survey_data |>
  pivot_wider(
    names_from = c(Group),
    values_from = Satisfaction_Score
  )
wide_survey_data_again

names_from specifies which column provides the new column names, and values_from specifies which column provides the cell values. The result matches the original wide-format structure.