Homework: Exploring the Absenteeism Data

Using the Absenteeism at Work dataset loaded in this chapter, complete the following tasks in a Quarto document. Each task should be its own section with a heading, code, and a brief written interpretation.

  1. Summary statistics by season. The Seasons variable is coded as 1 = summer, 2 = autumn, 3 = winter, 4 = spring. Use filter() to create a subset for one season (e.g., filter(work, Seasons == 1) for summer), then run describe() on the Absenteeism time in hours column. Repeat for each season. Which season has the highest mean absenteeism?

  2. Histogram of transportation expense. Create a histogram of the Transportation expense variable. Describe the shape of the distribution — is it symmetric, left-skewed, or right-skewed?

  3. Boxplot of absenteeism hours by season. Create a boxplot showing absenteeism hours for each season. Label the seasons by name (not number). Are there noticeable differences across seasons?

  4. Your observations. In 3–5 sentences, describe one pattern or question that emerged from your exploration. What would you want to investigate further?

Submission: Render your Quarto document to Word format and submit the rendered file. See the Quarto appendix for detailed instructions on creating and rendering Quarto documents.

The skills in these chapters are not preliminary — they are the practical backbone of every analysis in this textbook and in professional BI work.