2 Case Study: Introducing the Data

Before we learn to analyze data, we need to understand the data we’ll be analyzing. Throughout this book, every technique — from visualization to predictive modeling to dashboard design — will be applied to a single, real-world dataset. This chapter introduces that dataset, the business problem it represents, and the questions we’ll spend the rest of the book answering.

Chapter Goals

Upon concluding this chapter, readers will be able to:

  1. Describe workplace absenteeism as a business problem, including its costs and operational impact.
  2. Explain the structure of the Absenteeism at Work dataset — what each record represents and what variables are captured.
  3. Distinguish among the types of data in the dataset (categorical, numerical, ordinal, and binary) and identify examples of each.
  4. Articulate the key business questions that the dataset can help answer, and connect those questions to the analytical techniques covered in later chapters.