2.5 Questions to Explore

Now that you are familiar with the dataset, consider the questions that will guide our analysis throughout the rest of the book:

  • Which factors are most strongly associated with high absenteeism? Do employees with longer commutes, higher workloads, or more children tend to miss more work? We will begin exploring these relationships visually in Chapters 7–8 and build formal models in Chapters 9–10.

  • Are there patterns in the timing of absences? Do certain days of the week, months, or seasons show higher absenteeism? Data visualization techniques (Chapters 7–8) will help us see these patterns clearly.

  • Can we predict which employees are at risk for excessive absenteeism? Using regression and classification models (Chapters 9–10) and data mining techniques (Chapters 11–12), we will build predictive models that identify at-risk employees based on their characteristics and work context.

  • Can we detect unusual patterns that signal a deeper problem? Anomaly detection methods (Chapters 11–12) can flag employees or time periods with unexpectedly high absence rates.

  • What would an effective HR dashboard look like? In Chapters 15–16, we will design and build an interactive dashboard that presents key absenteeism metrics to decision-makers in real time.

These are not hypothetical exercises — they represent the kinds of questions that real HR analysts and BI professionals work on every day. By the end of this book, you will have the skills to answer them.