10.3 Homework Assignment: Regression Models
In this case study, you are assigned to analyze the “Absenteeism at Work Data Set” using regression models, with a specific focus on predicting absenteeism. You will replicate the analysis from the previous case but with a key difference: use Number.of.absence as the dependent variable for your models. This will allow you to concentrate specifically on the factors influencing the frequency of absenteeism.
Assignment Instructions
You are to produce a comprehensive Quarto document, rendered to a DOCX file formatted as a memo to your boss. This document should cover the following steps, providing a thorough analysis from data preparation to modeling and diagnostics:
Data Preparation: Refer to the earlier section of the case for detailed steps on data loading, filtering, grouping, summarizing, selecting relevant columns, recoding, and cleaning to prepare the dataset for analysis.
Defining High Absence: Establish a new variable categorizing employees as ‘High Absenteeism’ or ‘Low Absenteeism’ based on the median of number of absences month of service.
Linear Model Prediction: Use a linear regression model to predict absenteeism time in hours. Identify and evaluate the significance of different variables affecting absenteeism.
Refine the Model: Apply stepwise regression to refine the model based on the AIC criterion, which may involve adding or removing variables to enhance model accuracy.
Logistic Model Prediction: Implement logistic regression to assess the probability of employees falling into either the ‘High Absenteeism’ or ‘Low Absenteeism’ categories.
Model Diagnostics: Conduct diagnostics to evaluate the fit of the logistic regression model and identify any issues.
Your analysis must clearly explain the results, detailing the significance of predictors and their relationships with absenteeism. Make sure your report includes all necessary steps from data preparation to final diagnostics, offering a comprehensive view of the factors affecting absenteeism in the workplace. This document should serve as a detailed memo to your boss, summarizing your findings and methodologies in a clear and professional manner.