9.7 AI in Modeling

AI is transforming the modeling process itself. Automated Machine Learning (AutoML) platforms can automatically test hundreds of model configurations — selecting algorithms, tuning hyperparameters, and engineering features — in the time it would take an analyst to manually evaluate a handful (Baratchi et al. 2024). Tools like H2O.ai, Google AutoML, and tidymodels in R make this capability accessible.

Large language models are adding another layer of assistance. An analyst can describe a modeling task in natural language — “build a logistic regression predicting customer churn using these five variables” — and an AI assistant like Claude Code can generate the R code, explain the output, and suggest improvements (Tornede et al. 2024).

However, AutoML and AI assistants do not eliminate the need for human judgment. They can optimize within a framework, but they cannot tell you whether the framework is right. Is churn the right outcome to predict, or should you be modeling customer lifetime value instead? Are the features ethically appropriate, or do they encode protected characteristics through proxies? Is the model interpretable enough for the stakeholders who will act on it?

The principles covered in this chapter — parsimony, generalizability, validation, and ethical responsibility — are the lens through which AI-generated models must be evaluated. AI accelerates the mechanics of modeling; human judgment determines whether the model serves its intended purpose.

Example: AI-Assisted Model Building

Prompt to Claude Code: Using the mtcars dataset, build a model to predict mpg. Try a few approaches and tell me which one generalizes best.

The AI might generate code that fits linear regression, ridge regression, and a random forest, evaluates each with cross-validation, and reports comparative RMSE values. The analyst’s job is to ask: Does the best-performing model make sense for my audience? Can I explain it to a non-technical stakeholder? Are the features it relies on available in the production environment?