7.4 Design Principles

Good visualization design is about communicating information clearly, not decorating it. Edward Tufte’s principle of maximizing the data-ink ratio — the proportion of ink in a graphic devoted to displaying data — remains the cornerstone: every visual element should convey data; anything that does not — often called chart junk — should be removed (Tufte 2001).

7.4.1 Key Guidelines

Simplify. Remove non-essential graphics, heavy grid lines, and decorative borders. Every element should earn its place by conveying data. Compare theme_gray() (ggplot2’s default, with a grey background and white grid lines) to theme_minimal() (white background, light grid lines). The minimal theme puts the data front and center — the grey background adds no information.

Use color purposefully. Color should encode meaning, not decorate. Use it to distinguish categories, highlight a key data point, or show a gradient — and use it consistently across related charts. Avoid using more than 5-7 colors in a single chart; beyond that, the distinctions become difficult to perceive.

Label directly. Where possible, label data points or lines directly rather than relying on a separate legend. This eliminates the back-and-forth between the data and the legend, reducing cognitive load. In ggplot2, the geom_text() and annotate() functions support direct labeling.

Design for accessibility. Use colorblind-friendly palettes (e.g., scale_fill_viridis_d() in ggplot2), high-contrast color schemes, and legible font sizes. Approximately 8% of men have some form of color vision deficiency — a visualization that relies solely on red-green distinctions excludes a significant portion of your audience.

Balance aesthetics and function. An attractive chart that misleads is worse than an ugly chart that informs. Design choices should enhance comprehension, not compete with the data.

7.4.2 Common Visualization Pitfalls

Even well-intentioned visualizations can mislead. Watch for these common problems:

  • Truncated axes: Starting a y-axis at a value other than zero can exaggerate small differences. A bar chart showing revenue of $99M vs. $101M looks dramatic if the axis starts at $98M, but the actual difference is only 2%.
  • Dual y-axes: Charts with two different y-axis scales can imply a relationship between variables that does not exist. The scales can be manipulated to make any two trends appear correlated.
  • 3D effects: Three-dimensional bars, pie charts, and other 3D effects distort proportions and make accurate comparison impossible. Always use 2D charts.
  • Cherry-picked time ranges: Selecting a start date that coincides with a trough (or peak) can make a trend look more dramatic than it is. Always consider whether the time range tells the full story.