7.9 Glossary of Terms
Aesthetic: A visual property of a plot element (e.g., position, color, shape, size) that is mapped to a variable in the data using
aes()inggplot2.Box Plot: A chart that displays the distribution of a continuous variable through its quartiles, showing the median, interquartile range, and outliers. Useful for comparing distributions across categories.
Chart Junk: Non-essential visual elements in a chart — decorative graphics, heavy grid lines, unnecessary labels — that do not convey data and distract from the message.
Composition: A data relationship showing how parts contribute to a whole. Visualized with stacked bar charts or pie charts.
Correlation: A data relationship showing the association between two variables. Visualized with scatter plots.
Distribution: The spread of data points across a range of values. Visualized with histograms and box plots.
Explanatory Visualization: A visualization designed to communicate a specific finding to an audience. Emphasizes clarity, simplicity, and narrative impact.
Exploratory Visualization: A visualization created during analysis to discover patterns, test hypotheses, and identify anomalies. Emphasizes speed and flexibility over polish.
Faceting: The technique of creating multiple small plots (small multiples) from subsets of the data, enabling visual comparison across categories.
Geom: A geometric object in
ggplot2that represents data visually — points (geom_point()), lines (geom_line()), bars (geom_bar()), boxes (geom_boxplot()), etc.ggplot2: An R package implementing the Grammar of Graphics, enabling the construction of complex, layered visualizations through a consistent and composable syntax.
Grammar of Graphics: A framework developed by Leland Wilkinson that decomposes statistical graphics into fundamental components (data, aesthetics, geoms, scales, facets, themes), enabling systematic and flexible visualization construction.
Heat Map: A visualization that uses color intensity to represent values across two dimensions, useful for identifying clusters and gradients in large datasets.
Histogram: A chart that displays the distribution of a continuous variable by dividing the data into bins and showing the frequency of values in each bin.
Data-Ink Ratio: Edward Tufte’s principle that the proportion of ink in a graphic devoted to displaying data should be maximized. Visual elements that do not convey data (chart junk) should be removed.
Scale: A
ggplot2component that controls how data values are mapped to aesthetic properties — for example, mapping a variable to a color gradient or transforming an axis to a logarithmic scale.Plotly: An R package (and cross-language library) that adds interactivity to visualizations — hover tooltips, zooming, panning. The
ggplotly()function converts anyggplot2plot into an interactive version.Scatter Plot: A chart that plots individual data points on two continuous axes, used to visualize correlation between variables.
Tableau: A widely used GUI-based visualization platform for creating interactive dashboards and explanatory visualizations in business settings.
Theme: A
ggplot2component that controls non-data visual elements such as background color, grid lines, font sizes, and axis styling.