3.15 Glossary of Terms

  1. AI Coding Assistant: A software tool that uses artificial intelligence to help write, debug, and explain code. AI coding assistants include GUI-based tools (ChatGPT, Claude web) and CLI-based tools (Claude Code, GitHub Copilot) that generate code from natural language descriptions.

  2. Argument: A value passed to a function that controls its behavior. For example, in mean(x, na.rm = TRUE), x and na.rm = TRUE are both arguments.

  3. Assignment Operator (<-): The symbol used in R to assign a value to a variable. For example, x <- 10 stores the value 10 in the variable x.

  4. Comment: A line of text in code preceded by # that is ignored by R. Comments are used to document and explain code.

  5. Conditional Statement: A programming construct (e.g., if-else) that executes different blocks of code depending on whether a condition is true or false.

  6. Console: The panel in RStudio where R commands are executed and output is displayed. Commands can be typed directly into the console or sent from a script.

  7. CRAN (Comprehensive R Archive Network): The official repository for R packages, hosting thousands of community-contributed extensions to R.

  8. Data Frame: R’s primary data structure for tabular data. A data frame is a collection of vectors of equal length, where each vector represents a column.

  9. Factor: A data type in R used to represent categorical variables. Factors store both the values and the predefined levels (categories) of the variable.

  10. Formula: An R expression using the ~ symbol to specify relationships between variables, commonly used in statistical modeling (e.g., y ~ x1 + x2).

  11. Function: A reusable block of code that takes inputs (arguments), performs operations, and returns a result. R has thousands of built-in functions, and users can define their own.

  12. IDE (Integrated Development Environment): A software application that provides tools for writing, running, and debugging code. RStudio is the primary IDE for R.

  13. List: A flexible data structure in R that can hold elements of different data types, including vectors, data frames, and other lists.

  14. Loop: A programming construct that repeats a block of code multiple times. R supports for, while, and repeat loops.

  15. NA (Not Available): R’s representation of a missing value. Many functions return NA when input contains missing values unless explicitly told to ignore them (e.g., na.rm = TRUE).

  16. Package: A collection of R functions, data, and documentation that extends R’s capabilities. Packages are installed from CRAN using install.packages() and loaded with library().

  17. Pipe Operator (|>): An operator that passes the result of one expression as the first argument to the next function, enabling readable left-to-right chains of operations. Built into R 4.1 and later.

  18. Posit.Cloud: A cloud-based platform that provides browser access to RStudio without requiring local installation. Formerly called RStudio Cloud.

  19. R: A programming language and software environment for statistical computing and graphics, widely used in data science, business intelligence, and academic research.

  20. RStudio: An integrated development environment (IDE) for R that provides a user-friendly interface with panels for writing scripts, viewing output, managing files, and inspecting data.

  21. Script: A file containing a sequence of R commands that can be executed together. Scripts make analyses reproducible and shareable.

  22. Variable: A named storage location in R that holds a value. Variables can store numbers, text, logical values, vectors, data frames, and other objects.

  23. Vector: An ordered sequence of elements of the same data type. Vectors are R’s most basic data structure, created using the c() function.

  24. Workspace: R’s current working environment, where all variables, functions, and datasets created during a session are stored. Use ls() to view and rm() to remove objects.