8 Generating Tables in APA Style

When presenting data in scientific papers, it’s important to follow the guidelines provided by the American Psychological Association (APA) to ensure clarity and consistency. In this section, we’ll demonstrate how to generate tables in APA style using the kableExtra package in R.

8.1 Summary Statistics

First, let’s calculate summary statistics for the twin data and present them in an APA-style table.

# Load necessary libraries
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(kableExtra)
## 
## Attaching package: 'kableExtra'
## 
## The following object is masked from 'package:dplyr':
## 
##     group_rows
library(tidyverse)
library(NlsyLinks)
library(discord)
library(BGmisc)
library(OpenMx)
## 
## Attaching package: 'OpenMx'
## 
## The following object is masked from 'package:BGmisc':
## 
##     vech
library(conflicted) # to handle conflicts
conflicted::conflicts_prefer(OpenMx::vech,dplyr::filter) # Resolve conflicts
## [conflicted] Will prefer OpenMx::vech over any other package.
## [conflicted] Will prefer dplyr::filter over any other package.
data(twinData)

df_long <- twinData %>% select(-age)
# Convert wide data to long form
df_long <- df_long %>%
  pivot_longer(
    cols = matches('1$|2$'), # Select columns ending in '1' or '2'
     cols_vary = "slowest", # Specify that the columns are in the same order for each twin
    names_to = c(".value", "twin"), # Split the column names into variable and twin number
    names_pattern = "(.*)(1|2)" # Capture the variable and twin number
  )
# Add 'sex' and 'zyg' columns based on 'zygosity'
df_long <- df_long %>%
  mutate(sex = case_when(
    zygosity %in% c("MZFF", "DZFF") ~ "F",
    zygosity %in% c("MZMM", "DZMM") ~ "M",
    TRUE ~ "OS"
  ),
  zyg = case_when(
    zygosity %in% c("MZFF", "MZMM") ~ "MZ",
    zygosity %in% c("DZFF", "DZMM", "DZOS") ~ "DZ",
    TRUE ~ NA_character_
  ))
# Calculate summary statistics
summary_stats_long <- df_long %>%
  summarise(across(where(is.numeric), list(
    Mean = ~mean(., na.rm = TRUE),
    SD = ~sd(., na.rm = TRUE),
    Median = ~median(., na.rm = TRUE),
    Min = ~min(., na.rm = TRUE),
    Max = ~max(., na.rm = TRUE),
    IQR = ~IQR(., na.rm = TRUE)
  ), .names = "{col}_{fn}")) %>%
  pivot_longer(
    cols = everything(),
    names_to = c("Variable", "Statistic"),
    names_sep = "_"
  ) %>%
  pivot_wider(
    names_from = Statistic,
    values_from = value
  )

Now, let’s create an APA-style table for these summary statistics.

# Generate APA-style table
summary_stats_long %>%
  kable(caption = "Summary Statistics for Twin Data",
        col.names = c("Variable", "Mean", "SD", "Median", "Min", "Max", "IQR"),
        format = "html") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"),
                full_width = FALSE,
                position = "left") %>%
  add_header_above(c(" " = 1, "Summary Statistics" = 6)) %>%
  footnote(general = "Note. This table presents the summary statistics for the variables in the twin dataset.",
           general_title = " ",
           footnote_as_chunk = TRUE)
Table 8.1: Summary Statistics for Twin Data
Summary Statistics
Variable Mean SD Median Min Max IQR
fam 1904.500000 1099.3470505 1904.5000 1.0000 3808.0000 1903.5000
part 1.933298 0.2648394 2.0000 0.0000 2.0000 0.0000
wt 63.881744 11.7100387 62.0000 34.0000 127.0000 16.0000
ht 1.677868 0.0958175 1.6799 1.3398 1.9900 0.1501
htwt 22.612309 3.1794125 22.2041 13.2964 46.2493 3.8303
bmi 21.764600 0.9408264 21.7019 18.1125 26.8383 1.2029
age 34.453494 14.1704368 30.0000 17.0000 88.0000 19.0000
Note. This table presents the summary statistics for the variables in the twin dataset.

The table above presents the summary statistics for the variables in the twin dataset, including the mean, standard deviation, median, minimum, maximum, and interquartile range.

8.2 Frequency Tables

Next, we’ll create frequency tables for categorical variables such as zygosity and sex, and format them in APA style.

# Calculate frequency tables
frequency_tables <- df_long %>%
  select(zyg, sex) %>%
  pivot_longer(cols = everything(), names_to = "Variable", values_to = "Category") %>%
  group_by(Variable, Category) %>%
  summarise(Count = n()) %>%
  mutate(Percentage = round((Count / sum(Count)) * 100, 2)) %>%
  ungroup()
## `summarise()` has grouped output by 'Variable'. You can override using the
## `.groups` argument.

Let’s create an APA-style table for these frequency tables.