8 Generating Tables in APA Style

When presenting data in scientific papers, it’s important to follow the guidelines provided by the American Psychological Association (APA) to ensure clarity and consistency. In this section, we’ll demonstrate how to generate tables in APA style using the kableExtra package in R.

8.1 Summary Statistics

First, let’s calculate summary statistics for the twin data and present them in an APA-style table.

# Load necessary libraries
library(tidyverse)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(kableExtra)

## 
## Attaching package: 'kableExtra'
## 
## The following object is masked from 'package:dplyr':
## 
##     group_rows

library(tidyverse)
library(NlsyLinks)
library(discord)
library(BGmisc)
library(OpenMx)

## 
## Attaching package: 'OpenMx'
## 
## The following object is masked from 'package:BGmisc':
## 
##     vech

library(conflicted) # to handle conflicts
conflicted::conflicts_prefer(OpenMx::vech,dplyr::filter) # Resolve conflicts

## [conflicted] Will prefer OpenMx::vech over any other package.
## [conflicted] Will prefer dplyr::filter over any other package.

data(twinData)

df_long <- twinData %>% select(-age)

# Convert wide data to long form
df_long <- df_long %>%
  pivot_longer(
    cols = matches('1$|2$'), # Select columns ending in '1' or '2'
     cols_vary = "slowest", # Specify that the columns are in the same order for each twin
    names_to = c(".value", "twin"), # Split the column names into variable and twin number
    names_pattern = "(.*)(1|2)" # Capture the variable and twin number
  )
# Add 'sex' and 'zyg' columns based on 'zygosity'
df_long <- df_long %>%
  mutate(sex = case_when(
    zygosity %in% c("MZFF", "DZFF") ~ "F",
    zygosity %in% c("MZMM", "DZMM") ~ "M",
    TRUE ~ "OS"
  ),
  zyg = case_when(
    zygosity %in% c("MZFF", "MZMM") ~ "MZ",
    zygosity %in% c("DZFF", "DZMM", "DZOS") ~ "DZ",
    TRUE ~ NA_character_
  ))

# Calculate summary statistics
summary_stats_long <- df_long %>%
  summarise(across(where(is.numeric), list(
    Mean = ~mean(., na.rm = TRUE),
    SD = ~sd(., na.rm = TRUE),
    Median = ~median(., na.rm = TRUE),
    Min = ~min(., na.rm = TRUE),
    Max = ~max(., na.rm = TRUE),
    IQR = ~IQR(., na.rm = TRUE)
  ), .names = "{col}_{fn}")) %>%
  pivot_longer(
    cols = everything(),
    names_to = c("Variable", "Statistic"),
    names_sep = "_"
  ) %>%
  pivot_wider(
    names_from = Statistic,
    values_from = value
  )

Now, let’s create an APA-style table for these summary statistics.

# Generate APA-style table
summary_stats_long %>%
  kable(caption = "Summary Statistics for Twin Data",
        col.names = c("Variable", "Mean", "SD", "Median", "Min", "Max", "IQR"),
        format = "html") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"),
                full_width = FALSE,
                position = "left") %>%
  add_header_above(c(" " = 1, "Summary Statistics" = 6)) %>%
  footnote(general = "Note. This table presents the summary statistics for the variables in the twin dataset.",
           general_title = " ",
           footnote_as_chunk = TRUE)

Table 8.1: Summary Statistics for Twin Data
	Summary Statistics
Variable	Mean	SD	Median	Min	Max	IQR
fam	1904.500000	1099.3470505	1904.5000	1.0000	3808.0000	1903.5000
part	1.933298	0.2648394	2.0000	0.0000	2.0000	0.0000
wt	63.881744	11.7100387	62.0000	34.0000	127.0000	16.0000
ht	1.677868	0.0958175	1.6799	1.3398	1.9900	0.1501
htwt	22.612309	3.1794125	22.2041	13.2964	46.2493	3.8303
bmi	21.764600	0.9408264	21.7019	18.1125	26.8383	1.2029
age	34.453494	14.1704368	30.0000	17.0000	88.0000	19.0000
Note. This table presents the summary statistics for the variables in the twin dataset.

The table above presents the summary statistics for the variables in the twin dataset, including the mean, standard deviation, median, minimum, maximum, and interquartile range.

8.2 Frequency Tables

Next, we’ll create frequency tables for categorical variables such as zygosity and sex, and format them in APA style.

# Calculate frequency tables
frequency_tables <- df_long %>%
  select(zyg, sex) %>%
  pivot_longer(cols = everything(), names_to = "Variable", values_to = "Category") %>%
  group_by(Variable, Category) %>%
  summarise(Count = n()) %>%
  mutate(Percentage = round((Count / sum(Count)) * 100, 2)) %>%
  ungroup()

## `summarise()` has grouped output by 'Variable'. You can override using the
## `.groups` argument.

Let’s create an APA-style table for these frequency tables.