Introduction
Unlike Tolstoy, where only happy families are alike, all
pedigrees are alike – or at least, all simulated pedigrees are alike.
The simulatePedigree
function generates a pedigree with a
user-specified number of generations and individuals per generation.
This function provides users the opportunity to test family models in
pedigrees with a customized pedigree length and width.
These pedigrees can be simulated as a function of several parameters, including the number of children per mate, generations, sex ratio of newborns, and mating rate. Given that large family pedigrees are difficult to collect or access, simulated pedigrees serve as an efficient tool for researchers. These simulated pedigrees are useful for building family-based statistical models, and evaluating their statistical properties, such as power, bias, and computational efficiency.
To illustrate this functionality, let us generate a pedigree. This
pedigree has a total of four generations (Ngen
), in which
each person who “mates”, grows a family with four offspring
(kpc
). In our scenario, the number of male and female
newborns is equal, but can be adjusted via (sexR
). In this
illustration 70% of individuals will mate and bear offspring
(marR
). Such a pedigree structure can be simulated by
running the following code:
## Loading Required Libraries
library(BGmisc)
set.seed(5)
df_ped <- simulatePedigree(
kpc = 4,
Ngen = 4,
sexR = .5,
marR = .7
)
summary(df_ped)
#> fam ID gen dadID
#> Length:57 Min. : 10011 Min. :1.000 Min. : 10012
#> Class :character 1st Qu.: 10036 1st Qu.:3.000 1st Qu.: 10024
#> Mode :character Median :100312 Median :3.000 Median : 10037
#> Mean : 59171 Mean :3.298 Mean : 42859
#> 3rd Qu.:100416 3rd Qu.:4.000 3rd Qu.:100311
#> Max. :100432 Max. :4.000 Max. :100320
#> NA's :13
#> momID spID sex
#> Min. : 10011 Min. : 10011 Length:57
#> 1st Qu.: 10022 1st Qu.: 10025 Class :character
#> Median : 10036 Median : 10036 Mode :character
#> Mean : 42859 Mean : 40124
#> 3rd Qu.:100316 3rd Qu.:100311
#> Max. :100318 Max. :100320
#> NA's :13 NA's :33
The simulation output is a data.frame
with 57 rows and 7
columns. Each row corresponds to a simulated individual.
df_ped[21, ]
#> fam ID gen dadID momID spID sex
#> 21 fam 1 100312 3 10024 10022 100317 M
The columns represents the individual’s family ID, the individual’s personal ID, the generation the individual is in, the IDs of their father and mother, the ID of their spouse, and the biological sex of the individual, respectively.
Summarizing Pedigrees
summarizeFamilies(df_ped, famID = "fam")$family_summary
#> fam count gen_mean gen_median gen_min gen_max gen_sd spID_mean
#> <char> <int> <num> <num> <num> <num> <num> <num>
#> 1: fam 1 57 3.298246 3 1 4 0.8229935 40123.5
#> spID_median spID_min spID_max spID_sd
#> <num> <num> <num> <num>
#> 1: 10035.5 10011 100320 43476.96
Plotting Pedigree
Pedigrees are visual diagrams that represent family relationships
across generations. They are commonly used in genetics to trace the
inheritance of specific traits or conditions. This vignette will guide
you through visualizing simulated pedigrees using the
plotPedigree
function. This function is a wrapper function
for Kinship2
’s base R plotting.
Single Pedigree Visualization
To visualize a single simulated pedigree, use the
plotPedigree()
function.
# Plot the simulated pedigree
plotPedigree(df_ped)
#> Did not plot the following people: 10032
#> $plist
#> $plist$n
#> [1] 2 7 19 28
#>
#> $plist$nid
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14]
#> [1,] 2 1 0 0 0 0 0 0 0 0 0 0 0 0
#> [2,] 6 4 5 3 9 7 8 0 0 0 0 0 0 0
#> [3,] 18 17 19 22 21 26 23 10 12 13 14 16 15 24
#> [4,] 38 39 40 42 41 43 45 48 47 50 52 53 30 31
#> [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26]
#> [1,] 0 0 0 0 0 0 0 0 0 0 0 0
#> [2,] 0 0 0 0 0 0 0 0 0 0 0 0
#> [3,] 20 25 28 29 27 0 0 0 0 0 0 0
#> [4,] 32 33 34 35 36 37 44 46 49 51 54 55
#> [,27] [,28]
#> [1,] 0 0
#> [2,] 0 0
#> [3,] 0 0
#> [4,] 56 57
#>
#> $plist$pos
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#> [1,] 1.550317e+01 16.503171 0.000000 0.000000 0.000000 0.000000 0.00000
#> [2,] 8.255043e+00 9.255043 14.147242 15.147242 18.805200 19.805200 20.80520
#> [3,] 2.351008e+00 3.351008 5.751008 6.751008 8.585014 9.585014 10.58501
#> [4,] -1.257081e-13 1.000000 2.000000 3.000000 4.000000 5.000000 6.00000
#> [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15]
#> [1,] 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
#> [2,] 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
#> [3,] 12.13453 13.13453 14.13453 15.13453 16.32945 17.32945 18.98794 19.98794
#> [4,] 7.00000 8.00000 9.00000 10.00000 11.00000 12.00000 13.00000 14.00000
#> [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25]
#> [1,] 0.00000 0.00000 0.00000 0.00000 0 0 0 0 0 0
#> [2,] 0.00000 0.00000 0.00000 0.00000 0 0 0 0 0 0
#> [3,] 20.98794 21.98794 23.86104 24.86104 0 0 0 0 0 0
#> [4,] 15.00000 16.00000 17.00000 18.00000 19 20 21 22 23 24
#> [,26] [,27] [,28]
#> [1,] 0 0 0
#> [2,] 0 0 0
#> [3,] 0 0 0
#> [4,] 25 26 27
#>
#> $plist$fam
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14]
#> [1,] 0 0 0 0 0 0 0 0 0 0 0 0 0 0
#> [2,] 0 1 1 0 0 1 1 0 0 0 0 0 0 0
#> [3,] 0 1 1 0 1 0 1 3 3 3 0 0 3 5
#> [4,] 1 1 1 1 3 3 3 3 5 5 5 5 10 10
#> [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26]
#> [1,] 0 0 0 0 0 0 0 0 0 0 0 0
#> [2,] 0 0 0 0 0 0 0 0 0 0 0 0
#> [3,] 0 5 5 5 0 0 0 0 0 0 0 0
#> [4,] 10 10 12 12 12 12 15 15 15 15 18 18
#> [,27] [,28]
#> [1,] 0 0
#> [2,] 0 0
#> [3,] 0 0
#> [4,] 18 18
#>
#> $plist$spouse
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14]
#> [1,] 1 0 0 0 0 0 0 0 0 0 0 0 0 0
#> [2,] 1 0 1 0 1 0 0 0 0 0 0 0 0 0
#> [3,] 1 0 1 0 1 0 0 0 0 1 0 1 0 0
#> [4,] 0 0 0 0 0 0 0 0 0 0 0 0 0 0
#> [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26]
#> [1,] 0 0 0 0 0 0 0 0 0 0 0 0
#> [2,] 0 0 0 0 0 0 0 0 0 0 0 0
#> [3,] 1 0 0 1 0 0 0 0 0 0 0 0
#> [4,] 0 0 0 0 0 0 0 0 0 0 0 0
#> [,27] [,28]
#> [1,] 0 0
#> [2,] 0 0
#> [3,] 0 0
#> [4,] 0 0
#>
#>
#> $x
#> [1] 1.650317e+01 1.550317e+01 1.514724e+01 9.255043e+00 1.414724e+01
#> [6] 8.255043e+00 1.980520e+01 2.080520e+01 1.880520e+01 1.213453e+01
#> [11] NA 1.313453e+01 1.413453e+01 1.513453e+01 1.732945e+01
#> [16] 1.632945e+01 3.351008e+00 2.351008e+00 5.751008e+00 1.998794e+01
#> [21] 8.585014e+00 6.751008e+00 1.058501e+01 1.898794e+01 2.098794e+01
#> [26] 9.585014e+00 2.486104e+01 2.198794e+01 2.386104e+01 1.200000e+01
#> [31] 1.300000e+01 1.400000e+01 1.500000e+01 1.600000e+01 1.700000e+01
#> [36] 1.800000e+01 1.900000e+01 -1.257081e-13 1.000000e+00 2.000000e+00
#> [41] 4.000000e+00 3.000000e+00 5.000000e+00 2.000000e+01 6.000000e+00
#> [46] 2.100000e+01 8.000000e+00 7.000000e+00 2.200000e+01 9.000000e+00
#> [51] 2.300000e+01 1.000000e+01 1.100000e+01 2.400000e+01 2.500000e+01
#> [56] 2.600000e+01 2.700000e+01
#>
#> $y
#> [1] 1 1 2 2 2 2 2 2 2 3 NA 3 3 3 3 3 3 3 3 3 3 3 3 3 3
#> [26] 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
#> [51] 4 4 4 4 4 4 4
#>
#> $boxw
#> [1] 0.5158615
#>
#> $boxh
#> [1] 0.08681352
#>
#> $call
#> kinship2::plot.pedigree(x = p3, cex = cex, col = col, symbolsize = symbolsize,
#> branch = branch, packed = packed, align = align, width = width,
#> density = density, angle = angle, keep.par = keep.par, pconnect = pconnect,
#> mar = mar)
In the resulting plot, biological males are represented by squares, while biological females are represented by circles, following the standard pedigree conventions.
Visualizing Multiple Pedigrees Side-by-Side
If you wish to compare different pedigrees side by side, you can plot them together. For instance, let’s visualize pedigrees for families spanning three and four generations, respectively.
set.seed(8)
# Simulate a family with 3 generations
df_ped_3 <- simulatePedigree(Ngen = 3)
# Simulate a family with 4 generations
df_ped_4 <- simulatePedigree(Ngen = 4)
# Set up plotting parameters for side-by-side display
par(mfrow = c(1, 2))
# Plot the 3-generation pedigree
plotPedigree(df_ped_3, width = 3)
#> $plist
#> $plist$n
#> [1] 2 5 6
#>
#> $plist$nid
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 2 1 0 0 0 0
#> [2,] 3 5 4 6 7 0
#> [3,] 8 10 11 9 12 13
#>
#> $plist$pos
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 1.166667e+00 2.166667 0 0 0 0
#> [2,] 2.047042e-09 1.000000 2 3 4 0
#> [3,] 0.000000e+00 1.000000 2 3 4 5
#>
#> $plist$fam
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 0 0 0 0 0 0
#> [2,] 1 1 0 0 1 0
#> [3,] 2 2 2 4 4 4
#>
#> $plist$spouse
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 1 0 0 0 0 0
#> [2,] 0 1 0 1 0 0
#> [3,] 0 0 0 0 0 0
#>
#>
#> $x
#> [1] 2.166667e+00 1.166667e+00 2.047042e-09 2.000000e+00 1.000000e+00
#> [6] 3.000000e+00 4.000000e+00 0.000000e+00 3.000000e+00 1.000000e+00
#> [11] 2.000000e+00 4.000000e+00 5.000000e+00
#>
#> $y
#> [1] 1 1 2 2 2 2 2 3 3 3 3 3 3
#>
#> $boxw
#> [1] 0.2060484
#>
#> $boxh
#> [1] 0.05787568
#>
#> $call
#> kinship2::plot.pedigree(x = p3, cex = cex, col = col, symbolsize = symbolsize,
#> branch = branch, packed = packed, align = align, width = width,
#> density = density, angle = angle, keep.par = keep.par, pconnect = pconnect,
#> mar = mar)
# Plot the 4-generation pedigree
plotPedigree(df_ped_4, width = 1)
#> $plist
#> $plist$n
#> [1] 2 5 10 12
#>
#> $plist$nid
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
#> [1,] 2 1 0 0 0 0 0 0 0 0 0 0
#> [2,] 3 5 4 6 7 0 0 0 0 0 0 0
#> [3,] 8 9 11 15 14 13 10 12 17 16 0 0
#> [4,] 18 21 23 22 25 26 19 20 24 27 28 29
#>
#> $plist$pos
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#> [1,] 6.399999e+00 7.399999 0.000000 0.000000 0.000000 0.000000 0.000000
#> [2,] 3.299999e+00 4.299999 6.699999 7.699999 8.699999 0.000000 0.000000
#> [3,] 9.333331e-01 1.933333 2.933333 3.933333 4.933333 6.066666 7.066666
#> [4,] 1.854016e-14 1.000000 2.000000 3.000000 4.000000 5.000000 6.000000
#> [,8] [,9] [,10] [,11] [,12]
#> [1,] 0.000000 0.000000 0.00000 0 0
#> [2,] 0.000000 0.000000 0.00000 0 0
#> [3,] 8.066666 9.066666 10.06667 0 0
#> [4,] 7.000000 8.000000 9.00000 10 11
#>
#> $plist$fam
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
#> [1,] 0 0 0 0 0 0 0 0 0 0 0 0
#> [2,] 0 1 0 1 1 0 0 0 0 0 0 0
#> [3,] 0 1 1 1 0 0 3 3 3 0 0 0
#> [4,] 1 1 1 4 4 4 6 6 6 9 9 9
#>
#> $plist$spouse
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
#> [1,] 1 0 0 0 0 0 0 0 0 0 0 0
#> [2,] 1 0 1 0 0 0 0 0 0 0 0 0
#> [3,] 1 0 0 1 0 1 0 0 1 0 0 0
#> [4,] 0 0 0 0 0 0 0 0 0 0 0 0
#>
#>
#> $x
#> [1] 7.399999e+00 6.399999e+00 3.299999e+00 6.699999e+00 4.299999e+00
#> [6] 7.699999e+00 8.699999e+00 9.333331e-01 1.933333e+00 7.066666e+00
#> [11] 2.933333e+00 8.066666e+00 6.066666e+00 4.933333e+00 3.933333e+00
#> [16] 1.006667e+01 9.066666e+00 1.854016e-14 6.000000e+00 7.000000e+00
#> [21] 1.000000e+00 3.000000e+00 2.000000e+00 8.000000e+00 4.000000e+00
#> [26] 5.000000e+00 9.000000e+00 1.000000e+01 1.100000e+01
#>
#> $y
#> [1] 1 1 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4
#>
#> $boxw
#> [1] 0.4533065
#>
#> $boxh
#> [1] 0.08681352
#>
#> $call
#> kinship2::plot.pedigree(x = p3, cex = cex, col = col, symbolsize = symbolsize,
#> branch = branch, packed = packed, align = align, width = width,
#> density = density, angle = angle, keep.par = keep.par, pconnect = pconnect,
#> mar = mar)
By examining the side-by-side plots, you can contrast and analyze the structures of different families, tracing the inheritance of specific traits or conditions if needed.