Repeatedly removes structural leaf nodes from a pedigree until no further
trimming is possible or a stopping condition is reached. After each removal
pass, parent ID columns are updated so that references to removed individuals
are set to NA.
Usage
trimPedigree(
ped,
personID = "ID",
momID = "momID",
dadID = "dadID",
include_terminal = TRUE,
include_founder_singletons = TRUE,
max_iter = Inf,
min_size = 0L,
remove_ids = NULL,
keep_var = NULL,
keep_vals = NULL,
verbose = FALSE
)Arguments
- ped
a pedigree dataset. Needs ID, momID, and dadID columns
- personID
character. Name of the column in ped for the person ID variable
- momID
character. Name of the column in ped for the mother ID variable
- dadID
character. Name of the column in ped for the father ID variable
- include_terminal
Logical. If
TRUE(default), flag individuals with no children (outdegree 0) as leaves.- include_founder_singletons
Logical. If
TRUE(default), also flag founders with exactly one child (indegree 0, outdegree 1) as leaves.- max_iter
Integer or
Inf. Maximum number of trimming iterations. Defaults toInf, which trims until no other stopping condition applies.- min_size
Integer. Minimum number of individuals to retain. Trimming stops before any removal that would reduce the pedigree below this size. Defaults to
0L.- remove_ids
Character vector of additional individual IDs to remove before any leaf-based trimming. Defaults to
NULL.- keep_var
Character. Optional column name of a phenotypic variable. Passed to
findLeavesat every iteration so that individuals with protected phenotype values are never removed.- keep_vals
Optional vector of phenotype values that protect an individual from removal. See
findLeavesfor full details.- verbose
Logical. If
TRUE, print counts of each leaf type.
Value
A trimmed pedigree data.frame with the same columns as the
input. Parent ID columns (momID, dadID) are updated to
NA for any references to removed individuals.
Details
The trimming process peels the pedigree from the outside in: first removing the outermost leaves, then re-evaluating the remaining structure so that newly exposed leaves can be removed in subsequent iterations.
Iteration stops when any of the following conditions is met:
No leaf nodes remain.
The number of iterations reaches
max_iter.Removing the next batch of leaves would reduce the pedigree below
min_sizerows.
See also
findLeaves to preview which individuals would be removed.
Examples
if (FALSE) { # \dontrun{
ped <- data.frame(
ID = 1:6,
dadID = c(NA, NA, 1, 1, 3, NA),
momID = c(NA, NA, 2, 2, 4, NA)
)
trimPedigree(ped, verbose = TRUE)
trimPedigree(ped, min_size = 2, verbose = TRUE)
} # }