Solve duplicated species names by summarizing traits given user provided functions for common classes of variables (numeric, character, and logical).
lcvp_solve_dups(
x,
duplicated_col,
fixed_cols = NULL,
func_numeric = mean,
func_character = .keep_all,
func_logical = any
)
data.frame.
The number of the column position with duplicated names to be solved.
The columns positions that should be left out of the summarizing processes. Normally applies for columns with fixed values across repeated names.
A function to summarize numeric columns if solve_duplicated = TRUE. Default will return the mean.
A function to summarize character or factor columns if solve_duplicated = TRUE. Default will keep all unique strings separated by comma.
A function to summarize logical columns if solve_duplicated = TRUE.Default will return TRUE if any is TRUE.
A data.frame with the same number of columns in x
and combined duplicated lines according to functions provided.
The function will combine lines in x
with duplicated names found in
duplicated_col
. User-defined functions to combine the information in
x
should take a vector (of length > 2) of the corresponding class
(numeric, character, and logical) and output only one value of the
corresponding class. Factors are transformed into characters.
# Ensure that LCVP package is available before running the example.
# If it is not, see the `lcvplants` package vignette for details
# on installing the required data package.
if (requireNamespace("LCVP", quietly = TRUE)) { # Do not run this
# Create a data.frame with duplicated names and different traits
splist <- sample(apply(LCVP::tab_lcvp[1:100, 2:3], 1, paste, collapse = " "))
search <- lcvp_search(splist)
x <- data.frame("Species" = search$Output.Taxon,
"Trait1" = runif(length(splist)),
"Trait2" = sample(c("a", "b"), length(splist), replace = TRUE),
"Trait3" = sample(c(TRUE, FALSE), length(splist), replace = TRUE))
# Solve with default parameters
lcvp_solve_dups(x, 1)
# Summarize numbers using the median
lcvp_solve_dups(x, 1, func_numeric = median)
# Get one of characters at random
lcvp_solve_dups(x, 1, func_character = function(x){sample(x, 1)})
}