Solve duplicated names by summarizing traits — lcvp_solve

Solve duplicated species names by summarizing traits given user provided functions for common classes of variables (numeric, character, and logical).

lcvp_solve_dups(
  x,
  duplicated_col,
  fixed_cols = NULL,
  func_numeric = mean,
  func_character = .keep_all,
  func_logical = any
)

Arguments

x: data.frame.
duplicated_col: The number of the column position with duplicated names to be solved.
fixed_cols: The columns positions that should be left out of the summarizing processes. Normally applies for columns with fixed values across repeated names.
func_numeric: A function to summarize numeric columns if solve_duplicated = TRUE. Default will return the mean.
func_character: A function to summarize character or factor columns if solve_duplicated = TRUE. Default will keep all unique strings separated by comma.
func_logical: A function to summarize logical columns if solve_duplicated = TRUE.Default will return TRUE if any is TRUE.

Value

A data.frame with the same number of columns in x

and combined duplicated lines according to functions provided.

Details

The function will combine lines in x with duplicated names found in duplicated_col. User-defined functions to combine the information in x should take a vector (of length > 2) of the corresponding class (numeric, character, and logical) and output only one value of the corresponding class. Factors are transformed into characters.

Author

Bruno Vilela & Alexander Ziska

Examples

# Ensure that LCVP package is available before running the example.
# If it is not, see the `lcvplants` package vignette for details
# on installing the required data package.
if (requireNamespace("LCVP", quietly = TRUE)) { # Do not run this

# Create a data.frame with duplicated names and different traits
splist <- sample(apply(LCVP::tab_lcvp[1:100, 2:3], 1, paste, collapse = " "))
search <- lcvp_search(splist)

x <- data.frame("Species" = search$Output.Taxon,
"Trait1" = runif(length(splist)),
"Trait2" = sample(c("a", "b"), length(splist), replace = TRUE),
"Trait3" = sample(c(TRUE, FALSE), length(splist), replace = TRUE))

# Solve with default parameters
lcvp_solve_dups(x, 1)

# Summarize numbers using the median
lcvp_solve_dups(x, 1, func_numeric = median)

# Get one of characters at random
lcvp_solve_dups(x, 1, func_character = function(x){sample(x, 1)})

}