Parse taxonomic information from full taxonomic paths

taxoparser(taxopath, sep.level, sep.info)

Arguments

taxopath

a vector containing full taxonomic paths to parse

sep.level

a character string to separate the taxonomic levels in `taxopath`. NA character not allowed.

sep.info

a character string to separate taxonomic from taxorank information in `taxopath`. NA character not allowed.

Value

a list of vectors containing parsed taxa as values and corresponding taxonomic ranks as value names.

Details

The taxonomic path should include both taxa names AND their associated taxonomic rank (full names or abbreviations as in qiime or unite outputs). The function will use it together with separators by decreasing level of taxonomic resolution. The taxonomic information should follow a standard structure across samples (e.g. standard taxonomy as in Genbank, SILVA or BOLD by decreasing level of taxonomic resolution: the function does not infer missing taxonomic ranks).

See also

Author

Lucie Zinger

Examples


data(soil_euk)

# Parse taxonomic path

## a ncbi-like type of full taxonomic path
taxoparsed <- taxoparser(taxopath = soil_euk$motus$path,
                         sep.info = "@",
                         sep.level= ":")

## a qiime/unite-like type of full taxonomic path.
arthropoda <- subset_metabarlist(soil_euk,
                                 table = "motus",
                                 indices = grepl("Arthropoda", soil_euk$motus$path))
#> Warning: Some PCRs in out have a number of reads of zero in table `reads`!

qiimepath <- apply(arthropoda$motus[,grep("[msry]_name", colnames(soil_euk$motus))], 1,
                  function(x) {
                     paste(sapply(1:length(x), function(y) {
                       paste(c("p", "c", "o", "f", "g", "s")[y], x[y], sep="_")
                       }), collapse =";")})

taxoparsed <- taxoparser(taxopath = qiimepath,
                         sep.info = "_",
                         sep.level= ";")