Selecting a given taxonomic annotation between those obtained from two different databases
taxodecider(metabarlist, best.db, sim.scores, lineage, threshold)
a metabarlist
object
a vector of two database names, with the best database in terms of taxonomic information reliability listed first.
a vector of two column names in the `motus` table corresponding to the similarity scores of each database.
a vector of two column names in the `motus` table corresponding to the full taxonomic lineage obtained for each database. This should be provided in the same order as `best.db`.
a similiarty score threshold above which the annotation of the best database is kept if both databases yields high similarity scores.
a metabarlist
object with a motus dataframe including
the preferred taxonomic assignements.
The function taxodecider
allows users to choose between two taxonomic annotations based on their best similarity scores and on a preference for a given database (e.g. with more reliable taxonomy). All taxonomic information should be stored in the `motus` table.
# \donttest{
data(soil_euk)
dir <- tempdir()
url = "https://raw.githubusercontent.com/metabaRfactory/metabaR_external_data/master/"
silva_file = "lit_euk---ssu---otus.csv"
silva_url = paste(url, silva_file, sep="")
silva_path <- file.path(dir, silva_file)
download.file(silva_url, silva_path)
clust_file = "lit_euk---ssu---sequence_cluster_map---litiere_euk_cl97_agg_filt.clstr"
clust_url = paste(url, clust_file, sep="")
clust_path <- file.path(dir, clust_file)
download.file(clust_url, clust_path)
taxonomy_file <- "tax_slv_ssu_138.1.txt"
taxonomy_url <- paste(url, taxonomy_file, sep="")
taxonomy_path <- file.path(dir, taxonomy_file)
download.file(taxonomy_url, taxonomy_path)
data(soil_euk)
soil_euk <- silva_annotator(
metabarlist = soil_euk,
silva.path = silva_path,
clust.path = clust_path,
taxonomy.path = taxonomy_path)
#> Warning: Some PCRs in metabarlist have a number of reads of zero in table `reads`!
soil_euk$motus$similarity = soil_euk$motus$similarity/100
soil_euk2 <- taxodecider(
metabarlist = soil_euk,
best.db = c("silva", "embl"),
sim.scores = c("similarity", "best_identity.order_filtered_embl_r136_noenv_EUK"),
lineage = c("lineage_silva", "path"),
threshold = 0.9
)
#> Warning: Some PCRs in metabarlist have a number of reads of zero in table `reads`!
# }