Aggregate PCR replicates in a metabarlist
object.
aggregate_pcrs(metabarlist, replicates = NULL, FUN = FUN_agg_pcrs_sum)
FUN_agg_pcrs_sum(metabarlist, replicates)
FUN_agg_pcrs_mean(metabarlist, replicates)
FUN_agg_pcrs_prob(metabarlist, replicates)
a metabarlist
object
a vector containing the sample names to which each pcr replicate belongs and within which they should be aggregated. Default is the `sample_id` column from the `pcrs` table.
a replicate aggregation function. Default is the sum of reads per MOTU across replicates.
A metabarlist
where the table `reads` contains MOTU abundances aggregated according to FUN
. The number of rows of the produced `reads` and `pcrs` tables are equal to that of the `samples` table.
The function aggregate_pcrs
is typically used at the end of the data filtration process and aims to aggregate reads and the pcr related information at the sample level. The user is free to use their own method of aggregation, but the following are often used and therefore pre-encoded:
FUN_agg_pcrs_sum
: the reads of pcr replicates are summed for each MOTU
FUN_agg_pcrs_mean
: the reads of pcr replicates are averaged for each MOTU.
Results are rounded so as to obtain genuine count data
FUN_agg_pcrs_prob
: the probability of detection is returned for each MOTU.
This method is often used in studies dealing with ancient DNA (e.g. Pansu et al. 2015) or diet (e.g. Deagles et al. 2019).
After aggregation, the information contained in the `pcrs` table is averaged if numeric. If none numeric, information is dereplicated if equal across replicates, or concatenated if not.
aggregate_pcrs
: Aggregate PCR replicates in a metabarlist
object.
FUN_agg_pcrs_sum
: Aggregate PCR replicates in a metabarlist
object by summing MOTUs read counts across PCR replicates.
FUN_agg_pcrs_mean
: Aggregate PCR replicates in a metabarlist
object by averaging MOTUs read counts across PCR replicates
FUN_agg_pcrs_prob
: Aggregate PCR replicates in a metabarlist
object by computing the probability of MOTU occurrence across PCR replicates.
Deagle, B. E., Thomas, A. C., McInnes, J. C., Clarke, L. J., Vesterinen, E. J., Clare, E. L., ... & Eveson, J. P. (2019). Counting with DNA in metabarcoding studies: How should we convert sequence reads to dietary data?. Molecular Ecology, 28(2), 391-406.
Pansu, J., Giguet-Covex, C., Ficetola, G. F., Gielly, L., Boyer, F., Zinger, L., ... & Choler, P. (2015). Reconstructing long-term human impacts on plant communities: An ecological approach based on lake sediment DNA. Molecular Ecology, 24(7), 1485-1498.
data(soil_euk)
## With default function (sum reads across replicates)
soil_euk_ag <- aggregate_pcrs(soil_euk)
#> Warning: Some PCRs in out have a number of reads of zero in table `reads`!
summary_metabarlist(soil_euk)
#> $dataset_dimension
#> n_row n_col
#> reads 384 12647
#> motus 12647 15
#> pcrs 384 11
#> samples 64 8
#>
#> $dataset_statistics
#> nb_reads nb_motus avg_reads sd_reads avg_motus sd_motus
#> pcrs 3538913 12647 9215.919 10283.45 333.6849 295.440
#> samples 2797294 12382 10926.930 10346.66 489.5117 239.685
#>
summary_metabarlist(soil_euk_ag)
#> $dataset_dimension
#> n_row n_col
#> reads 96 12647
#> motus 12647 15
#> pcrs 96 11
#> samples 64 8
#>
#> $dataset_statistics
#> nb_reads nb_motus avg_reads sd_reads avg_motus sd_motus
#> pcrs 3538913 12647 36863.68 25728.92 765.1146 585.5052
#> samples 2797294 12382 43707.72 24514.09 1115.1406 376.4573
#>
## With the FUN_agg_prob pre-defined function
soil_euk_ag <- aggregate_pcrs(soil_euk, FUN = FUN_agg_pcrs_prob)
#> Warning: Some PCRs in out have a number of reads of zero in table `reads`!
summary_metabarlist(soil_euk)
#> $dataset_dimension
#> n_row n_col
#> reads 384 12647
#> motus 12647 15
#> pcrs 384 11
#> samples 64 8
#>
#> $dataset_statistics
#> nb_reads nb_motus avg_reads sd_reads avg_motus sd_motus
#> pcrs 3538913 12647 9215.919 10283.45 333.6849 295.440
#> samples 2797294 12382 10926.930 10346.66 489.5117 239.685
#>
summary_metabarlist(soil_euk_ag) ## output reads produced do not have much sense in this case.
#> $dataset_dimension
#> n_row n_col
#> reads 96 12647
#> motus 12647 15
#> pcrs 96 11
#> samples 64 8
#>
#> $dataset_statistics
#> nb_reads nb_motus avg_reads sd_reads avg_motus sd_motus
#> pcrs 32033.75 12647 333.6849 269.9321 765.1146 585.5052
#> samples 31328.75 12382 489.5117 188.7353 1115.1406 376.4573
#>
## With a custom function (here equivalent to FUN_agg_pcrs_sum,
## i.e. summing abundances of all MOTUs across replicates)
soil_euk_ag <- aggregate_pcrs(soil_euk,
FUN = function(metabarlist, replicates){
rowsum(metabarlist$reads, replicates)})
#> Warning: Some PCRs in out have a number of reads of zero in table `reads`!