Aggregate PCR replicates in a metabarlist object.

aggregate_pcrs(metabarlist, replicates = NULL, FUN = FUN_agg_pcrs_sum)

FUN_agg_pcrs_sum(metabarlist, replicates)

FUN_agg_pcrs_mean(metabarlist, replicates)

FUN_agg_pcrs_prob(metabarlist, replicates)

Arguments

metabarlist

a metabarlist object

replicates

a vector containing the sample names to which each pcr replicate belongs and within which they should be aggregated. Default is the `sample_id` column from the `pcrs` table.

FUN

a replicate aggregation function. Default is the sum of reads per MOTU across replicates.

Value

A metabarlist where the table `reads` contains MOTU abundances aggregated according to FUN. The number of rows of the produced `reads` and `pcrs` tables are equal to that of the `samples` table.

Details

The function aggregate_pcrs is typically used at the end of the data filtration process and aims to aggregate reads and the pcr related information at the sample level. The user is free to use their own method of aggregation, but the following are often used and therefore pre-encoded:

  • FUN_agg_pcrs_sum: the reads of pcr replicates are summed for each MOTU

  • FUN_agg_pcrs_mean: the reads of pcr replicates are averaged for each MOTU. Results are rounded so as to obtain genuine count data

  • FUN_agg_pcrs_prob: the probability of detection is returned for each MOTU. This method is often used in studies dealing with ancient DNA (e.g. Pansu et al. 2015) or diet (e.g. Deagles et al. 2019).

After aggregation, the information contained in the `pcrs` table is averaged if numeric. If none numeric, information is dereplicated if equal across replicates, or concatenated if not.

Functions

  • aggregate_pcrs: Aggregate PCR replicates in a metabarlist object.

  • FUN_agg_pcrs_sum: Aggregate PCR replicates in a metabarlist object by summing MOTUs read counts across PCR replicates.

  • FUN_agg_pcrs_mean: Aggregate PCR replicates in a metabarlist object by averaging MOTUs read counts across PCR replicates

  • FUN_agg_pcrs_prob: Aggregate PCR replicates in a metabarlist object by computing the probability of MOTU occurrence across PCR replicates.

References

Deagle, B. E., Thomas, A. C., McInnes, J. C., Clarke, L. J., Vesterinen, E. J., Clare, E. L., ... & Eveson, J. P. (2019). Counting with DNA in metabarcoding studies: How should we convert sequence reads to dietary data?. Molecular Ecology, 28(2), 391-406.

Pansu, J., Giguet-Covex, C., Ficetola, G. F., Gielly, L., Boyer, F., Zinger, L., ... & Choler, P. (2015). Reconstructing long-term human impacts on plant communities: An ecological approach based on lake sediment DNA. Molecular Ecology, 24(7), 1485-1498.

Author

Lucie Zinger, Frédéric Boyer

Examples


data(soil_euk)

## With default function (sum reads across replicates)
soil_euk_ag <- aggregate_pcrs(soil_euk)
#> Warning: Some PCRs in out have a number of reads of zero in table `reads`!
summary_metabarlist(soil_euk)
#> $dataset_dimension
#>         n_row n_col
#> reads     384 12647
#> motus   12647    15
#> pcrs      384    11
#> samples    64     8
#> 
#> $dataset_statistics
#>         nb_reads nb_motus avg_reads sd_reads avg_motus sd_motus
#> pcrs     3538913    12647  9215.919 10283.45  333.6849  295.440
#> samples  2797294    12382 10926.930 10346.66  489.5117  239.685
#> 
summary_metabarlist(soil_euk_ag)
#> $dataset_dimension
#>         n_row n_col
#> reads      96 12647
#> motus   12647    15
#> pcrs       96    11
#> samples    64     8
#> 
#> $dataset_statistics
#>         nb_reads nb_motus avg_reads sd_reads avg_motus sd_motus
#> pcrs     3538913    12647  36863.68 25728.92  765.1146 585.5052
#> samples  2797294    12382  43707.72 24514.09 1115.1406 376.4573
#> 

## With the FUN_agg_prob pre-defined function
soil_euk_ag <- aggregate_pcrs(soil_euk, FUN = FUN_agg_pcrs_prob)
#> Warning: Some PCRs in out have a number of reads of zero in table `reads`!
summary_metabarlist(soil_euk)
#> $dataset_dimension
#>         n_row n_col
#> reads     384 12647
#> motus   12647    15
#> pcrs      384    11
#> samples    64     8
#> 
#> $dataset_statistics
#>         nb_reads nb_motus avg_reads sd_reads avg_motus sd_motus
#> pcrs     3538913    12647  9215.919 10283.45  333.6849  295.440
#> samples  2797294    12382 10926.930 10346.66  489.5117  239.685
#> 
summary_metabarlist(soil_euk_ag) ## output reads produced do not have much sense in this case.
#> $dataset_dimension
#>         n_row n_col
#> reads      96 12647
#> motus   12647    15
#> pcrs       96    11
#> samples    64     8
#> 
#> $dataset_statistics
#>         nb_reads nb_motus avg_reads sd_reads avg_motus sd_motus
#> pcrs    32033.75    12647  333.6849 269.9321  765.1146 585.5052
#> samples 31328.75    12382  489.5117 188.7353 1115.1406 376.4573
#> 

## With a custom function (here equivalent to FUN_agg_pcrs_sum,
## i.e. summing abundances of all MOTUs across replicates)
soil_euk_ag <- aggregate_pcrs(soil_euk,
                              FUN = function(metabarlist, replicates){
                                 rowsum(metabarlist$reads, replicates)})
#> Warning: Some PCRs in out have a number of reads of zero in table `reads`!