Data from a DNA metabarcoding experiment assessing the diversity of soil eukaryotes in two tropical forests in French Guiana.
data(soil_euk)
An object of class metabarlist
; see check_metabarlist
.
Samples were collected at 2 sample sites in contrasting habitats:
The Mana site located in a white sand forest, characterised by highly oligotrophic soils and tree species adaptated to the local harsh conditions.
The Petit Plateau site is located in pristine rainforest (Nouragues Natural Reserve) characterized by soils rich in clay and organic matter.
At each site, sample collection was conducted at 16 sampling points separated from one another by 20 m and arranged in a grid across a 1 ha plot. At each sampling point, two types of environmental material were collected:
soil: a composite sample of 5 soil cores
litter: 1 m2 of surface leaf litter from the forest floor
=> A total of 64 DNA extracts (i.e. 16 sampling points x 2 sites x 2 types of environmental matrix, i.e. soil and litter) were thus produced, in addition to four DNA extraction controls (one per site and environmental matrix).
For each DNA extract, a short region of the 18S rRNA (primer pair Euka02 in Taberlet et al. 2018) was amplfied by PCR in quadruplicate, following the protocol described in Zinger et al. (2019). The resulting PCR products were pooled and sequenced on an Illumina HiSeq platform, using the paired-end technology.
The total experiment hence resulted in 384 PCR products consisting of:
256 PCR products obtained from biological samples (16 sampling points x 2 sites x 2 environmental matrices x 4 pcr replicates).
16 PCR products obtained from extraction negative controls (4 extraction negative controls x 4 pcr replicates).
32 PCR products corresponding to PCR negative controls (8 pcr negative controls x 4 pcr replicates).
48 sequencing negative controls (i.e. 48 unused tag combinations).
32 PCR products corresponding to PCR positive controls, i.e. PCR amplifications of a DNA template composed of a mixture of DNA from 16 plant species (8 pcr positive controls at different dilutions x 4 pcr replicates).
The retrieved data were then processed using the OBITools (Boyer et al. 2016) and SUMACLUST (Mercier et al. 2013) packages. Briefly, paired-end reads were assembled, assigned to their respective samples/marker and dereplicated. Low-quality sequences (containing Ns, shorter than 50 bp or singletons) were excluded; the remaining sequences were clustered into molecular operational taxonomic units (MOTUs) using SUMACLUST at a sequence similarity threshold of 0.97. The representative sequence of each MOTU (the most abundant one) was assigned a taxonomic clade using a databased built from the EMBL (release 136) with the ecoPCR program (Ficetola et al., 2010). Taxonomic assignments obtained from the SILVAngs pipeline (Quast et al. 2013; using default parameters for the taxonomic identification) are also available at https://github.com/metabaRfactory/metabaR_external_data (lit_euk---ssu* files) .
The data `soil_euk` is a metabarlist
containing four tables
`reads`: a numeric matrix containing MOTU abundances (expressed as a number of reads) for each PCR (i.e. technical replicates of both biological samples and positive and negative controls)
`motus`: a dataframe containing MOTU characteristics (e.g. taxonomy, sequence) for each MOTU)
`pcrs`: a dataframe containing information on each PCR (e.g. control type, PCR well, etc.)
`samples`: a dataframe containing information on each environmental sample (e.g. habitat type, etc.)
Boyer, F., Mercier, C., Bonin, A., Le Bras, Y., Taberlet, P., & Coissac, E. (2016). obitools: a unix-inspired software package for DNA metabarcoding. Molecular Ecology Resources, 16(1), 176-182.
Ficetola, G. F., Coissac, E., Zundel, S., Riaz, T., Shehzad, W., Bessière, J., ... & Pompanon, F. (2010). An in silico approach for the evaluation of DNA barcodes. BMC Genomics, 11(1), 434.
Mercier, C., Boyer, F., Bonin, A., & Coissac, E. (2013, November). SUMATRA and SUMACLUST: fast and exact comparison and clustering of sequences. In Programs and Abstracts of the SeqBio 2013 workshop. Abstract (pp. 27-29).
Taberlet, P., Bonin, A., Zinger, L., & Coissac, E. (2018). Environmental DNA: For Biodiversity Research and Monitoring. Oxford University Press.
Zinger, L., Taberlet, P., Schimann, H., Bonin, A., Boyer, F., De Barba, M., ... & Chave, J. (2019). Body size determines soil community assembly in a tropical forest. Molecular Ecology, 28(3), 528-543.
Quast, C., Pruesse, E., Yilmaz, P., Gerken, J., Schweer, T., Yarza, P., ... & Glöckner, F. O. (2012). The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Research, 41(D1), D590-D596.
data(soil_euk)
summary_metabarlist(soil_euk)
#> $dataset_dimension
#> n_row n_col
#> reads 384 12647
#> motus 12647 15
#> pcrs 384 11
#> samples 64 8
#>
#> $dataset_statistics
#> nb_reads nb_motus avg_reads sd_reads avg_motus sd_motus
#> pcrs 3538913 12647 9215.919 10283.45 333.6849 295.440
#> samples 2797294 12382 10926.930 10346.66 489.5117 239.685
#>