I have made major changes to the taxonomic table of the database, which is now compatible with SILVA taxonomy. Although SILVA taxonomy does not match the taxonomy in the List of Prokaryotic names with Standing in Nomenclature the change was necessary because new studies in FoodMicrobionet are processed using SILVA v138 and this was causing inconsistencies in the higher level taxonomy of taxa (i.e. the same genus might potentially have a different lineage depending on when it was added to the database) and this, in turn, would prevent correct aggregation at levels higher than genus. In case you want to know, I did compare taxonomic assignments done with SILVA v132 and v138 for five studies using different 16S RNA gene regions as a target. The same sequence was assigned to a different “taxon” in as many as 70% of the cases. However, when doing comparisons at the genus level, >96% of the sequences were assigned to the same genus using either version of the database. Mismatches were mostly due to sequences (actually Amplicon Sequence Variants) which, when tested with Seqmatch had consistently a Sab<0.80 with the best match. Again, the best way to compare studies is to reprocess data based on the same target using exactly the same pipeline, but this is time consuming. Doing comparisons at the genus level is still a reasonable alternative: Article A comparison of bioinformatic approaches for 16S rRNA gene p…
The update version of the last public version of FoodMicrobionet should be shortly available here.
As usual, we are open to collaborations and if you are interested in obtaining data from FoodMicrobionet 3.2.6 contact me.
Latest additions in FoodMicrobionet are done using SILVA 138 SSU. This version introduces several differences over previous version, especially in higher level taxonomy (2/3 taxonomic paths have been changed). I have compared taxonomic assignments with SILVA v132 with those of v138 using 5 recent studies targeting different 16S RNA gene regions (V1-V3, V3-V4, V4-V5) and found that:
- ≥95% of the sequences in each study are identified at the genus or species level in the same way
- overall, the matching identifications at the genus level range from 70 to 90%; differences are usually due to sequences of poor quality (which receive ambiguous identifications with either BLASTn or SEQMATCH)
Overall, while the best way to compare results of different studies is to re-analyse the data using the same pipeline and the same version of the taxonomic database, I still feel that comparing different studies at the genus level is a reasonable compromise. In addition, with FoodMicrobionet you always have the option of selecting studies which are as close as possible in terms of target, platform and pipeline. However, due to the changes in the higher level taxonomy, I have decided to make the higher level taxonomy (i.e. above the genus level) compatible with SILVA 138 SSU, even if this is sometimes in contrast with NCBI taxonomy or LPSN.
One last issue is with the new classification of the former genus Lactobacillus. The new classification has been incorporated in LPCN and NCBI taxonomy, but not in Florilege nor in SILVA, and searches with the old species names still work. Therefore I have decided to leave things as they are (and add a small hidden switch in the code of ShinyFMBN which allows you to convert old names into the new ones).