The availability of Next Generation Sequencing platforms and powerful bioinformatics tools for sequence and data analysis is boosting the number of studies based on high-throughput sequencing of amplicons from food microbial communities. Although most journals require the deposit of sequences in public repositories (such as the NCBI Sequence Read Archive), accessing and using these data requires time and bioinformatics skills.
Among the many tools used to visualize data from sequencing-based studies, network analysis is particularly appealing because it provides a way to quickly visualize relationships between food matrices and microbes identified as OTUs (Operational Taxonomic Units). Although network analysis tools are more related to exploratory data analysis than to inferential statistics, they are relatively easy to use and the visualizations they provide are intuitive and appealing. Cytoscape is perhaps the most frequently used tool (and provides plugins for the creation and analysis of co-occurrence and correlation networks) but other, more powerful tools are available to expert users. With this in mind it is tempting to imagine that it might be possible to create a database and a representation tool, which may allow users to carry out meta analysis of food microbial communities in a rapid and efficient way. Such a tool may benefit the scientific community and industrial or public stakeholders in several ways:
- providing access to a large set of curated data on the occurrence of different taxa in foods, facilitating the process of formulating and validating hypotheses on the structure and dynamics of food bacterial communities and writing original articles and reviews;
- fostering open access to microbial ecology data;
- improving our understanding of the ecology of spoilage-associated and beneficial microorganisms;
- providing information on the structure of bacterial communities in raw materials, fermented and spoiled foods which can be used for food process development
A small consortium of research groups (a list is published here) has therefore agreed to share data from published and unpublished studies to create a demo for the initiative.
FoodMicrobionet has been created with Gephi (a network visualization tool which has been originally developed for social sciences and which provides more control on visualisation compared to Cytoscape). As of 23/2/2018, two versions have been released. To learn more on the development of FoodMicrobionet tools click here.
The figure on the left shows a static representation of the whole network for FoodMicrobionet v 1. with the colour of nodes representing food categories or OTUs (different colours are used for different families). Here and in the figures below the relative importance of the OTUs in the dataset is shown by the size of their nodes which is related to the weighted degree (i.e. the weighted sum of all edges for samples, which by default sums to 100 or the weighed sum of all edges for OTUs); the thickness of the edges connecting a given sample with a given set of OTUs represent the % occurrence of OTUs. A Yfan Hu layout has been applied to the network to highlight similarities among samples, proximities between samples and OTUs, which dominate their microbiota, and to identify core microbial communities in different samples.
Although the figure might be pretty, it provides little insight for any user. The real power of the network is however in the possibility of rapidly filtering and processing data to obtain visualisations at different levels of depth and in the possibility of exporting filtered data (nodes and edges) tables for further processing.
By itself, Gephi offers rapid selection tools to the user. Two examples, one in which all sample nodes in which Pseudomonas appears and one in which all OTUs occurring in a single sample are selected “on the fly” are provided below:
Simple or complex filters can be used to select subsets of the microbiota in samples belonging to a given food category. In the examples below two networks for cheese and cow milk are shown. In both cases filters for the dominating OTUs were applied to hide OTUs appearing only in a limited number of samples. Color of sample nodes is set to grey scale, while colors for OTU nodes match families.
The structure of the data tables used in FoodMicrobionet is evolving. In version 1.0 edge tables include fields for source (food) and target (OTU) nodes, weight (the frequency of the OTU in the sample, as %) and a number of fields for filtering purposes. Node tables include metadata for each OTU and food sample (including labels, taxonomic lineages, out links to other resources). This can be used for the selection of sample and OTU nodes based on different properties and to apply partition and ranking styles to nodes. Specifications for nodes and edges can be found here.
In the future more fields will be added with information on ecologically relevant properties of foods (aW, pH, temperature of storage or of production, main ingredients)
Furthermore, interactive visualisations of the network or sub networks extracted by filtering can be easily obtained by using the Sigmajs exporter plugin of Gephi, or by similar tools. An example of an interactive graph can be found here. The web visualisation is relatively easy to explore even for unexperienced users. A user manual is available here.
Note: FMBN is undergoing a major change. Visit the version pages for updates. Although web visualisations will still provided (whenever possible) visualisations and processing will be mostly through Shiny apps.
Future plans include:
- disseminating the results of the initiative in journals and scientific meetings
- improve the accessibility of FoodMicrobionet by creating interactive apps
- finding sponsorships from publishers, scientific societies, stakeholders
- participating to calls for funding within the framework of Horizon 2020
Are you interested? Please feel free to contact us by E-mail