Import/export scripts

This set of R (R Core Team, 2015) scripts is useful to put OTU abundance tables into shape for import in networks analysis software, such as Gephi (and hence into FoodMicrobionet). I have tried to make the scripts as foolproof as possible but there is no guarantee they perform correctly if you do not follow the instructions.

To use the scripts you need R with RStudio (optional). It is convenient to have scripts and files in a directory in your desktop called “OTUTables”. Please note: The scripts have been tested using R and RStudio running on systems with MacOS11 or Windows 8. If the scripts misbehave with your system please let me know (send along the R workspace of the sessions in which you observed problems and describe the problem as clearly as you can).

The scripts in this page are saved with a .txt extension (this makes easier uploading and dowloading in WordPress). Just open in a text editor and copy and paste in a new document in R or in the script window of RStudio (File->New file->R Script)

If you want to open them as R scripts save a copy and change the extension to .R. You can open the scripts by using File->Open document in R or File ->Open file… in RStudio.

The OTUtabtoedgefilev2.R script will use a tab-delimited OTU abundance table (examples here and here) as input, reads it in a dataframe (checking that sample names conforms to the syntax for variables in R, i.e. must start with a letter or a dot not followed by a number, can contain letters, numbers, dot and underscores) in wide format (i.e. OTUs on first column, as character, one column with abundances for each sample) and then builds the edges table retaining only sample-OTU combinations with abundance >0. This script uses only functions in the R base package.

A simpler (OTUtabedgemeltv3.R) script uses the “melt” function of the reshape2 package (Wickham, 2007) to transform the table in long format, suitable for use as edges table.

Please note: in both cases if there are duplicate OTU entries they will not be summed. Duplicates must be removed separately (weights of duplicate entries should be summed) prior to import of the edges table.

In both cases the OTU abundance table and the edge table are saved as .Rdata and tab delimited .txt files with names

  1. OTU abundance table: name.Rdata, name.txt where name is the filename of the original abundance table
  2. edge file: nameedge.Rdata, nameedge.txt where name is the filename of the original abundance table

Note: in FoodMicrobionet the sum of abundances is standardized to 100 before import. I have tested the scripts with a few input formats for abundance tables. If you notice that the format you use is returning wrong results for the edge table, let me know (and send the abundance table)

You can perform the reverse operation (i.e. tranform an edge table in an OTU abundance table using this script, which makes use of the dcast function of the Reshape2 package. Be extra careful with the structure of the file. A newer version of the script, which comes handy when using FMBN 2.0 edge tables and allows to generate files which can be used with the package Bipartite of R (more on this later) is available here. Be extra careful with the structure of the input file. At least 4 columns are required (Source: the sample name; Target: the OTU lineage in the format used in FoodMicrobionet; OTULabel: a character column with unique labels for the each OTU; Weight: the abundance, in %). Be careful with this script, howere, it must be run in pieces.

Literature

Wickham H. (2007). Reshaping Data with the reshape Package. Journal of Statistical Software, 21(12), 1-20. URL http://www.jstatsoft.org/v21/i12/.

R Core Team (2015). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/