Table2itol
Generating iToL annotations from Spreadsheet or CSV files
Install / Use
/learn @mgoeker/Table2itolREADME
table2itol
About
Interactive Tree of Life (iTOL) is a popular tool for
displaying phylogenetic trees and associated information. The table2itol.R
script makes it easy to generate iTOL annotations from spreadsheet files.
Features
- Works with CSV, OpenOffice, LibreOffice and Microsoft Excel files.
- Supports iTOL domains, colour strips, simple bars, gradients, binary data, heat maps, and texts.
- Partially supports iTOL branch annotation (currently work in progress).
- By default selects the appropriate visualisation from the data type of each input column but this can be modified by the user.
- Provides carefully chosen colour vectors for up to 40 levels and optionally combines them with symbols for maximizing contrast.
- The default colour vectors can be replaced by user-defined colour vectors.
- Can be used either interactively on any operating system on which R is running, or non-interactively using the command line of a UNIX-like system.
Prerequisites
- A recent (>= 3.2.0) version of R.
- The optparse package for R if
you want to run the script in non-interactive mode or if you want to read the
help message (which is placed in
tests/table2itol_help.txt, too). - The plotrix package for R if you want to generate branch annotations from continuous numeric data.
- The readxl package for R if you want to apply the script to Microsoft Excel files.
- The readODS package for R if you want to apply the script to Libreoffice or Openoffice ods files.
- The yaml package for R if you want to define colour vectors yourself.
Please note that explaining how to correctly install R is beyond the scope of
this manual, and please do not contact the table2itol.R authors about this
issue. There is plenty of online material available elsewhere. As for the
installation of R packages see the FAQ below.
Installation
First, obtain the script as indicated on its GitHub page.
Command-line use
The following explanations are for non-experts; there is nothing special with running this script in command-line mode on UNIX-like systems. First, if necessary make the script executable:
chmod +x table2itol.R
Then call:
./table2itol.R
to obtain the help message. If this yields an error, see the troubleshooting chapter.
Optionally place the script in a folder that is contained in the $PATH
variable, e.g.
install table2itol.R ~/bin
or even
sudo install table2itol.R /usr/local/bin
if you have sudo permissions. Then you can call the script by just entering
table2itol.R
Interactive use
Open R or RStudio or whatever interface to R you are using, then enter at the console:
source("table2itol.R")
provided the script is located in the current working directory as given by
getwd(). Alternatively, first use setwd() to move to the directory in which
table2itol.R resides or enter the full path to the location of the script.
When loading the script it shows the usual help message and an indication that you are running it in interactive mode. When doing so, you might need to modify the arguments of the function much like command-line users might need to apply certain command-line options. For instance, in analogy to entering:
./table2itol.R --na-strings X --identifier Tip --label Name ann1.tsv ann2.tsv
on the command line of a UNIX-like system, you would enter within R the following:
source("table2itol.R")
create_itol_files(infiles = c("ann1.tsv", "ann2.tsv"),
identifier = "Tip", label = "Name", na.strings = "X")
The analogy should be obvious, hence for details on the arguments of
create_itol_files see the help message. The arguments of the function are
identical to the long version of the arguments of the script, subjected to the
replacement of dashes by dots to yield syntactic names. The sole mandatory
argument of the function is infiles, whose value is identical to the
positional arguments of the script. With some basic knowledge of R it is thus
easy to set up customized scripts that set the arguments for your input files
and generate the intended output.
Examples
Exemplars for input table files are found within the tests/INPUT folder. A
list of examples for calling table2itol.R is found in tests/examples.txt.
Experts only: On a UNIX-like system you can run these examples by calling
tests/run_tests.sh provided a modern
Bash is installed. The versions of R and
the R packages used for testing by the maintainer are found in the file
tests/R_settings.txt.
Troubleshooting
Some commonly encountered error messages are mentioned in the following. Note that you might actually get these error messages in a language other than English (e.g., your own language) or with other minor modifications.
Command-line use
Bad interpreter
/usr/local/bin/Rscript: bad interpreter: No such file or directory
Solution: Enter
locate Rscript
and watch the output. If it is empty, you must install
R first. If you instead obtained a location such
as /usr/bin/Rscript you could do the following:
sudo ln -s /usr/bin/Rscript /usr/local/bin/Rscript
if you had sudo permissions. Alternatively, within the first line of the script
replace /usr/local/bin/Rscript by /usr/bin/Rscript or wherever your
Rscript executable is located. A third option is to leave the script as-is and
enter Rscript table2itol.R instead of ./table2itol.R or whatever location of
the script you are using. But this is less convenient in the long run.
Please note that explaining how to correctly install R is beyond the scope of
this manual, and please do not contact the table2itol.R authors about this
issue. There is plenty of online material available elsewhere.
Command-line or interactive use
Missing R package
there is no package called 'optparse'
Solution: Install the optparse
package for R. (It is not an absolute requirement in interactive mode but
without it you would not see the help message. However, this message is placed
in tests/table2itol_help.txt anyway.)
there is no package called 'plotrix'
Solution: Install the plotrix package for R. (It is only needed if you want to create branch annotations from continuous numeric data.)
there is no package called 'readODS'
Solution: Install the readODS package for R. (It is only needed if you want to apply the script to ods files.)
there is no package called 'readxl'
Solution: Install the readxl package for R. (It is only needed if you want to apply the script to Microsoft Excel files.)
there is no package called 'yaml'
Solution: Install the yaml package for R. (It is only needed if you want to use the script in conjunction with your own colour vectors.)
Please note that explaining how to correctly install R is beyond the scope of
this manual, and please do not contact the table2itol.R authors about this
issue. There is plenty of online material available elsewhere. As for the
installation of or R packages see the FAQ below.
Outdated R version
need a newer version of R, 3.2.0 or higher
Solution: Install a newer version of R.
Please note that explaining how to correctly install R is beyond the scope of
this manual, and please do not contact the table2itol.R authors about this
issue. There is plenty of online material available elsewhere.
The script generates not enough output files
Solution: Watch the warnings and error messages generated by the script. Without
any input files, the script should not generate any output. The script would
also skip input files or single tables if they failed to contain columns you
have requested. You can use the --abort option to let the script immediately
stop in such cases, then look up the last error message in this manual. But even
without --abort the script generates warnings when data sets get skipped.
The script generates too many output files
Solution: Accept as a design decision that the scripts generates one file for
each input column (except for the tip identifier column and when a heat map is
created). Since you can still decide to not upload (some of) the generated files
to iTOL and also deselect data sets within iTOL, we believe it would not make
much sense to also include a selection mechanism within the table2itol.R
script. As last resort you could also reduce the number of input columns.
However, if you are mainly concerned about the script cluttering up your working
directory with files, simply consider using the --directory option to place
all output files in a dedicated directory. An empty argument to this option
causes the script to place every output file in the directory in which the
respective input file resides.
A column is requested but missing
selected column 'ID' does not exist
Solution: Use the --identifier option to set the name of the tip identifier
column.
selected column 'Label' does not exist
Solution: Use the --label option to set the name of the tip label column.
A name clash of output file names occurs
name clash: file [...] has already been generated
Solution: Name the columns distinctly in distinct tables within the same file
