VICTOR FAQ

  1. How is my privacy respected?
  2. Who can use this service?
  3. What does "VICTOR" mean?
  4. How do I cite this service?
  5. How do I describe the methods and the results?
  6. What are the main advantages of using this service?
  7. How long does a job take?
  8. Which kind of data should be uploaded to the server?
  9. Which distance formula should be preferred?
  10. How can I use the files attached to the result e-mails?

Answers

1. How is my privacy respected?

See the according entry in the gene phylogeny FAQ.

2. Who can use this service?

See the according entry in the gene phylogeny FAQ.

3. What does "VICTOR" mean?

"VICTOR" stands for "Virus Classification and Tree Building Online Resource".

Use VICTOR if you want to infer phylogenies from the genome or proteome sequences of (prokaryotic and potentially other) viruses and/or obtain estimates for taxon boundaries at distinct ranks.

Work on VICTOR has been funded by the German Research Council as part of the SFB TRR 51.

4. How do I cite this service?

All relevant citations are listed in the result e-mails sent around by this service. The main VICTOR publication can be downloaded as preprint version, which can be cited. The paper is currently under review.

5. How do I describe the methods and the results?

Close to the middle of the main text of the result e-mails sent around by this service, suggestions for phrasing the according sections in the methods as well as the results chapter are contained. You just have to format and arrange them according to the instructions for authors of the chosen journal. You might also need to rephrase them slightly to avoid being falsely detected by plagiarism scanners. Watch out for instructions enclosed in square brackets. These indicate sections whose content must frequently be adapted, too.

6. What are the main advantages of using this service?

Based on the results of the VICTOR service, users can make an informed decision on the evolutionary relationships between prokaryotic viruses. The method was thoroughly optimized against a large reference dataset of genome-sequenced taxa recognized by the International Committee on Taxonomy of Viruses (ICTV) and showed a high agreement with the classification, particularly at the species and genus level. See the VICTOR references for details.

Use of VICTOR is simple. Data can be uploaded in several formats and yield informative result e-mails.

Technically it should not be a problem to apply VICTOR to the genomes or proteomes of other kinds of viruses. Phylogenetically and regarding the estimates for taxon boundaries, VICTOR might even work well for them, too. VICTOR has just not yet been tested in this respect.

7. How long does a job take?

Once your VICTOR submission has received a free computation slot on the server, the estimated running time of your job is expected to be as shown below. If you want to check whether or not there are still free slots, you can check the payload progress bar at the end of the VICTOR submission page.

Estimated running time of VICTOR submissions in dependence of data type
If sufficient server resources are available, VICTOR switches to a fast track mode, thus reducing the overall running time of your submission by a factor of about 4.

8. Which kind of data should be uploaded to the server?

You should upload FASTA files, GenBank files and/or GenBank accession IDs. Each of the uploaded files should contain the genome and/or proteome sequence of a single virus. Each line of uploaded GenBank accession IDs should yield the genome and/or proteome sequence of a single virus.

Analysis is either at the genome or proteome level; you cannot mix them. Incomplete genomes can be analysed but then other distance formulas must be preferred.

At least four usable genomes or proteomes must be uploaded, otherwise phylogenies cannot be inferred.

A length check ensures that genomes of cellular organisms are not processed by VICTOR. If you think this length check hinders you analysing viruses with VICTOR, please contact the authors.

9. Which distance formula should be preferred?

VICTOR delivers e-mails which contain the results from applying distinct distance formulas in otherwise identical GBDP runs. This means one tree per formula and one set of clustering results per formula. The VICTOR study indicates that formula d6 should be preferred when amino-acid sequences of prokaryotic viruses are analysed — unless incomplete proteome sequences are contained in the data set. In that case d4 is the formula of choice. The VICTOR study also indicates that formula d0 should be preferred when nucleotide sequences of prokaryotic viruses are analysed — unless incomplete genome sequences are contained in the data set. In that case d4 is again the formula of choice.

The meaning of the three formulas is described in the GGDC FAQ but for historical reasons it uses different terms. GGDC formula 1 is VICTOR d0, GGDC formula 2 is VICTOR d4, and GGDC formula 3 is VICTOR d6.

10. How can I use the files attached to the result e-mails?

The files attached by the service use the following standardized file extensions:

pdf
PDF file depicting a midpoint-rooted phylogenetic tree. This is a not necessarily a publication-ready figure.
phy
Phylogenetic tree in Newick format, labels cleaned.
tsv
Tabulator-separated file containing the affiliations to clusters at the species, genus and family rank.

Marks for the cluster affiliations at the species (S), genus (G) and family (F) level contained in the tip labels of the phylogenetic trees are found after an "@" sign.