Usage

phigaro -f filename.fasta -o folder/basename

Full list of options

phigaro -h      

usage: phigaro [-h] [-V] -f FASTA_FILE [-c CONFIG] [-p] [-e EXTENSION [EXTENSION ...]] [-o OUTPUT] [--not-open] [-t THREADS]  
               [-S SUBSTITUTE_OUTPUT] [-d] [-m MODE]

Phigaro is a scalable command-line tool for predictions phages and prophages from nucleid acid sequences

optional arguments:
  -h, --help            show this help message and exit
  -V, --version         show program's version number and exit
  -f FASTA_FILE, --fasta-file FASTA_FILE
                        Assembly scaffolds/contigs or full genomes, required
  -c CONFIG, --config CONFIG
                        Path to the config file, not required
  -p, --print-vogs      Print phage vogs for each region
  -e EXTENSION [EXTENSION ...], --extension EXTENSION [EXTENSION ...]
                        Type of the output: html, tsv, gff, bed or stdout. Default is html. You can specify several file      
                        formats with a space as a separator. Example: -e tsv html stdout.
  -o OUTPUT, --output OUTPUT
                        Output filename for html and txt outputs. Required by default, but not required for stdout only       
                        output.
  --not-open            Do not open html file automatically, if html output type is specified.
  -t THREADS, --threads THREADS
                        Num of threads (default is num of CPUs=4)
  --no-cleanup          Do not delete any temporary files that was generated by Phigaro (HMMER & Prodigal outputs and some others).
  -S SUBSTITUTE_OUTPUT, --substitute-output SUBSTITUTE_OUTPUT
                        If you have precomputed prodigal and/or hmmer data you can provide paths to the files in the
                        following format: program:address/to/the/file. In place of program you should write hmmer or
                        prodigal. If you need to provide both files you should pass them separetely as two parametres.
  --save-fasta          Save all phage fasta sequences in a fasta file.           
  -d, --delete-shorts   Exclude sequences with length < 20000 automatically.
  -m MODE, --mode MODE  You can launch Phigaro at one of 3 modes: basic, abs, without_gc. Default is basic. Read more about   
                        modes at https://github.com/bobeobibo/phigaro/

Running time depends on the size of your input data and the number of CPUs used. The running time for a metagenomic assembly file of 150MB is about 20 minutes.

Mode Description

Here is a short decription of Phigaro modes. The more detailed description you can find in the publication.
$Tr(x)$ - triangular function, $\mathbf{1}_{pVOG}(gene_i))$ - indicator function, $GC(x)$ - GC content of x, $mean_gc$ - constant.

basic

abs

without_gc


Output

The output can be annotated prophage genome maps (html), gff3/bed or tabular format (text or stdout).


Test data

Test data is available in test_data folder. In order to run Phigaro on test data, enter the following command from your Phigaro folder:

phigaro -f test_data/Bacillus_anthracis_str_ames.fna -o test_data/Bacillus_anthracis_str_ames -p --not-open

This command generates Bacillus_anthracis_str_ames.phg.html files in test_data folder. If output file is not specified with -o, the following output is generated:

scaffold        begin   end     taxonomy
NC_003997.3     451613  457261  Siphoviridae
NC_003997.3     460328  482139  Siphoviridae
NC_003997.3     3460450 3482979 Siphoviridae
NC_003997.3     3495703 3505502 Siphoviridae
NC_003997.3     3749518 3776811 Siphoviridae
NC_003997.3     3779698 3784171 Siphoviridae