Inputs & Parameters
vConTACT3 has numerous options, but really only requires either a single nucleotide file in FASTA format, or proteins and a gene-to-genome mapping if using pre-called genes. All other options have defaults, which will be used when not specified.
Inputs
--nucleotide Path to a FASTA-formatted nucleotide file. Selecting this option will enable the gene-calling tool and disable --proteins.
Note
Avoid using mmseq in the input filename (e.g. my-mmseqs-results.fna), as the temporary file cleanup may
inadvertently match temporary files. See FAQ & Troubleshooting for details.
--proteins
FASTA file of predicted proteins. Requires --gene2genome, while --len-nucleotide is optional (see below).
--gene2genome
TSV or parquet file linking protein IDs to genome IDs. Required when using --proteins. Expected columns:
protein_id, genome_id, and optionally keywords (filled with None if absent).
--len-nucleotide
TSV or parquet file mapping genome IDs to nucleotide lengths. Only applicable when using --proteins. When
--nucleotide is provided, genome lengths are computed automatically from the sequences. Optional even in protein
mode. If omitted, Size (Kb) will be NaN in the output and the ANI export will be disabled. Accepts a
length column in base pairs (converted to KB automatically) or a Size (Kb) column.
--output Path to the output directory. Defaults to vConTACT3_results/.
Key Parameters
Though not necessary, these are the most frequently used parameters
--threads Number of CPU cores to use. Defaults to all available cores.
--max-iterations Iterations to use when resolving mixed-realm components/clusters. Increase to reduce chance of encountering. Default: 3
--reduce-memory Reduce memory usage by downcasting arrays to float16 (~50% savings).
--distance-metric The distance metric used between genomes in the gene sharing network. Options: SqRoot (default), VirClust, Shorter, Jaccard.
--breaks Splits large networks/graphs into smaller chunks during export.
--db-path Path to a specific database version file or directory. Defaults to using the latest version.
--db-domain Specify domain: archaea, bacteria, prokaryotes, or eukaryotes.
---exports Specify which export types to generate (e.g., graphml, cytoscape, profiles). See Exports for details.
Advanced Parameters
--verbose Increase logging verbosity (INFO, WARN, ERROR, DEBUG).
--keep-temp Preserve intermediate files (generally for MMSeqs2).
For a full list of command-line options, see the CLI Reference.