Inputs & Parameters =================== vConTACT3 has numerous options, but really only requires either a single nucleotide file in FASTA format, or proteins and a gene-to-genome mapping if using pre-called genes. All other options have defaults, which will be used when not specified. Inputs ------ **--nucleotide** Path to a FASTA-formatted nucleotide file. Selecting this option will enable the gene-calling tool and disable `--proteins`. .. note:: Avoid using ``mmseq`` in the input filename (e.g. ``my-mmseqs-results.fna``), as the temporary file cleanup may inadvertently match temporary files. See :doc:`FAQ & Troubleshooting ` for details. **--proteins** FASTA file of predicted proteins. Requires ``--gene2genome``, while ``--len-nucleotide`` is optional (see below). **--gene2genome** TSV or parquet file linking protein IDs to genome IDs. Required when using ``--proteins``. Expected columns: ``protein_id``, ``genome_id``, and optionally ``keywords`` (filled with ``None`` if absent). **--len-nucleotide** TSV or parquet file mapping genome IDs to nucleotide lengths. Only applicable when using ``--proteins``. When ``--nucleotide`` is provided, genome lengths are computed automatically from the sequences. Optional even in protein mode. If omitted, ``Size (Kb)`` will be ``NaN`` in the output and the ``ANI`` export will be disabled. Accepts a ``length`` column in base pairs (converted to KB automatically) or a ``Size (Kb)`` column. **--output** Path to the output directory. Defaults to `vConTACT3_results/`. Key Parameters -------------- Though not necessary, these are the most frequently used parameters **--threads** Number of CPU cores to use. Defaults to all available cores. **--max-iterations** Iterations to use when resolving mixed-realm components/clusters. Increase to reduce chance of encountering. Default: 3 **--reduce-memory** Reduce memory usage by downcasting arrays to `float16` (~50% savings). **--distance-metric** The distance metric used between genomes in the gene sharing network. Options: `SqRoot` (default), `VirClust`, `Shorter`, `Jaccard`. **--breaks** Splits large networks/graphs into smaller chunks during export. **--db-path** Path to a specific database version file or directory. Defaults to using the latest version. **--db-domain** Specify domain: `archaea`, `bacteria`, `prokaryotes`, or `eukaryotes`. **---exports** Specify which export types to generate (e.g., `graphml`, `cytoscape`, `profiles`). See :doc:`Exports ` for details. Advanced Parameters ------------------- **--verbose** Increase logging verbosity (INFO, WARN, ERROR, DEBUG). **--keep-temp** Preserve intermediate files (generally for MMSeqs2). For a full list of command-line options, see the :doc:`CLI Reference `.