Release Notes

Tracking changes and improvements across software versions.

v3.2.0 (2026-03-26)

  • Added VOG marker search module (markers.py): user proteins against VOGDB via MMseqs2 profile search

  • Added VOG-based host domain override: per-realm votes of user sequences can override the DB default domain when ≥80% of VOG hits vote for a different domain

  • Added VOG realm and taxonomy support throughout resolver and GBGA: VOG hits now contribute to realm identification and rank-level taxonomy assignments

  • Filled in mixed-realm Archaea distance thresholds that were previously disabled

v3.1.8 (2025-12-20)

  • vConTACT3 paper published at Nature Biotechnology! Update relevant locations

  • Updated exports to use lazy importing, hopefully reducing the resource requirements just loading vcontact3

v3.1.7 (2025-10-24)

  • Update database versions allowed

v3.1.6 (2025-10-22)

  • Added extensive docstrings for all (?) functions and classes. Type hints have not yet been updated

  • Added bypass to allow vcontact3 installation when ete3 package is incorrectly installed

v3.1.5 (2025-09-25)

  • Rewrite upper and lower taxonomy assignments due to bug in propagating labels hierarchically. It is strongly recommended to update to this version.

  • Changed how novel prediction placeholders are structured. Upper ranks (e.g. kingdom, phylum, class) now use "unplaced_<rank>_of_<parent_rank>" and only lower ranks (e.g. order, family, subfamily, genus) start numbering.

v3.1.4 (2025-08-17)

  • Fix versioning for Bioconda install, which prevented version from being identified from git tag (pip was still fine)

  • Expose option to allow users to set maximum iterations to resolve multi-realm components

v3.1.3 (2025-06-25)

  • Update build and versioning system for package

  • Add Singelvaria

  • Update docs with recent changes

v3.1.2 (2025-06-18)

  • Add funding acknowledgements to docs

  • Add & expose centroid export option

v3.1.1 (2025-06-13)

  • Fix downloading from list

v3.1.0 (2025-06-13)

  • Add ANI export option

  • Add vclust as "optional" dependency

  • Switch from using setup.py to pyproject.toml

  • Update DB versions in docs

  • Restructured how package is run through the main function

  • Implement new build and versioning system. Instead of manually encoding version in several locations (= source of failure), git version tag is 'single source of truth'

  • Update recent changes to options and other changes in docs

v3.0.5 (2025-06-05)

  • Implemented new DB versioning system which identifies installed package version and evaluates it against DBs and their compatibility range

  • Added docs site to README

  • Change realm logic and which realms are handled by GBGA

  • Begin adding mixed-realm distance thresholds, allowing for user (and environmental) situations where multiple realms cannot be reliably disentangled

  • Fixed edge value reporting during identification of mixed-realm components, which prevented breaking at the appropriate limit

  • Fix category issue where new categories being added weren't being identified

  • Change fastcluster version to improve compatibility

v3.0.3 (2025-05-29)

  • Remove unnecessary imports and legacy code

v3.0.2 (2025-05-29)

  • Add back in NetworKit for processing extremely dense and large graphs for calculating connected components (in order to establish mutli-realm components)

v3.0.1 (2025-05-29)

  • Burn and rebuild resolver + GBGA modules. Use sparse matrices over 3rd party networks. Update iteration logic to resolve components. H5 for files. Integrate all network layers into single view. Fix gene + component realm identification.

  • [extensive other modifications]

v3.0.0b74 (2024-12-30)

  • Apply default realm to final predictions. Was used and calculated, but ignored while writing the final file

v3.0.0b (2024-11-02)

  • Addition of Monodnaviria for eukaryotes

v3.0.0b (2024-10-31)

  • Addition of Monodnaviria for prokaryotes

Future Work

These are planned updates and/or features. There is no guarantee that these will be implemented, nor should be considered expectation of, but are actively being worked on.

  • Enable newick export handling

  • More "useful" interactive network/graph with D3js

  • Improved memory handling

  • Filtering exports on user-defined labels (e.g. "Rudiviridae" would filter profiles, graphs, networks, etc)

  • No-database / reference-free mode — allow vConTACT3 to run without a reference database, performing de novo clustering and network construction on user genomes only (no taxonomy assignment). Useful for exploratory analyses or datasets from under-represented taxa.

  • Unified database support — the unified (cross-domain) database is partially implemented; remaining work is automatic host-domain detection so that the correct realm-level thresholds are selected without requiring the user to specify --db-domain explicitly.

  • User-supplied custom reference databases — allow users to provide their own curated set of reference genomes (formatted as a vConTACT3-compatible database) to supplement or replace the built-in RefSeq releases. This enables targeted analyses against taxon-specific reference sets.

Development priorities are based on community feedback and internal benchmarking. Contributions or feature requests are welcome via the issue tracker.