Release Notes
Tracking changes and improvements across software versions.
v3.2.0 (2026-03-26)
Added VOG marker search module (
markers.py): user proteins against VOGDB via MMseqs2 profile searchAdded VOG-based host domain override: per-realm votes of user sequences can override the DB default domain when ≥80% of VOG hits vote for a different domain
Added VOG realm and taxonomy support throughout resolver and GBGA: VOG hits now contribute to realm identification and rank-level taxonomy assignments
Filled in mixed-realm Archaea distance thresholds that were previously disabled
v3.1.8 (2025-12-20)
vConTACT3 paper published at Nature Biotechnology! Update relevant locations
Updated exports to use lazy importing, hopefully reducing the resource requirements just loading vcontact3
v3.1.7 (2025-10-24)
Update database versions allowed
v3.1.6 (2025-10-22)
Added extensive docstrings for all (?) functions and classes. Type hints have not yet been updated
Added bypass to allow vcontact3 installation when ete3 package is incorrectly installed
v3.1.5 (2025-09-25)
Rewrite upper and lower taxonomy assignments due to bug in propagating labels hierarchically. It is strongly recommended to update to this version.
Changed how novel prediction placeholders are structured. Upper ranks (e.g. kingdom, phylum, class) now use "unplaced_<rank>_of_<parent_rank>" and only lower ranks (e.g. order, family, subfamily, genus) start numbering.
v3.1.4 (2025-08-17)
Fix versioning for Bioconda install, which prevented version from being identified from git tag (pip was still fine)
Expose option to allow users to set maximum iterations to resolve multi-realm components
v3.1.3 (2025-06-25)
Update build and versioning system for package
Add Singelvaria
Update docs with recent changes
v3.1.2 (2025-06-18)
Add funding acknowledgements to docs
Add & expose centroid export option
v3.1.1 (2025-06-13)
Fix downloading from list
v3.1.0 (2025-06-13)
Add ANI export option
Add vclust as "optional" dependency
Switch from using setup.py to pyproject.toml
Update DB versions in docs
Restructured how package is run through the main function
Implement new build and versioning system. Instead of manually encoding version in several locations (= source of failure), git version tag is 'single source of truth'
Update recent changes to options and other changes in docs
v3.0.5 (2025-06-05)
Implemented new DB versioning system which identifies installed package version and evaluates it against DBs and their compatibility range
Added docs site to README
Change realm logic and which realms are handled by GBGA
Begin adding mixed-realm distance thresholds, allowing for user (and environmental) situations where multiple realms cannot be reliably disentangled
Fixed edge value reporting during identification of mixed-realm components, which prevented breaking at the appropriate limit
Fix category issue where new categories being added weren't being identified
Change fastcluster version to improve compatibility
v3.0.3 (2025-05-29)
Remove unnecessary imports and legacy code
v3.0.2 (2025-05-29)
Add back in NetworKit for processing extremely dense and large graphs for calculating connected components (in order to establish mutli-realm components)
v3.0.1 (2025-05-29)
Burn and rebuild resolver + GBGA modules. Use sparse matrices over 3rd party networks. Update iteration logic to resolve components. H5 for files. Integrate all network layers into single view. Fix gene + component realm identification.
[extensive other modifications]
v3.0.0b74 (2024-12-30)
Apply default realm to final predictions. Was used and calculated, but ignored while writing the final file
v3.0.0b (2024-11-02)
Addition of Monodnaviria for eukaryotes
v3.0.0b (2024-10-31)
Addition of Monodnaviria for prokaryotes
Future Work
These are planned updates and/or features. There is no guarantee that these will be implemented, nor should be considered expectation of, but are actively being worked on.
Enable newick export handling
More "useful" interactive network/graph with D3js
Improved memory handling
Filtering exports on user-defined labels (e.g. "Rudiviridae" would filter profiles, graphs, networks, etc)
No-database / reference-free mode — allow vConTACT3 to run without a reference database, performing de novo clustering and network construction on user genomes only (no taxonomy assignment). Useful for exploratory analyses or datasets from under-represented taxa.
Unified database support — the unified (cross-domain) database is partially implemented; remaining work is automatic host-domain detection so that the correct realm-level thresholds are selected without requiring the user to specify
--db-domainexplicitly.User-supplied custom reference databases — allow users to provide their own curated set of reference genomes (formatted as a vConTACT3-compatible database) to supplement or replace the built-in RefSeq releases. This enables targeted analyses against taxon-specific reference sets.
Development priorities are based on community feedback and internal benchmarking. Contributions or feature requests are welcome via the issue tracker.