Databases
vConTACT3 relies on curated reference databases for clustering and taxonomic assignment. These databases are archived at Zenodo under the following:
Download Instructions:
Use vcontact3 prepare_databases to download specific versions:
vcontact3 prepare_databases --list-versions
vcontact3 prepare_databases --get-version 228 --set-location ./db
Use vcontact3 prepare_databases to get the latest version:
vcontact3 prepare_databases --get-version 'latest' --set-location ./latest
By default, the latest version of the database is downloaded. You can select any available version from the --list-versions command.
Additionally, you can specify the download location:
vcontact3 prepare_databases --get-version "latest" --set-location /path/to/download/location
If users wish to directly download a database of their choice, they can use wget to download the latest OR earlier versions.
Place the directory and point to it during Run.
Note: vConTACT3 (since version 3.0.5) has a database-filtering restriction to prevent users from downloading and using databases that are incompatible with their version of the tool. If you cannot download (or don't see) a version you want to use, please try updating your vConTACT3 version. However, if you want to KEEP using older databases, we recommend creating a separate installation for the newer version.
More information:
When vConTACT3 downloads the database, it will download it as a {version}.tar.gz file and then decompress it into the specified directory. If no download location is specified, it will download into the current directory.
Database downloads are structured as follows:
/path/to/download/location/
├── v220.tar.gz
├── v220/
└── 220.json
The "220.json" file contains the relative paths to the necessary files within the v220 directory. The goal of this organization is to allow users to specify the same download location with every new version. vConTACT3 is smart enough to identify the latest versions from the downloaded files. For example:
vcontact3 prepare_databases --get-version "220" --set-location /path/to/download/location
vcontact3 prepare_databases --get-version "221" --set-location /path/to/download/location
vcontact3 prepare_databases --get-version "222" --set-location /path/to/download/location
vcontact3 prepare_databases --get-version "223" --set-location /path/to/download/location
After downloading the databases, the target directory (e.g., /path/to/download/location) will contain compressed archives, extracted folders, and version metadata files:
/path/to/download/location/
├── v220.tar.gz
├── v221.tar.gz
├── v222.tar.gz
├── v223.tar.gz
├── v220/
├── v221/
├── v222/
├── v223/
├── 220.json
├── 221.json
├── 222.json
└── 223.json
Database Updates
The vConTACT3 team periodically releases database updates to reflect additional sequence data and taxonomic updates in NCBI. When building each database version, benchmarks are performed and distance thresholds are adjusted to maintain or improve agreement with the updated taxonomy. For some database releases, this includes adding newly-established realms (i.e. Singelvaria), which requires additional processing time. Due to this, older versions of vConTACT3 will not be compatible.
One additional reason for updates is to incorporate code changes, such as adding indexes for faster processing or new files for better exports. Unfortunately, these are nearly always breaking changes. The team tries to avoid this, but it is not always possible.
Current Version:
v232 - vConTACT3 Reference Database
Compatible with vConTACT3 3.2.0+
Breaking change: this version introduces VOG markers. Only vConTACT3 versions 3.2.0+ are compatible!
Version History:
v230 - vConTACT3 Reference Database
Compatible with vConTACT3 3.1.0 < 3.2.0
v228 - vConTACT3 Reference Database
Compatible with vConTACT3 3.0.1 < 3.1.0
v223 - vConTACT3 Reference Database
Compatible with vConTACT3.0.0 beta versions
v220 - vConTACT3 Reference Database
Compatible with vConTACT3.0.1 beta versions