SemiBin2
Summary: Use the SemiBin2 command. As of version 2.2, only SemiBin2 is installed.
History
Starting with version 1.5 (officially SemiBin2 beta, released March 2023), installing the SemiBin package installed two scripts: SemiBin and SemiBin2.
They had the same functionality, but slightly different interfaces.
As of version 2.0 (released October 2023), the older SemiBin command was not recommended (except for backwards compatibility) and newer projects should use SemiBin2.
In version 2.1 (released March 2024), we deprecated the SemiBin command and introduced a more explicit SemiBin1 command for backwards compatibility.
In version 2.2 (released March 2025), only SemiBin2 is installed. The SemiBin and SemiBin1 commands are no longer available.
Upgrading to SemiBin2
- If you are using the
easy_*workflows, then they will probably continue to work exactly the same (except that you will get better results faster). - Outputs are now always in a directory called
output_bins(unless you explicitly ask for the pre-reclustered bins to be written out with the--write-pre-reclustering-binsoption). - By default, bins are in file named as
SemiBin_{label}.fa.gz(and compressed with gzip as the name indicates; you can change the compression with the--compressionflag, including settingcompression=noneif you prefer no compression).
Points 2 and 3 may require some minor modifications to wrapper scripts.
Longer list of differences between SemiBin2 and SemiBin1
The biggest different is that the default training mode is self-supervised mode.
- Output bins are now in a directory called
output_bins(in SemiBin1, it actually depended on which parameters were used). - Output filenames are now anvi'o compatible (effectively, the default value of
--tag-outputisSemiBin), see discussion at #123. --compressiondefaults togz(instead ofnone)- ORF finder defaults to the
fast-naiveinternal ORF finder --write-pre-reclustering-binsisFalseby default- To train in semi-supervised mode, you must use the
train_semisubcommand (and there is notrainsubcommand, you must be specific:train_semiortrain_self).
A few arguments that were deprecated before are removed:
- --recluster: it did nothing already as reclustering is default
- --mode: use --train-from-many instead
- --training-type: use --semi-supervised to use semi-supervised learning instead