Outputs of SemiBin

Single sample/co-assembly binning

output_bins: directory of the final reconstructed bins (after reclustering by default).
model.pt: saved deep learning model (self-supervised by default).
data.csv/data_split.csv: data used in the training of deep learning model.
*_data_cov.csv/*_data_split_cov.csv: coverage data generated from depth file.
cannot/cannot.txt: cannot-link file used in training (only present when using semi-supervised mode).
bins_info.tsv: table with basic information on each bin (name, total number of basepairs, number of contigs, N50, and L50; more columns may be added in the future).
contig_bins.tsv: table mapping each contig to its assigned bin.
SemiBinRun.log: run log (timestamped messages, including any fatal error that aborted the run). This is the first place to look when troubleshooting.

If --write-pre-reclustering-bins is passed, pre-reclustering bins are also written to a separate output_prerecluster_bins directory.

bins: Reconstructed bins from all samples.
samples/*.fasta: Contig fasta file for every sample from the input whole_contig.fna.
samples/*_data_cov.csv: same as in single sample/coassembly binning.
samples/{sample-name}/: directory of the output of SemiBin for every sample (same as that in single sample/coassembly binning).