CBcalc - Compositional Bias calculation

Input form

Fasta sequence:

Enter a sequence or choose fasta file (may be gzipped), max 50 MiB.

List of sites:

Enter sites here (one per line) or choose text file.

Method(s) of compositional bias calculation:

M0 - Bernoulli model-based method;
MM - maximum order Markov chain model-based method;
PBM - the method of Pevzner and co-authors;
BCK - the method of Burge and co-authors (recommended).

Guidelines

Input sequence

Input sequence should be in FASTA format. It could contain multiple entries. They are presumed to be separated parts of a single sequence. Any non-ACGT symbol will be treated as sequence break. The exception is space, newline, and gap ("-") symbols which are ignored.

Input sequence could be loaded as a fasta file. The file could be commpressed with GZip (.gz extension) to reduce its size. Maximum acceptable size is 50 MiB (mebibyte, 1 MiB = 1024×1024 bytes).

Note: if file is selected, text field content is ignored.

Input list of sites

CBcalc could handle both continuous and bipartite sites which length does not exceed 10 bases. A bipartite site contains two continuous parts divided by multi-N spacer of fixed length. Length of a bipartite site is a sum of lengths of its parts. Gap length should not exceed 16. The sites could contain any DNA nucleotide symbols (A C G T W S M K R Y B D H V N). Empty lines are ignored.

The list of sites could be loaded as an ASCII text file (not RTF, DOC, DOCX, ODT, etc). It should contains one site per line. Empty lines are ignored.

Note: if file is selected, text field content is ignored.

Methods of calculation of compositional biases

There are four methods implemented: M0, MM, PBM, BCK. M0 is based on Bernoulli model of genome sequence, MM utilizes Markov chains of maximum applicable order (L-2, where L is a site length) as a sequence model. PBM was designed by Pevzner et al. to improve MM. BCK was suggested by Burge et al. It takes into account observed frequences of all subsites of a site.

Stand-alone version and source code

CBcalc has a stand-alone version. The source code is available on GitHub.

© Rusinov, 2016