An 'indivisible unit' of the hydrophobic interaction is an atomic group that includes one carbon or sulfur atom and covalently bound hydrogen atoms. In proteins and nucleic acids those groups are -CH3, -CH2-, -CH=, -SH, and -S- groups.

The program uses two variants of the list of non-polar groups. The extended list includes all such groups in protein and nucleic acids, the strict list includes those groups of extended one, whose main atom is covalently bound only with another carbon or sulfur atoms (and, thus, is not covalently bound with any polar atom).

The center of a non-polar group is the center of the carbon or sulfur atom. Hydrogen atoms are not considered in calculations.

Each non-polar group corresponds to one vertex of the graph.
Two vertices are connected by an edge (and are suppose to be in interaction) in the case if
(a) the distance between the centers of the groups does not exceed the threshold value d;
(b) the interaction of these groups is not prevented by any other (polar or non-polar) group.
We say that a group C prevents the interaction between non-polar groups A and B,
if the ball with the center C and the radius R_{C} includes the intersection of the balls
with the centers A and B and the radii R_{A} and R_{B}, respectively.
In the program all radii are equal to 2.7 Å, which corresponds to the minimal possible distance
from a carbon-generated non-polar group to the oxygen atom of a water molecule.
The value of d is a user parameter. Maximal vreasonable value of d is 5.4 Å,
which corresponds to the sum of R_{A} and R_{B} radii in the case of carbon non-polar groups
(roughly speaking, the hydrophobic interaction occurs if a water molecule cannot be placed between two
non-polar groups).

Edges of the graph reflect two sufficiently different types of relations between atomic groups: (i) 'fixed' groups (including covalently bound ones); (ii) groups that are nearby located due to the hydrophobic interactions. Two non-polar groups A and B are 'fixed', if the distance between them is fixed due to chemical bonds, or any other interaction with energy exceeding sufficiently the free energy of the hydrophobic interaction between the groups. The examples of fixed non-polar groups are two groups, the central atoms of which are covalently bound with some third atom, or any two groups of aromatic ring. In the described algorithm and program these two different types of edges are not distinguished.

The graph of interactions of non-polar groups is denoted by Γ.

Fig. 1.Connected 2-edges subgraphs (thick lines) representing(a) (2, 1)-cut; (b) non-(2, 1)-cut. Vertices of 1-neighbourhood of the subgraphs are in dark grey. In (a), after the removal of thick edges from1-neighbourhood the obtained subgraph is decomposed into 2 connected components. In (b), the analogous procedure gives a connected subgraph.

The algorithm starts with the exhaustion of all subgraphs of one edge.
Then for each subgraph Δ all

The algorithm realizes the above mentioned intuitive concept concerning a hydrophobic cluster as follows: (i) weak interactions between the hydrophobic clusters (yet not found) are searched, in terms of graph of interactions these weak interactions correspond to

Fig. 2.The planar illustration of(K,L)-cut algorithm. (a) The initial distribution of non-polar groups. (b) The graph of interactions of non-polar groups. (c) All(1,1)-cuts in the graph. The edges of(1,1)-cuts are thick ones. (d) The graph after removal of all(1,1)-cuts. An example of one(2,1)-cut is presented (thick edges). The clusters, found after the removal of(1,1)-cuts and one(2,1)-cut , are subscribed.

The algorithm works as follows.
At the first step the graph Γ of interactions of the non-polar groups is constructed
(see Fig. 2b).
At the second step all

For graphs of interacting non-polar groups the algorithm is linear in the number N of atoms in a given structure, because the number of interactions of a non-polar group with other groups is obviously restricted. The constant in this linear function is rather big and depends on K exponentially. Fortunately, only small K=1, 2, 3 are expected to be sufficient for reasonable hydrophobic cluster detection.

In the program realization the values K=1, L=1, m=3 are fixed.

(i) The number of non-polar groups in the cluster.

(ii) The mean number of interactions of one non-polar group in the cluster. This parameter indirectly characterizes the form of the cluster. More compact packing of hydrophobic groups, and the form of the cluster closer to an ideal ball are characterized by a higher value of mean number of interactions of a non-polar group.

(iii) Semiaxes of the ellipsoid of inertia of the cluster. In calculating the ellipsoid, the groups of the cluster are considered as material points in space, with unitary masses. Ratios of semiaxes reflex the spatial form of the cluster. For example, if all three semiaxes are approximately equal, then the form of the cluster can be assumed to be close to a ball.

Residue |
Atom |
(in PDB notation) |

ALA | CB | |

ARG | CB | |

ARG | CG | |

ASN | CB | |

ASP | CB | |

CYS | CB | |

CYS | SG | |

GLU | CB | |

GLU | CG | |

GLN | CB | |

GLN | CG | |

HIS | CB | |

HIS | CG | |

HIS | CD2 | |

HIS | CE1 | |

ILE | CB | |

ILE | CG1 | |

ILE | CD1 | |

ILE | CG2 | |

LEU | CB | |

LEU | CG | |

LEU | CD1 | |

LEU | CD2 | |

LYS | CB | |

LYS | CG | |

LYS | CD | |

MET | CB | |

MET | CG | |

MET | SD | |

MET | CE | |

PHE | CB | |

PHE | CG | |

PHE | CD2 | |

PHE | CE2 | |

PHE | CZ | |

PHE | CE1 | |

PHE | CD1 | |

PRO | CB | |

PRO | CG | |

THR | CG2 | |

TRP | CB | |

TRP | CG | |

TRP | CD2 | |

TRP | CE3 | |

TRP | CZ3 | |

TRP | CH2 | |

TRP | CZ2 | |

TRP | CE2 | |

TYR | CB | |

TYR | CG | |

TYR | CD2 | |

TYR | CE2 | |

TYR | CE1 | |

TYR | CD1 | |

VAL | CB | |

VAL | CG1 | |

VAL | CG2 | |

N | C2* | N is for any DNA nucleotide |

T | C5M | |

C | C5 |

Residue |
Atom |
(in PDB notation) |

X | C | X is for any amino acid residue |

X | CA | |

ALA | CB | |

ARG | CB | |

ARG | CG | |

ARG | CD | |

ARG | CZ | |

ASN | CB | |

ASN | CG | |

ASN | ND2 | |

ASP | CB | |

ASP | CG | |

CYS | CB | |

CYS | SG | |

GLU | CB | |

GLU | CG | |

GLU | CD | |

GLN | CB | |

GLN | CG | |

GLN | CD | |

HIS | CB | |

HIS | CG | |

HIS | CD2 | |

HIS | CE1 | |

ILE | CB | |

ILE | CG1 | |

ILE | CD1 | |

ILE | CG2 | |

LEU | CB | |

LEU | CG | |

LEU | CD1 | |

LEU | CD2 | |

LYS | CB | |

LYS | CG | |

LYS | CD | |

LYS | CE | |

MET | CB | |

MET | CG | |

MET | SD | |

MET | CE | |

PHE | CB | |

PHE | CG | |

PHE | CD2 | |

PHE | CE2 | |

PHE | CZ | |

PHE | CE1 | |

PHE | CD1 | |

PRO | CB | |

PRO | CG | |

PRO | CD | |

SER | CB | |

THR | CB | |

THR | CG2 | |

TRP | CB | |

TRP | CG | |

TRP | CD2 | |

TRP | CE3 | |

TRP | CZ3 | |

TRP | CH2 | |

TRP | CZ2 | |

TRP | CE2 | |

TRP | CD1 | |

TYR | CB | |

TYR | CG | |

TYR | CD2 | |

TYR | CE2 | |

TYR | CZ | |

TYR | CE1 | |

TYR | CD1 | |

VAL | CB | |

VAL | CG1 | |

VAL | CG2 | |

N | C5* | N is for any DNA nucleotide |

N | C4* | |

N | C3* | |

N | C2* | |

N | C1* | |

T | C2 | |

T | C4 | |

T | C5 | |

T | C5M | |

T | C6 | |

C | C2 | |

C | C4 | |

C | C5 | |

C | C6 | |

A | C8 | |

A | C5 | |

A | C6 | |

A | C2 | |

A | C4 | |

G | C8 | |

G | C5 | |

G | C6 | |

G | C2 | |

G | C4 |