ECCC 2 - Canonical numbering of 3-D Molecules

Chr. Benecke, A. Kerber, R. Laue

Example for canonically numbering 2-D Structures


The structure we want canonically to number now is m-Bromobenzamide:

The actual (not canonical) numbering is shown above.


Building the initial classification:
In this example, we will not use information about the aromatic ring as described in the previous chapter. In addition, the H-atoms are not used .
After the preclassification, 5 classes are build as follows:

Class 1 : 10 (Br - atom)
Class 2 : 1 3 4 5 (C - atoms with one H-atom connected )
Class 3 : 2 6 7 (C - atoms without H-atoms connected )
Class 4 : 9 (N - atom )
Class 5 : 8 (O - atom )
Memorized Class : -

According to the algorithm, we go through all the classes and memorize those consisting of only one atom. Doing this we fill up the queue for the memorized classes and end up with:

Memorized Class : 1 4 5

Now, we start the iterated classification algorithm with the first class memorized (Class 1) and try to refine the other classes according to the connection information of atom number 10:


Iterated Classification:
Since atom 10 is only connected to atom Number 2 from class 3, only this can be split. As described before, we build one class containing atom number 2 and another class containing the remaining atoms from class 3: atom 6 and 7. The class containing atom 2 must be added to the list of memorized classes. Thus we get the following class list:

Class 1 : 10
Class 2 : 1 3 4 5
Class 3 : 2
Class 4 : 9
Class 5 : 8
Class 6 : 6 7
Memorized Class : 1 4 5 3


The next atom to be used for iterated classification is the one in class 4: atom 9. The only atom connected to atom 9 is atom 7 which occurs in class number 6. Since Atom 10 is only connected to Atom Number 2 from class 3, only this has to be split. Since class 6 only consists of two atoms, two new classes with only one atom are created and according to the previosly desribed algorithm, the new class containing atom number 7 will be put into the memorized class list first. The resulting classification shows as:

Class 1 : 10
Class 2 : 1 3 4 5
Class 3 : 2
Class 4 : 9
Class 5 : 8
Class 6 : 6
Class 7 : 7
Memorized Class : 1 4 5 3 7 6


Trying to use class 5 for a refinement fails since every class available does not contain an atom connected to atom 8 and one not connected to atom 8 at the same time. So the classes won't change, only the memorized class list is updated.

Memorized Class : 1 4 5 3 7 6


Class 3 contains atom 2. The only class that can be splitted now is class 2. Atoms 1 and 3 are connected to atom 2, the atoms 4 and 5 are not. Since none of the resulting class consist of only one member, no new class has to be memorized.:

Class 1 : 10
Class 2 : 1 3
Class 3 : 2
Class 4 : 9
Class 5 : 8
Class 6 : 6
Class 7 : 7
Class 8 : 4 5
Memorized Class : 1 4 5 3 7 6


Class 7 also cannot be used for a refinement since every atom connected to it is already the only member of a class. The memorized class list is now:

Memorized Class : 1 4 5 3 7 6


The final class 6 in our list of memorized classes now brings this algorithm to its end. The atoms connected to atom 6 (the only member of class 6) and not already member of a single atom class are 5 and one. Thus class 2 and class 8 can be split due to that information. The final classification is now:

Class 1 : 10
Class 2 : 1
Class 3 : 2
Class 4 : 9
Class 5 : 8
Class 6 : 6
Class 7 : 7
Class 8 : 4
Class 9 : 3
Class 10: 5
Memorized Class : 1 4 5 3 7 6 2 9 8 10


Canonical numbering:
According to the list of memorized classes, the atoms in our molecule have to be numbered in the following order:

10 - 9 - 8 - 2 - 7 - 6 - 1 - 3 - 4 - 5

So our structure from the beginning of this example looks like:


As you can see, in this example the backtracking algorithm need not be used due to the efficient initial classification. So once more we have to emphasize that the performance of the algorithm just depends on how good the description of the molecule is. Of course, chemists can probably add a lot of more properties to the atoms and define an appropriate order upon them.


Previous: Adaption to 2-D Molecules
Next: Adaption to 3-D Molecules

© Chr. Benecke, Oct. 1995