The Todd-Coxeter algorithm
One problem in group theory is to understand a group from a presentation. This is not an easy problem in general, and there are some notable obstructions. For instance, there is no hope to find an algorithm which can tell whether a particular word is the identity since the word problem is undecidable.
However, for finite presentations, those with a finite set of generators and a finite set of relations between them, the Todd-Coxeter algorithm is able to give a list of all elements of a finite group in an (unknown) finite time. In fact, the algorithm is able to do more: enumerate the cosets of a finite-index finitely generated subgroup. Or, more specificially, it is able to find a faithful group action with a given finitely generated stabilizer.
What we will do in this note is describe the Todd-Coxeter algorithm, give a rather short implementation of it, and prove its correctness. But, to simplify the proof of the algorithm, let us first take a diversion through the theory of group actions.
There is an online implementation of the algorithm at Math tools on this site.
1. Classification of group actions
A -set for a set and a group is a homomorphism — that is, a left group action. Maps between -sets are functions on the underlying sets which commute with the respective group actions.
An invariant subset of is one for which for all , and a reducible -set is one which has a nonempty invariant proper subset. An irreducible -set is also known as transitive, and is characterized by each pair having some with .
If is a nonempty proper invariant subset, then is as well. This is because if , then , so . The orbit of an element is the set , which is an invariant subset and irreducible. Having the same orbit is an equivalence relation, and so decomposes into a disjoint union of irreducible -sets, one per equivalence class. For this reason, we will now focus on classifying an irreducible -set.
A pointed -set is one with a distinguished element , called the basepoint, and maps between pointed -sets are -set maps which map the basepoint to the basepoint. A pointed -set has a corresponding subgroup consisting of all elements with . This is actually a functor. For , the induced map is simply an inclusion, since for stabilizing , so is stabilized as well.
Let be the stabilizer for an irreducible pointed -set . Define the -set map by , where is the set of left cosets of , with the -action defined by left multiplication, and with as the basepoint. This is well-defined since is the stabilizer of , and injective for the same reason. Given , let be such that , and so , hence is surjective. Therefore, is a -set isomorphism. Thus, every irreducible pointed -set is isomorphic to the coset -set for some subgroup . This completes the classification.
In case this doesn’t seem recognizable: a corollary is that the cardinality of an orbit is the index of the stabilizer, so the cardinality of an orbit times the order of the stabilizer equals the order of .
There are still a few questions which should be answered. What is the importance of the basepoint? Suppose , and is the stabilizer of , is the stabilizer of , and is such that . Then for , , so is in . By symmetry, and are conjugate subgroups of one another.
What are the automorphisms of an irreducible -set? Let be an automorphism (not pointed — if it were pointed, then it would be the identity). Since for all , there is some such that , so is determined by . Furthermore, for , since is both and , we have for some , which implies is in the normalizer of . Conversely, a map is well-defined for . The only such maps which are the identity are when , hence is isomorphic to . In particular, when is a normal subgroup, is itself. The converse follows at least for finite, since then is a stabilizer for all of .
While it is something of a tautology to say that is a -set, it is worth pointing out because whenever a pointed -set has as a stabilizer, and is a normal subgroup, then that -set can also be regarded as a -set with the same basepoint. In general, if is normal subgroup which is a subset of the stabilizer, then the -set may also be regarded as a -set.
To relate this to the theory of covering spaces, there is a universal -set, namely as a -set. Whenever is a subgroup of , then the quotient map is the unique pointed -set map between them.
A construction which will prove useful is a sort of quotient. Suppose is a normal subgroup of and is a -set. Then is a -set as well. As an aside: this is related to the induction/restriction adjunction. Restriction of to an -set is a functor, and the irreducible -subsets correspond to the elements of the above quotient. Induction of an -set to a -set is to take with equivalent to and with the -action defined by multiplication in the left component.
2. Describing a -set with a Schreier graph
In this section, we will discuss a way to describe a -set when is generated by a set .
A Schreier graph for a pointed -set and generating set is a labeled directed graph with the vertex set and a -labeled edge for each and .
For a given label , every vertex has exactly one outgoing edge and one incoming edge with that label. This is because the acts on bijectively. For this reason, we can describe a path starting from using only edge labels and the direction of travel along each edge. For , will denote following the edge leaving a particular vertex, and the following the edge entering that vertex. A path from a vertex has each or in , and we follow the path in right-to-left order, so that is the vertex at the end of the path.
A relation in is a word equaling the identity, and a relator in at is a loop starting and ending at . Every relation corresponds to a relator at each vertex.
The set of relators at a particular vertex, taken as words in , is the stabilizer of that vertex. This gives a way to see whether a particular graph is the Schreier graph of a -set with a particular stabilizer: check that each vertex has the correct number of incoming and outgoing edges, check that each relation in is a relator in the graph, and check that the set of relators at the basepoint equals the stabilizer of the basepoint.
If is a Schreier graph , and if every element of a normal subgroup of is a relator in at , then is a Schreier graph of as well.
Suppose is a free group. The Schreier graph for as a -set is a tree, which is a loop-free graph. Given a normal subgroup of , the quotient -set has each element of form a loop in , and each relator in is an element of , or in other words a relation in . Suppose is the normal closure of some set of generators. Each generator is a relator in at every vertex, because conjugation brings a relator from the basepoint to any other vertex. It is sufficient to check each that each generator is a relator at every vertex to see that is a subset of the stabilizer.
Suppose is any subgroup of . We may regard it as a subgroup of instead by finding generators of . Then, the Schreier graph for is , so the generators of are relators at the basepoint.
3. The algorithm
With these facts in mind, the Todd-Coxeter algorithm is fairly simple. Given the following data:
- a finite set of generators,
- a finite set of relations, and
- a finite set of generators for a subgroup
- Begin with the Schreier graph of the free group generated by , with vertices labeled by positive integers. Let be the basepoint. All of the vertices are unlabeled and unmarked, but is labeled with .
- For each generator in , follow the path from , labeling the unlabeled vertices with fresh integers along the way, and then identify with the end vertex.
For each unlabeled unmarked vertex :
- Label with a fresh integer.
- Follow from each generator and generator’s inverse, labeling unlabeled neighbors with fresh integers.
- For each relation in , follow the path from , labeling unlabeled vertices with fresh integers along the way, and then identify with the end vertex.
- Mark .
To identify two vertices of a Schreier graph, remove the vertex with the greater label, and move all the edges from the removed vertex to the remaining vertex. Then, wherever the there are too many edges of a particular label around that vertex, identify the vertices at the other end of those edges. Since the set of labeled integers at any particular moment is finite, this identification process must eventually terminate. The result is a Schreier graph of a free group.
Notice the following facts about this algorithm:
- At any moment, except while in the process of identifying vertices, the graph is some Schreier graph for the free group generated by .
- Each vertex of the graph is eventually visited.
- For any particular vertex, eventually every relation from is a relator.
- Every generator of is a relator at the basepoint.
- Every vertex has a label which eventually stabilizes since they are a decreasing sequence of positive integers.
- Every path starting at eventually stabilizes in that all of the vertex numbers along the path eventually stabilize.
Thus, the result may be regarded as a -set. Since the graph is path-connected, it corresponds to an irreducible -set, and if is finite, then the algorithm must terminate: the graph has no more vertices than , so there is some point where all of the neighbors of those vertices have stabilized. There is a path between the basepoint and any labeled vertex, so at this point such a path remains within the stabilized vertices.
The question remains whether the result actually is isomorphic to itself. Each step of the algorithm creates a new Schreier graph whose stabilizer contains the stabilizer of the previous Schreier graph, and so there is an increasing chain of subgroups, each of which is inside the limiting . A relator in is a path, and so there is some step of the algorithm such that that path in that step’s Schreier graph is a loop, and so it is an element of that step’s stabilizer, and hence of .
That is it for correctness.
Just for the sake of some intuition: the way I think about the algorithm is that we are creating a covering space of the wedge sum of circles, one circle per generator of . We lazily quotient the covering space by the relations of and the generators of to produce a covering space whose fiber is in one-to-one correspondence with .
One thing to notice about the algorithm is that there is no need for to be finite, other than to guarantee it will terminate. There is still a limit graph , but unfortunately there is no way to tell when the graph has stabilized!
4. Implementation
Now we give a concrete implementation of the algorithm in Python. For simplicity, we will just compute with the presentation . You can download tc.py.
The main data structures are
idents = [] neighbors = [] to_visit = 0The idents variable is a list of numbers, with the property that idents[i] <= i. This is a union-find data structure for keeping track of the vertex quotients for the Schreier graph so far. The find function looks up the current-lowest label for a particular labeled vertex:
def find(c): c2 = idents[c] if c == c2: return c else: c2 = find(c2) idents[c] = c2 return c2
For vertices which have not been removed, the neighbors data structure contains a list of all the neighboring vertices, with None for non-existent neighbors — non-existent neighbors are presumed to be a portion of the free group’s universal Schreier graph, namely a tree. Each entry is another vertex label. Vertex labels may be out of date, so find must be run on them to ensure validity.
The to_visit variable keeps track of which vertices have been marked. Any vertex whose label is less than to_visit is considered marked.
We record the relations and generators in the following structures, the details of which are not that important:
ngens = 2 rels = [ (1, 0), # a^-1a (3, 2), # b^-1b (0, 0, 0), #a^3 (2, 2), # b^2 (0, 2, 0, 2) # abab ] hgens = [ (2,), # b ]We use 0 and 1 for and , and 2 and 3 for and . For simplicity of the core algorithm, we introduce and as explicit relations.
The identification procedure is called unify.
def unify(c1, c2): c1 = find(c1) c2 = find(c2) if c1 == c2: return c1, c2 = min(c1, c2), max(c1, c2) idents[c2] = c1 for d in xrange(2*ngens): n1 = neighbors[c1][d] n2 = neighbors[c2][d] if n1 == None: neighbors[c1][d] = n2 elif n2 != None: unify(n1, n2)It takes two vertices, and makes the one with the greater label refer to the lesser one with idents[c2] = c1. Then, it moves all of the neighbors of the deleted vertex by recursively calling unify. In case the neighbor being replaced is non-existent, which is the base case of the recursion, we just record it as a neighbor. There is no need to record this identification on the other end of the edge because the other end is referring to c2, which is now c1.
For following paths, we have the following two functions:
def follow(c, d): c = find(c) ns = neighbors[c] if ns[d] == None: ns[d] = new() return find(ns[d]) def followp(c, ds): c = find(c) for d in reversed(ds): c = follow(c, d) return cThe first takes a vertex and finds the neighbor in the d direction, creating a new vertex in that direction if needed, and the second follows a list of directions to find the end of a path. The follow function creates new neighbors with the following function:
def new(): c = len(idents) idents.append(c) neighbors.append((2*ngens)*[None]) return cThis just creates a fresh label so that idents[c] == c, and initializes the neighbors array to point to non-existent vertices. There is no need for follow or new to record the neighbors on the new vertex, since the relations such as will eventually take care of this.
Finally, there is the core algorithm:
start = new() for hgen in hgens: unify(followp(start, hgen), start) while to_visit < len(idents): c = find(to_visit) if c == to_visit: for rel in rels: unify(followp(c, rel), c) to_visit += 1It creates the start vertex, adds all of the relations for as relators at this basepoint, and then for each unmarked vertex, adds all of the relations from rels. Notice how unify is being used to turn paths into relators.
After this, the data structures contain the Schreier graph for . This can be interpreted as a permutation representation, for instance using
cosets = [c for i, c in enumerate(idents) if i == c] perms = [[cosets.index(follow(c, 2*d)) for i, c in enumerate(cosets)] for d in range(ngens)]to enumerate the cosets (which are vertices which have the least label in their equivalency classes), and then taking edges of the Schreier graph for each generator to construct a permutation.
The permutations can be written in cycle notation fairly easily. One way is with
def cycle(perm): parts = [] for i in range(len(perm)): part = [str(i+1)] k = perm[i] while k != i: if k < i: break part.append(str(k+1)) k = perm[k] else: parts.append(" ".join(part)) return "("+")(".join(parts)+")" for d in range(ngens): print "g%d ="%d, cycle(perms[d])
For these particular relations, the output is
g0 = (1 2 3) g1 = (1)(2 3)which means there are three cosets. The first generator acts cyclically on the three cosets, and the second generator merely swaps the non-subgroup cosets. This actually gives a faithful representation of .
Replacing with the trivial subgroup, we instead get
g0 = (1 2 4)(3 5 6) g1 = (1 3)(2 6)(4 5)which means has six elements.
5. Applications
There are a number of things one can deduce from the result of the algorithm.
- Of course, if it terminates, we deduce is finite (and know what it equals).
- A permutation representation of , given by a permutation of for each generator of .
- Whether is a normal subgroup. This can be determined by seeing whether each generator’s permutation is an element of , since then , implying is normal.
- An algorithm for the word problem for the group. Given two words, follow the graph from the basepoint to see whether they end on the same vertex.
6. References
- Artin, Michael. Algebra. — Contains the older-style algorithm of maintaining relation tables.
- Brown, Ken. The Todd-Coxeter procedure. http://www.math.cornell.edu/~kbrown/7350/toddcox.pdf — Gives the description of the algorithm from the point of view of Schreier graphs and describes a proof of termination.
- Sher and Daverman. Handbook of Geometric Topology — Ran into this while preparing these notes; it was the only reference I found which had a topological way of understanding the algorithm. I was going to describe it more topologically, but I realized -sets might be clearer for a symbolic explanation.