| 1 | \documentclass[11pt]{report} |
|---|
| 2 | |
|---|
| 3 | \input{defs} |
|---|
| 4 | |
|---|
| 5 | |
|---|
| 6 | \setlength\overfullrule{5pt} |
|---|
| 7 | \tolerance=10000 |
|---|
| 8 | \sloppy |
|---|
| 9 | \hfuzz=10pt |
|---|
| 10 | |
|---|
| 11 | \makeindex |
|---|
| 12 | |
|---|
| 13 | \begin{document} |
|---|
| 14 | |
|---|
| 15 | \title{A Generic Programming Implementation of Strongly Connected Components} |
|---|
| 16 | \author{Jeremy G. Siek} |
|---|
| 17 | |
|---|
| 18 | \maketitle |
|---|
| 19 | |
|---|
| 20 | \section{Introduction} |
|---|
| 21 | |
|---|
| 22 | This paper describes the implementation of the |
|---|
| 23 | \code{strong\_components()} function of the Boost Graph Library. The |
|---|
| 24 | function computes the strongly connects components of a directed graph |
|---|
| 25 | using Tarjan's DFS-based |
|---|
| 26 | algorithm~\cite{tarjan72:dfs_and_linear_algo}. |
|---|
| 27 | |
|---|
| 28 | A \keyword{strongly connected component} (SCC) of a directed graph |
|---|
| 29 | $G=(V,E)$ is a maximal set of vertices $U$ which is in $V$ such that |
|---|
| 30 | for every pair of vertices $u$ and $v$ in $U$, we have both a path |
|---|
| 31 | from $u$ to $v$ and path from $v$ to $u$. That is to say that $u$ and |
|---|
| 32 | $v$ are reachable from each other. |
|---|
| 33 | |
|---|
| 34 | cross edge (u,v) is an edge from one subtree to another subtree |
|---|
| 35 | -> discover_time[u] > discover_time[v] |
|---|
| 36 | |
|---|
| 37 | Lemma 10. Let $v$ and $w$ be vertices in $G$ which are in the same |
|---|
| 38 | SCC and let $F$ be any depth-first forest of $G$. Then $v$ and $w$ |
|---|
| 39 | have a common ancestor in $F$. Also, if $u$ is the common ancestor of |
|---|
| 40 | $u$ and $v$ with the latest discover time then $w$ is also in the same |
|---|
| 41 | SCC as $u$ and $v$. |
|---|
| 42 | |
|---|
| 43 | Proof. |
|---|
| 44 | |
|---|
| 45 | If there is a path from $v$ to $w$ and if they are in different DFS |
|---|
| 46 | trees, then the discover time for $w$ must be earlier than for $v$. |
|---|
| 47 | Otherwise, the tree that contains $v$ would have extended along the |
|---|
| 48 | path to $w$, putting $v$ and $w$ in the same tree. |
|---|
| 49 | |
|---|
| 50 | |
|---|
| 51 | The following is an informal description of Tarjan's algorithm for |
|---|
| 52 | computing strongly connected components. It is basically a variation |
|---|
| 53 | on depth-first search, with extra actions being taken at the |
|---|
| 54 | ``discover vertex'' and ``finish vertex'' event points. It may help to |
|---|
| 55 | think of the actions taken at the ``discover vertex'' event point as |
|---|
| 56 | occuring ``on the way down'' a DFS-tree (from the root towards the |
|---|
| 57 | leaves), and actions taken a the ``finish vertex'' event point as |
|---|
| 58 | occuring ``on the way back up''. |
|---|
| 59 | |
|---|
| 60 | There are three things that need to happen on the way down. For each |
|---|
| 61 | vertex $u$ visited we record the discover time $d[u]$, push vertex $u$ |
|---|
| 62 | onto a auxiliary stack, and set $root[u] = u$. The root field will |
|---|
| 63 | end up mapping each vertex to the topmost vertex in the same strongly |
|---|
| 64 | connected component. By setting $root[u] = u$ we are starting with |
|---|
| 65 | each vertex in a component by itself. |
|---|
| 66 | |
|---|
| 67 | Now to describe what happens on the way back up. Suppose we have just |
|---|
| 68 | finished visiting all of the vertices adjacent to some vertex $u$. We |
|---|
| 69 | then scan each of the adjacent vertices again, checking the root of |
|---|
| 70 | each for which one has the earliest discover time, which we will call |
|---|
| 71 | root $a$. We then compare $a$ with vertex $u$ and consider the |
|---|
| 72 | following cases: |
|---|
| 73 | |
|---|
| 74 | \begin{enumerate} |
|---|
| 75 | \item If $d[a] < d[u]$ then we know that $a$ is really an ancestor of |
|---|
| 76 | $u$ in the DFS tree and therefore we have a cycle and $u$ must be in |
|---|
| 77 | a SCC with $a$. We then set $root[u] = a$ and continue our way back up |
|---|
| 78 | the DFS. |
|---|
| 79 | |
|---|
| 80 | \item If $a = u$ then we know that $u$ must be the topmost vertex of a |
|---|
| 81 | subtree that defines a SCC. All of the vertices in this subtree are |
|---|
| 82 | further down on the stack than vertex $u$ so we pop the vertices off |
|---|
| 83 | of the stack until we reach $u$ and mark each one as being in the |
|---|
| 84 | same component. |
|---|
| 85 | |
|---|
| 86 | \item If $d[a] > d[u]$ then the adjacent vertices are in different |
|---|
| 87 | strongly connected components. We continue our way back up the |
|---|
| 88 | DFS. |
|---|
| 89 | |
|---|
| 90 | \end{enumerate} |
|---|
| 91 | |
|---|
| 92 | |
|---|
| 93 | |
|---|
| 94 | @d Build a list of vertices for each strongly connected component |
|---|
| 95 | @{ |
|---|
| 96 | template <typename Graph, typename ComponentMap, typename ComponentLists> |
|---|
| 97 | void build_component_lists |
|---|
| 98 | (const Graph& g, |
|---|
| 99 | typename graph_traits<Graph>::vertices_size_type num_scc, |
|---|
| 100 | ComponentMap component_number, |
|---|
| 101 | ComponentLists& components) |
|---|
| 102 | { |
|---|
| 103 | components.resize(num_scc); |
|---|
| 104 | typename graph_traits<Graph>::vertex_iterator vi, vi_end; |
|---|
| 105 | for (tie(vi, vi_end) = vertices(g); vi != vi_end; ++vi) |
|---|
| 106 | components[component_number[*vi]].push_back(*vi); |
|---|
| 107 | } |
|---|
| 108 | @} |
|---|
| 109 | |
|---|
| 110 | |
|---|
| 111 | \bibliographystyle{abbrv} |
|---|
| 112 | \bibliography{jtran,ggcl,optimization,generic-programming,cad} |
|---|
| 113 | |
|---|
| 114 | \end{document} |
|---|