| 1 | <HTML> |
|---|
| 2 | <!-- |
|---|
| 3 | -- Copyright (c) Jeremy Siek 2000 |
|---|
| 4 | -- |
|---|
| 5 | -- Distributed under the Boost Software License, Version 1.0. |
|---|
| 6 | -- (See accompanying file LICENSE_1_0.txt or copy at |
|---|
| 7 | -- http://www.boost.org/LICENSE_1_0.txt) |
|---|
| 8 | --> |
|---|
| 9 | <Head> |
|---|
| 10 | <Title>Kevin Bacon Example</Title> |
|---|
| 11 | <BODY BGCOLOR="#ffffff" LINK="#0000ee" TEXT="#000000" VLINK="#551a8b" |
|---|
| 12 | ALINK="#ff0000"> |
|---|
| 13 | <IMG SRC="../../../boost.png" |
|---|
| 14 | ALT="C++ Boost" width="277" height="86"> |
|---|
| 15 | |
|---|
| 16 | <BR Clear> |
|---|
| 17 | |
|---|
| 18 | <H1>Six Degrees of Kevin Bacon</H1> |
|---|
| 19 | |
|---|
| 20 | <P> |
|---|
| 21 | A fun application of graph theory comes up in the popular game ``Six |
|---|
| 22 | Degrees of Kevin Bacon''. The idea of the game is to connect an actor |
|---|
| 23 | to Kevin Bacon through a trail of actors who appeared together in |
|---|
| 24 | movies, and do so in less than six steps. For example, Theodore |
|---|
| 25 | Hesburgh (President Emeritus of the University of Notre Dame) was in |
|---|
| 26 | the movie ``Rudy'' with the actor Gerry Becker, who was in the movie |
|---|
| 27 | ``Sleepers'' with Kevin Bacon. Why Kevin Bacon? For some reason the |
|---|
| 28 | three students who invented the game, Mike Ginelli, Craig Fass, and |
|---|
| 29 | Brian Turtle decided that Kevin Bacon is the center of the |
|---|
| 30 | entertainment world. The Kevin Bacon game is quite similar to the <a |
|---|
| 31 | href="http://www.oakland.edu/~grossman/erdoshp.html">Erdös |
|---|
| 32 | Number</a> which has ``been a part of the folklore of |
|---|
| 33 | mathematicians throughout the world for many years''. |
|---|
| 34 | |
|---|
| 35 | <P> |
|---|
| 36 | The ``Six Degrees of Kevin Bacon'' game is really a graph problem. If |
|---|
| 37 | you assign each actor to a vertex, and add an edge between two actors |
|---|
| 38 | if they have appeared together in a movie, then you have a graph that |
|---|
| 39 | represents the data for this game. Then the problem of finding a trail |
|---|
| 40 | of actors to Kevin Bacon becomes a traditional graph problem: that of |
|---|
| 41 | finding a <I>path</I> between two vertices. Since we wish to find a |
|---|
| 42 | path that is shorter than six steps, ideally we would find the |
|---|
| 43 | <I>shortest path</I> between the vertices. One might think to apply |
|---|
| 44 | Dijkstra's shortest path algorithm to this problem, but that would be |
|---|
| 45 | overkill since Dijkstra's algorithm is meant for situations when each |
|---|
| 46 | edge has an associated length (or weight) and the goal is to find the |
|---|
| 47 | path with the shortest cumulative length. Since we are only concerned |
|---|
| 48 | with finding the shortest paths in terms of the number of edges the |
|---|
| 49 | breadth-first search algorithm will solve the problem (and use less |
|---|
| 50 | time than Dijkstra's). |
|---|
| 51 | |
|---|
| 52 | <P> |
|---|
| 53 | |
|---|
| 54 | <H2>Input File and Graph Setup</H2> |
|---|
| 55 | |
|---|
| 56 | <P> |
|---|
| 57 | For this example we will use a much smaller graph than all movies and |
|---|
| 58 | all actors. The full source code for this example is in <a |
|---|
| 59 | href="../example/kevin-bacon.cpp"><TT>examples/kevin-bacon.cpp</TT></a>. |
|---|
| 60 | We have supplied a file <TT>kevin-bacon.dat</TT> which contains a list |
|---|
| 61 | of actor pairs who appeared in the same movie. Each line of the file |
|---|
| 62 | contains an actor's name, a movie, and another actor that appeared in |
|---|
| 63 | the movie. The ``;'' character is used as a separator. Here is an |
|---|
| 64 | excerpt. |
|---|
| 65 | |
|---|
| 66 | <P> |
|---|
| 67 | <PRE> |
|---|
| 68 | Patrick Stewart;Prince of Egypt, The (1998);Steve Martin |
|---|
| 69 | </PRE> |
|---|
| 70 | |
|---|
| 71 | <P> |
|---|
| 72 | Our first task will be to read in the file and create a graph from |
|---|
| 73 | it. We use a <TT>std::ifstream</TT> to input the file. |
|---|
| 74 | |
|---|
| 75 | <P> |
|---|
| 76 | <PRE> |
|---|
| 77 | std::ifstream datafile("./kevin-bacon.dat"); |
|---|
| 78 | if (!datafile) { |
|---|
| 79 | std::cerr << "No ./kevin-bacon.dat file" << std::endl; |
|---|
| 80 | return EXIT_FAILURE; |
|---|
| 81 | } |
|---|
| 82 | </PRE> |
|---|
| 83 | |
|---|
| 84 | <P> |
|---|
| 85 | We will want to attach the actor's names to the vertices in the graph, |
|---|
| 86 | and the movie names to the edges. We use the <TT>property</TT> class to |
|---|
| 87 | specify the addition of these vertex and edge properties to the graph. |
|---|
| 88 | We choose to use an undirected graph, because the relationship of |
|---|
| 89 | actors appearing together in a movie is symmetric. |
|---|
| 90 | |
|---|
| 91 | <P> |
|---|
| 92 | <PRE> |
|---|
| 93 | using namespace boost; |
|---|
| 94 | typedef adjacency_list<vecS, vecS, undirectedS, |
|---|
| 95 | property<vertex_name_t, string>, |
|---|
| 96 | property<edge_name_t, string > > |
|---|
| 97 | > Graph; |
|---|
| 98 | </PRE> |
|---|
| 99 | |
|---|
| 100 | <P> |
|---|
| 101 | To access the properties, we will need to obtain property map |
|---|
| 102 | objects from the graph. The following code shows how this is done. |
|---|
| 103 | |
|---|
| 104 | <P> |
|---|
| 105 | <PRE> |
|---|
| 106 | property_map<Graph, vertex_name_t>::type actor_name = get(vertex_name, g); |
|---|
| 107 | property_map<Graph, edge_name_t>::type connecting_movie = get(edge_name, g); |
|---|
| 108 | </PRE> |
|---|
| 109 | |
|---|
| 110 | <P> |
|---|
| 111 | Next we can start to loop through the file, parsing each line into a |
|---|
| 112 | sequence of tokens using the <a href="../../tokenizer/index.html">Boost |
|---|
| 113 | Tokenizer Library</a>. |
|---|
| 114 | |
|---|
| 115 | <P> |
|---|
| 116 | <PRE> |
|---|
| 117 | for (std::string line; std::getline(datafile, line);) { |
|---|
| 118 | char_delimiters_separator<char> sep(false, "", ";"); |
|---|
| 119 | tokenizer<> line_toks(line, sep); |
|---|
| 120 | ... |
|---|
| 121 | } |
|---|
| 122 | </PRE> |
|---|
| 123 | |
|---|
| 124 | <P> |
|---|
| 125 | Each line of the input corresponds to an edge in the graph, with the |
|---|
| 126 | two actors as the vertices incident to the edge, and the movie name |
|---|
| 127 | will be a property attached to the edge. One challenge in creating the |
|---|
| 128 | graph from this format of input is that the edges are specified by a |
|---|
| 129 | pair of actor names. As we add vertices to the graph, we'll need to |
|---|
| 130 | create a map from actor names to their vertices so that later |
|---|
| 131 | appearances of the same actor (on a different edge) can be linked with |
|---|
| 132 | the correct vertex in the graph. The <a |
|---|
| 133 | href="http://www.sgi.com/tech/stl/Map.html"><TT>std::map</TT></a> |
|---|
| 134 | can be used to implement this mapping. |
|---|
| 135 | |
|---|
| 136 | <P> |
|---|
| 137 | <PRE> |
|---|
| 138 | typedef graph_traits<Graph>::vertex_descriptor Vertex; |
|---|
| 139 | typedef std::map<string, Vertex> NameVertexMap; |
|---|
| 140 | NameVertexMap actors; |
|---|
| 141 | </PRE> |
|---|
| 142 | |
|---|
| 143 | <P> |
|---|
| 144 | The first token will be an actor's name. If the actor is not already |
|---|
| 145 | in our actor's map we will add a vertex to the graph, set the name |
|---|
| 146 | property of the vertex, and record the vertex descriptor in the map. |
|---|
| 147 | If the actor is already in the graph, we will retrieve the vertex |
|---|
| 148 | descriptor from the map's <TT>pos</TT> iterator. |
|---|
| 149 | |
|---|
| 150 | <P> |
|---|
| 151 | <PRE> |
|---|
| 152 | NameVertexMap::iterator pos; |
|---|
| 153 | bool inserted; |
|---|
| 154 | Vertex u, v; |
|---|
| 155 | |
|---|
| 156 | tokenizer<>::iterator i = line_toks.begin(); |
|---|
| 157 | std::string actors_name = *i++; |
|---|
| 158 | |
|---|
| 159 | tie(pos, inserted) = actors.insert(std::make_pair(actors_name, Vertex())); |
|---|
| 160 | if (inserted) { |
|---|
| 161 | u = add_vertex(g); |
|---|
| 162 | actor_name[u] = actors_name; |
|---|
| 163 | pos->second = u; |
|---|
| 164 | } else |
|---|
| 165 | u = pos->second; |
|---|
| 166 | </PRE> |
|---|
| 167 | |
|---|
| 168 | <P> |
|---|
| 169 | The second token is the name of the movie, and the third token is the |
|---|
| 170 | other actor. We use the same technique as above to insert the second |
|---|
| 171 | actor into the graph. |
|---|
| 172 | |
|---|
| 173 | <P> |
|---|
| 174 | <PRE> |
|---|
| 175 | std::string movie_name = *i++; |
|---|
| 176 | |
|---|
| 177 | tie(pos, inserted) = actors.insert(std::make_pair(*i, Vertex())); |
|---|
| 178 | if (inserted) { |
|---|
| 179 | v = add_vertex(g); |
|---|
| 180 | actor_name[v] = *i; |
|---|
| 181 | pos->second = v; |
|---|
| 182 | } else |
|---|
| 183 | v = pos->second; |
|---|
| 184 | </PRE> |
|---|
| 185 | |
|---|
| 186 | <P> |
|---|
| 187 | The final step is to add an edge connecting the two actors, and record |
|---|
| 188 | the name of the connecting movie. |
|---|
| 189 | |
|---|
| 190 | <P> |
|---|
| 191 | <PRE> |
|---|
| 192 | graph_traits<Graph>::edge_descriptor e; |
|---|
| 193 | tie(e, inserted) = add_edge(u, v, g); |
|---|
| 194 | if (inserted) |
|---|
| 195 | connecting_movie[e] = movie_name; |
|---|
| 196 | </PRE> |
|---|
| 197 | |
|---|
| 198 | <P> |
|---|
| 199 | |
|---|
| 200 | <H2>A Custom Visitor Class</H2> |
|---|
| 201 | |
|---|
| 202 | <P> |
|---|
| 203 | We create a custom visitor class to record the Kevin Bacon |
|---|
| 204 | numbers. The Kevin Bacon number will be the shortest distances from |
|---|
| 205 | each actor to Kevin Bacon. This distance has also been referred to as |
|---|
| 206 | the ``Bacon Number'' in memory of the ``Erdos Number'' used to rank |
|---|
| 207 | mathematicians according to how many publications separate them from |
|---|
| 208 | Paul Erdos. The shortest distance to from Kevin Bacon to each actor |
|---|
| 209 | will follow the breadth-first tree. The BFS algorithm identifies edges |
|---|
| 210 | that belong to the breadth-first tree and calls our visitor's |
|---|
| 211 | <tt>tree_edge</tt> function. We record the distance to the target |
|---|
| 212 | vertex as one plus the distance to the source vertex. |
|---|
| 213 | |
|---|
| 214 | |
|---|
| 215 | <PRE> |
|---|
| 216 | template <typename DistanceMap> |
|---|
| 217 | class bacon_number_recorder : public default_bfs_visitor |
|---|
| 218 | { |
|---|
| 219 | public: |
|---|
| 220 | bacon_number_recorder(DistanceMap dist) : d(dist) { } |
|---|
| 221 | |
|---|
| 222 | template <typename Edge, typename Graph> |
|---|
| 223 | void tree_edge(Edge e, const Graph& g) const |
|---|
| 224 | { |
|---|
| 225 | typename graph_traits<Graph>::vertex_descriptor |
|---|
| 226 | u = source(e, g), v = target(e, g); |
|---|
| 227 | d[v] = d[u] + 1; |
|---|
| 228 | } |
|---|
| 229 | private: |
|---|
| 230 | DistanceMap d; |
|---|
| 231 | }; |
|---|
| 232 | // Convenience function |
|---|
| 233 | template <typename DistanceMap> |
|---|
| 234 | bacon_number_recorder<DistanceMap> |
|---|
| 235 | record_bacon_number(DistanceMap d) |
|---|
| 236 | { |
|---|
| 237 | return bacon_number_recorder<DistanceMap>(d); |
|---|
| 238 | } |
|---|
| 239 | </PRE> |
|---|
| 240 | |
|---|
| 241 | <P> |
|---|
| 242 | We will use a vector to store the bacon numbers. |
|---|
| 243 | |
|---|
| 244 | <P> |
|---|
| 245 | <PRE> |
|---|
| 246 | std::vector<int> bacon_number(num_vertices(g)); |
|---|
| 247 | </PRE> |
|---|
| 248 | |
|---|
| 249 | <H2>Apply the Breadth-First Search</H2> |
|---|
| 250 | |
|---|
| 251 | <P> |
|---|
| 252 | The BFS algorithm requires a source vertex as input; of course this |
|---|
| 253 | will be Kevin Bacon. We now call the BGL BFS algorithm, passing in the |
|---|
| 254 | graph, source vertex, and the visitor. |
|---|
| 255 | |
|---|
| 256 | <P> |
|---|
| 257 | <PRE> |
|---|
| 258 | Vertex src = actors["Kevin Bacon"]; |
|---|
| 259 | bacon_number[src] = 0; |
|---|
| 260 | |
|---|
| 261 | breadth_first_search(g, src, visitor(record_bacon_number(&bacon_number[0]))); |
|---|
| 262 | </PRE> |
|---|
| 263 | |
|---|
| 264 | <P> |
|---|
| 265 | We can output the Bacon number for each actor simply by looping |
|---|
| 266 | through all the vertices in the graph and use them to index into the |
|---|
| 267 | <TT>bacon_number</TT> vector. |
|---|
| 268 | |
|---|
| 269 | <P> |
|---|
| 270 | <PRE> |
|---|
| 271 | graph_traits<Graph>::vertex_iterator i, end; |
|---|
| 272 | for (boost::tie(i, end) = vertices(g); i != end; ++i) |
|---|
| 273 | std::cout << actor_name[*i] << "'s bacon number is " |
|---|
| 274 | << bacon_number[*i] << std::endl; |
|---|
| 275 | </PRE> |
|---|
| 276 | |
|---|
| 277 | <P> |
|---|
| 278 | Note that vertex descriptor objects can not always be used as indices |
|---|
| 279 | into vectors or arrays such as <TT>bacon_number</TT>. This is valid |
|---|
| 280 | with the <TT>adjacency_list</TT> class with <TT>VertexList=vecS</TT>, |
|---|
| 281 | but not with other variations of <TT>adjacency_list</TT>. A more |
|---|
| 282 | generic way to index based on vertices is to use the ID property |
|---|
| 283 | map (<TT>vertex_index_t</TT>) in coordination with the <A |
|---|
| 284 | HREF="../../property_map/iterator_property_map.html"><TT>iterator_property_map</TT></a>. |
|---|
| 285 | |
|---|
| 286 | <P> |
|---|
| 287 | Here are some excepts from the output of the program. |
|---|
| 288 | <PRE> |
|---|
| 289 | William Shatner's bacon number is 2 |
|---|
| 290 | Denise Richards's bacon number is 1 |
|---|
| 291 | Kevin Bacon's bacon number is 0 |
|---|
| 292 | Patrick Stewart's bacon number is 2 |
|---|
| 293 | Steve Martin's bacon number is 1 |
|---|
| 294 | ... |
|---|
| 295 | </PRE> |
|---|
| 296 | |
|---|
| 297 | |
|---|
| 298 | |
|---|
| 299 | |
|---|
| 300 | <br> |
|---|
| 301 | <HR> |
|---|
| 302 | <TABLE> |
|---|
| 303 | <TR valign=top> |
|---|
| 304 | <TD nowrap>Copyright © 2000-2001</TD><TD> |
|---|
| 305 | <A HREF="../../../people/jeremy_siek.htm">Jeremy Siek</A>, Indiana University (<A HREF="mailto:jsiek@osl.iu.edu">jsiek@osl.iu.edu</A>) |
|---|
| 306 | </TD></TR></TABLE> |
|---|
| 307 | |
|---|
| 308 | </BODY> |
|---|
| 309 | </HTML> |
|---|
| 310 | <!-- LocalWords: gif Hesburgh Ginelli Fass Erd vertices cerr namespace vecS |
|---|
| 311 | --> |
|---|
| 312 | <!-- LocalWords: undirectedS Tokenizer sep tokenizer toks NameVertexMap pos |
|---|
| 313 | --> |
|---|
| 314 | <!-- LocalWords: bool Erdos BFS typename DistanceMap bfs const num BGL src |
|---|
| 315 | --> |
|---|
| 316 | <!-- LocalWords: indices Shatner's Richards's siek htm |
|---|
| 317 | --> |
|---|