Representing A Graph:
Last updated
Last updated
In this section we will look at two common abstract representations of graphs: the adjacency matrix and the unfortunately named adjacency “list”. The adjacency list is really a mapping, so in this section we also consider two possible representations in Python of the adjacency list: one an object-oriented approach with a Python dict
as its underlying data type, and one just using a plain dict
directly.
It’s useful to be familiar with many ways to represent graphs as you will encounter them everywhere. Ultimately though, we see the adjacency list representation using a pure map type (such as a dict
in Python) as the most intuitive and flexible.
One of the easiest ways to implement a graph is to use a two-dimensional matrix. In this matrix implementation, each of the rows and columns represent a vertex in the graph. The value that is stored in the cell at the intersection of row and column indicates if there is an edge from vertex to vertex . When two vertices are connected by an edge, we say that they are adjacent. The diagram below illustrates the adjacency matrix for the example graph we presented earlier. A value in a cell represents the weight of the edge from vertex to vertex .
The advantage of the adjacency matrix is that it is simple, and for small graphs it is easy to see which nodes are connected to other nodes. However, notice that most of the cells in the matrix are empty. Because most of the cells are empty we say that this matrix is “sparse.” A matrix is not a very efficient way to store sparse data. In fact, in Python you must go out of your way to even create a matrix structure like the one above.
The adjacency matrix is a good implementation for a graph when the number of edges is large. But what do we mean by large? How many edges would be needed to fill the matrix? Since there is one row and one column for every vertex in the graph, the number of edges required to fill the matrix is . A matrix is full when every vertex is connected to every other vertex. There are few real problems that approach this sort of connectivity. The problems we will look at in this section all involve graphs that are sparsely connected.
A more space-efficient way to implement a sparsely connected graph is to use the unfortunately named adjacency list structure. In an adjacency list implementation we keep a master collection of all the vertices in the Graph object and then each vertex object in the graph maintains a list of the other vertices that it is connected to. In our implementation of the Vertex
class we will use a dictionary rather than a list as our master collection, where the dictionary keys are the vertices, and the values are the weights. The diagram below shows an adjacency list representation of the example graph we have been discussing.
The advantage of the adjacency list implementation is that it allows us to compactly represent a sparse graph. The adjacency list also allows us to easily find all the links that are directly connected to a particular vertex.
Using dictionaries, it is easy to implement the adjacency list in Python. In this implementation we create two classes: Graph
, which holds the master list of vertices, and Vertex
, which will represent each vertex in the graph.
Each Vertex
uses a dictionary to keep track of the vertices to which it is connected, and the weight of each edge. If we weren’t concerned with edge weights, we could use a set in place of a dictionary. This dictionary is called neighbors
.
In the code below, the add_neighbor
method is used to add a connection from this vertex to another. The get_connections
method returns all of the vertices in the adjacency list, as represented by the neighbors
instance variable. The get_weight
method returns the weight of the edge from this vertex to the vertex passed as a parameter.
The Graph
class, shown below, contains a dictionary that maps vertex names to vertex objects. Graph
also provides methods for adding vertices to a graph and connecting one vertex to another. The get_vertices
method returns the names of all of the vertices in the graph. In addition, we have implemented the __iter__
method to make it easy to iterate over all the vertex objects in a particular graph. Together, the two methods allow you to iterate over the vertices in a graph by name, or by the objects themselves.
Using the Graph
and Vertex
classes just defined, the following Python session creates our example graph. First we create six vertices numbered 0 through 5. Then we display the vertex dictionary. Notice that for each key 0 through 5 we have created an instance of a Vertex
. Next, we add the edges that connect the vertices together. Finally, a nested loop verifies that each edge in the graph is properly stored.
As with our node and edges representation of trees, you may wonder whether it is worthwhile to wrap the underlying structure of our graph with classes as we have above. This will depend on the context, but our general preference is actually to work directly with the data structure. Doing so allows us to more easily print or otherwise debug a graph, create a graph as a literal composite type, and serialize a graph for instance to JSON.
The example graph above would simply look like:
For the remainder of the chapter, we will use dictionaries directly in most cases.