forked from aliev/runestone
-
Notifications
You must be signed in to change notification settings - Fork 0
/
graphs.txt
2259 lines (1740 loc) · 85.8 KB
/
graphs.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Graphs
======
Objectives
----------
- To learn what a graph is and how it is used.
- To implement the **graph** abstract data type using multiple internal
representations.
- To see how graphs can be used to solve a wide variety of problems
In this chapter we will study graphs. Graphs are a more general
structure than the trees we studied in the last chapter; in fact you can
think of a tree as a special kind of graph. Graphs can be used to
represent many interesting things about our world, including systems of
roads, airline flights from city to city, how the Internet is connected,
or even the sequence of classes you must take to complete a major in
computer science. We will see in this chapter that once we have a good
representation for a problem, we can use some standard graph algorithms
to solve what otherwise might seem to be a very difficult problem.
While it is relatively easy for humans to look at a road map and
understand the relationships between different places, a computer has no
such knowledge. However, we can also think of a road map as a graph.
When we do so we can have our computer do interesting things for us. If
you have ever used one of the Internet map sites, you know that a
computer can find the shortest, quickest, or easiest path from one place
to another.
As a student of computer science you may wonder about the courses you
must take in order to get a major. A graph is good way to represent the
prerequisites and other interdependencies among courses.
:ref:`Figure 1 <fig1>` shows another graph. This one represents the courses
and the order in which they must be taken to complete a major in
computer science at Luther College.
.. _fig1:
.. figure:: Graphs/CS-Prereqs.png
:align: center
:alt: image
Figure 1: Prerequisites for a Computer Science Major
Vocabulary and Definitions
--------------------------
Now that we have looked at some examples of graphs, we will more
formally define a graph and its components. We already know some of
these terms from our discussion of trees.
Vertex
A vertex (also called a “node”) is a fundamental part of a graph. It
can have a name, which we will call the “key.” A vertex may also
have additional information. We will call this additional
information the “payload.”
Edge
An edge (also called an “arc”) is another fundamental part of a
graph. An edge connects two vertices to show that there is a
relationship between them. Edges may be one-way or two-way. If the
edges in a graph are all one-way, we say that the graph is a
**directed graph**, or a **digraph**. The class prerequisites graph
shown above is clearly a digraph since you must take some classes
before others.
Weight
Edges may be weighted to show that there is a cost to go from one
vertex to another. For example in a graph of roads that connect one
city to another, the weight on the edge might represent the distance
between the two cities.
With those definitions in hand we can formally define a graph. A graph
can be represented by :math:`G` where :math:`G =(V,E)`. For the
graph :math:`G`, :math:`V` is a set of vertices and :math:`E` is a
set of edges. Each edge is a tuple :math:`(v,w)` where
:math:`w,v \in V`. We can add a third component to the edge tuple to
represent a weight. A subgraph :math:`s` is a set of edges :math:`e`
and vertices :math:`v` such that :math:`e \subset E` and
:math:`v \subset V`.
:ref:`Figure 2 <fig_dgsimple>` shows another example of a simple weighted
digraph. Formally we can represent this graph as the set of six
vertices:
.. math::
V = \left\{ V0,V1,V2,V3,V4,V5 \right\}
and the set of nine edges:
.. math::
E = \left\{ \begin{array}{l}(v0,v1,5), (v1,v2,4), (v2,v3,9), (v3,v4,7), (v4,v0,1), \\
(v0,v5,2),(v5,v4,8),(v3,v5,3),(v5,v2,1)
\end{array} \right\}
.. _fig_dgsimple:
.. figure:: Graphs/digraph.png
:align: center
:alt: image
Figure 2: A Simple Example of a Directed Graph
The example graph in :ref:`Figure x <fig_dgsimple>` helps illustrate two other
key graph terms:
Path
A path in a graph is a sequence of vertices that are connected by
edges. Formally we would define a path as
:math:`w_1, w_2, ..., w_n` such that
:math:`(w_i, w_{i+1}) \in E` for all :math:`1 \le i \le n-1`.
The unweighted path length is the number of edges in the path,
specifically :math:`n-1`. The weighted path length is the sum of
the weights of all the edges in the path. For example in
:ref:`Figure x <fig_dgsimple>` the path from :math:`V3` to :math:`V1` is
the sequence of vertices :math:`(V3,V4,V0,V1)`. The edges are
:math:`\left\{(v3,v4,7),(v4,v0,1),(v0,v1,5) \right\}`.
Cycle
A cycle in a directed graph is a path that starts and ends at the
same vertex. For example, in :ref:`Figure x <fig_dgsimple>` the path
:math:`(V5,V2,V3,V5)` is a cycle. A graph with no cycles is called
an **acyclic graph**. A directed graph with no cycles is called a
**directed acyclic graph** or a **DAG**. We will see that we can
solve several important problems if the problem can be represented
as a DAG.
The Graph Abstract Data Type
----------------------------
The graph abstract data type (ADT) is defined as follows:
- ``Graph()`` creates a new, empty graph.
- ``addVertex(vert)`` adds an instance of ``Vertex`` to the graph.
- ``addEdge(fromVert, toVert)`` Adds a new, directed edge to the graph
that connects two vertices.
- ``addEdge(fromVert, toVert, weight)`` Adds a new, weighted, directed
edge to the graph that connects two vertices.
- ``getVertex(vertKey)`` finds the vertex in the graph named
``vertKey``.
- ``getVertices()`` returns the list of all vertices in the graph.
- ``in`` returns ``True`` for a statement of the form
``vertex in graph``, if the given vertex is in the graph, ``False``
otherwise.
Beginning with the formal definition for a graph there are several ways
we can implement the graph ADT in Python. We will see that there are
trade-offs in using different representations to implement the ADT
described above. There are two well-known implementations of a graph,
the **adjacency matrix** and the **adjacency list**. We will explain
both of these options, and then implement one as a Python class.
An Adjacency Matrix
~~~~~~~~~~~~~~~~~~~
One of the easiest ways to implement a graph is to use a two-dimensional
matrix. In this matrix implementation, each of the rows and columns
represent a vertex in the graph. The value that is stored in the cell at
the intersection of row :math:`v` and column :math:`w` indicates if
there is an edge from vertex :math:`v` to vertex :math:`w`. When two
vertices are connected by an edge, we say that they are **adjacent**.
:ref:`Figure 1 <fig_adjmat>` illustrates the adjacency matrix for the graph in
:ref:`Figure x <fig_dgsimple>`. A value in a cell represents the weight of the
edge from vertex :math:`v` to vertex :math:`w`.
.. _fig_adjmat:
.. figure:: Graphs/adjMat.png
:align: center
:alt: image
{An Adjacency Matrix Representation for a Graph}
The advantage of the adjacency matrix is that it is simple, and for
small graphs it is easy to see which nodes are connected to other nodes.
However, notice that most of the cells in the matrix are empty. Because
most of the cells are empty we say that this matrix is “sparse.” A
matrix is not a very efficient way to store sparse data. In fact, in
Python you must go out of your way to even create a matrix structure
like the one in :ref:`Figure 1 <fig_adjmat>`.
The adjacency matrix is a good implementation for a graph when the
number of edges is large. But what do we mean by large? How many edges
would be needed to fill the matrix? Since there is one row and one
column for every vertex in the graph, the number of edges required to
fill the matrix is :math:`|V|^2`. A matrix is full when every vertex
is connected to every other vertex. There are few real problems that
approach this sort of connectivity. The problems we will look at in this
chapter all involve graphs that are sparsely connected.
An Adjacency List
~~~~~~~~~~~~~~~~~
A more space-efficient way to implement a sparsely connected graph is to
use an adjacency list. In an adjacency list implementation we keep a
master list of all the vertices in the Graph object and then each vertex
object in the graph maintains a list of the other vertices that it is
connected to. In our implementation of the ``Vertex`` class we will use
a dictionary rather than a list where the dictionary keys are the
vertices, and the values are the weights. :ref:`Figure 2 <fig_adjlist>`
illustrates the adjacency list representation for the graph in
:ref:`Figure x <fig_dgsimple>`.
.. _fig_adjlist:
.. figure:: Graphs/adjlist.png
:align: center
:alt: image
{An Adjacency List Representation of a Graph}
The advantage of the adjacency list implementation is that it allows us
to compactly represent a sparse graph. The adjacency list also allows us
to easily find all the links that are directly connected to a particular
vertex.
Implementation
~~~~~~~~~~~~~~
Using dictionaries, it is easy to implement the adjacency list in
Python. In our implementation of the Graph abstract data type we will
create two classes, ``Graph``, which holds the master list of vertices,
and ``Vertex``, which will represent each vertex in the graph.
Each ``Vertex`` uses a dictionary to keep track of the vertices to which
it is connected, and the weight of each edge. This dictionary is called
``connectedTo``. Listing {lst:vertex} shows the code for the ``Vertex``
class. The constructor simply initializes the ``id``, which will
typically be a string, and the ``connectedTo`` dictionary. The
``addNeighbor`` method is used add a connection from this vertex to
another. The ``getConnections`` method returns all of the vertices in
the adjacency list, as represented by the ``connectedTo`` instance
variable. The ``getWeight`` method returns the weight of the edge from
this vertex to the vertex passed as a parameter.
::
class Vertex:
def __init__(self,key):
self.id = key
self.connectedTo = {}
def addNeighbor(self,nbr,weight=0):
self.connectedTo[nbr] = weight
def __str__(self):
return str(self.id) + ' connectedTo: '
+ str([x.id for x in self.connectedTo])
def getConnections(self):
return self.connectedTo.keys()
def getId(self):
return self.id
def getWeight(self,nbr):
return self.connectedTo[nbr]
The ``Graph`` class, shown in Listing {lst:graph}, contains a dictionary
that maps vertex names to vertex objects. In :ref:`Figure 2 <fig_adjlist>` this
dictionary object is represented by the shaded gray box. ``Graph`` also
provides methods for adding vertices to a graph and connecting one
vertex to another. The ``getVertices`` method returns the names of all
of the vertices in the graph. In addition, we have implemented the
{\_\_iter\_\_} method to make it easy to iterate over all the vertex
objects in a particular graph. Together, the two methods allow you to
iterate over the vertices in a graph by name, or by the objects
themselves.
::
class Graph:
def __init__(self):
self.vertList = {}
self.numVertices = 0
def addVertex(self,key):
self.numVertices = self.numVertices + 1
newVertex = Vertex(key)
self.vertList[key] = newVertex
return newVertex
def getVertex(self,n):
if n in self.vertList:
return self.vertList[n]
else:
return None
def __contains__(self,n):
return n in self.vertList
def addEdge(self,f,t,cost=0):
if f not in self.vertList:
nv = self.addVertex(f)
if t not in self.vertList:
nv = self.addVertex(t)
self.vertList[f].addNeighbor(self.vertList[t],
cost)
def getVertices(self):
return self.vertList.keys()
def __iter__(self):
return iter(self.vertList.values())
Using the ``Graph`` and ``Vertex`` classes just defined, the following
Python session creates the graph in :ref:`Figure x <fig_dgsimple>`. First we
create six vertices numbered 0 through 5. Then we display the vertex
dictionary. Notice that for each key 0 through 5 we have created an
instance of a ``Vertex``. Next, we add the edges that connect the
vertices together. Finally, a nested loop verifies that each edge in the
graph is properly stored. You should check the output of the edge list
at the end of this session against :ref:`Figure x <fig_dgsimple>`.
::
>>> g = Graph()
>>> for i in range(6):
... g.addVertex(i)
>>> g.vertList
{0: <adjGraph.Vertex instance at 0x41e18>,
1: <adjGraph.Vertex instance at 0x7f2b0>,
2: <adjGraph.Vertex instance at 0x7f288>,
3: <adjGraph.Vertex instance at 0x7f350>,
4: <adjGraph.Vertex instance at 0x7f328>,
5: <adjGraph.Vertex instance at 0x7f300>}
>>> g.addEdge(0,1,5)
>>> g.addEdge(0,5,2)
>>> g.addEdge(1,2,4)
>>> g.addEdge(2,3,9)
>>> g.addEdge(3,4,7)
>>> g.addEdge(3,5,3)
>>> g.addEdge(4,0,1)
>>> g.addEdge(5,4,8)
>>> g.addEdge(5,2,1)
>>> for v in g:
... for w in v.getConnections():
... print("( %s , %s )" % (v.getId(), w.getId()))
...
( 0 , 5 )
( 0 , 1 )
( 1 , 2 )
( 2 , 3 )
( 3 , 4 )
( 3 , 5 )
( 4 , 0 )
( 5 , 4 )
( 5 , 2 )
Breadth First Search
--------------------
The Word Ladder Problem
~~~~~~~~~~~~~~~~~~~~~~~
To begin our study of graph algorithms let’s consider the following
puzzle called a word ladder. Transform the word “FOOL” into the word
“SAGE”. In a word ladder puzzle you must make the change occur gradually
by changing one letter at a time. At each step you must transform one
word into another word, you are not allowed to transform a word into a
non-word. The word ladder puzzle was invented in 1878 by Lewis Carroll,
the author of *Alice in Wonderland*. The following sequence of words
shows one possible solution to the problem posed above.
====== ==
FOOL
POOL
POLL
POLE
PALE
SALE
SAGE
====== ==
There are many variations of the word ladder puzzle. For example you
might be given a particular number of steps in which to accomplish the
transformation, or you might need to use a particular word. In this
section we are interested in figuring out the smallest number of
transformations needed to turn the starting word into the ending word.
Not surprisingly, since this chapter is on graphs, we can solve this
problem using a graph algorithm. Here is an outline of where we are
going:
- Represent the relationships between the words as a graph.
- Use the graph algorithm known as breadth first search to find an
efficient path from the starting word to the ending word.
Building the Word Ladder Graph
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Our first problem is to figure out how to turn a large collection of
words into a graph. What we would like is to have an edge from one word
to another if the two words are only different by a single letter. If we
can create such a graph, then any path from one word to another is a
solution to the word ladder puzzle. :ref:`Figure 3 <fig_wordladder>` shows a
small graph of some words that solve the FOOL to SAGE word ladder
problem. Notice that the graph is an undirected graph and that the edges
are unweighted.
.. _fig_wordladder:
.. figure:: Graphs/wordgraph.png
:align: center
:alt: image
{A Small Word Ladder Graph}
We could use several different approaches to create the graph we need to
solve this problem. Let’s start with the assumption that we have a list
of words that are all the same length. As a starting point, we can
create a vertex in the graph for every word in the list. To figure out
how to connect the words, we could compare each word in the list with
every other. When we compare we are looking to see how many letters are
different. If the two words in question are different by only one
letter, we can create an edge between them in the graph. For a small set
of words that approach would work fine; however let’s suppose we have a
list of 5,110 words. Roughly speaking, comparing one word to every other
word on the list is an :math:`O(n^2)` algorithm. For 5,110 words,
:math:`n^2` is more than 26 million comparisons.
We can do much better by using the following approach. Suppose that we
have a huge number of buckets, each of them with a four-letter word on
the outside, except that one of the letters in the label has been
replaced by an underscore. For example, consider
:ref:`Figure 4 <fig_wordbucket>`, we might have a bucket labeled “pop\_.” As we
process each word in our list we compare the word with each bucket,
using the ‘\_’ as a wildcard, so both “pope” and “pops” would match
“pop\_.” Every time we find a matching bucket, we put our word in that
bucket. Once we have all the words in the appropriate buckets we know
that all the words in the bucket must be connected.
|image| {Word Buckets for Words That are Different by One Letter}
.. _fig_wordbucket:
In Python, we can implement the scheme we have just described by using a
dictionary. The labels on the buckets we have just described are the
keys in our dictionary. The value stored for that key is a list of
words. Once we have the dictionary built we can create the graph. We
start our graph by creating a vertex for each word in the graph. Then we
create edges between all the vertices we find for words found under the
same key in the dictionary. Listing {lst:laddergraph} shows the Python
code required to build the graph.
::
from pythonds.graphs import Graph
def buildGraph(wordFile):
d = {}
g = Graph()
wfile = open(wordFile,'r')
# create buckets of words that differ by one letter
for line in wfile:
word = line[:-1]
for i in range(len(word)):
bucket = word[:i] + '_' + word[i+1:]
if bucket in d:
d[bucket].append(word)
else:
d[bucket] = [word]
# add vertices and edges for words in the same bucket
for bucket in d.keys():
for word1 in d[bucket]:
for word2 in d[bucket]:
if word1 != word2:
g.addEdge(word1,word2)
return g
Since this is our first real-world graph problem, you might be wondering
how sparse is the graph? The list of four-letter words we have for this
problem is 5,110 words long. If we were to use an adjacency matrix, the
matrix would have 5,110 \* 5,110 = 26,112,100 cells. The graph
constructed by the ``buildGraph`` function has exactly 53,286 edges, so
the matrix would have only 0.20% of the cells filled! That is a very
sparse matrix indeed.
Implementing Breadth First Search
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
With the graph constructed we can now turn our attention to the
algorithm we will use to find the shortest solution to the word ladder
problem. The graph algorithm we are going to use is called the “breadth
first search” algorithm. **Breadth first search** (**BFS**) is one of
the easiest algorithms for searching a graph. It also serves as a
prototype for several other important graph algorithms that we will
study later.
Given a graph :math:`G` and a starting vertex :math:`s`, a breadth
first search proceeds by exploring edges in the graph to find all the
vertices in :math:`G` for which there is a path from :math:`s`. The
remarkable thing about a breadth first search is that it finds *all* the
vertices that are a distance :math:`k` from :math:`s` before it
finds *any* vertices that are a distance :math:`k+1`. One good way to
visualize what the breadth first search algorithm does is to imagine
that it is building a tree, one level of the tree at a time. A breadth
first search adds all children of the starting vertex before it begins
to discover any of the grandchildren.
To keep track of its progress, BFS colors each of the vertices white,
gray, or black. All the vertices are initialized to white when they are
constructed. A white vertex is an undiscovered vertex. When a vertex is
initially discovered it is colored gray, and when BFS has completely
explored a vertex it is colored black. This means that once a vertex is
colored black, it has no white vertices adjacent to it. A gray node, on
the other hand, may have some white vertices adjacent to it, indicating
that there are still additional vertices to explore. {cormen-algorithms}
The breadth first search algorithm shown in Listing {lst:bfs} uses the
adjacency list graph representation we developed earlier in
Listings {lst:vertex} and {lst:graph}. In addition it uses a ``Queue``,
a crucial point as we will see, to decide which vertex to explore next.
In addition the BFS algorithm uses an extended version of the ``Vertex``
class. This new vertex class adds three new instance variables:
distance, predecessor, and color. Each of these instance variables also
has the appropriate getter and setter methods. The code for this
expanded Vertex class is included in the ``pythonds`` package, but we
will not show it to you here as there is nothing new to learn by seeing
the additional instance variables.
BFS begins at the starting vertex ``s`` and colors ``start`` gray to
show that it is currently being explored. Two other values, the distance
and the predecessor, are initialized to 0 and ``None`` respectively for
the starting vertex. Finally, ``start`` is placed on a ``Queue``. The
next step is to begin to systematically explore vertices at the front of
the queue. We explore each new node at the front of the queue by
iterating over its adjacency list. As each node on the adjacency list is
examined its color is checked. If it is white, the vertex is unexplored,
and four things happen:
#. The new, unexplored vertex ``nbr``, is colored gray.
#. The predecessor of ``nbr`` is set to the current node ``currentVert``
#. The distance to ``nbr`` is set to the distance to ``currentVert + 1``
#. ``nbr`` is added to the end of a queue. Adding ``nbr`` to the end of
the queue effectively schedules this node for further exploration,
but not until all the other vertices on the adjacency list of
``currentVert`` have been explored.
::
from pythonds.graphs import Graph, Vertex
from pythonds.basic import Queue
def bfs(g,start):
start.setDistance(0)
start.setPred(None)
vertQueue = Queue()
vertQueue.enqueue(start)
while (vertQueue.size() > 0): #// \label{lst:bfs:while}
currentVert = vertQueue.dequeue()
for nbr in currentVert.getConnections(): #// \label{lst:bfs:for}
if (nbr.getColor() == 'white'):
nbr.setColor('gray')
nbr.setDistance(currentVert.getDistance() + 1)
nbr.setPred(currentVert)
vertQueue.enqueue(nbr)
currentVert.setColor('black')
Let’s look at how the ``bfs`` function would construct the breadth first
tree corresponding to the graph in :ref:`Figure 3 <fig_wordladder>`. Starting
from fool we take all nodes that are adjacent to fool and add them to
the tree. The adjacent nodes include pool, foil, foul, and cool. Each of
these nodes are added to the queue of new nodes to expand.
:ref:`Figure 5 <fig_bfs1>` shows the state of the in-progress tree along with the
queue after this step.
.. figure:: Graphs/bfs1.png
:align: center
:alt: image
.. _fig_bfs1:
In the next step ``bfs`` removes the next node (pool) from the front of
the queue and repeats the process for all of its adjacent nodes.
However, when ``bfs`` examines the node cool, it finds that the color of
cool has already been changed to gray. This indicates that there is a
shorter path to cool and that cool is already on the queue for further
expansion. The only new node added to the queue while examining pool is
poll. The new state of the tree and queue is shown in :ref:`Figure 6 <fig_bfs2>`.
.. figure:: Graphs/bfs2.png
:align: center
:alt: image
{The Second Step in the Breadth First Search}
.. _fig_bfs2:
The next vertex on the queue is foil. The only new node that foil can
add to the tree is fail. As ``bfs`` continues to process the queue,
neither of the next two nodes add anything new to the queue or the tree.
:ref:`Figure 7 <fig_bfs3>` shows the tree and the queue after expanding all the
vertices on the second level of the tree.
[Breadth First Search Tree After Completing One Level] {
.. _fig_bfs3:
.. figure:: Graphs/bfs3.png
:align: center
:alt: image
[Final Breadth First Search Tree]
.. _fig_bfsDone:
.. figure:: Graphs/bfsDone.png
:align: center
:alt: image
image
}{Constructing the Breadth First Search Tree}
.. _fig_bfscontruct:
You should continue to work through the algorithm on your own so that
you are comfortable with how it works. :ref:`Figure 8 <fig_bfsDone>` shows the
final breadth first search tree after all the vertices in
:ref:`Figure 3 <fig_wordladder>` have been expanded. The amazing thing about the
breadth first search solution is that we have not only solved the
FOOL–SAGE problem we started out with, but we have solved many other
problems along the way. We can start at any vertex in the breadth first
search tree and follow the predecessor arrows back to the root to find
the shortest word ladder from any word back to fool. The function in
Listing {lst:traversebfs} shows how to follow the predecessor links to
print out the word ladder.
::
def traverse(y):
x = y
while (x.getPred()):
print(x.getId())
x = x.getPred()
print(x.getId())
traverse(g.getVertex('sage'))
Breadth First Search Analysis
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
{sec:analysis:bfs}
Before we continue with other graph algorithms let us analyze the run
time performance of the breadth first search algorithm. The first thing
to observe is that the while loop on line {lst:bfs:while} is executed,
at most, one time for each vertex in the graph :math:`|V|`. You can
see that this is true because a vertex must be white before it can be
examined and added to the queue. This gives us :math:`O(V)` for the
while loop. The for loop, which is nested inside the while, on
line {lst:bfs:for} is executed at most once for each edge in the graph,
:math:`|E|`. The reason is that every vertex is dequeued at most once
and we examine an edge from node :math:`u` to node :math:`v` only
when node :math:`u` is dequeued. This gives us :math:`O(E)` for the
for loop. combining the two loops gives us :math:`O(V + E)`.
Of course doing the breadth first search is only part of the task.
Following the links from the starting node to the goal node is the other
part of the task. The worst case for this would be if the graph was a
single long chain. In this case traversing through all of the vertices
would be :math:`O(V)`. The normal case is going to be some fraction of
:math:`|V|` but we would still write :math:`O(V)`.
Finally, at least for this problem, there is the time required to build
the initial graph. We leave the analysis of the ``buildGraph`` function
as an exercise for you.
Depth First Search
------------------
The Knight’s Tour Problem
~~~~~~~~~~~~~~~~~~~~~~~~~
Another classic problem that we can use to illustrate a second common
graph algorithm is called the “knight’s tour.” {gordon-kt} The knight’s
tour puzzle is played on a chess board with a single chess piece, the
knight. The object of the puzzle is to find a sequence of moves that
allow the knight to visit every square on the board exactly once. One
such sequence is called a “tour.” The knight’s tour puzzle has
fascinated chess players, mathematicians and computer scientists alike
for many years. The upper bound on the number of possible legal tours
for an eight-by-eight chessboard is known to be
:math:`1.305 \times 10^{35}`; however, there are even more possible
dead ends. Clearly this is a problem that requires some real brains,
some real computing power, or both.
Although researchers have studied many different algorithms to solve the
knight’s tour problem, a graph search is one of the easiest to
understand and program. Once again we will solve the problem using two
main steps:
- Represent the legal moves of a knight on a chessboard as a graph.
- Use a graph algorithm to find a path of length
:math:`rows \times columns - 1` where every vertex on the graph is
visited exactly once.
Building the Knight’s Tour Graph
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To represent the knight’s tour problem as a graph we will use the
following two ideas: Each square on the chessboard can be represented as
a node in the graph. Each legal move by the knight can be represented as
an edge in the graph. :ref:`Figure 10 <fig_knightmoves>` illustrates the legal
moves by a knight and the corresponding edges in a graph.
|image1| {Legal Moves for a Knight on Square 12, and the
Corresponding Graph}
.. _fig_knightmoves:
To build the full graph for an n-by-n board we can use the Python
function shown in Listing {lst:ktbuild}. The ``knightGraph`` function
makes one pass over the entire board. At each square on the board the
``knightGraph`` function calls a helper, ``genLegalMoves``, to create a
list of legal moves for that position on the board. All legal moves are
then converted into edges in the graph. Another helper function
``posToNodeId`` converts a location on the board in terms of a row and a
column into a linear vertex number similar to the vertex numbers shown
in :ref:`Figure 10 <fig_knightmoves>`.
::
from pythonds.graphs import Graph
def knightGraph(bdSize):
ktGraph = Graph()
for row in range(bdSize):
for col in range(bdSize):
nodeId = posToNodeId(row,col,bdSize)
newPositions = genLegalMoves(row,col,bdSize)
for e in newPositions:
nid = posToNodeId(e[0],e[1])
ktGraph.addEdge(nodeId,nid)
return ktGraph
The ``genLegalMoves`` function takes the position of the knight on the
board and generates each of the eight possible moves. The ``legalCoord``
helper function makes sure that a particular move that is generated is
still on the board.
::
def genLegalMoves(x,y,bdSize):
newMoves = []
moveOffsets = [(-1,-2),(-1,2),(-2,-1),(-2,1),
( 1,-2),( 1,2),( 2,-1),( 2,1)]:
for i in moveOffsets:
newX = x + i[0]
newY = y + i[1]
if legalCoord(newX,bdSize) and \
legalCoord(newY,bdSize):
newMoves.append((newX,newY))
return newMoves
def legalCoord(x,bdSize):
if x >= 0 and x < bdSize:
return True
else:
return False
:ref:`Figure 11 <fig_bigknight>` shows the complete graph of possible moves on an
eight-by-eight board. There are exactly 336 edges in the graph. Notice
that the vertices corresponding to the edges of the board have fewer
connections (legal moves) than the vertices in the middle of the board.
Once again we can see how sparse the graph is. If the graph was fully
connected there would be 4,096 edges. Since there are only 336 edges,
the adjacency matrix would be only 8.2 percent full.
.. figure:: Graphs/bigknight.png
:align: center
:alt: image
image
{All Legal Moves for a Knight on an :math:`8 \times 8` Chessboard}
.. _fig_bigknight:
Implementing Knight’s Tour
~~~~~~~~~~~~~~~~~~~~~~~~~~
The search algorithm we will use to solve the knight’s tour problem is
called **depth first search**(**DFS**). {cormen-algorithms} Whereas the
breadth first search algorithm discussed in the previous section builds
a search tree one level at a time, a depth first search creates a search
tree by exploring one branch of the tree as deeply as possible. In this
section we will look at two algorithms that implement a depth first
search. The first algorithm we will look at directly solves the knight’s
tour problem by explicitly forbidding a node to be visited more than
once. The second implementation is more general, but allows nodes to be
visited more than once as the tree is constructed. The second version is
used in subsequent sections to develop additional graph algorithms.
The depth first exploration of the graph is exactly what we need in
order to find a path that has exactly 63 edges. We will see that when
the depth first search algorithm finds a dead end (a place in the graph
where there are no more moves possible) it backs up the tree to the next
deepest vertex that allows it to make a legal move.
The ``knightTour`` function takes four parameters: ``n``, the current
depth in the search tree; ``path``, a list of vertices visited up to
this point; ``u``, the vertex in the graph we wish to explore; and
``limit`` the number of nodes in the path. The ``knightTour`` function
is recursive. When the ``knightTour`` function is called, it first
checks the base case condition. If we have a path that contains 64
vertices, we return from ``knightTour`` with a status of ``True``,
indicating that we have found a successful tour. If the path is not long
enough we continue to explore one level deeper by choosing a new vertex
to explore and calling ``knightTour`` recursively for that vertex.
DFS also uses colors to keep track of which vertices in the graph have
been visited. Unvisited vertices are colored white, and visited vertices
are colored gray. If all neighbors of a particular vertex have been
explored and we have not yet reached our goal length of 64 vertices, we
have reached a dead end. When we reach a dead end we must backtrack.
Backtracking happens when we return from ``knightTour`` with a status of
``False``. In the breadth first search we used a queue to keep track of
which vertex to visit next. Since depth first search is recursive, we
are implicitly using a stack to help us with our backtracking. When we
return from a call to ``knightTour`` with a status of ``False``, in line
{lst:kt:bt}, we remain inside the ``while`` loop and look at the next
vertex in ``nbrList}.
::
from pythonds.graphs import Graph, Vertex
def knightTour(n,path,u,limit):
u.setColor('gray')
path.append(u)
if n < limit:
nbrList = list(u.getConnections()) #// \label{lst:kt:oba}
i = 0
done = False
while i < len(nbrList) and not done:
if nbrList[i].getColor() == 'white':
done = knightTour(n+1, #// \label{lst:kt:bt}
path,
nbrList[i],
limit)
i = i + 1
if not done: # prepare to backtrack
path.pop()
u.setColor('white')
else:
done = True
return done
Let's look at a simple example of \texttt``knightTour} in action. You
can refer to :ref:`Figure 20 <fig_ktexamp>` to follow the steps of the search. For
this example we will assume that the call to the ``getConnections``
method on line {lst:kt:oba} of Listing {lst:knight} orders the nodes in
alphabetical order. We begin by calling ``knightTour(0,path,A,6)``
``knightTour`` starts with node A. The nodes adjacent to A are B and D.
Since B is before D alphabetically, DFS selects B to expand next as
shown in :ref:`Figure 13 <fig_ktb>`. Exploring B happens when ``knightTour`` is
called recursively. B is adjacent to C and D, so ``knightTour`` elects
to explore C next. However, as you can see in :ref:`Figure 14 <fig_ktc>` node C is
a dead end with no adjacent white nodes. At this point we change the
color of node C back to white. The call to ``knightTour`` returns a
value of ``False``. The return from the recursive call effectively
backtracks the search to vertex B (see :ref:`Figure 15 <fig_ktd>`). The next
vertex on the list to explore is vertex D, so ``knightTour`` makes a
recursive call moving to node D. From vertex D on,
``knightTour} can continue to make recursive calls until we
get to node C again. However, this time when we get to node C the
test \texttt``n < limit} fails so we know that we have exhausted all the
nodes in the graph. At this point we can return ``True`` to indicate
that we have made a successful tour of the graph. When we return the
list, ``path`` has the values ``[A,B,D,E,F,C]``, which is the the order
we need to traverse the graph to visit each node exactly once.
[Start with node A] {
.. _fig_kta:
.. figure:: Graphs/ktdfsa.png
:align: center
:alt: image
image
}[Explore B] {
.. _fig_ktb:
.. figure:: Graphs/ktdfsb.png
:align: center
:alt: image
image
}[node C is a dead end] {
.. _fig_ktc:
.. figure:: Graphs/ktdfsc.png
:align: center
:alt: image
image
} [backtrack to B] {
.. _fig_ktd:
.. figure:: Graphs/ktdfsd.png
:align: center
:alt: image
image
}[] {
.. _fig_kte:
.. figure:: Graphs/ktdfse.png
:align: center
:alt: image
image
}[] {
.. _fig_ktf:
.. figure:: Graphs/ktdfsf.png
:align: center
:alt: image
image
} [] {
.. _fig_ktg:
.. figure:: Graphs/ktdfsg.png
:align: center
:alt: image
image
}[finish] {
.. _fig_kth:
.. figure:: Graphs/ktdfsh.png
:align: center
:alt: image
image
} {Finding a Path through a Graph with ``knightTour``}
.. _fig_ktexamp:
:ref:`Figure 21 <fig_tour>` shows you what a complete tour around an
eight-by-eight board looks like. There are many possible tours; some are
symmetric. With some modification you can make circular tours that start
and end at the same square.
.. figure:: Graphs/completeTour.png
:align: center
:alt: image
image
{A Complete Tour of the Board}
.. _fig_tour:
Knight’s Tour Analysis
~~~~~~~~~~~~~~~~~~~~~~
There is one last interesting topic regarding the knight’s tour problem,
then we will move on to the general version of the depth first search.
The topic is performance. In particular, ``knightTour`` is very
sensitive to the method you use to select the next vertex to visit. For
example, on a five-by-five board you can produce a path in about 1.5
seconds on a reasonably fast computer. But what happens if you try an