forked from aliev/runestone
-
Notifications
You must be signed in to change notification settings - Fork 0
/
trees.txt
2819 lines (2240 loc) · 109 KB
/
trees.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Trees
=====
{chap:tree}
Objectives
----------
- To understand what a tree data structure is and how it is used.
- To see how trees can be used to implement a map data structure.
- To implement trees using a list.
- To implement trees using classes and references.
- To implement trees as a recursive data structure.
- To implement a priority queue using a heap.
Examples of Trees
-----------------
{sec:treeexamp} Now that we have studied linear data structures like
stacks and queues and have some experience with recursion, we will look
at a common data structure called the **tree**. Trees are used in many
areas of computer science, including operating systems, graphics,
database systems, and computer networking. Tree data structures have
many things in common with their botanical cousins. A tree data
structure has a root, branches, and leaves. The difference between a
tree in nature and a tree in computer science is that a tree data
structure has its root at the top and its leaves on the bottom.
Before we begin our study of tree data structures, let’s look at a few
common examples. Our first example of a tree is a classification tree
from biology. Figure {fig:biotree} shows an example of the biological
classification of some animals. From this simple example, we can learn
about several properties of trees. The first property this example
demonstrates is that trees are hierarchical. By hierarchical, we mean
that trees are structured in layers with the more general things near
the top and the more specific things near the bottom. The top of the
hierarchy is the Kingdom, the next layer of the tree (the “children” of
the layer above) is the Phylum, then the Class, and so on. However, no
matter how deep we go in the classification tree, all the organisms are
still animals.
.. figure:: Trees/biology.png
:align: center
:alt: image
image
{Taxonomy of Some Common Animals Shown as a Tree} {fig:biotree}
Notice that you can start at the top of the tree and follow a path made
of circles and arrows all the way to the bottom. At each level of the
tree we might ask ourselves a question and then follow the path that
agrees with our answer. For example we might ask, “Is this animal a
Chordate or an Arthropod?” If the answer is “Chordate” then we follow
that path and ask, “Is this Chordate a Mammal?” If not, we are stuck
(but only in this simplified example). When we are at the Mammal level
we ask, “Is this Mammal a Primate or a Carnivore?” We can keep following
paths until we get to the very bottom of the tree where we have the
common name.
A second property of trees is that all of the children of one node are
independent of the children of another node. For example, the Genus
Felis has the children Domestica and Leo. The Genus Musca also has a
child named Domestica, but it is a different node and is independent of
the Domestica child of Felis. This means that we can change the node
that is the child of Musca without affecting the child of Felis.
A third property is that each leaf node is unique. We can specify a path
from the root of the tree to a leaf that uniquely identifies each
species in the animal kingdom; for example, Animalia
:math:`\rightarrow` Chordate :math:`\rightarrow` Mammal
:math:`\rightarrow` Carnivora :math:`\rightarrow` Felidae
:math:`\rightarrow `Felis:math:`\rightarrow` Domestica.
Another example of a tree structure that you probably use every day is a
file system. In a file system, directories, or folders, are structured
as a tree. Figure {fig:filetree} illustrates a small part of a Unix file
system hierarchy.
.. figure:: Trees/directory.png
:align: center
:alt: image
image
{A Small Part of the Unix File System Hierarchy} {fig:filetree}
The file system tree has much in common with the biological
classification tree. You can follow a path from the root to any
directory. That path will uniquely identify that subdirectory (and all
the files in it). Another important property of trees, derived from
their hierarchical nature, is that you can move entire sections of a
tree (called a **subtree**) to a different position in the tree without
affecting the lower levels of the hierarchy. For example, we could take
the entire subtree staring with /etc/, detach etc/ from the root and
reattach it under usr/. This would change the unique pathname to httpd
from /etc/httpd to /usr/etc/httpd, but would not affect the contents or
any children of the httpd directory.
A final example of a tree is a web page. The following is an example of
a simple web page written using HTML. Figure {fig:html} shows the tree
that corresponds to each of the HTML tags used to create the page.
::
[label=lst:html,float=htbp,language=html,frame=none,numbers=none]
<html xmlns="http://www.w3.org/1999/xhtml"
xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type"
content="text/html; charset=utf-8" />
<title>simple</title>
</head>
<body>
<h1>A simple web page</h1>
<ul>
<li>List item one</li>
<li>List item two</li>
</ul>
<h2><a href="http://www.cs.luther.edu">Luther CS </a><h2>
</body>
</html>
.. figure:: Trees/htmltree
:align: center
:alt: image
image
{A Tree Corresponding to the Markup Elements of a Web Page}
{fig:html}
{} The HTML source code and the tree accompanying the source illustrate
another hierarchy. Notice that each level of the tree corresponds to a
level of nesting inside the HTML tags. The first tag in the source is
``<html>`` and the last is ``</html>`` All the rest of the tags in the
page are inside the pair. If you check, you will see that this nesting
property is true at all levels of the tree.
Vocabulary and Definitions
--------------------------
{sec:treevocab}
Now that we have looked at examples of trees, we will formally define a
tree and its components.
Node
A node is a fundamental part of a tree. It can have a name, which we
call the “key.” A node may also have additional information. We call
this additional information the “payload.” While the payload
information is not central to many tree algorithms, it is often
critical in applications that make use of trees.
Edge
An edge is another fundamental part of a tree. An edge connects two
nodes to show that there is a relationship between them. Every node
(except the root) is connected by exactly one incoming edge from
another node. Each node may have several outgoing edges.
Root
The root of the tree is the only node in the tree that has no
incoming edges. In Figure {fig:filetree}, / is the root of the tree.
Path
A path is an ordered list of nodes that are connected by edges. For
example,
Mammal:math:`\rightarrow`Carnivora:math:`\rightarrow`Felidae:math:`\rightarrow`Felis:math:`\rightarrow`Domestica
is a path.
Children
The set of nodes :math:`c` that have incoming edges from the same
node to are said to be the children of that node. In Figure
{fig:filetree}, nodes log/, spool/, and yp/ are the children of node
var/.
Parent
A node is the parent of all the nodes it connects to with outgoing
edges. In Figure {fig:filetree} the node var/ is the parent of nodes
log/, spool/, and yp/.
Sibling
Nodes in the tree that are children of the same parent are said to
be siblings. The nodes etc/ and usr/ are siblings in the filesystem
tree.
Subtree
A subtree is a set of nodes and edges comprised of a parent and all
the descendants of that parent.
Leaf Node
A leaf node is a node that has no children. For example, Human and
Chimpanzee are leaf nodes in Figure {fig:biotree}.
Level
The level of a node :math:`n` is the number of edges on the path
from the root node to :math:`n`. For example, the level of the
Felis node in Figure {fig:biotree} is five. By definition, the level
of the root node is zero.
Height
The height of a tree is equal to the maximum level of any node in
the tree. The height of the tree in Figure {fig:filetree} is two.
With the basic vocabulary now defined, we can move on to a formal
definition of a tree. In fact, we will provide two definitions of a
tree. One definition involves nodes and edges. The second definition,
which will prove to be very useful, is a recursive definition.
{} *Definition One:* A tree consists of a set of nodes and a set of
edges that connect pairs of nodes. A tree has the following properties:
- One node of the tree is designated as the root node.
- Every node :math:`n`, except the root node, is connected by an edge
from exactly one other node :math:`p`, where :math:`p` is the
parent of :math:`n`.
- A unique path traverses from the root to each node.
- If each node in the tree has a maximum of two children, we say that
the tree is a **binary tree**.
Figure {fig:nodeedgetree} illustrates a tree that fits definition one.
The arrowheads on the edges indicate the direction of the connection.
.. figure:: Trees/treedef1.png
:align: center
:alt: image
image
{A Tree Consisting of a Set of Nodes and Edges} {fig:nodeedgetree}
*Definition Two:* A tree is either empty or consists of a root and zero
or more subtrees, each of which is also a tree. The root of each subtree
is connected to the root of the parent tree by an edge.
Figure {fig:rectree} illustrates this recursive definition of a tree.
Using the recursive definition of a tree, we know that the tree in
Figure {fig:rectree} has at least four nodes, since each of the
triangles representing a subtree must have a root. It may have many more
nodes than that, but we do not know unless we look deeper into the tree.
.. figure:: Trees/TreeDefRecursive.png
:align: center
:alt: image
image
{A Recursive Definition of a Tree} {fig:rectree}
Implementation
--------------
Keeping in mind the definitions from the previous section, we can use
the following functions to create and manipulate a binary tree:
- ``BinaryTree()`` creates a new instance of a binary tree.
- ``getLeftChild()`` returns the binary tree corresponding to the left
child of the current node.
- ``getRightChild()`` returns the binary tree corresponding to the
right child of the current node.
- ``setRootVal(val)`` stores the object in parameter ``val`` in the
current node.
- ``getRootVal()`` returns the object stored in the current node.
- ``insertLeft(val)`` creates a new binary tree and installs it as the
left child of the current node.
- ``insertRight(val)`` creates a new binary tree and installs it as the
right child of the current node.
The key decision in implementing a tree is choosing a good internal
storage technique. Python allows us two very interesting possibilities,
so we will examine both before choosing one. The first technique we will
call “list of lists,” the second technique we will call “nodes and
references.”
List of Lists Representation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
{sec:bintree} In a tree represented by a list of lists, we will begin
with Python’s list data structure and write the functions defined above.
Although writing the interface as a set of operations on a list is a bit
different from the other abstract data types we have implemented, it is
interesting to do so because it provides us with a simple recursive data
structure that we can look at and examine directly. In a list of lists
tree, we will store the value of the root node as the first element of
the list. The second element of the list will itself be a list that
represents the left subtree. The third element of the list will be
another list that represents the right subtree. To illustrate this
storage technique, let’s look at an example. Figure {fig:smalltree}
shows a simple tree and the corresponding list implementation.
.. figure:: Trees/smalltree.png
:align: center
:alt: image
image
{{fig:smalltreeG} A small tree}
::
myTree = ['a', #root
['b', #left subtree
['d' [], []],
['e' [], []] ],
['c', #right subtree
['f' [], []],
[] ]
]
{{fig:smalltreeL} The list representation of the tree}
{Representing a Tree as a List of Lists} {fig:smalltree}
Notice that we can access subtrees of the list using standard list
slices. The root of the tree is ``myTree[0]``, the left subtree of the
root is ``myTree[1]``, and the right subtree is ``myTree[2]``. The
following Python session illustrates creating a simple tree using a
list. Once the tree is constructed, we can access the root and the left
and right subtrees. One very nice property of this list of lists
approach is that the structure of a list representing a subtree adheres
to the structure defined for a tree; the structure itself is recursive!
A subtree that has a root value and two empty lists is a leaf node.
Another nice feature of the list of lists approach is that it
generalizes to a tree that has many subtrees. In the case where the tree
is more than a binary tree, another subtree is just another list.
{
::
>>> myTree = ['a', ['b', ['d',[],[]], ['e',[],[]] ], \
['c', ['f',[],[]], []] ]
>>> myTree
['a', ['b', ['d',[],[]], ['e',[],[]]],
['c', ['f',[],[]], []]]
>>> myTree[1]
['b', ['d',[],[]], ['e',[],[]]]
>>> myTree[0]
'a'
>>> myTree[2]
['c', ['f', [], []], []]
}
Let’s formalize this definition of the tree data structure by providing
some functions that make it easy for us to use lists as trees. Note that
we are not going to define a binary tree class. The functions we will
write will just help us manipulate a standard list as though we are
working with a tree.
::
[caption=List Functions,label=lst:treefuncs,float=htbp,index={BinaryTree}]
def BinaryTree(r):
return [r, [], []]
The ``BinaryTree`` function simply constructs a list with a root node
and two empty sublists for the children. To add a left subtree to the
root of a tree, we need to insert a new list into the second position of
the root list. We must be careful. If the list already has something in
the second position, we need to keep track of it and push it down the
tree as the left child of the list we are adding. Listing {lst:linsleft}
shows the Python code for inserting a left child.
::
[caption=Insert a Left Subtree,label=lst:linsleft,float=htbp,index={insertLeft}]
def insertLeft(root,newBranch):
t = root.pop(1)
if len(t) > 1:
root.insert(1,[newBranch,t,[]])
else:
root.insert(1,[newBranch, [], []])
return root
Notice that to insert a left child, we first obtain the (possibly empty)
list that corresponds to the current left child. We then add the new
left child, installing the old left child as the left child of the new
one. This allows us to splice a new node into the tree at any position.
The code for ``insertRight`` is similar to ``insertLeft`` and is shown
in Listing {lst:linsright}.
::
[caption=Insert a Right Subtree,label=lst:linsright,float=htbp,index={insertright}]
def insertRight(root,newBranch):
t = root.pop(2)
if len(t) > 1:
root.insert(2,[newBranch,[],t])
else:
root.insert(2,[newBranch,[],[]])
return root
To round out this set of tree-making functions, let’s write a couple of
access functions for getting and setting the root value, as well as
getting the left or right subtrees.
::
[caption=Access Functions for Parts of the Tree,label=lst:lab,float=htbp,index={getRootVal,setRootVal,getLeftChild,getRightChild}]
def getRootVal(root):
return root[0]
def setRootVal(root,newVal):
root[0] = newVal
def getLeftChild(root):
return root[1]
def getRightChild(root):
return root[2]
The Python session in Figure {fig:makeTreess} exercises the tree
functions we have just written. You should type in this code and try it
out for yourself. One of the exercises asks you to draw the tree
structure resulting from this set of calls.
{A Python Session to Illustrate Basic Tree Functions }
{fig:makeTreess}
Nodes and References
~~~~~~~~~~~~~~~~~~~~
Our second method to represent a tree uses nodes and references. In this
case we will define a class that has attributes for the root value, as
well as the left and right subtrees. Since this representation more
closely follows the object-oriented programming paradigm, we will
continue to use this representation for the remainder of the chapter.
Using nodes and references, we might think of the tree as being
structured like the one shown in Figure {fig:treerec}.
.. figure:: Trees/treerecs.png
:align: center
:alt: image
image
{A Simple Tree Using a Nodes and References Approach} {fig:treerec}
We will start out with a simple class definition for the nodes and
references approach as shown in Listing {lst:nar}. The important thing
to remember about this representation is that the attributes ``left``
and ``right`` will become references to other instances of the
``BinaryTree`` class. For example, when we insert a new left child into
the tree we create another instance of ``BinaryTree`` and modify
``self.leftChild`` in the root to reference the new tree.
::
[caption=A Simple Class Definition,label=lst:nar,float=htbp,index={BinaryTree}]
class BinaryTree:
def __init__(self,rootObj):
self.key = rootObj
self.leftChild = None
self.rightChild = None
Notice that in Listing {lst:nar}, the constructor function expects to
get some kind of object to store in the root. Just like you can store
any object you like in a list, the root object of a tree can be a
reference to any object. For our early examples, we will store the name
of the node as the root value. Using nodes and references to represent
the tree in Figure {fig:treerec}, we would create six instances of the
BinaryTree class.
Next let’s look at the functions we need to build the tree beyond the
root node. To add a left child to the tree, we will create a new binary
tree object and set the ``left`` attribute of the root to refer to this
new object. The code for ``insertLeft`` is shown in
Listing {lst:inleft}.
::
[caption=Insert a New Left Child,label=lst:inleft,float=htb]
def insertLeft(self,newNode):
if self.leftChild == None:
self.leftChild = BinaryTree(newNode)
else: #// \label{lst:inilinsrt}
t = BinaryTree(newNode)
t.left = self.leftChild
self.leftChild = t
We must consider two cases for insertion. The first case is
characterized by a node with no existing left child. When there is no
left child, simply add a node to the tree. The second case is
characterized by a node with an existing right child. In the second
case, we insert a node and push the existing child down one level in the
tree. The second case is handled by the ``else`` statement on line
{lst:inilinsrt} of Listing {lst:inleft}.
The code for ``insertRight`` must consider a symmetric set of cases.
There will either be no right child, or we must insert the node between
the root and an existing right child. The insertion code is shown in
Listing {lst:insrt}.
::
[caption=Code to Insert a Right Child,label=lst:insrt,float=htb]
def insertRight(self,newNode):
if self.rightChild == None:
self.rightChild = BinaryTree(newNode)
else:
t = BinaryTree(newNode)
t.right = self.rightChild
self.rightChild = t
To round out the definition for a simple binary tree data structure, we
will write access functions for the left and right children, as well as
the root values.
::
[caption=Access Methods for the Binary Tree Class,label=lst:btaccess,float=htb]
def getRightChild(self):
return self.rightChild
def getLeftChild(self):
return self.leftChild
def setRootVal(self,obj):
self.key = obj
def getRootVal(self):
return self.key
Now that we have all the pieces to create and manipulate a binary tree,
let’s use them to check on the structure a bit more. Let’s make a simple
tree with node a as the root, and add nodes b and c as children. The
following Python session creates the tree and looks at the some of the
values stored in ``key``, ``left``, and ``right``. Notice that both the
left and right children of the root are themselves distinct instances of
the ``BinaryTree`` class. As we said in our original recursive
definition for a tree, this allows us to treat any child of a binary
tree as a binary tree itself. {
::
>>> from pythonds.trees import BinaryTree
>>> r = BinaryTree('a')
>>> r.getRootVal()
'a'
>>> print(r.getLeftChild())
None
>>> r.insertLeft('b')
>>> print(r.getLeftChild())
<__main__.BinaryTree instance at 0x6b238>
>>> print(r.getLeftChild().getRootVal())
b
>>> r.insertRight('c')
>>> print(r.getRightChild())
<__main__.BinaryTree instance at 0x6b9e0>
>>> print(r.getRightChild().getRootVal())
c
>>> r.getRightChild().setRootVal('hello')
>>> print(r.getRightChild().getRootVal())
hello
>>>
}
Binary Tree Applications
------------------------
{sec:bintreeapps}
Parse Tree
~~~~~~~~~~
{sec:parsetree} With the implementation of our tree data structure
complete, we now look at an example of how a tree can be used to solve
some real problems. In this section we will look at parse trees. Parse
trees can be used to represent real-world constructions like sentences
(see Figure {fig:nlparse}), or mathematical expressions.
.. figure:: Trees/nlParse.png
:align: center
:alt: image
image
{A Parse Tree for a Simple Sentence} {fig:nlparse}
Figure {fig:nlparse} shows the hierarchical structure of a simple
sentence. Representing a sentence as a tree structure allows us to work
with the individual parts of the sentence by using subtrees.
.. figure:: Trees/meParse.png
:align: center
:alt: image
image
{Parse Tree for :math:`((7+3)*(5-2))`} {fig:meparse}
We can also represent a mathematical expression such as
:math:`((7 + 3) * (5 - 2))` as a parse tree, as shown in
Figure {fig:meparse}. We have already looked at fully parenthesized
expressions, so what do we know about this expression? We know that
multiplication has a higher precedence than either addition or
subtraction. Because of the parentheses, we know that before we can do
the multiplication we must evaluate the parenthesized addition and
subtraction expressions. The hierarchy of the tree helps us understand
the order of evaluation for the whole expression. Before we can evaluate
the top-level multiplication, we must evaluate the addition and the
subtraction in the subtrees. The addition, which is the left subtree,
evaluates to 10. The subtraction, which is the right subtree, evaluates
to 3. Using the hierarchical structure of trees, we can simply replace
an entire subtree with one node once we have evaluated the expressions
in the children. Applying this replacement procedure gives us the
simplified tree shown in Figure {fig:mesimple}.
.. figure:: Trees/meSimple.png
:align: center
:alt: image
image
{A Simplified Parse Tree for :math:`((7+3)*(5-2))`} {fig:mesimple}
In the rest of this section we are going to examine parse trees in more
detail. In particular we will look at
- How to build a parse tree from a fully parenthesized mathematical
expression.
- How to evaluate the expression stored in a parse tree.
- How to recover the original mathematical expression from a parse
tree.
The first step in building a parse tree is to break up the expression
string into a list of tokens. There are four different kinds of tokens
to consider: left parentheses, right parentheses, operators, and
operands. We know that whenever we read a left parenthesis we are
starting a new expression, and hence we should create a new tree to
correspond to that expression. Conversely, whenever we read a right
parenthesis, we have finished an expression. We also know that operands
are going to be leaf nodes and children of their operators. Finally, we
know that every operator is going to have both a left and a right child.
Using the information from above we can define four rules as follows:
#. If the current token is a ``'('``, add a new node as the left child
of the current node, and descend to the left child.
#. If the current token is in the list ``['+','-','/','*']``, set the
root value of the current node to the operator represented by the
current token. Add a new node as the right child of the current node
and descend to the right child.
#. If the current token is a number, set the root value of the current
node to the number and return to the parent.
#. If the current token is a ``')'``, go to the parent of the current
node.
Before writing the Python code, let’s look at an example of the rules
outlined above in action. We will use the expression
:math:`(3 + (4 * 5))`. We will parse this expression into the
following list of character tokens ``['(', '3', '+',``
``'(', '4', '*', '5' ,')',')']``. Initially we will start out with a
parse tree that consists of an empty root node. Figure {fig:bldExpstep}
illustrates the structure and contents of the parse tree, as each new
token is processed.
[] { {fig:bldExpa}
.. figure:: Trees/buildExp1.png
:align: center
:alt: image
image
}[] { {fig:bldExpb}
.. figure:: Trees/buildExp2.png
:align: center
:alt: image
image
}[] { {fig:bldExpc}
.. figure:: Trees/buildExp3.png
:align: center
:alt: image
image
}[] { {fig:bldExpd}
.. figure:: Trees/buildExp4.png
:align: center
:alt: image
image
} [] { {fig:bldExpe}
.. figure:: Trees/buildExp5.png
:align: center
:alt: image
image
}[] { {fig:bldExpf}
.. figure:: Trees/buildExp6.png
:align: center
:alt: image
image
}[] { {fig:bldExpg}
.. figure:: Trees/buildExp7.png
:align: center
:alt: image
image
}[] { {fig:bldExph}
.. figure:: Trees/buildExp8.png
:align: center
:alt: image
image
} {Tracing Parse Tree Construction} {fig:bldExpstep}
Using Figure {fig:bldExpstep}, let’s walk through the example step by
step:
a) Create an empty tree.
b) Read ( as the first token. By rule 1, create a new node as the left
child of the root. Make the current node this new child.
c) Read 3 as the next token. By rule 3, set the root value of the
current node to 3 and go back up the tree to the parent.
d) Read + as the next token. By rule 2, set the root value of the
current node to + and add a new node as the right child. The new
right child becomes the current node.
e) Read a ( as the next token. By rule 1, create a new node as the left
child of the current node. The new left child becomes the current
node.
f) Read a 4 as the next token. By rule 3, set the value of the current
node to 4. Make the parent of 4 the current node.
g) Read \* as the next token. By rule 2, set the root value of the
current node to \* and create a new right child. The new right child
becomes the current node.
h) Read 5 as the next token. By rule 3, set the root value of the
current node to 5. Make the parent of 5 the current node.
i) Read ) as the next token. By rule 4 we make the parent of \* the
current node.
j) Read ) as the next token. By rule 4 we make the parent of + the
current node. At this point there is no parent for + so we are done.
From the example above, it is clear that we need to keep track of the
current node as well as the parent of the current node. The tree
interface provides us with a way to get children of a node, through the
``getLeftChild`` and ``getRightChild`` methods, but how can we keep
track of the parent? A simple solution to keeping track of parents as we
traverse the tree is to use a stack. Whenever we want to descend to a
child of the current node, we first push the current node on the stack.
When we want to return to the parent of the current node, we pop the
parent off the stack.
Using the rules described above, along with the ``Stack`` and
``BinaryTree`` operations, we are now ready to write a Python function
to create a parse tree. The code for our parse tree builder is presented
in Listing {lst:buildexp}.
::
[caption=Code to Create a Parse Tree,label=lst:buildexp,float=htbp,index={buildParseTree}]
from pythonds.basic import Stack
from pythonds.trees import BinaryTree
def buildParseTree(fpexp):
fplist = fpexp.split()
pStack = Stack()
eTree = BinaryTree('')
pStack.push(eTree)
currentTree = eTree
for i in fplist:
if i == '(': #// \label{lst:ptlp}
currentTree.insertLeft('')
pStack.push(currentTree)
currentTree = currentTree.getLeftChild()
elif i not in '+-*/)': #// \label{lst:ptoper}
currentTree.setRootVal(eval(i))
parent = pStack.pop()
currentTree = parent
elif i in '+-*/': #// \label{lst:ptopnd}
currentTree.setRootVal(i)
currentTree.insertRight('')
pStack.push(currentTree)
currentTree = currentTree.getRightChild()
elif i == ')': #// \label{lst:ptrp}
currentTree = pStack.pop()
else:
raise ValueError("Unknown Operator: " + i)
return eTree
The four rules for building a parse tree are coded as the first four
clauses of the ``if`` statement on lines {lst:ptlp}, {lst:ptoper},
{lst:ptopnd}, and {lst:ptrp} of Listing {lst:buildexp}. In each case you
can see that the code implements the rule, as described above, with a
few calls to the ``BinaryTree`` or ``Stack`` methods. The only error
checking we do in this function is in the ``else`` clause, where we
raise a ``ValueError`` exception if we get a token from the list that we
do not recognize.
Now that we have built a parse tree, what can we do with it? As a first
example, we will write a function to evaluate the parse tree, returning
the numerical result. To write this function, we will make use of the
hierarchical nature of the tree. Look back at Figure {fig:meparse}.
Recall that we can replace the original tree with the simplified tree
shown in Figure {fig:mesimple}. This suggests that we can write an
algorithm that evaluates a parse tree by recursively evaluating each
subtree.
As we have done with past recursive algorithms, we will begin the design
for the recursive evaluation function by identifying the base case. A
natural base case for recursive algorithms that operate on trees is to
check for a leaf node. In a parse tree, the leaf nodes will always be
operands. Since numerical objects like integers and floating points
require no further interpretation, the ``evaluate`` function can simply
return the value stored in the leaf node. The recursive step that moves
the function toward the base case is to call ``evaluate`` on both the
left and the right children of the current node. The recursive call
effectively moves us down the tree, toward a leaf node.
To put the results of the two recursive calls together, we can simply
apply the operator stored in the parent node to the results returned
from evaluating both children. In the example from Figure {fig:mesimple}
we see that the two children of the root evaluate to themselves, namely
10 and 3. Applying the multiplication operator gives us a final result
of 30.
The code for a recursive ``evaluate`` function is shown in
Listing {lst:eteval}. First, we obtain references to the left and the
right children of the current node. If both the left and right children
evaluate to ``None``, then we know that the current node is really a
leaf node. This check is on line {lst:etbc}. If the current node is not
a leaf node, look up the operator in the current node and apply it to
the results from recursively evaluating the left and right children.
To implement, we use a dictionary with the keys ``'+', '-', '*'``, and
``'/'``. The values stored in the dictionary are functions from Python’s
operator module. The operator module provides us with the functional
versions of many commonly used operators. When we look up an operator in
the dictionary, the corresponding function object is retrieved. Since
the retrieved object is a function, we can call it in the usual way
``function(param1,param2)``. So the lookup ``opers['+'](2,2)`` is
equivalent to ``operator.add(2,2)``.
::
[caption=A Recursive Function to Evaluate a Binary Parse Tree,label=lst:eteval,float=htbp,index={evaluate}]
def evaluate(parseTree):
opers = {'+':operator.add, '-':operator.sub,
'*':operator.mul, '/':operator.truediv}
leftC = parseTree.getLeftChild()
rightC = parseTree.getRightChild()
if leftC and rightC: #// \label{lst:etbc}
fn = opers[parseTree.getRootVal()]
return fn(evaluate(leftC),evaluate(rightC)) #//\label{lst:evalexprec}
else:
return parseTree.getRootVal()
Finally, we will trace the ``evaluate`` function on the parse tree we
created in Figure {fig:bldExpstep}. When we first call ``evaluate``, we
pass the root of the entire tree as the parameter ``parseTree``. Then we
obtain references to the left and right children to make sure they
exist. The recursive call takes place on line {lst:evalexprec}. We begin
by looking up the operator in the root of the tree, which is ``'+'``.
The ``'+'`` operator maps to the ``operator.add`` function call, which
takes two parameters. As usual for a Python function call, the first
thing Python does is to evaluate the parameters that are passed to the
function. In this case both parameters are recursive function calls to
our ``evaluate`` function. Using left-to-right evaluation, the first
recursive call goes to the left. In the first recursive call the
``evaluate`` function is given the left subtree. We find that the node
has no left or right children, so we are in a leaf node. When we are in
a leaf node we just return the value stored in the leaf node as the
result of the evaluation. In this case we return the integer 3.
At this point we have one parameter evaluated for our top-level call to
``operator.add``. But we are not done yet. Continuing the left-to-right
evaluation of the parameters, we now make a recursive call to evaluate
the right child of the root. We find that the node has both a left and a
right child so we look up the operator stored in this node, ``'*'``, and
call this function using the left and right children as the parameters.
At this point you can see that both recursive calls will be to leaf
nodes, which will evaluate to the integers four and five respectively.
With the two parameters evaluated, we return the result of
``operator.mul(4,5)``. At this point we have evaluated the operands for
the top level ``'+'`` operator and all that is left to do is finish the
call to ``operator.add(3,20)``. The result of the evaluation of the
entire expression tree for :math:`(3 + (4 * 5))` is 23.
Tree Traversals
~~~~~~~~~~~~~~~
{sec:traverse} Now that we have examined the basic functionality of our
tree data structure, it is time to look at some additional usage
patterns for trees. These usage patterns can be divided into the three
ways that we access the nodes of the tree. There are three commonly used
patterns to visit all the nodes in a tree. The difference between these
patterns is the order in which each node is visited. We call this
visitation of the nodes a “traversal.” The three traversals we will look
at are called **preorder**, **inorder**, and **postorder**. Let’s start
out by defining these three traversals more carefully, then look at some
examples where these patterns are useful.
preorder
In a preorder traversal, we visit the root node first, then
recursively do a preorder traversal of the left subtree, followed by
a recursive preorder traversal of the right subtree.
inorder
In an inorder traversal, we recursively do an inorder traversal on
the left subtree, visit the root node, and finally do a recursive
inorder traversal of the right subtree.
postorder
In a postorder traversal, we recursively do a postorder traversal of
the left subtree and the right subtree followed by a visit to the
root node.
Let’s look at some examples that illustrate each of these three kinds of
traversals. First let’s look at the preorder traversal. As an example of
a tree to traverse, we will represent this book as a tree. The book is
the root of the tree, and each chapter is a child of the root. Each
section within a chapter is a child of the chapter, and each subsection
is a child of its section, and so on. Figure {fig:booktree} shows a
limited version of a book with only two chapters. Note that the
traversal algorithm works for trees with any number of children, but we
will stick with binary trees for now.
.. figure:: Trees/booktree.png
:align: center
:alt: image
image
{Representing a Book as a Tree} {fig:booktree}
Suppose that you wanted to read this book from front to back. The
preorder traversal gives you exactly that ordering. Starting at the root
of the tree (the Book node) we will follow the preorder traversal
instructions. We recursively call ``preorder`` on the left child, in
this case Chapter1. We again recursively call ``preorder`` on the left
child to get to Section 1.1. Since Section 1.1 has no children, we do
not make any additional recursive calls. When we are finished with
Section 1.1, we move up the tree to Chapter 1. At this point we still
need to visit the right subtree of Chapter 1, which is Section 1.2. As
before we visit the left subtree, which brings us to Section 1.2.1, then
we visit the node for Section 1.2.2. With Section 1.2 finished, we
return to Chapter 1. Then we return to the Book node and follow the same
procedure for Chapter 2.
The code for writing tree traversals is surprisingly elegant, largely
because the traversals are written recursively. Listing {lst:preorder}
shows the Python code for a preorder traversal of a binary tree.
You may wonder, what is the best way to write an algorithm like preorder
traversal? Should it be a function that simply uses a tree as a data
structure, or should it be a method of the tree data structure itself?
Listing {lst:preordext} shows a version of the preorder traversal
written as an external function that takes a binary tree as a parameter.
The external function is particularly elegant because our base case is
simply to check if the tree exists. If the tree parameter is ``None``,