-
Notifications
You must be signed in to change notification settings - Fork 15
/
Copy pathExtending.html
4058 lines (3298 loc) · 115 KB
/
Extending.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>Extending SWIG to support new languages</title>
<link rel="stylesheet" type="text/css" href="style.css">
</head>
<body bgcolor="#ffffff">
<H1><a name="Extending"></a>38 Extending SWIG to support new languages</H1>
<!-- INDEX -->
<div class="sectiontoc">
<ul>
<li><a href="#Extending_nn2">Introduction</a>
<li><a href="#Extending_nn3">Prerequisites</a>
<li><a href="#Extending_nn4">The Big Picture</a>
<li><a href="#Extending_nn5">Execution Model</a>
<ul>
<li><a href="#Extending_nn6">Preprocessing</a>
<li><a href="#Extending_nn7">Parsing</a>
<li><a href="#Extending_nn8">Parse Trees</a>
<li><a href="#Extending_nn9">Attribute namespaces</a>
<li><a href="#Extending_nn10">Symbol Tables</a>
<li><a href="#Extending_nn11">The %feature directive</a>
<li><a href="#Extending_nn12">Code Generation</a>
<li><a href="#Extending_nn13">SWIG and XML</a>
</ul>
<li><a href="#Extending_nn14">Primitive Data Structures</a>
<ul>
<li><a href="#Extending_nn15">Strings</a>
<li><a href="#Extending_nn16">Hashes</a>
<li><a href="#Extending_nn17">Lists</a>
<li><a href="#Extending_nn18">Common operations</a>
<li><a href="#Extending_nn19">Iterating over Lists and Hashes</a>
<li><a href="#Extending_nn20">I/O</a>
</ul>
<li><a href="#Extending_nn21">Navigating and manipulating parse trees</a>
<li><a href="#Extending_nn22">Working with attributes</a>
<li><a href="#Extending_nn23">Type system</a>
<ul>
<li><a href="#Extending_nn24">String encoding of types</a>
<li><a href="#Extending_nn25">Type construction</a>
<li><a href="#Extending_nn26">Type tests</a>
<li><a href="#Extending_nn27">Typedef and inheritance</a>
<li><a href="#Extending_nn28">Lvalues</a>
<li><a href="#Extending_nn29">Output functions</a>
</ul>
<li><a href="#Extending_nn30">Parameters</a>
<li><a href="#Extending_nn31">Writing a Language Module</a>
<ul>
<li><a href="#Extending_nn32">Execution model</a>
<li><a href="#Extending_starting_out">Starting out</a>
<li><a href="#Extending_nn34">Command line options</a>
<li><a href="#Extending_nn35">Configuration and preprocessing</a>
<li><a href="#Extending_nn36">Entry point to code generation</a>
<li><a href="#Extending_nn37">Module I/O and wrapper skeleton</a>
<li><a href="#Extending_nn38">Low-level code generators</a>
<li><a href="#Extending_configuration_files">Configuration files</a>
<li><a href="#Extending_nn40">Runtime support</a>
<li><a href="#Extending_nn41">Standard library files</a>
<li><a href="#Extending_nn42">User examples</a>
<li><a href="#Extending_test_suite">Test driven development and the test-suite</a>
<ul>
<li><a href="#Extending_running_test_suite">Running the test-suite</a>
</ul>
<li><a href="#Extending_nn43">Documentation</a>
<li><a href="#Extending_prerequisites">Prerequisites for adding a new language module to the SWIG distribution</a>
<li><a href="#Extending_coding_style_guidelines">Coding style guidelines</a>
</ul>
<li><a href="#Extending_debugging_options">Debugging Options</a>
<li><a href="#Extending_nn46">Guide to parse tree nodes</a>
<li><a href="#Extending_further_info">Further Development Information</a>
</ul>
</div>
<!-- INDEX -->
<H2><a name="Extending_nn2"></a>38.1 Introduction</H2>
<p>
This chapter describes SWIG's internal organization and the process by which
new target languages can be developed. First, a brief word of warning---SWIG
is continually evolving.
The information in this chapter is mostly up to
date, but changes are ongoing. Expect a few inconsistencies.
</p>
<p>
Also, this chapter is not meant to be a hand-holding tutorial. As a starting point,
you should probably look at one of SWIG's existing modules.
</p>
<H2><a name="Extending_nn3"></a>38.2 Prerequisites</H2>
<p>
In order to extend SWIG, it is useful to have the following background:
</p>
<ul>
<li>An understanding of the C API for the target language.
<li>A good grasp of the C++ type system.
<li>An understanding of typemaps and some of SWIG's advanced features.
<li>Some familiarity with writing C++ (language modules are currently written in C++).
</ul>
<p>
Since SWIG is essentially a specialized C++ compiler, it may be useful
to have some prior experience with compiler design (perhaps even a
compilers course) to better understand certain parts of the system. A
number of books will also be useful. For example, "The C Programming
Language" by Kernighan and Ritchie (a.k.a, "K&R") and the C++ standard,
"ISO/IEC 14882 Programming Languages - C++" will be of great use.
</p>
<p>
Also, it is useful to keep in mind that SWIG primarily operates as an
extension of the C++ <em>type</em> system. At first glance, this might not be
obvious, but almost all SWIG directives as well as the low-level generation of
wrapper code are driven by C++ datatypes.
</p>
<H2><a name="Extending_nn4"></a>38.3 The Big Picture</H2>
<p>
SWIG is a special purpose compiler that parses C++ declarations to
generate wrapper code. To make this conversion possible, SWIG makes
three fundamental extensions to the C++ language:
</p>
<ul>
<li><b>Typemaps</b>. Typemaps are used to define the
conversion/marshalling behavior of specific C++ datatypes. All type conversion in SWIG is
based on typemaps. Furthermore, the association of typemaps to datatypes utilizes an advanced pattern matching
mechanism that is fully integrated with the C++ type system.
</li>
<li><b>Declaration Annotation</b>. To customize wrapper code
generation, most declarations can be annotated with special features.
For example, you can make a variable read-only, you can ignore a
declaration, you can rename a member function, you can add exception
handling, and so forth. Virtually all of these customizations are built on top of a low-level
declaration annotator that can attach arbitrary attributes to any declaration.
Code generation modules can look for these attributes to guide the wrapping process.
</li>
<li><b>Class extension</b>. SWIG allows classes and structures to be extended with new
methods and attributes (the <tt>%extend</tt> directive). This has the effect of altering
the API in the target language and can be used to generate OO interfaces to C libraries.
</ul>
<p>
It is important to emphasize that virtually all SWIG features reduce to one of these three
fundamental concepts. The type system and pattern matching rules also play a critical
role in making the system work. For example, both typemaps and declaration annotation are
based on pattern matching and interact heavily with the underlying type system.
</p>
<H2><a name="Extending_nn5"></a>38.4 Execution Model</H2>
<p>
When you run SWIG on an interface, processing is handled in stages by a series of system components:
</p>
<ul>
<li>An integrated C preprocessor reads a collection of configuration
files and the specified interface file into memory. The preprocessor
performs the usual functions including macro expansion and file
inclusion. However, the preprocessor also performs some transformations of the
interface. For instance, <tt>#define</tt> statements are sometimes transformed into
<tt>%constant</tt> declarations. In addition, information related to file/line number
tracking is inserted.
</li>
<li>A C/C++ parser reads the preprocessed input and generates a full
parse tree of all of the SWIG directives and C declarations found.
The parser is responsible for many aspects of the system including
renaming, declaration annotation, and template expansion. However, the parser
does not produce any output nor does it interact with the target
language module as it runs. SWIG is not a one-pass compiler.
</li>
<li>A type-checking pass is made. This adjusts all of the C++ typenames to properly
handle namespaces, typedefs, nested classes, and other issues related to type scoping.
</li>
<li>A semantic pass is made on the parse tree to collect information
related to properties of the C++ interface. For example, this pass
would determine whether or not a class allows a default constructor.
</li>
<li>A code generation pass is made using a specific target language
module. This phase is responsible for generating the actual wrapper
code. All of SWIG's user-defined modules are invoked during this
latter stage of compilation.
</li>
</ul>
<p>
The next few sections briefly describe some of these stages.
</p>
<H3><a name="Extending_nn6"></a>38.4.1 Preprocessing</H3>
<p>
The preprocessor plays a critical role in the SWIG implementation. This is because a lot
of SWIG's processing and internal configuration is managed not by code written in C, but
by configuration files in the SWIG library. In fact, when you
run SWIG, parsing starts with a small interface file like this (note: this explains
the cryptic error messages that new users sometimes get when SWIG is misconfigured or installed
incorrectly):
</p>
<div class="code">
<pre>
%include "swig.swg" // Global SWIG configuration
%include "<em>langconfig.swg</em>" // Language specific configuration
%include "yourinterface.i" // Your interface file
</pre>
</div>
<p>
The <tt>swig.swg</tt> file contains global configuration information. In addition, this file
defines many of SWIG's standard directives as macros. For instance, part of
of <tt>swig.swg</tt> looks like this:
</p>
<div class="code">
<pre>
...
/* Code insertion directives such as %wrapper %{ ... %} */
#define %begin %insert("begin")
#define %runtime %insert("runtime")
#define %header %insert("header")
#define %wrapper %insert("wrapper")
#define %init %insert("init")
/* Access control directives */
#define %immutable %feature("immutable","1")
#define %mutable %feature("immutable")
/* Directives for callback functions */
#define %callback(x) %feature("callback") `x`;
#define %nocallback %feature("callback");
/* %ignore directive */
#define %ignore %rename($ignore)
#define %ignorewarn(x) %rename("$ignore:" x)
...
</pre>
</div>
<p>
The fact that most of the standard SWIG directives are macros is
intended to simplify the implementation of the internals. For instance,
rather than having to support dozens of special directives, it is
easier to have a few basic primitives such as <tt>%feature</tt> or
<tt>%insert</tt>.
</p>
<p>
The <em><tt>langconfig.swg</tt></em> file is supplied by the target
language. This file contains language-specific configuration
information. More often than not, this file provides run-time wrapper
support code (e.g., the type-checker) as well as a collection of
typemaps that define the default wrapping behavior. Note: the name of this
file depends on the target language and is usually something like <tt>python.swg</tt>
or <tt>perl5.swg</tt>.
</p>
<p>
As a debugging aide, the text that SWIG feeds to its C++ parser can be
obtained by running <tt>swig -E interface.i</tt>. This output
probably isn't too useful in general, but it will show how macros have
been expanded as well as everything else that goes into the low-level
construction of the wrapper code.
</p>
<H3><a name="Extending_nn7"></a>38.4.2 Parsing</H3>
<p>
The current C++ parser handles a subset of C++. Most incompatibilities with C are due to
subtle aspects of how SWIG parses declarations. Specifically, SWIG expects all C/C++ declarations to follow this general form:
</p>
<div class="diagram">
<pre>
<em>storage</em> <em>type</em> <em>declarator</em> <em>initializer</em>;
</pre>
</div>
<p>
<tt><em>storage</em></tt> is a keyword such as <tt>extern</tt>,
<tt>static</tt>, <tt>typedef</tt>, or <tt>virtual</tt>. <tt><em>type</em></tt> is a primitive
datatype such as <tt>int</tt> or <tt>void</tt>. <tt><em>type</em></tt> may be optionally
qualified with a qualifier such as <tt>const</tt> or <tt>volatile</tt>. <tt><em>declarator</em></tt>
is a name with additional type-construction modifiers attached to it (pointers, arrays, references,
functions, etc.). Examples of declarators include <tt>*x</tt>, <tt>**x</tt>, <tt>x[20]</tt>, and
<tt>(*x)(int,double)</tt>. The <tt><em>initializer</em></tt> may be a value assigned using <tt>=</tt> or
body of code enclosed in braces <tt>{ ... }</tt>.
</p>
<p>
This declaration format covers most common C++ declarations. However, the C++ standard
is somewhat more flexible in the placement of the parts. For example, it is technically legal, although
uncommon to write something like <tt>int typedef const a</tt> in your program. SWIG simply
doesn't bother to deal with this case.
</p>
<p>
The other significant difference between C++ and SWIG is in the
treatment of typenames. In C++, if you have a declaration like this,
</p>
<div class="code">
<pre>
int blah(Foo *x, Bar *y);
</pre>
</div>
<p>
it won't parse correctly unless <tt>Foo</tt> and <tt>Bar</tt> have
been previously defined as types either using a <tt>class</tt>
definition or a <tt>typedef</tt>. The reasons for this are subtle,
but this treatment of typenames is normally integrated at the level of the C
tokenizer---when a typename appears, a different token is returned to the parser
instead of an identifier.
</p>
<p>
SWIG does not operate in this manner--any legal identifier can be used
as a type name. The reason for this is primarily motivated by the use
of SWIG with partially defined data. Specifically,
SWIG is supposed to be easy to use on interfaces with missing type information.
</p>
<p>
Because of the different treatment of typenames, the most serious
limitation of the SWIG parser is that it can't process type declarations where
an extra (and unnecessary) grouping operator is used. For example:
</p>
<div class="code">
<pre>
int (x); /* A variable x */
int (y)(int); /* A function y */
</pre>
</div>
<p>
The placing of extra parentheses in type declarations like this is
already recognized by the C++ community as a potential source of
strange programming errors. For example, Scott Meyers "Effective STL"
discusses this problem in a section on avoiding C++'s "most vexing
parse."
</p>
<p>
The parser is also unable to handle declarations with no return type or bare argument names.
For example, in an old C program, you might see things like this:
</p>
<div class="code">
<pre>
foo(a,b) {
...
}
</pre>
</div>
<p>
In this case, the return type as well as the types of the arguments
are taken by the C compiler to be an <tt>int</tt>. However, SWIG
interprets the above code as an abstract declarator for a function
returning a <tt>foo</tt> and taking types <tt>a</tt> and <tt>b</tt> as
arguments).
</p>
<H3><a name="Extending_nn8"></a>38.4.3 Parse Trees</H3>
<p>
The SWIG parser produces a complete parse tree of the input file before any wrapper code
is actually generated. Each item in the tree is known as a "Node". Each node is identified
by a symbolic tag. Furthermore, a node may have an arbitrary number of children.
The parse tree structure and tag names of an interface can be displayed using <tt>swig -debug-tags</tt>.
For example:
</p>
<div class="shell">
<pre>
$ <b>swig -c++ -python -debug-tags example.i</b>
. top (example.i:1)
. top . include (example.i:1)
. top . include . typemap (/r0/beazley/Projects/lib/swig1.3/swig.swg:71)
. top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/swig.swg:71)
. top . include . typemap (/r0/beazley/Projects/lib/swig1.3/swig.swg:83)
. top . include . typemap . typemapitem (/r0/beazley/Projects/lib/swig1.3/swig.swg:83)
. top . include (example.i:4)
. top . include . insert (/r0/beazley/Projects/lib/swig1.3/python/python.swg:7)
. top . include . insert (/r0/beazley/Projects/lib/swig1.3/python/python.swg:8)
. top . include . typemap (/r0/beazley/Projects/lib/swig1.3/python/python.swg:19)
...
. top . include (example.i:6)
. top . include . module (example.i:2)
. top . include . insert (example.i:6)
. top . include . include (example.i:9)
. top . include . include . class (example.h:3)
. top . include . include . class . access (example.h:4)
. top . include . include . class . constructor (example.h:7)
. top . include . include . class . destructor (example.h:10)
. top . include . include . class . cdecl (example.h:11)
. top . include . include . class . cdecl (example.h:11)
. top . include . include . class . cdecl (example.h:12)
. top . include . include . class . cdecl (example.h:13)
. top . include . include . class . cdecl (example.h:14)
. top . include . include . class . cdecl (example.h:15)
. top . include . include . class (example.h:18)
. top . include . include . class . access (example.h:19)
. top . include . include . class . cdecl (example.h:20)
. top . include . include . class . access (example.h:21)
. top . include . include . class . constructor (example.h:22)
. top . include . include . class . cdecl (example.h:23)
. top . include . include . class . cdecl (example.h:24)
. top . include . include . class (example.h:27)
. top . include . include . class . access (example.h:28)
. top . include . include . class . cdecl (example.h:29)
. top . include . include . class . access (example.h:30)
. top . include . include . class . constructor (example.h:31)
. top . include . include . class . cdecl (example.h:32)
. top . include . include . class . cdecl (example.h:33)
</pre>
</div>
<p>
Even for the most simple interface, the parse tree structure is larger than you might expect. For example, in the
above output, a substantial number of nodes are actually generated by the <tt>python.swg</tt> configuration file
which defines typemaps and other directives. The contents of the user-supplied input file don't appear until the end
of the output.
</p>
<p>
The contents of each parse tree node consist of a collection of attribute/value
pairs. Internally, the nodes are simply represented by hash tables. A display of
the entire parse-tree structure can be obtained using <tt>swig -debug-top <n></tt>, where <tt>n</tt> is
the stage being processed.
There are a number of other parse tree display options, for example, <tt>swig -debug-module <n></tt> will
avoid displaying system parse information and only display the parse tree pertaining to the user's module at
stage <tt>n</tt> of processing.
</p>
<div class="shell">
<pre>
$ swig -c++ -python -debug-module 4 example.i
+++ include ----------------------------------------
| name - "example.i"
+++ module ----------------------------------------
| name - "example"
|
+++ insert ----------------------------------------
| code - "\n#include \"example.h\"\n"
|
+++ include ----------------------------------------
| name - "example.h"
+++ class ----------------------------------------
| abstract - "1"
| sym:name - "Shape"
| name - "Shape"
| kind - "class"
| symtab - 0x40194140
| sym:symtab - 0x40191078
+++ access ----------------------------------------
| kind - "public"
|
+++ constructor ----------------------------------------
| sym:name - "Shape"
| name - "Shape"
| decl - "f()."
| code - "{\n nshapes++;\n }"
| sym:symtab - 0x40194140
|
+++ destructor ----------------------------------------
| sym:name - "~Shape"
| name - "~Shape"
| storage - "virtual"
| code - "{\n nshapes--;\n }"
| sym:symtab - 0x40194140
|
+++ cdecl ----------------------------------------
| sym:name - "x"
| name - "x"
| decl - ""
| type - "double"
| sym:symtab - 0x40194140
|
+++ cdecl ----------------------------------------
| sym:name - "y"
| name - "y"
| decl - ""
| type - "double"
| sym:symtab - 0x40194140
|
+++ cdecl ----------------------------------------
| sym:name - "move"
| name - "move"
| decl - "f(double,double)."
| parms - double ,double
| type - "void"
| sym:symtab - 0x40194140
|
+++ cdecl ----------------------------------------
| sym:name - "area"
| name - "area"
| decl - "f(void)."
| parms - void
| storage - "virtual"
| value - "0"
| type - "double"
| sym:symtab - 0x40194140
|
+++ cdecl ----------------------------------------
| sym:name - "perimeter"
| name - "perimeter"
| decl - "f(void)."
| parms - void
| storage - "virtual"
| value - "0"
| type - "double"
| sym:symtab - 0x40194140
|
+++ cdecl ----------------------------------------
| sym:name - "nshapes"
| name - "nshapes"
| decl - ""
| storage - "static"
| type - "int"
| sym:symtab - 0x40194140
|
+++ class ----------------------------------------
| sym:name - "Circle"
| name - "Circle"
| kind - "class"
| bases - 0x40194510
| symtab - 0x40194538
| sym:symtab - 0x40191078
+++ access ----------------------------------------
| kind - "private"
|
+++ cdecl ----------------------------------------
| name - "radius"
| decl - ""
| type - "double"
|
+++ access ----------------------------------------
| kind - "public"
|
+++ constructor ----------------------------------------
| sym:name - "Circle"
| name - "Circle"
| parms - double
| decl - "f(double)."
| code - "{ }"
| sym:symtab - 0x40194538
|
+++ cdecl ----------------------------------------
| sym:name - "area"
| name - "area"
| decl - "f(void)."
| parms - void
| storage - "virtual"
| type - "double"
| sym:symtab - 0x40194538
|
+++ cdecl ----------------------------------------
| sym:name - "perimeter"
| name - "perimeter"
| decl - "f(void)."
| parms - void
| storage - "virtual"
| type - "double"
| sym:symtab - 0x40194538
|
+++ class ----------------------------------------
| sym:name - "Square"
| name - "Square"
| kind - "class"
| bases - 0x40194760
| symtab - 0x40194788
| sym:symtab - 0x40191078
+++ access ----------------------------------------
| kind - "private"
|
+++ cdecl ----------------------------------------
| name - "width"
| decl - ""
| type - "double"
|
+++ access ----------------------------------------
| kind - "public"
|
+++ constructor ----------------------------------------
| sym:name - "Square"
| name - "Square"
| parms - double
| decl - "f(double)."
| code - "{ }"
| sym:symtab - 0x40194788
|
+++ cdecl ----------------------------------------
| sym:name - "area"
| name - "area"
| decl - "f(void)."
| parms - void
| storage - "virtual"
| type - "double"
| sym:symtab - 0x40194788
|
+++ cdecl ----------------------------------------
| sym:name - "perimeter"
| name - "perimeter"
| decl - "f(void)."
| parms - void
| storage - "virtual"
| type - "double"
| sym:symtab - 0x40194788
</pre>
</div>
<H3><a name="Extending_nn9"></a>38.4.4 Attribute namespaces</H3>
<p>
Attributes of parse tree nodes are often prepended with a namespace qualifier.
For example, the attributes
<tt>sym:name</tt> and <tt>sym:symtab</tt> are attributes related to
symbol table management and are prefixed with <tt>sym:</tt>. As a
general rule, only those attributes which are directly related to the raw declaration
appear without a prefix (type, name, declarator, etc.).
</p>
<p>
Target language modules may add additional attributes to nodes to assist the generation
of wrapper code. The convention for doing this is to place these attributes in a namespace
that matches the name of the target language. For example, <tt>python:foo</tt> or
<tt>perl:foo</tt>.
</p>
<H3><a name="Extending_nn10"></a>38.4.5 Symbol Tables</H3>
<p>
During parsing, all symbols are managed in the space of the target
language. The <tt>sym:name</tt> attribute of each node contains the symbol name
selected by the parser. Normally, <tt>sym:name</tt> and <tt>name</tt>
are the same. However, the <tt>%rename</tt> directive can be used to
change the value of <tt>sym:name</tt>. You can see the effect of
<tt>%rename</tt> by trying it on a simple interface and dumping the
parse tree. For example:
</p>
<div class="code">
<pre>
%rename(foo_i) foo(int);
%rename(foo_d) foo(double);
void foo(int);
void foo(double);
void foo(Bar *b);
</pre>
</div>
<p>
There are various <tt>debug-</tt> options that can be useful for debugging and analysing the parse tree.
For example, the <tt>debug-top <n></tt> or <tt>debug-module <n></tt> options will
dump the entire/top of the parse tree or the module subtree at one of the four <tt>n</tt> stages of processing.
The parse tree can be viewed after the final stage of processing by running SWIG:
</p>
<div class="shell">
<pre>
$ swig -debug-top 4 example.i
...
+++ cdecl ----------------------------------------
| sym:name - "foo_i"
| name - "foo"
| decl - "f(int)."
| parms - int
| type - "void"
| sym:symtab - 0x40165078
|
+++ cdecl ----------------------------------------
| sym:name - "foo_d"
| name - "foo"
| decl - "f(double)."
| parms - double
| type - "void"
| sym:symtab - 0x40165078
|
+++ cdecl ----------------------------------------
| sym:name - "foo"
| name - "foo"
| decl - "f(p.Bar)."
| parms - Bar *
| type - "void"
| sym:symtab - 0x40165078
</pre>
</div>
<p>
All symbol-related conflicts and complaints about overloading are based on <tt>sym:name</tt> values.
For instance, the following example uses <tt>%rename</tt> in reverse to generate a name clash.
</p>
<div class="code">
<pre>
%rename(foo) foo_i(int);
%rename(foo) foo_d(double;
void foo_i(int);
void foo_d(double);
void foo(Bar *b);
</pre>
</div>
<p>
When you run SWIG on this you now get:
</p>
<div class="shell">
<pre>
$ ./swig example.i
example.i:6. Overloaded declaration ignored. foo_d(double )
example.i:5. Previous declaration is foo_i(int )
example.i:7. Overloaded declaration ignored. foo(Bar *)
example.i:5. Previous declaration is foo_i(int )
</pre>
</div>
<H3><a name="Extending_nn11"></a>38.4.6 The %feature directive</H3>
<p>
A number of SWIG directives such as <tt>%exception</tt> are implemented using the
low-level <tt>%feature</tt> directive. For example:
</p>
<div class="code">
<pre>
%feature("except") getitem(int) {
try {
$action
} catch (badindex) {
...
}
}
...
class Foo {
public:
Object *getitem(int index) throws(badindex);
...
};
</pre>
</div>
<p>
The behavior of <tt>%feature</tt> is very easy to describe--it simply
attaches a new attribute to any parse tree node that matches the
given prototype. When a feature is added, it shows up as an attribute in the <tt>feature:</tt> namespace.
You can see this when running with the <tt>-debug-top 4</tt> option. For example:
</p>
<div class="shell">
<pre>
+++ cdecl ----------------------------------------
| sym:name - "getitem"
| name - "getitem"
| decl - "f(int).p."
| parms - int
| type - "Object"
| feature:except - "{\n try {\n $action\n } catc..."
| sym:symtab - 0x40168ac8
|
</pre>
</div>
<p>
Feature names are completely arbitrary and a target language module can be
programmed to respond to any feature name that it wants to recognize. The
data stored in a feature attribute is usually just a raw unparsed string.
For example, the exception code above is simply
stored without any modifications.
</p>
<H3><a name="Extending_nn12"></a>38.4.7 Code Generation</H3>
<p>
Language modules work by defining handler functions that know how to respond to
different types of parse-tree nodes. These handlers simply look at the
attributes of each node in order to produce low-level code.
</p>
<p>
In reality, the generation of code is somewhat more subtle than simply
invoking handler functions. This is because parse-tree nodes might be
transformed. For example, suppose you are wrapping a class like this:
</p>
<div class="code">
<pre>
class Foo {
public:
virtual int *bar(int x);
};
</pre>
</div>
<p>
When the parser constructs a node for the member <tt>bar</tt>, it creates a raw "cdecl" node with the following
attributes:
</p>
<div class="diagram">
<pre>
nodeType : cdecl
name : bar
type : int
decl : f(int).p
parms : int x
storage : virtual
sym:name : bar
</pre>
</div>
<p>
To produce wrapper code, this "cdecl" node undergoes a number of transformations. First, the node is recognized as a function declaration. This adjusts some of the type information--specifically, the declarator is joined with the base datatype to produce this:
</p>
<div class="diagram">
<pre>
nodeType : cdecl
name : bar
type : p.int <-- Notice change in return type
decl : f(int).p
parms : int x
storage : virtual
sym:name : bar
</pre>
</div>
<p>
Next, the context of the node indicates that the node is really a
member function. This produces a transformation to a low-level
accessor function like this:
</p>
<div class="diagram">
<pre>
nodeType : cdecl
name : bar
type : int.p
decl : f(int).p
parms : Foo *self, int x <-- Added parameter
storage : virtual
wrap:action : result = (arg1)->bar(arg2) <-- Action code added
sym:name : Foo_bar <-- Symbol name changed
</pre>
</div>
<p>
In this transformation, notice how an additional parameter was added
to the parameter list and how the symbol name of the node has suddenly
changed into an accessor using the naming scheme described in the
"SWIG Basics" chapter. A small fragment of "action" code has also
been generated--notice how the <tt>wrap:action</tt> attribute defines
the access to the underlying method. The data in this transformed
node is then used to generate a wrapper.
</p>
<p>
Language modules work by registering handler functions for dealing with
various types of nodes at different stages of transformation. This is done by
inheriting from a special <tt>Language</tt> class and defining a collection
of virtual methods. For example, the Python module defines a class as
follows:
</p>
<div class="code">
<pre>
class PYTHON : public Language {
protected:
public :
virtual void main(int, char *argv[]);
virtual int top(Node *);
virtual int functionWrapper(Node *);
virtual int constantWrapper(Node *);
virtual int variableWrapper(Node *);
virtual int nativeWrapper(Node *);
virtual int membervariableHandler(Node *);
virtual int memberconstantHandler(Node *);
virtual int memberfunctionHandler(Node *);
virtual int constructorHandler(Node *);
virtual int destructorHandler(Node *);
virtual int classHandler(Node *);
virtual int classforwardDeclaration(Node *);
virtual int insertDirective(Node *);
virtual int importDirective(Node *);
};
</pre>
</div>
<p>
The role of these functions is described shortly.
</p>
<H3><a name="Extending_nn13"></a>38.4.8 SWIG and XML</H3>
<p>
Much of SWIG's current parser design was originally motivated by
interest in using XML to represent SWIG parse trees. Although XML is
not currently used in any direct manner, the parse tree structure, use
of node tags, attributes, and attribute namespaces are all influenced
by aspects of XML parsing. Therefore, in trying to understand SWIG's
internal data structures, it may be useful to keep XML in the back of
your mind as a model.
</p>
<H2><a name="Extending_nn14"></a>38.5 Primitive Data Structures</H2>
<p>
Most of SWIG is constructed using three basic data structures:
strings, hashes, and lists. These data structures are dynamic in same way as
similar structures found in many scripting languages. For instance,
you can have containers (lists and hash tables) of mixed types and
certain operations are polymorphic.
</p>
<p>
This section briefly describes the basic structures so that later
sections of this chapter make more sense.
</p>
<p>
When describing the low-level API, the following type name conventions are
used:
</p>
<ul>
<li><tt>String</tt>. A string object.
<li><tt>Hash</tt>. A hash object.
<li><tt>List</tt>. A list object.
<li><tt>String_or_char</tt>. A string object or a <tt>char *</tt>.
<li><tt>Object_or_char</tt>. An object or a <tt>char *</tt>.
<li><tt>Object</tt>. Any object (string, hash, list, etc.)
</ul>
<p>
In most cases, other typenames in the source are aliases for one of these
primitive types. Specifically:
</p>
<div class="code">
<pre>
typedef String SwigType;
typedef Hash Parm;
typedef Hash ParmList;
typedef Hash Node;
typedef Hash Symtab;
typedef Hash Typetab;
</pre>
</div>
<H3><a name="Extending_nn15"></a>38.5.1 Strings</H3>
<p>
<b><tt>String *NewString(const String_or_char *val)</tt></b>
</p>
<div class="indent">
Creates a new string with initial value <tt>val</tt>. <tt>val</tt> may
be a <tt>char *</tt> or another <tt>String</tt> object. If you want
to create an empty string, use "" for val.