@@ -10,16 +10,204 @@ Data Serialization
10
10
What is data serialization?
11
11
***************************
12
12
13
- Data serialization is the concept of converting structured data into a format
13
+ Data serialization is the process of converting structured data into a format
14
14
that allows it to be shared or stored in such a way that its original
15
- structure to be recovered. In some cases, the secondary intention of data
15
+ structure should be recovered or reconstructed . In some cases, the secondary intention of data
16
16
serialization is to minimize the size of the serialized data which then
17
17
minimizes disk space or bandwidth requirements.
18
18
19
+ ********************
20
+ Flat vs. Nested data
21
+ ********************
19
22
20
- ******
21
- Pickle
22
- ******
23
+ Before beginning to serialize data, it is important to identify or decide how the
24
+ data should be structured during data serialization - flat or nested.
25
+ The differences in the two styles are shown in the below examples.
26
+
27
+ Flat style:
28
+
29
+ .. code-block :: python
30
+
31
+ { " Type" : " A" , " field1" : " value1" , " field2" : " value2" , " field3" : " value3" }
32
+
33
+
34
+ Nested style:
35
+
36
+ .. code-block :: python
37
+
38
+ {" A"
39
+ { " field1" : " value1" , " field2" : " value2" , " field3" : " value3" } }
40
+
41
+
42
+ For more reading on the two styles, please see the discussion on
43
+ `Python mailing list <https://mail.python.org/pipermail/python-list/2010-October/590762.html >`__,
44
+ `IETF mailing list <https://www.ietf.org/mail-archive/web/json/current/msg03739.html >`__ and
45
+ `in stackexchange <https://softwareengineering.stackexchange.com/questions/350623/flat-or-nested-json-for-hierarchal-data >`__.
46
+
47
+ ****************
48
+ Serializing Text
49
+ ****************
50
+
51
+ =======================
52
+ Simple file (flat data)
53
+ =======================
54
+
55
+ If the data to be serialized is located in a file and contains flat data, Python offers two methods to serialize data.
56
+
57
+ repr
58
+ ----
59
+
60
+ The repr method in Python takes a single object parameter and returns a printable representation of the input:
61
+
62
+ .. code-block :: python
63
+
64
+ # input as flat text
65
+ a = { " Type" : " A" , " field1" : " value1" , " field2" : " value2" , " field3" : " value3" }
66
+
67
+ # the same input can also be read from a file
68
+ a = open (' /tmp/file.py' , ' r' )
69
+
70
+ # returns a printable representation of the input;
71
+ # the output can be written to a file as well
72
+ print (repr (a))
73
+
74
+ # write content to files using repr
75
+ with open (' /tmp/file.py' ) as f:f.write(repr (a))
76
+
77
+
78
+ ast.literal_eval
79
+ ----------------
80
+
81
+ The literal_eval method safely parses and evaluates an expression for a Python datatype.
82
+ Supported data types are: strings, numbers, tuples, lists, dicts, booleans, and None.
83
+
84
+ .. code-block :: python
85
+
86
+ with open (' /tmp/file.py' , ' r' ) as f: inp = ast.literal_eval(f.read())
87
+
88
+ ====================
89
+ CSV file (flat data)
90
+ ====================
91
+
92
+ The CSV module in Python implements classes to read and write tabular
93
+ data in CSV format.
94
+
95
+ Simple example for reading:
96
+
97
+ .. code-block :: python
98
+
99
+ # Reading CSV content from a file
100
+ import csv
101
+ with open (' /tmp/file.csv' , newline = ' ' ) as f:
102
+ reader = csv.reader(f)
103
+ for row in reader:
104
+ print (row)
105
+
106
+ Simple example for writing:
107
+
108
+ .. code-block :: python
109
+
110
+ # Writing CSV content to a file
111
+ import csv
112
+ with open (' /temp/file.csv' , ' w' , newline = ' ' ) as f:
113
+ writer = csv.writer(f)
114
+ writer.writerows(iterable)
115
+
116
+
117
+ The module's contents, functions, and examples can be found
118
+ `in the Python documentation <https://docs.python.org/3/library/csv.html >`__.
119
+
120
+ ==================
121
+ YAML (nested data)
122
+ ==================
123
+
124
+ There are many third party modules to parse and read/write YAML file
125
+ structures in Python. One such example is below.
126
+
127
+ .. code-block :: python
128
+
129
+ # Reading YAML content from a file using the load method
130
+ import yaml
131
+ with open (' /tmp/file.yaml' , ' r' , newline = ' ' ) as f:
132
+ try :
133
+ print (yaml.load(f))
134
+ except yaml.YAMLError as ymlexcp:
135
+ print (ymlexcp)
136
+
137
+ Documentation on the third party module can be found
138
+ `in the PyYAML Documentation <https://pyyaml.org/wiki/PyYAMLDocumentation >`__.
139
+
140
+ =======================
141
+ JSON file (nested data)
142
+ =======================
143
+
144
+ Python's JSON module can be used to read and write JSON files.
145
+ Example code is below.
146
+
147
+ Reading:
148
+
149
+ .. code-block :: python
150
+
151
+ # Reading JSON content from a file
152
+ import json
153
+ with open (' /tmp/file.json' , ' r' ) as f:
154
+ data = json.load(f)
155
+
156
+ Writing:
157
+
158
+ .. code-block :: python
159
+
160
+ # Writing JSON content to a file using the dump method
161
+ import json
162
+ with open (' /tmp/file.json' , ' w' ) as f:
163
+ json.dump(data, f, sort_keys = True )
164
+
165
+ =================
166
+ XML (nested data)
167
+ =================
168
+
169
+ XML parsing in Python is possible using the `xml ` package.
170
+
171
+ Example:
172
+
173
+ .. code-block :: python
174
+
175
+ # reading XML content from a file
176
+ import xml.etree.ElementTree as ET
177
+ tree = ET .parse(' country_data.xml' )
178
+ root = tree.getroot()
179
+
180
+ More documentation on using the `xml.dom ` and `xml.sax ` packages can be found
181
+ `in the Python XML library documentation <https://docs.python.org/3/library/xml.html >`__.
182
+
183
+
184
+ *******
185
+ Binary
186
+ *******
187
+
188
+ =======================
189
+ NumPy Array (flat data)
190
+ =======================
191
+
192
+ Python's NumPy array can be used to serialize and deserialize data to and from byte representation.
193
+
194
+ Example:
195
+
196
+ .. code-block :: python
197
+
198
+ import NumPy as np
199
+
200
+ # Converting NumPy array to byte format
201
+ byte_output = np.array([ [1 , 2 , 3 ], [4 , 5 , 6 ], [7 , 8 , 9 ] ]).tobytes()
202
+
203
+ # Converting byte format back to NumPy array
204
+ array_format = np.frombuffer(byte_output)
205
+
206
+
207
+
208
+ ====================
209
+ Pickle (nested data)
210
+ ====================
23
211
24
212
The native data serialization module for Python is called `Pickle
25
213
<https://docs.python.org/2/library/pickle.html> `_.
0 commit comments