You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/t-sql/statements/create-external-file-format-transact-sql.md
+12-12Lines changed: 12 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -156,14 +156,14 @@ Specifies the field terminator for data of type string in the text-delimited fil
156
156
- STRING_DELIMITER = '0x7E0x7E' -- Two tildes (for example, ~~)
157
157
158
158
FIRST_ROW = *First_row_int*
159
-
Specifies the row number that is read first in all files during a PolyBase load. This parameter can take values 1-15. If the value is set to 2, the first row in every file (header row) will be skipped. Rows are skipped based on the existence of row terminators (/r/n, /r, /n). When this option is used for export, rows are added to the data to ensure it can be read back with no data loss. If the value is set to >2, the first row exported is the Column names of the external table.
159
+
Specifies the row number that is read first in all files during a PolyBase load. This parameter can take values 1-15. If the value is set to 2, the first row in every file (header row) will be skipped. Rows are skipped based on the existence of row terminators (/r/n, /r, /n). When this option is used for export, rows are added to the data to make sure the file can be read with no data loss. If the value is set to >2, the first row exported is the Column names of the external table.
160
160
161
161
DATE\_FORMAT = *datetime_format*
162
-
Specifies a custom format for all date and time data that might appear in a delimited text file. If the source file uses default datetime formats, this option is not necessary. Only one custom datetime format is allowed per file. You cannot specify multiple custom datetime formats per file. However, you can use multiple datetime formats, if each one is the default format for its respective data type in the external table definition.
162
+
Specifies a custom format for all date and time data that might appear in a delimited text file. If the source file uses default datetime formats, this option isn't necessary. Only one custom datetime format is allowed per file. You can't specify more than one custom datetime formats per file. However, you can use more than one datetime formats, if each one is the default format for its respective data type in the external table definition.
163
163
164
-
PolyBase only uses the custom date format for importing the data. It does not use the custom format for writing data to an external file.
164
+
PolyBase only uses the custom date format for importing the data. It doesn't use the custom format for writing data to an external file.
165
165
166
-
When DATE_FORMAT is not specified or is the empty string, PolyBase uses the following default formats:
166
+
When DATE_FORMAT isn't specified or is the empty string, PolyBase uses the following default formats:
167
167
168
168
- DateTime: 'yyyy-MM-dd HH:mm:ss'
169
169
@@ -185,7 +185,7 @@ PolyBase only uses the custom date format for importing the data. It does not us
185
185
186
186
- Milliseconds (fffffff) are not required.
187
187
188
-
- Am, pm (tt) is not required. The default is AM.
188
+
- Am, pm (tt) isn't required. The default is AM.
189
189
190
190
|Date Type|Example|Description|
191
191
|---------------|-------------|-----------------|
@@ -248,11 +248,11 @@ PolyBase only uses the custom date format for importing the data. It does not us
248
248
Store all missing values as NULL. Any NULL values that are stored by using the word NULL in the delimited text file are imported as the string 'NULL'.
249
249
250
250
Encoding = {'UTF8' | 'UTF16'}
251
-
In Azure SQL Data Warehouse, PolyBase can read UTF8 and UTF16-LE encoded delimited text files. In SQL Server and PDW, PolyBase does not support reading UTF16 encoded files.
251
+
In Azure SQL Data Warehouse, PolyBase can read UTF8 and UTF16-LE encoded delimited text files. In SQL Server and PDW, PolyBase doesn't support reading UTF16 encoded files.
252
252
253
253
DATA_COMPRESSION = *data_compression_method*
254
-
Specifies the data compression method for the external data. When DATA_COMPRESSION is not specified, the default is uncompressed data.
255
-
In order to work properly, Gzip compressed files must have the ".gz" file extension.
254
+
Specifies the data compression method for the external data. When DATA_COMPRESSION isn't specified, the default is uncompressed data.
255
+
To work properly, Gzip compressed files must have the ".gz" file extension.
256
256
257
257
The DELIMITEDTEXT format type supports these compression methods:
258
258
@@ -309,7 +309,7 @@ PolyBase only uses the custom date format for importing the data. It does not us
309
309
## Examples
310
310
311
311
### A. Create a DELIMITEDTEXT external file format
312
-
This example creates an external file format named *textdelimited1* for a text-delimited file. The options listed for FORMAT\_OPTIONS specify that the fields in the file should be separated using a pipe character '|'. The text file is also compressed with the Gzip codec. If DATA\_COMPRESSION is not specified, the text file is uncompressed.
312
+
This example creates an external file format named *textdelimited1* for a text-delimited file. The options listed for FORMAT\_OPTIONS specify that the fields in the file should be separated using a pipe character '|'. The text file is also compressed with the Gzip codec. If DATA\_COMPRESSION isn't specified, the text file is uncompressed.
313
313
314
314
For a delimited text file, the data compression method can either be the default Codec, 'org.apache.hadoop.io.compress.DefaultCodec', or the Gzip Codec, 'org.apache.hadoop.io.compress.GzipCodec'.
315
315
@@ -325,7 +325,7 @@ WITH (
325
325
```
326
326
327
327
### B. Create an RCFile external file format
328
-
This example creates an external file format for a RCFile that uses the serialization/deserialization method org.apache.hadoop.hive.serde2.columnar.LazyBinaryColumnarSerDe. It also specifies to use the Default Codec for the data compression method. If DATA_COMPRESSION is not specified, the default is no compression.
328
+
This example creates an external file format for a RCFile that uses the serialization/deserialization method org.apache.hadoop.hive.serde2.columnar.LazyBinaryColumnarSerDe. It also specifies to use the Default Codec for the data compression method. If DATA_COMPRESSION isn't specified, the default is no compression.
329
329
330
330
```
331
331
CREATE EXTERNAL FILE FORMAT rcfile1
@@ -337,7 +337,7 @@ WITH (
337
337
```
338
338
339
339
### C. Create an ORC external file format
340
-
This example creates an external file format for an ORC file that compresses the data with the org.apache.io.compress.SnappyCodec data compression method. If DATA_COMPRESSION is not specified, the default is no compression.
340
+
This example creates an external file format for an ORC file that compresses the data with the org.apache.io.compress.SnappyCodec data compression method. If DATA_COMPRESSION isn't specified, the default is no compression.
341
341
342
342
```
343
343
CREATE EXTERNAL FILE FORMAT orcfile1
@@ -348,7 +348,7 @@ WITH (
348
348
```
349
349
350
350
### D. Create a PARQUET external file format
351
-
This example creates an external file format for a Parquet file that compresses the data with the org.apache.io.compress.SnappyCodec data compression method. If DATA_COMPRESSION is not specified, the default is no compression.
351
+
This example creates an external file format for a Parquet file that compresses the data with the org.apache.io.compress.SnappyCodec data compression method. If DATA_COMPRESSION isn't specified, the default is no compression.
0 commit comments