Skip to content

Commit c473714

Browse files
committed
Adding SQL Hints recipe
1 parent 83a2377 commit c473714

File tree

5 files changed

+384
-0
lines changed

5 files changed

+384
-0
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ The cookbook is a living document. :seedling:
3535
1. [Working with Dates and Timestamps](other-builtin-functions/01_date_time/01_date_time.md)
3636
2. [Building the Union of Multiple Streams](other-builtin-functions/02_union-all/02_union-all.md)
3737
3. [Filtering out Late Data](other-builtin-functions/03_current_watermark/03_current_watermark.md)
38+
4. [Overriding table options](other-builtin-functions/04_override_table_options/04_override_table_options.md)
3839

3940
### User-Defined Functions (UDFs)
4041
1. [Extending SQL with Python UDFs](udfs/01_python_udfs/01_python_udfs.md)
Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
# 04 Overriding table options
2+
3+
![Twitter Badge](https://img.shields.io/badge/Flink%20Version-1.11%2B-lightgrey)
4+
5+
> :bulb: This example will show how you can override table options that have been defined via a DDL by using Hints.
6+
7+
This recipe uses the `2015 Flight Delays and Cancellations` dataset which can be found on [Kaggle](https://www.kaggle.com/usdot/flight-delays).
8+
9+
As explained before in the [creating tables recipe](../../foundations/01_create_table/01_create_table.md), you create tables in Flink SQL by using a SQL DDL. For example, you would use the following DDL to create a table `airports` which reads available airports in via the provided CSV file.
10+
11+
> :warning: Make sure that the value for `path` is correct for your location environment.
12+
13+
```sql
14+
CREATE TABLE `airports` (
15+
`IATA_CODE` CHAR(3),
16+
`AIRPORT` STRING,
17+
`CITY` STRING,
18+
`STATE` CHAR(2),
19+
`COUNTRY` CHAR(3),
20+
`LATITUDE` DOUBLE NULL,
21+
`LONGITUDE` DOUBLE NULL,
22+
PRIMARY KEY (`IATA_CODE`) NOT ENFORCED
23+
) WITH (
24+
'connector' = 'filesystem',
25+
'path' = 'file:///flink-sql-cookbook/other-builtin-functions/04_override_table_options/airports.csv',
26+
'format' = 'csv'
27+
);
28+
```
29+
30+
After creating this table, you would normally query it using something like:
31+
32+
```sql
33+
SELECT * FROM `airports`;
34+
```
35+
36+
However, this currently doesn't work because there is an improperly formatted line in the CSV file. There is an option for CSV files to ignore parsing errors, but that means you need to alter the table.
37+
38+
You can also override the defined table options using [SQL Hints](https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/table/sql/queries/hints/). Your SQL statement would then look like:
39+
40+
```sql
41+
SELECT * FROM `airports` /*+ OPTIONS('csv.ignore-parse-errors'='true') */;
42+
```
43+
44+
Since the CSV format option `csv.ignore-parse-errors` sets fields to null in case of errors, you can also quickly identify which fields can't be parsed using:
45+
46+
```sql
47+
SELECT * FROM `airports` /*+ OPTIONS('csv.ignore-parse-errors'='true') */ WHERE `LATITUDE` IS NULL;
48+
```
49+
50+
You can apply SQL Hints for all possible table options. For example, if you SQL job which reads from Kafka has crashed, you can override the default reading position:
51+
52+
```sql
53+
SELECT * FROM `your_kafka_topic` /*+ OPTIONS('scan.startup.mode'='group-offsets');
54+
```
55+
56+
Tables, views and functions are all registered in the catalog. The catalog is a collection of metadata. Using SQL Hints, you can override any defined metadata.
57+
58+
## Example Output
59+
60+
![04_override_table_options.screen01](04_override_table_options.screen01.png)
61+
![04_override_table_options.screen02](04_override_table_options.screen02.png)
Loading
Loading

0 commit comments

Comments
 (0)