sql | ||
---|---|---|
|
Observable Framework includes built-in support for client-side SQL powered by DuckDB. You can use SQL to query data from CSV, TSV, JSON, Apache Arrow, and Apache Parquet files, which can either be static or generated by data loaders.
To use SQL, first register the desired tables in the page’s front matter using the sql option. Each key is a table name, and each value is the path to the corresponding data file. For example, to register a table named gaia
from a Parquet file:
---
sql:
gaia: ./lib/gaia-sample.parquet
---
To run SQL queries, create a SQL fenced code block (```sql
). For example, to query the first 10 rows from the gaia
table:
```sql
SELECT * FROM gaia ORDER BY phot_g_mean_mag LIMIT 10
```
This produces a table:
SELECT * FROM gaia ORDER BY phot_g_mean_mag LIMIT 10
To refer to the results of a query in JavaScript, use the id
directive. For example, to refer to the results of the previous query as top10
:
```sql id=top10
SELECT * FROM gaia ORDER BY phot_g_mean_mag LIMIT 10
```
SELECT * FROM gaia ORDER BY phot_g_mean_mag LIMIT 10
This returns an array of 10 rows, inspected here:
top10
When a SQL code block uses the id
directive, the results are not displayed by default. You can display them by adding the display
directive, which produces the table shown above.
```sql id=top10 display
SELECT * FROM gaia ORDER BY phot_g_mean_mag LIMIT 10
```
The id
directive is often a simple identifier such as top10
above, but it supports destructuring assignment, so you can refer to individual rows and columns using array and object patterns. For example, to pull out the top row:
```sql id=[top]
SELECT * FROM gaia ORDER BY phot_g_mean_mag LIMIT 1
```
SELECT * FROM gaia ORDER BY phot_g_mean_mag LIMIT 1
top
Or to pull out the minimum value of the phot_g_mean_mag
column:
```sql id=[{min}]
SELECT MIN(phot_g_mean_mag) AS min FROM gaia
```
SELECT MIN(phot_g_mean_mag) AS min FROM gaia
min
For complex destructuring patterns, you may need to quote the id
directive. For example, to pull out the column named min(phot_g_mean_mag)
to the variable named min
, say id="[{'min(phot_g_mean_mag)': min}]"
. Or to pull out the min
and max
columns, say id="[{min, max}]"
.
For dynamic or interactive queries that respond to user input, you can interpolate values into SQL queries using inline expressions ${…}
. For example, to show the stars around a given brightness:
const mag = view(Inputs.range([6, 20], {label: "Magnitude"}));
SELECT * FROM gaia WHERE phot_g_mean_mag BETWEEN ${mag - 0.1} AND ${mag + 0.1};
The value of a SQL code block is an Apache Arrow table. This format is supported by Observable Plot, so you can use SQL and Plot together to visualize data. For example, below we count the number of stars in each 2°×2° bin of the sky (where ra
is right ascension and dec
is declination, representing a point on the celestial sphere in the equatorial coordinate system), and then visualize the resulting heatmap using a raster mark.
SELECT
floor(ra / 2) * 2 + 1 AS ra,
floor(dec / 2) * 2 + 1 AS dec,
count() AS count
FROM
gaia
GROUP BY
1,
2
Plot.plot({
aspectRatio: 1,
x: {domain: [0, 360]},
y: {domain: [-90, 90]},
marks: [
Plot.frame({fill: 0}),
Plot.raster(bins, {
x: "ra",
y: "dec",
fill: "count",
width: 360 / 2,
height: 180 / 2,
imageRendering: "pixelated"
})
]
})
SQL fenced code blocks are shorthand for the sql
tagged template literal. You can invoke the sql
tagged template literal directly like so:
const rows = await sql`SELECT random() AS random`;
rows[0].random
The sql
tagged template literal is available by default in Markdown, but you can also import it explicitly as:
import {sql} from "npm:@observablehq/duckdb";