Skip to content

Latest commit

 

History

History
158 lines (119 loc) · 4.81 KB

sql.md

File metadata and controls

158 lines (119 loc) · 4.81 KB
sql
gaia
./lib/gaia-sample.parquet

SQL

Observable Framework includes built-in support for client-side SQL powered by DuckDB. You can use SQL to query data from CSV, TSV, JSON, Apache Arrow, and Apache Parquet files, which can either be static or generated by data loaders.

To use SQL, first register the desired tables in the page’s front matter using the sql option. Each key is a table name, and each value is the path to the corresponding data file. For example, to register a table named gaia from a Parquet file:

---
sql:
  gaia: ./lib/gaia-sample.parquet
---

SQL code blocks

To run SQL queries, create a SQL fenced code block (```sql). For example, to query the first 10 rows from the gaia table:

```sql
SELECT * FROM gaia ORDER BY phot_g_mean_mag LIMIT 10
```

This produces a table:

SELECT * FROM gaia ORDER BY phot_g_mean_mag LIMIT 10

To refer to the results of a query in JavaScript, use the id directive. For example, to refer to the results of the previous query as top10:

```sql id=top10
SELECT * FROM gaia ORDER BY phot_g_mean_mag LIMIT 10
```
SELECT * FROM gaia ORDER BY phot_g_mean_mag LIMIT 10

This returns an array of 10 rows, inspected here:

top10

When a SQL code block uses the id directive, the results are not displayed by default. You can display them by adding the display directive, which produces the table shown above.

```sql id=top10 display
SELECT * FROM gaia ORDER BY phot_g_mean_mag LIMIT 10
```

The id directive is often a simple identifier such as top10 above, but it supports destructuring assignment, so you can refer to individual rows and columns using array and object patterns. For example, to pull out the top row:

```sql id=[top]
SELECT * FROM gaia ORDER BY phot_g_mean_mag LIMIT 1
```
SELECT * FROM gaia ORDER BY phot_g_mean_mag LIMIT 1
top

Or to pull out the minimum value of the phot_g_mean_mag column:

```sql id=[{min}]
SELECT MIN(phot_g_mean_mag) AS min FROM gaia
```
SELECT MIN(phot_g_mean_mag) AS min FROM gaia
min

For complex destructuring patterns, you may need to quote the id directive. For example, to pull out the column named min(phot_g_mean_mag) to the variable named min, say id="[{'min(phot_g_mean_mag)': min}]". Or to pull out the min and max columns, say id="[{min, max}]".

For dynamic or interactive queries that respond to user input, you can interpolate values into SQL queries using inline expressions ${…}. For example, to show the stars around a given brightness:

const mag = view(Inputs.range([6, 20], {label: "Magnitude"}));
SELECT * FROM gaia WHERE phot_g_mean_mag BETWEEN ${mag - 0.1} AND ${mag + 0.1};

The value of a SQL code block is an Apache Arrow table. This format is supported by Observable Plot, so you can use SQL and Plot together to visualize data. For example, below we count the number of stars in each 2°×2° bin of the sky (where ra is right ascension and dec is declination, representing a point on the celestial sphere in the equatorial coordinate system), and then visualize the resulting heatmap using a raster mark.

SELECT
  floor(ra / 2) * 2 + 1 AS ra,
  floor(dec / 2) * 2 + 1 AS dec,
  count() AS count
FROM
  gaia
GROUP BY
  1,
  2
Plot.plot({
  aspectRatio: 1,
  x: {domain: [0, 360]},
  y: {domain: [-90, 90]},
  marks: [
    Plot.frame({fill: 0}),
    Plot.raster(bins, {
      x: "ra",
      y: "dec",
      fill: "count",
      width: 360 / 2,
      height: 180 / 2,
      imageRendering: "pixelated"
    })
  ]
})

SQL literals

SQL fenced code blocks are shorthand for the sql tagged template literal. You can invoke the sql tagged template literal directly like so:

const rows = await sql`SELECT random() AS random`;
rows[0].random

The sql tagged template literal is available by default in Markdown, but you can also import it explicitly as:

import {sql} from "npm:@observablehq/duckdb";