This is a Trino plugin that provides a connector to read from and write to HTTP APIs given an OpenAPI specification.
To run a Docker container with the connector, run the following:
docker run \
-d \
--name trino-openapi \
-e OPENAPI_SPEC_LOCATION=http://api.example/api/spec.yaml \
-e OPENAPI_BASE_URI=http://api.example/api/v1 \
-p 8080:8080 \
nineinchnick/trino-openapi:1.66
Then use your favourite SQL client to connect to Trino running at http://localhost:8080
Download one of the ZIP packages, unzip it and copy the trino-openapi-1.66
directory to the plugin directory on every node in your Trino cluster. Create a
openapi.properties
file in your Trino catalog directory and set all the
required properties.
connector.name=openapi
spec-location=http://api.example/api/spec.yaml
base-uri=http://api.example/api/v1
authentication.type=http
authentication.username=${ENV:OPENAPI_USERNAME}
authentication.password=${ENV:OPENAPI_PASSWORD}
After reloading Trino, you should be able to connect to the openapi
catalog.
Configuration property | Container environmental variable | Description |
---|---|---|
spec-location | OPENAPI_SPEC_LOCATION |
URL or filename containing the OpenAPI specification, either JSON or YAML |
base-uri | OPENAPI_BASE_URI |
Base URL for the API, often includes API version |
authentication.type | OPENAPI_AUTH_TYPE |
Default authentication type if not set in the specification. One of: none , http , api_key , oauth . |
authentication.scheme | OPENAPI_AUTH_SCHEME |
Authentication scheme for the http authentication type. One of: basic , bearer . |
authentication.token-endpoint | OPENAPI_TOKEN_ENDPOINT |
OAuth token endpoint URL |
authentication.client-id | OPENAPI_CLIENT_ID |
OAuth Client ID |
authentication.client-secret | OPENAPI_CLIENT_SECRET |
OAuth Client secret |
authentication.grant-type | OPENAPI_GRANT_TYPE |
OAuth grant type |
authentication.username | OPENAPI_USERNAME |
Username used for the http and oauth authentication types |
authentication.password | OPENAPI_PASSWORD |
Password used for the http and oauth authentication types |
authentication.bearer-token | OPENAPI_BEARER_TOKEN |
Bearer token for http authentication |
authentication.api-key-name | OPENAPI_API_KEY_NAME |
API key name |
authentication.api-key-value | OPENAPI_API_KEY_VALUE |
API key value |
max-requests-per-second | OPENAPI_MAX_REQUESTS_PER_SECOND |
Maximum HTTP requests per second executed from a single Trino node |
max-splits-per-second | OPENAPI_MAX_SPLITS_PER_SECOND |
Maximum number of splits per second generated when executing a query |
domain-expansion-limit | OPENAPI_DOMAIN_EXPANSION_LIMIT |
Maximum number of discrete domain values generated from range predicates, which correlates with the number of HTTP requests executed |
The connector is using the Airlift HTTP client, which can be configured with
additional
properties
prefixed with openApi
, like so:
openApi.http-client.log.enabled=true
openApi.http-client.log.path=logs
This connector has been tested with the following APIs. The Docker image includes additional catalog files that will be used if the first of their environmental variable is set.
Name | Environmental variables |
---|---|
OpenAPI Petstore | PETSTORE_URL |
Starburst Galaxy | GALAXY_URL , GALAXY_TOKEN |
Jira Cloud platform | JIRA_URL |
GitHub REST API | GITHUB_TOKEN |
DataDog API | DATADOG_URL |
The following rules are used to map the OpenAPI schema to SQL tables and columns:
- Paths are mapped to tables; table names are paths, with
camelCase
converted tosnake_case
and special characters (like/
) converted to underscores. - SQL operations are mapped to HTTP methods:
- SELECT uses GET or POST, if no GET is available;
- INSERT uses POST or PUT;
- UPDATE uses PATCH or POST;
- DELETE uses DELETE.
- All request parameters are mapped to columns, including path, query, and header parameters.
- Fields of the HTTP OK (200) response type are mapped to columns.
- Fields of the request body are also mapped to columns, if a POST request is used for SELECT.
- All columns are disambiguated - if a field with the same name but a different
data type appears in the response and parameters, it'll be mapped to multiple
columns, with numeric suffixes (
_2
,_3
, etc.). Request body fields have a_req
suffix.
OpenAPI allows using custom extensions - adding custom fields anywhere in the
schema as long as they're prefixed with an X
. Such custom extensions are used
to fine-tune this connector.
If the original service cannot be modified to include an extension in a generated OpenAPI schema, save it locally and modify as needed.
APIs can use 4 different types of pagination:
- Offset - every response can include an offset parameter, telling how many results to skip.
- Page - every response can include a page parameter; the number of results per page can be configurable with another parameter.
- Cursor/token - the response includes the value of a cursor or token, that needs to be included in the next request.
- Next page URL - the response includes an URL of the next set of results.
The connector currently supports only the Page
pagination.
To enable pagination, add a x-pagination
section in the path's operation section:
paths:
/records:
get:
responses:
# ...
x-pagination:
pageParam: "page"
limitParam: "per-page"
resultsPath: "$response.body#/workflows"
totalResultsPath: "$response.body#/total_count"
Run all the unit tests:
mvn test
Creates a deployable zip file:
mvn clean package
Unzip the archive from the target directory to use the connector in your Trino cluster.
unzip target/*.zip -d ${PLUGIN_DIRECTORY}/
mv ${PLUGIN_DIRECTORY}/trino-openapi-* ${PLUGIN_DIRECTORY}/trino-openapi
To test and debug the connector locally, run the OpenApiQueryRunner
class
located in tests:
mvn test-compile exec:java -Dexec.mainClass="pl.net.was.OpenApiQueryRunner" -Dexec.classpathScope=test
And then run the Trino CLI using trino --server localhost:8080 --no-progress
and query it:
trino> show catalogs;
Catalog
---------
openapi
system
(2 rows)
trino> show tables from openapi.default;
Table
------------
single_row
(1 row)
trino> select * from openapi.default.single_row;
id | type | name
----+---------------+---------
x | default-value | my-name
(1 row)