Once the storage step is done loading enriched events into our storage target of choice, we can start to use that data to build intelligence.
Whilst it is possible to query the events directly, it is often more convenient to transform and aggregate the events into a set of tables that are then used by various data consumers within the business. This has a number of advantages:
- It guarantees that all users within the business are using the same basic business logic (e.g. identity stitching and sessionization);
- It is easier and faster to run queries against the modeled data
- It is possible to connect a BI or pivoting tool directly to the aggregate data
The new generation of Snowplow officially-supported SQL data models for working with Snowplow data. There are two ways to run the Web model, depending on your choice of tooling to execute SQL against your database.
- Supports Redshift, Snowflake and BigQuery.
- Can be found in the dedicated data models GitHub repository.
- Supports Redshift only. BigQuery and Snowflake to follow.
- Can be found in the dedicated dbt-snowplow-web GitHub repository.
- Leverages functionality provided by the snowplow-utils dbt package.
The first generation of Snowplow officially-supported SQL data models for working with Snowplow mobile data can be found in the dedicated data models GitHub repository. Redshift, BigQuery and Snowflake are supported using SQL-Runner, with a dbt based model to follow soon.
The data models are copyright 2016-2021 Snowplow Analytics Ltd.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this software except in compliance with the License.
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.