Skip to content

Commit

Permalink
Merge pull request #59 from mohb-ellakani/master
Browse files Browse the repository at this point in the history
Added new page for the materialization tutorial
  • Loading branch information
bcogrel authored Jul 31, 2023
2 parents fb8d2d1 + 161e669 commit fccbe7f
Show file tree
Hide file tree
Showing 3 changed files with 104 additions and 25 deletions.
7 changes: 7 additions & 0 deletions .vuepress/config.js
Original file line number Diff line number Diff line change
Expand Up @@ -222,6 +222,13 @@ function genTutorialSidebarConfig() {
'interact/jupyter.md',
]
},
{
title: 'Materialize',
collapsable: false,
children: [
'materialization/materialization.md'
]
},
{
title: 'Mapping',
collapsable: false,
Expand Down
52 changes: 27 additions & 25 deletions tutorial/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@ In this tutorial, we will see how to design a Virtual Knowledge Graph (VKG) spec

## Requirements

* [Java 11](http://www.oracle.com/technetwork/java/javase/downloads/index.html)
* Latest version of Ontop from [GitHub](https://github.com/ontop/ontop/releases) or [SourceForge](https://sourceforge.net/projects/ontop4obda/files/)
* H2 with preloaded datasets [h2.zip](h2.zip)
* [Git](https://git-scm.com/)
- [Java 11](http://www.oracle.com/technetwork/java/javase/downloads/index.html)
- Latest version of Ontop from [GitHub](https://github.com/ontop/ontop/releases) or [SourceForge](https://sourceforge.net/projects/ontop4obda/files/)
- H2 with preloaded datasets [h2.zip](h2.zip)
- [Git](https://git-scm.com/)

## Clone this repository

Expand All @@ -21,26 +21,28 @@ cd ontop-tutorial
## Program

1. [Basics of VKG Modeling](basic/setup.md)
* [Mapping the first data source](basic/university-1.md)
* [Mapping the second data source](basic/university-2.md)
* [Mapping the first data source](basic/university-1.md)
* [Mapping the second data source](basic/university-2.md)
2. [Deploying an Ontop SPARQL endpoint](endpoint)
* [Using Ontop CLI](endpoint/endpoint-cli.md)
* [Using Ontop Docker image](endpoint/endpoint-docker.md)
* [Using Ontop CLI](endpoint/endpoint-cli.md)
* [Using Ontop Docker image](endpoint/endpoint-docker.md)
3. [Interacting with an Ontop SPARQL endpoint](interact/cli.md)
* [Command Line Tools (curl, http)](interact/cli.md)
* [Python and Jupyter Notebook](interact/jupyter.md)
4. [Mapping Engineering](mapping)
* [Role of primary keys](mapping/primary-keys.md)
* [Role of foreign keys](mapping/foreign-keys.md)
* [Choice of the URI templates](mapping/uri-templates.md)
* [Bonus: existential reasoning](mapping/existential.md)
5. [Lenses](lenses)
* [Basic Lens](lenses/basic-lens.md)
* [Join Lens](lenses/join-lens.md)
* [SQL Lens](lenses/sql-lens.md)
* [Union Lens](lenses/union-lens.md)
* [Flatten Lens](lenses/flatten-lens.md)
6. [Federating multiple databases](federation)
* [Ontop with Dremio](federation/dremio/README.md)
* [Ontop with Denodo](federation/denodo/README.md)
7. [External tutorials](external-tutorials)
* [Command Line Tools (curl, http)](interact/cli.md)
* [Python and Jupyter Notebook](interact/jupyter.md)
4. [Materialization using Ontop](materialization/materialization.md)
* [How to materialize data into a graph database using Ontop](materialization/materialization.md)
5. [Mapping Engineering](mapping)
* [Role of primary keys](mapping/primary-keys.md)
* [Role of foreign keys](mapping/foreign-keys.md)
* [Choice of the URI templates](mapping/uri-templates.md)
* [Bonus: existential reasoning](mapping/existential.md)
6. [Lenses](lenses)
* [Basic Lens](lenses/basic-lens.md)
* [Join Lens](lenses/join-lens.md)
* [SQL Lens](lenses/sql-lens.md)
* [Union Lens](lenses/union-lens.md)
* [Flatten Lens](lenses/flatten-lens.md)
7. [Federating multiple databases](federation)
* [Ontop with Dremio](federation/dremio/README.md)
* [Ontop with Denodo](federation/denodo/README.md)
8. [External tutorials](external-tutorials)
70 changes: 70 additions & 0 deletions tutorial/materialization/materialization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# How to deploy your Knowledge Graph in a graph database with Ontop

In this tutorial, we present two ways to materialize your Knowledge Graph using Ontop.

## How to materialize data into a graph database using Ontop

1. ### Materialize in RDF files and load into a triplestore

For the first solution, you will need the following prerequisites:

- Access to a relational database (in our example PostgreSQL)
- [Mapping](../glossary/#mapping) ([R2RML](../glossary/#r2rml) or [OBDA](../glossary/#obda_mapping_format) files)
- [Ontop](https://ontop-vkg.org/guide/cli.html#setup-ontop-cli)

Using the CLI command _ontop-materialize_ ([https://ontop-vkg.org/guide/cli#ontop-materialize](https://ontop-vkg.org/guide/cli#ontop-materialize)), you can [materialize](../glossary/#materialization) your KG into one or multiple files. For simplicity, we keep the default option and only materialize it into one file.


_./ontop materialize -m mapping.ttl -p credentials.properties -f turtle -o materialized-triples.ttl_


After running the command, we have all the content of our KG copied to the file _materialized-triples.ttl_.

Now we load this file in the triplestore of our choice, in this case, we use [GraphDB](https://www.ontotext.com/products/graphdb/download/). This graph database offers [several ways to load files](https://graphdb.ontotext.com/documentation/10.2/loading-and-updating-data.html). Here, since our file is only 200 MB, we go for the simplest option and load it directly from the UI.

Once this is done, we can query this KG using GraphDB.

2. ### Deploy a VKG and fetch its content from the graph database

For the second solution, we make use of the concept of KG virtualization.

We deploy the KG as a virtual KG first and then query it from the graph database. In this way, you can retrieve the triples and store them locally in the graph database.

Triples are directly streamed to the graph database: no intermediate file storage is involved, making this solution more direct than the previous one.

let’s deploy the KG as a virtual KG using the _ontop-endpoint_ command:

_./ontop endpoint -m mapping.ttl -p credentials.properties_

Now Ontop is deployed as a [SPARQL endpoint](../glossary/#sparql_endpoint) available at [http://localhost:8080/sparql](http://localhost:8080/sparql).

Let’s go now to GraphDB. To fetch and insert all the triples from the VKG exposed by Ontop, we run the following SPARQL INSERT query from GraphDB itself:

INSERT {
?s ?p ?o
}
WHERE {
SERVICE <http://localhost:8080/sparql> {
?s ?p ?o
}
}

This query materializes the same triples as with the first approach.

## Choosing the Right Approach for Your Use Case

1\. **Small Dataset, Easy Communication:** If your dataset isn't large and you can easily set up communication between the Ontop SPARQL endpoint and the graph database, go with solution #2. It avoids dealing with files and intermediate storage.

2\. **Large Dataset, Efficient Loading:** For very large datasets, choose the most efficient loading solution supported by the triplestore, even if it requires more effort to set up.

3\. **Materializing Fragments of the KG:** Solution #2 allows easy materialization of specific fragments of the Knowledge Graph by adapting the SPARQL query. You can have hybrid KGs with some parts stored in the graph database and the rest kept virtual.

4\. **Advantage of Keeping Data Virtual:** Keeping data virtual is great for large volumes of sensor data that constantly update. It's better to keep this part virtual while storing rich contextual information in the graph database.

## Ontology Usage

If you're familiar with Ontop, you might have noticed that we didn't use an ontology in this example. Providing an ontology to Ontop can result in a significantly larger KG due to the reasoning capabilities embedded in Ontop. However, GraphDB also has reasoning capabilities, allowing reasoning to be done later in GraphDB, making materialization simpler and faster. If your graph database doesn't support reasoning, Ontop can handle it.

## Mapping Options

Ontop supports any R2RML mapping as well as its native format (.obda). You can create these mappings manually or use Ontopic Studio, a no-code environment designed for designing knowledge graphs and managing large mappings.

0 comments on commit fccbe7f

Please sign in to comment.