graph2nosql

A simple Python interface to store and interact with knowledge graphs in your favourite NoSQL DB.

Knowledge Graphs are the up and coming tool to index knowlegde and make it understandable to your LLM applications. Working with Graph Databases are a pain though.

This Python interface aims to solve this by offering a set of basic functions to store and manage your (knowledge graph) in your existing NoSQL DB. From experience Document based databases offer an exteremely attractive performance an price position comparing to some existing specialized databases. I found this to be attractive for simple graph storage use cases in which no fully structured query language is required.

This repository mostly caters own use and is not regularly updated or maintained.

Implemented Databases for graph storage:

Performance Benchmark

Approximate latency performance benchmark comparing tool and technology. Benchmarking framework can be found in ./benchmarks.

Values are processing seconds -> lower = better

Feature	Firestore	MongoDB	Neo4j
Adding 100 Nodes	3.03	2.51	1.91
Query 100 individual nodes	0.94	1.10	7.12
Count 2nd degree connection of given node	0.8	tbd	10.5
Count 3rd degree connection of given node	10.9	tbd	13.3

Getting Started

graph2nosql.py is the abstract class defining the available operations.

Create an .env that stores your secrets & env vars.
Use a database object to interact with your nosql db.

Initialize knowledge graph object

Every knowledge graph store object is a child of NoSQLKnowledgeGraph in ./base/operations.py.

The graph contains three data objects: NodeData, EdgeData and CommunityData. Their respective attributes are defined in ./data_model/datamodel.py.

from databases.firestore_kg import FirestoreKG

secrets = dotenv_values("../.env")
credentials, _ = google.auth.load_credentials_from_file(secrets["GCP_CREDENTIAL_FILE"])

fskg = FirestoreKG(gcp_credential_file=secret["GCP_CREDENTIAL_FILE"],
        gcp_project_id=str(secrets["GCP_PROJECT_ID"]),
        firestore_db_id=str(secrets["WIKIDATA_FS_DB"]),
        node_collection_id=str(secrets["NODE_COLL_ID"]),
        edges_collection_id=str(secrets["EDGES_COLL_ID"]),
        community_collection_id=str(secrets["COMM_COLL_ID"])
        )

Add nodes

node_data_1 = NodeData(
    node_uid="test_egde_node_1",
    node_title="Test Node 1",
    node_type="Person",
    node_description="This is a test node",
    node_degree=0,
    document_id="doc_1",
    edges_to=[],
    edges_from=[],
    embedding=[0.1, 0.2, 0.3],
)

node_data_2 = NodeData(
    node_uid="test_egde_node_2",
    node_title="Test Node 2",
    node_type="Person",
    node_description="This is another test node",
    node_degree=0,
    document_id="doc_2",
    edges_to=[],
    edges_from=[],
    embedding=[0.4, 0.5, 0.6],
)

self.kg.add_node(node_uid="test_egde_node_1"node_data=node_data_1)
self.kg.add_node(node_uid="test_egde_node_2"node_data=node_data_2)

Add directed and undirected edges

edge_data1 = EdgeData(
    source_uid="test_egde_node_1",
    target_uid="test_egde_node_2",
    description="This is a test egde description",
    directed=True
)

edge_data2 = EdgeData(
    source_uid="test_egde_node_3",
    target_uid="test_egde_node_2",
    description="This is a test egde description",
    directed=False
)

self.kg.add_edge(edge_data=edge_data1)
self.kg.add_edge(edge_data=edge_data2)

Contributing

If you decide to add new DB operations, please add corresponding tests to graph2nosql_tests.py
If you decide to write an implementation for another NoSQL db please make sure all tests in graph2nosql_tests.py succeed.

Name		Name	Last commit message	Last commit date
Latest commit History 93 Commits
.github/workflows		.github/workflows
.idx		.idx
base		base
benchmarks		benchmarks
databases		databases
datamodel		datamodel
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

graph2nosql

Implemented Databases for graph storage:

Performance Benchmark

Getting Started

Initialize knowledge graph object

Add nodes

Add directed and undirected edges

Contributing

About

Releases

Packages

Contributors 2

Languages

License

jakobap/graph2nosql

Folders and files

Latest commit

History

Repository files navigation

graph2nosql

Implemented Databases for graph storage:

Performance Benchmark

Getting Started

Initialize knowledge graph object

Add nodes

Add directed and undirected edges

Contributing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages