TotemAPI

TotemAPI is a Python-based application that provides a solution for converting any book or online text into accessible data through an API. It essentially functions as a library, but in the form of an API, allowing users to seamlessly transform PDFs, text files, or even links into an API without encountering any difficulties.

CURRENT INFO:

The project is in very ealy stages, currently I'm working on the ingestor. I will be perfecting the ingestor using LLM and other methods, such as directly writing new code for it.

Why is it important to focus on the ingestor? I want to be able to have good quality convesion from text files to the LLM accessible files. For this to happen focusing on the ingestor is key.

You can currently test the ingestor that is just a basic LangChain implementation.

I added a Querry-Chat-to-PDF bot so you can play with the data from the ingestor. Enojoy! 🥹

to run:

Change path to your file path in ingestor.py
run: pipenv shell
run: python ingestor.py
after ingestion is complete run: python totem.py to try the chatbot with your data

1. Building a Text Ingestor

The first step involves building a text ingestor module. This module is responsible for extracting the content from various sources, such as PDFs, text files, or online links, and converting it into a readable format. This process ensures that the extracted text can be further processed and utilized within the API.

2. Build Cell String

The second step focuses on constructing a cell string. A cell string is an intermediate data structure that organizes the extracted text into manageable units, such as paragraphs, chapters, or sections. This step facilitates easy navigation and retrieval of specific information within the API.

3. Build the API

The final step is to construct the API itself using FastAPI, a Python web framework designed for building APIs with high performance and simplicity. In this step, the cell string generated in the previous step is used to create endpoints that allow users to access the transformed text data. These endpoints can be utilized to search for specific content, retrieve sections, or perform other desired operations on the text.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
src/data/chroma		src/data/chroma
.gitignore		.gitignore
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
api.py		api.py
ingestor.py		ingestor.py
readme.md		readme.md
totem.py		totem.py
tree.pdf		tree.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TotemAPI

CURRENT INFO:

to run:

1. Building a Text Ingestor

2. Build Cell String

3. Build the API

About

Releases

Packages

Languages

ccsssccc/TotemAPI

Folders and files

Latest commit

History

Repository files navigation

TotemAPI

CURRENT INFO:

to run:

1. Building a Text Ingestor

2. Build Cell String

3. Build the API

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages