RepoToText is a web app that scrapes a GitHub repository and converts its files into a single organized .txt. It allows you to enter the URL of a GitHub repository and an optional documentation URL (the doc info will append to the top of the .txt). The app retrieves the contents of the repository, including all files and directories, and also fetches the documentation from the provided URL and includes it in a single organized text file. The .txt file will be saved in the /data folder with user + repo + timestamp info. This file can then be uploaded to (GPT-4, Claude Opus, etc) and you can use the chatbot to interact with the entire GitHub repo.
To run the application using Docker, follow these steps:
- Clone the repository. Create a .env file in the root folder.
- Set up the environment variable
GITHUB_API_KEY
in the.env
file. - Build the Docker images with
docker compose build
. - Start the containers with
docker compose up
. - Access the application (http://localhost:3000) in a web browser and enter the GitHub repository URL and documentation URL (if available).
- Choose All files or choose specific file types.
- Click the "Submit" button to initiate the scraping process. The converted text will be displayed in the output area, and it will also be saved in the /data folder.
- You can also click the "Copy Text" button to copy the generated text to the clipboard. =======
To run the application locally without Docker, follow these steps:
- Clone the repository and navigate to the project directory.
- Create a
.env
file in the root folder and set up the environment variableGITHUB_API_KEY
. - Set up a Python virtual environment:
python3 -m venv venv source venv/bin/activate # On Windows use `venv\Scripts\activate`
- Install the required dependencies:
pip install -r requirements.txt
- Start the Flask server:
python RepoToText.py
- Navigate to the
src
directory and start the React frontend:npm install npm start
- Access the application at
http://localhost:3000
in a web browser and enter the GitHub repository URL and documentation URL (if available). - Choose all files or select specific file types.
- Click the "Submit" button to initiate the scraping process. The converted text will be displayed in the output area and saved in the
/data
folder. - You can also click the "Copy Text" button to copy the generated text to the clipboard.
FolderToText.py is a script that allows you to turn a local folder, or local files, into a .txt in the same way RepoToText.py does. Choose your files with browse (you can continue adding by clicking "Browse". Once you have all of your files selected and uploaded with browse, type in the file type endings you want to copy with a ',' in between. Example: .py , .js , .md , .ts ---> You can also turn this off and it will add every file you uploaded to the .txt ---> Last, enter in the file name you want to appear and the output path. The file will be written with your file name choice and a timestamp.
- Creates a .txt with ('''---) separating each file from the repo.
- Each file from the repo has a header after ('''---) with the file path as the title.
- The .txt file is saved in the /data folder
- You can add a URL to a documentation page and the documentation page will append to the top of the .txt file (great to use for tech that came out after Sep 2021).
- Frontend: React.js
- Backend: Python Flask
- Containerization: Docker
- GitHub API: PyGithub library
- Additional Python libraries: beautifulsoup4, requests, flask_cors, retry
- Add Docker to project
- Add Dark Mode
- Build web app for (https://repototext.com/)
- FIX: Broken file types: .ipynb
- FIX: FolderToText - fix so a user can pick one folder (currently only working when user selects individual files)
- Add in the ability to work with private repositories
- Add ability to store change history and update .txt to reflect working changes
- Add function to make sure .txt is current repo version
- Adjust UI for flow, including change textarea output width, adding file management and history UI