Skip to content

Turn an entire GitHub Repo into a single organized .txt file to use with LLM's (GPT-4, Claude Opus, Gemini, etc)

License

Notifications You must be signed in to change notification settings

mixelpixx/RepoToText

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

example workflow example workflow

repo to text 5

repo to text 7

RepoToText

RepoToText is a web app that scrapes a GitHub repository and converts its files into a single organized .txt. It allows you to enter the URL of a GitHub repository and an optional documentation URL (the doc info will append to the top of the .txt). The app retrieves the contents of the repository, including all files and directories, and also fetches the documentation from the provided URL and includes it in a single organized text file. The .txt file will be saved in the /data folder with user + repo + timestamp info. This file can then be uploaded to (GPT-4, Claude Opus, etc) and you can use the chatbot to interact with the entire GitHub repo.

Running the Application with Docker

To run the application using Docker, follow these steps:

  1. Clone the repository. Create a .env file in the root folder.
  2. Set up the environment variable GITHUB_API_KEY in the .env file.
  3. Build the Docker images with docker compose build.
  4. Start the containers with docker compose up.
  5. Access the application (http://localhost:3000) in a web browser and enter the GitHub repository URL and documentation URL (if available).
  6. Choose All files or choose specific file types.
  7. Click the "Submit" button to initiate the scraping process. The converted text will be displayed in the output area, and it will also be saved in the /data folder.
  8. You can also click the "Copy Text" button to copy the generated text to the clipboard. =======

Running the Application Locally

To run the application locally without Docker, follow these steps:

  1. Clone the repository and navigate to the project directory.
  2. Create a .env file in the root folder and set up the environment variable GITHUB_API_KEY.
  3. Set up a Python virtual environment:
    python3 -m venv venv
    source venv/bin/activate  # On Windows use `venv\Scripts\activate`
  4. Install the required dependencies:
    pip install -r requirements.txt
  5. Start the Flask server:
    python RepoToText.py
  6. Navigate to the src directory and start the React frontend:
    npm install
    npm start
  7. Access the application at http://localhost:3000 in a web browser and enter the GitHub repository URL and documentation URL (if available).
  8. Choose all files or select specific file types.
  9. Click the "Submit" button to initiate the scraping process. The converted text will be displayed in the output area and saved in the /data folder.
  10. You can also click the "Copy Text" button to copy the generated text to the clipboard.

FolderToText

FolderToText.py is a script that allows you to turn a local folder, or local files, into a .txt in the same way RepoToText.py does. Choose your files with browse (you can continue adding by clicking "Browse". Once you have all of your files selected and uploaded with browse, type in the file type endings you want to copy with a ',' in between. Example: .py , .js , .md , .ts ---> You can also turn this off and it will add every file you uploaded to the .txt ---> Last, enter in the file name you want to appear and the output path. The file will be written with your file name choice and a timestamp.

Info

  • Creates a .txt with ('''---) separating each file from the repo.
  • Each file from the repo has a header after ('''---) with the file path as the title.
  • The .txt file is saved in the /data folder
  • You can add a URL to a documentation page and the documentation page will append to the top of the .txt file (great to use for tech that came out after Sep 2021).

Tech Used

  • Frontend: React.js
  • Backend: Python Flask
  • Containerization: Docker
  • GitHub API: PyGithub library
  • Additional Python libraries: beautifulsoup4, requests, flask_cors, retry

TODO

  • Add Docker to project
  • Add Dark Mode
  • Build web app for (https://repototext.com/)
  • FIX: Broken file types: .ipynb
  • FIX: FolderToText - fix so a user can pick one folder (currently only working when user selects individual files)
  • Add in the ability to work with private repositories
  • Add ability to store change history and update .txt to reflect working changes
  • Add function to make sure .txt is current repo version
  • Adjust UI for flow, including change textarea output width, adding file management and history UI

About

Turn an entire GitHub Repo into a single organized .txt file to use with LLM's (GPT-4, Claude Opus, Gemini, etc)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 52.6%
  • JavaScript 27.6%
  • CSS 12.3%
  • HTML 7.5%