Dec 11, 2024 update: Added support for Claude and Llama3 models, along with improved image handling capabilities and a new GUI manager!
TypeGPT is a Python application that allows you to invoke various AI's and LLM's from any text field in your operating system. Whether you're in a chat app, document, or code editor, you can seamlessly interact with ChatGPT, Google Gemini, Claude, or Llama3 with Ollama, using keyboard shortcuts.
video.mp4
- Global Accessibility: Invoke AI models from any text input field across your system.
- Multiple AI Models: Support for ChatGPT, Google Gemini, Claude, and Llama3.
- GUI Manager: Easy-to-use interface for managing API keys and program status.
- Keyboard Shortcuts: Use simple keyboard shortcuts to communicate with AI models.
- Clipboard Integration: Utilize the clipboard for larger text inputs and image pasting.
- Screenshot Capability: Capture and include screenshots in your queries.
- Support for image-based queries across all AI models
- Paste images directly using clipboard (Cmd/Ctrl+V)
- Screenshot capture for visual context (/see command)
- Automatic image format conversion and base64 encoding
- ChatGPT: Uses GPT-4 Turbo with vision capabilities
- Gemini: Uses gemini-1.0-pro-vision-latest for enhanced visual understanding
- Claude: Implements Claude 3.5 Sonnet with multimodal support
- Llama3: Local inference through Ollama with image processing
- O1: Access to OpenAI's O1 preview model (text-only)
- Line Mode: Standard text input mode activated with
/a
- Screenshot Mode: Visual input mode activated with
/see
- Clipboard Support: Paste text or images with Cmd/Ctrl+V
- Send Query:
Cmd+Shift+Enter
(Mac) orCtrl+Shift+Enter
(Windows/Linux) - Cancel Input:
Esc
- Paste Content:
Cmd/Ctrl+V
- Real-time program status monitoring
- Easy API key management
- Start/Stop controls for TypeGPT
- Automatic key file creation and management
- Visual feedback for program state
- Python 3.x
- Tkinter support (usually included with Python)
- Administrative privileges for keyboard monitoring
- Python 3.x
- Accessibility permissions (required for keyboard monitoring)
- Tkinter support
- Administrative privileges
- Automatic permission checking on startup
- Graceful error handling for API failures
- Clear error messages for missing API keys
- Automatic restart capability
keys.txt
: Stores API keys in format:
OPENAI_API_KEY=your_key_here
GEMINI_API_KEY=your_key_here
ANTHROPIC_API_KEY=your_key_here
system_prompt.txt
: Customize AI behavior (optional)
- Minimal CPU usage when idle
- Efficient clipboard handling
- Optimized image processing
- Local processing for screenshots and keyboard monitoring
Before you can run the application, ensure you have the following installed:
- Python 3.x
- Required packages (install via pip):
pip install pynput requests pyperclip google.generativeai anthropic Pillow tkinter
You also need to have API keys for the AI services you plan to use. You can get yours at:
- ChatGPT: https://openai.com/api/
- Google Gemini: https://ai.google.dev/aistudio
- Claude: https://www.anthropic.com/
- Llama3: Ensure you have Ollama installed and running locally (http://localhost:11434)
- Clone the repository:
git clone https://github.com/olyaiy/TypeGPT.git
cd TypeGPT
- Run the GUI manager to set up your API keys:
python typegpt_gui.py
- Launch the GUI manager:
python typegpt_gui.py
- Enter your API keys in the "API Keys" tab
- Use the "Program Status" tab to start/stop TypeGPT
python TypeGPT.py
Use the following commands in any text field:
/a
: Start listening for input (line mode)/see
: Capture screenshot for visual queries/stop
: Stop listening/quit
: Quit the program/restart
: Restart the program
/chatgpt
: Switch to ChatGPT model/gemini
: Switch to Google Gemini model/claude
: Switch to Claude model/llama3
: Switch to Llama3 model/o1
: Switch to OpenAI's O1 model/check
: Check which model is currently active
- Type
/a
to start input mode - Type your prompt
- Press
Cmd+Shift+Enter
(Mac) orCtrl+Shift+Enter
(Windows/Linux) to send - Wait for the response to be typed out
-
API Keys: Copy
keys.template.txt
tokeys.txt
and update with your API keys. Thekeys.txt
file is gitignored for security. -
System Prompt: Modify the
system_prompt.txt
file to customize the behavior and responses of your AI based on your needs. -
AI Model Versions: You can change the versions of the AI models in the
api_calls.py
file. Currently, the defaults are:
- ChatGPT: gpt-4-turbo
- Gemini: gemini-1.0-pro-vision-latest
- Claude: claude-3-5-sonnet-20240620
- Llama3: Uses the local Ollama instance
Contributions are very welcome! Please fork the repository and submit pull requests with your proposed changes.
We plan on adding support for more AI models and improving the user interface. If you have any further ideas, we'd love to hear them!
Distributed under the Apache 2.0 License. See LICENSE
for more information.
TypeGPT requires accessibility permissions to monitor keyboard input:
- When you first run the application, you'll be prompted to grant accessibility permissions
- Open System Preferences/Settings
- Navigate to Security & Privacy > Privacy > Accessibility
- Click the lock icon to make changes
- Add and enable your Terminal application or Python.app
- Restart TypeGPT