CodeMerge is a command-line interface (CLI) tool built with Rust for merging and analyzing codebases. It leverages Rust for performance-critical tasks like file I/O, token counting, and caching, while also incorporating a modern CLI design to offer flexible options. The tool allows you to merge files, display file tree structures, and calculate token counts—all while respecting .gitignore files and offering various filtering and caching options.
-
File Merging: Combines the content of multiple files into a single output, configurable via globs and ignores.
-
File Tree Generation: Visualizes the project's file structure with token counts for each file and directory.
-
Token Counting: Calculates the number of tokens in files using the
gpt-tokenizer
library, useful for estimating costs associated with large language models. -
.gitignore
Support: Respects .gitignore files for comprehensive filtering, handling multiple .gitignore files within nested directories. -
Flexible Filtering: Supports glob patterns, budget limits (minimum and maximum token counts), and the ability to ignore specific file types.
-
Output Formatting: Presents results in a clear, human-readable format.
-
Caching:
- Cache Providers: Choose among three caching strategies – SQLite, RocksDB, or disable caching altogether using the "none" provider.
- Configuration Options: Easily specify a custom cache directory using the
--cache_dir
flag, disable caching via--no_cache
, or clear the cache with--clear_cache
. - Usage Integration: Cached file results optimize repeated runs by avoiding redundant file reads.
-
Performance: Uses Rust for high-speed file processing, token counting, and caching to enhance overall efficiency.
To install CodeMerge on Linux or macOS, run:
curl -sL https://raw.githubusercontent.com/gelleson/codemerge/main/scripts/install.sh | sudo bash
After installation, use the codemerge
command directly.
- Clone the Repository:
git clone https://github.com/gelleson/codemerge.git cd codemerge
-
Build and Install from Source:
-
Install Rust: Ensure you have Rust installed. You can install Rust using
rustup
:curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
-
Build and Install:
cargo build --release sudo mv target/release/codemerge /usr/local/bin
This will build the project and install the CodeMerge binary.
-
CodeMerge provides several commands along with new caching capabilities. Run codemerge --help
to see the full list of options.
Merge the contents of multiple files into one output:
codemerge merge [options]
Options include:
<path>
: The directory or file paths to merge.--ignores, -i
: Glob patterns for files or directories to ignore.--filters, -f
: Glob patterns for files or directories to include (defaults to**
).--format
: Output format (default:text
).--output
: Path to save the merged output; if not provided, output is sent to stdout.--max-budget, -hb
: Maximum token budget (default: 10,000).--min-budget, -lb
: Minimum token budget (default: 0).--limit-by-high-budget, -lhb
: Apply maximum budget filtering.--limit-by-low-budget, -llb
: Apply minimum budget filtering.- Caching Options:
--cache_provider
: Choose a cache provider (sqlite, rocksdb, or none). (Default issqlite
.)--cache_dir
: Specify a custom directory for cache storage.--no_cache
: Disable caching completely.--clear_cache
: Clear cache before processing.
Example:
codemerge merge ./src --filters "*.ts" --output merged_code.txt
Display a visual representation of the file tree along with token counts:
codemerge tree [options]
Options include similar filtering and budgeting flags as in the merge command.
Example:
codemerge tree
Count tokens in files, which is useful when estimating costs for large language model interactions:
codemerge tokens [options]
Options include:
<path>
: The target directory.--ignores, -i
: Glob patterns for files/directories to ignore.--filters, -f
: Glob patterns for files/directories to include (default:**
).--total, -n
: Maximum number of files to display (default: all).- Budget and caching options as seen in the merge command.
Manage cache operations independently using a dedicated subcommand. This allows you to inspect or clear the cache without running a file merge or tree operation.
Usage:
codemerge cache [--provider ] [--dir <cache_directory>]
Where <operation>
can be:
clear
: Deletes all cache entries.info
: Displays cache provider and cache directory information.
Example:
codemerge cache clear --provider sqlite codemerge cache info --dir /custom/cache/dir
CodeMerge now integrates a caching layer to speed up file processing by avoiding repeated disk I/O for unchanged files. The caching module supports:
- SQLite Cache: Stores cached file data in a local SQLite database (
cache.db
in the cache directory). - RocksDB Cache: Uses RocksDB for a high-performance, embedded key-value store.
- None: Disables caching altogether, so every run reads all files from disk.
Cache configuration options include specifying a custom cache directory, disabling caching entirely, or clearing the cache before executing a command. Cache choice can be controlled via the --cache_provider
flag.
The initial version of CodeMerge was developed in Go (available on the /go
branch) but is no longer maintained. The current version is a complete rewrite in Rust that offers improved performance, safety, and additional features like caching.
Contributions are welcome! Please open an issue or submit a pull request.