genmd
- Combined Source Code Markdown Generator
genmd
is a versatile Bash script designed to generate comprehensive Markdown documentation from your source code files. It scans directories, includes or excludes files based on specified patterns, and consolidates the content into a structured Markdown file. Additionally, it can generate a visual directory structure and manage configurations through dedicated files.
One of the main purposes of this is to create a markdown file containing the directory hierarchy and source code for anything which is plain text. Handy for taking related files you are working on and grouping them together in a single markdown file which has your files wrapped in markdown tags indicating their name and location in the project followed by a block quote containing the type of file and contents. This makes it easier for a Large Language Model (LLM) to consume and work with, and you don't have to copy and paste your files.
By saving the configuration and settings of any particular run of the command, you can re-run it again to capture the same files for review as they change. This is especially handy for feeding models with large context windows your project organization and code.
-
Clone the Repository:
git clone https://github.com/unixwzrd/venvutil.git
-
Navigate to the Script Directory:
cd genmd
-
Make the Script Executable:
chmod +x genmd
-
Move the Script to a Directory in Your PATH (Optional):
sudo mv genmd /usr/local/bin/
You may want to find a location and just make symlinks to the utilities in the bin directory since many of the scripts depend on some of the included scripts in the shinclude
directory. Just create the symlinks in your favorite bin directory and the scripts will find the proper included scripts. It also depends on the filetree
script, also in the bin directory of this repository.
genmd [options]
genmd
scans the specified directory (default is the current directory) and generates a Markdown file containing the source code and directory structure based on the provided options.
-
-d [level]
: Set Debug Level- Description: Sets the debug level for the script's output.
- Levels:
1
: DEBUG92
: DEBUG83
: DEBUG74
: DEBUG65
: DEBUG56
: DEBUG47
: DEBUG38
: DEBUG29
: DEBUG110
: DEBUG20
: INFO30
: WARNING40
: ERROR50
: CRITICAL99
: SILENT
-
-h
: Show Help- Description: Displays the help message with usage instructions.
-
-e [patterns]
: Exclude Directories- Description: Excludes directories matching the given patterns.
- Separator: Multiple patterns can be separated by
|
.
-
-f [patterns]
: Exclude Files- Description: Excludes files matching the given patterns.
- Separators: Multiple patterns can be separated by
|
or spaces.
-
-p [patterns]
: Exclude Additional Patterns- Description: Excludes additional patterns matching the given strings.
- Separators: Multiple patterns can be separated by
|
or spaces.
-
-i [patterns]
: Include Files- Description: Includes only files matching the given patterns.
- Separators: Multiple patterns can be separated by
|
or spaces.
-
-o [filename]
: Output File- Description: Specifies the output Markdown file.
- Default:
combined_source.md
- Note: If specified, a corresponding
.grc
configuration file will be used or created in theGENMD_BASE/utils/etc
directory.
-
-L **Follow Symlinks**
- Description: Follow symlinks when scanning directories.
-
-c [filename]
: Load Configuration- Description: Loads configuration from a
.grc
file. - Note: If the filename does not end with
.grc
,.grc
will be appended automatically. The configuration file should reside in theGENMD_BASE/utils/etc
directory.
- Description: Loads configuration from a
-
-g
: Ignore .gitignore- Description: By default,
genmd
uses patterns from.gitignore
to exclude files and directories. Use this flag to disable this behavior.
- Description: By default,
-
-C
: No Configuration- Description: Run without sourcing the
.grc
configuration file. This allows you to executegenmd
using only the provided command-line options and.gitignore
patterns (if enabled).
- Description: Run without sourcing the
-
-s [modes]
: Show Settings- Description: Displays or saves settings based on the specified modes.
- Modes:
info
,cfg
,md
,env
,all
-
-n
: Dry Run- Description: Simulates the actions without writing to any files; outputs the files that would be processed.
-
-b
: Remove Lines with Whitespace Only- Description: Removes all lines containing only whitespace from the Markdown file.
-
-l
: Add Line Numbers- Description: Adds line numbers to each file in the Markdown file.
-
-z [tool]
: Compress Output- Description: Compresses the final output using the specified compression utility (e.g., gzip, xz, bzip2).
- Note: The compressed file will be named as
output.md.<extension>
where extension is the default extension for the compression tool.
All short options have corresponding long options with double dashes (--
):
--debug [level]
--help
--exclude [patterns]
--file [patterns]
--pattern [patterns]
--include [patterns]
--output [filename]
--config [filename]
--no-gitignore
--follow-links
--no-config
--settings [modes]
--dry-run
--remove-blanks
--line-numbers
--compress [tool]
-
Basic Usage with Exclusions and Inclusions:
genmd -e "node_modules|dist" -f "*.log *.tmp" -i "*css *.js" -s "info,md" -o project_overview.md
-
Using Long Options and Dry Run:
genmd --exclude "node_modules|dist" --file "*.log *.tmp" --include "info" --dry-run
-
Setting Multiple Modes and Debug Level:
genmd -s info,md -d 25
-
Generate Markdown with Default Settings (Using .grc and .gitignore):
genmd -d 4 -e "utils _includes _data _posts js collaborates projects" \ -f "*impression* professional.md *.png" \ -i "css liquid" \ -s all \ -o my_test_file
-
Generate Markdown Without Sourcing .grc Configuration (Using Only .gitignore and Command-Line Options):
genmd -d 4 -e "utils _includes _data _posts js collaborates projects" \ -f "*impression* professional.md *.png" \ -i "css liquid" \ -s all \ -o my_test_file \ --no-config
-
Generate Markdown Without Using .gitignore (Using Only .grc Configuration and Command-Line Options):
genmd -d 4 -e "utils _includes _data _posts js collaborates projects" \ -f "*impression* professional.md *.png" \ -i "css liquid" \ -s all \ -o my_test_file \ --no-gitignore
-
Generate Markdown Using Only Command-Line Options (Ignoring Both .grc and .gitignore):
genmd -d 4 -e "utils _includes _data _posts js collaborates projects" \ -f "*impression* professional.md *.png" \ -i "css liquid" \ -s all \ -o my_test_file \ --no-config \ --no-gitignore
genmd
utilizes several environment variables to set default patterns and directories:
-
GENMD_BASE
: Base Directory- Description: The base directory to search for files.
- Default: Current directory (
.
)
-
GENMD_DIR_EXCLUDES
: Default Directory Exclusions- Description: A default list of directory patterns to exclude from the generated Markdown.
- Default Value:
tmp .git log __pycache__ .vscode
-
GENMD_FILE_EXCLUDES
: Default File Exclusions- Description: A default list of file patterns to exclude from the generated Markdown.
- Default Value:
*.ico *.svg *.png *.pdf *.jpg *.htaccess *.webp *.jekyll .DS_Store *.JPG
-
GENMD_PATTERN_EXCLUDES
: Default Additional Pattern Exclusions- Description: A default list of additional patterns to exclude from the generated Markdown.
- Default Value: (Empty)
-
GENMD_FILE_INCLUDES
: Default File Inclusions- Description: A default list of file patterns to include in the generated Markdown.
- Default Value: (Empty)
-
PAGER
: Pager for Output- Description: The pager to use for output.
- Default Value:
less -R
genmd
supports loading configurations from .grc
files, allowing you to save and reuse your settings:
-
Loading a Configuration File:
genmd -c myconfig
- If
myconfig
does not end with.grc
,.grc
will be appended automatically. - The configuration file should reside in the
GENMD_BASE/utils/etc
directory.
- If
-
Saving Configuration:
- Use the
cfg
mode with the-s
option to save current settings to a configuration file, output will be written toSTDOUT
. - When using the
-o
option to specify an output file, a corresponding.grc
configuration file will be created in theGENMD_BASE/utils/etc
directory, matching the output filename.
- Use the
- create_date: Stores the creation date of the configuration, formatted as
YYYY-MM-DD HH:MM:SS
. This variable is automatically generated and saved in the configuration file.
With the integration of .gitignore
patterns by default, genmd
simplifies exclusion patterns, leveraging existing Git configurations to reduce redundancy in your .grc
files.
To integrate your project's .gitignore
patterns into the markdown generation process, use the --no-gitignore
flag. This allows genmd
to automatically exclude files and directories as specified in .gitignore
, reducing the need to duplicate exclusion patterns in your .grc
files.
-
Example Command:
genmd -d 4 -e "utils _includes _data _posts js collaborates projects" -f "*impression* professional.md *.png" -i "css liquid" -s all -o my_test_file --no-gitignore
The -s
or --settings
option allows you to manage how settings are displayed or saved:
-
Modes:
info
: Displays the current settings toSTDERR
.cfg
: Saves the current settings to a.grc
configuration file.md
: Appends the current settings to the output Markdown file.env
: Outputs the current settings as environment variables to a separate script (genmd_env.sh
).all
: Performsinfo
,md
, andenv
actions simultaneously.
-
Example:
genmd -s all
- Enhanced logging capabilities to allow adjustable verbosity levels, providing more control over the output detail.
The -n
or --dry-run
option allows you to simulate the actions of the script without making any changes. It will display the files that would be processed and the actions that would be taken.
-
Example:
genmd --dry-run -e "node_modules|dist" --file "*.log *.tmp" --include "info"
-
Verbose Output (
-v
or--verbose
):-
Description: Enables detailed output messages to help you understand what the script is doing.
-
Usage:
genmd -v
-
-
Debug Levels (
-d
or--debug
):- Description: Sets the level of debugging information.
- Default:
30
- Levels:
1
: DEBUG92
: DEBUG83
: DEBUG74
: DEBUG65
: DEBUG56
: DEBUG47
: DEBUG38
: DEBUG29
: DEBUG110
: DEBUG20
: INFO30
: WARNING40
: ERROR50
: CRITICAL99
: SILENT
-
Usage:
genmd -d 29
genmd
organizes its files and directories within the GENMD_BASE
directory. The default structure includes:
-
Configuration Files:
utils/etc/genmd.grc
: Default configuration file.utils/etc
: Default directory for configuration files.
-
Output Files:
utils/output
: Default directory for output Markdown files.
-
Custom Configurations:
- When using the
-o
option to specify an output file, a corresponding.grc
configuration file will be created or used in theGENMD_BASE/utils/etc
directory, matching the output filename.
- When using the
-
Example .grc Configuration File:
This will preserve the existing settings and merged settings from the command line. It will be written to the utils/etc directory
export GENMD_DIR_EXCLUDES=".git utils _includes _site conf docs _data _posts js collaborates projects"
export GENMD_FILE_EXCLUDES="*impression* professional.md *.png"
export GENMD_FILE_INCLUDES=".sh genmd filettree"
export GENMD_PATTERN_EXCLUDES=""
export GENMD_BASE="."
output_filename="./utils/output/yet-another-test.md"
settings_modes="info md cfg env"
dry_run=false
debug_level=40
verbose=false
use_gitignore=true
remove_blanks=false
add_line_numbers=false
compress=false
compression_tool=gzip
create_date="2023-02-20 14:30:00"
genmd
requires the following dependencies to function correctly:
-
Bash: Version 4.0 or higher.
-
filetree
Command:- Description: Used to generate a visual representation of the project's directory structure.
- Location: Included in this repository.
- Repository: https://github.com/unixwzrd/venvutil
- Requires the Rich Python library to be installed.
-
Included Scripts:
- errno.sh and util_funcs.sh are required scripts included in this repository.
-
Other Utilities:
- Ensure that all required utilities used within the script are installed and accessible in your system's PATH.
Michael Sullivan
Email: [email protected]
Website: https://unixwzrd.ai/
GitHub: https://github.com/unixwzrd
This project is licensed under the Apache License, Version 2.0.