gitoxide

gix is a command-line interface (CLI) to access git repositories in various ways best described as low-level for use by experts or those validating functionality in real-world scenarios. Performance and efficiency are staples of the implementation.

ein is reserved for one-off tools that are useful to many, and will one day implement a truly unique workflow with the potential to become the preferred way to interact with git repositories.

Please note that all functionality comes from the gitoxide-core library, which mirrors these capabilities and itself relies on all gix-* crates. It's not meant for consumption, for application development, please use gix.

the ein program - convenient and for humans
- init - initialize a new non-bare repository with a main branch
- clone - initialize a local copy of a remote repository
- tools
  - organize - find all git repositories and place them in directories according to their remote paths
  - find - find all git repositories in a given directory - useful for tools like skim
  - estimate-hours - estimate the time invested into a repository by evaluating commit dates.
    - Based on the git-hours algorithm.
    - See the discussion for some performance data.
the gix program (plumbing) - lower level commands for use during development
- As its main purpose is to help running the latest improvements in the real world, it's self-documenting without duplicating its features here. Use gix --help to start discovery.

gix

The top-level crate that acts as hub to all functionality provided by the gix-* plumbing crates.

utilities for applications to make long running operations interruptible gracefully and to support timeouts in servers.
handle core.repositoryFormatVersion and extensions
support for unicode-precomposition of command-line arguments (needs explicit use in parent application)
strict object creation (validate objects referenced by newly created objects exist)
strict hash verification (validate that objects actually have the hashes they claim to have)
Repository
- discovery
  - option to not cross file systems (default)
  - handle git-common-dir
  - support for GIT_CEILING_DIRECTORIES environment variable
  - handle other non-discovery modes and provide control over environment variable usage required in applications
- rev-parse
  - handle relative paths as relative to working directory
  - handle upstream and push resolution.
- rev-walk
  - include tips
  - exclude commits
- instantiation
- access to refs and objects
- create a pathspec-search from a set of strings
  - allow to construct Pathspecs using data structure instead of enforcing them to be passed as strings.
- credentials
  - run git credential directly
  - use credential helper configuration and to obtain credentials with gix_credentials::helper::Cascade
- traverse
  - commit graphs
  - make git-notes accessible
  - tree entries
- diffs/changes
  - tree with other tree
    - respect case-sensitivity of host filesystem.
    - a way to access various diff related settings or use them
    - respect diff.*.textconv, diff.*.cachetextconv and external diff viewers with diff.*.command, along with support for reading diff gitattributes.
    - rewrite tracking
      - deviation - git keeps up to four candidates whereas we use the first-found candidate that matches the similarity percentage. This can lead to different sources being found. As such, we also don't consider the filename at all.
      - handle binary files correctly, and apply filters for that matter
      - computation limit with observable reduction of precision when it is hit, for copies and renames separately
      - by identity
        
        renames (sym-links are only ever compared by identity)
        
        copies
      - by similarity - similarity factor controllable separately from renames
        
        renames
        
        copies
      - 'find-copies-harder' - find copies with the source being the entire tree.
  - tree or index with working tree
    - rename tracking
    - submodule status (recursive)
  - diffs between modified blobs with various algorithms
  - tree with index (via index-from-tree and index)
    - rename tracking
    - submodule status (recursive)
- initialize
  - Proper configuration depending on platform (e.g. ignorecase, filemode, …)
- Id
  - short hashes with detection of ambiguity.
- Commit
  - git describe like functionality, with optional commit-graph acceleration
  - create new commit from tree
- Objects
  - lookup
  - peel to object kind
  - create signed commits and tags
  - trees
    - lookup path
- references
  - peel to end
  - ref-log access
  - remote name
  - find remote itself
    - respect branch.<name>.merge in the returned remote.
- remotes
  - clone
    - shallow
      - include-tags when shallow is used (needs separate fetch)
      - prune non-existing shallow commits
    - bundles
  - fetch
    - shallow (remains shallow, options to adjust shallow boundary)
    - a way to auto-explode small packs to avoid them to pile up
    - 'ref-in-want'
    - 'wanted-ref'
    - standard negotiation algorithms consecutive, skipping and noop.
  - push
  - ls-refs
  - ls-refs with ref-spec filter
  - list, find by name
  - create in memory
  - groups
  - remote and branch files
- execute hooks
- refs
  - run transaction hooks and handle special repository states like quarantine
  - support for different backends like files and reftable
- main or linked worktree
  - add files with .gitignore handling
  - checkout with conversions like clean + smudge as in .gitattributes
  - diff index with working tree
  - sparse checkout support
  - read per-worktree config if extensions.worktreeConfig is enabled.
  - index
    - tree from index
    - index from tree
- worktrees
  - open a repository with worktrees
    - read locked state
    - obtain 'prunable' information
  - proper handling of worktree related refs
  - create a byte stream and create archives for such a stream, including worktree filters and conversions
  - create, move, remove, and repair
  - access exclude information
  - access attribute information
  - respect core.worktree configuration
    - deviation
      - The delicate interplay between GIT_COMMON_DIR and GIT_WORK_TREE isn't implemented.
- config
  - read the primitive types boolean, integer, string
  - read and interpolate trusted paths
  - low-level API for more elaborate access to all details of git-config files
  - a way to make changes to individual configuration files in memory
  - write configuration back
  - auto-refresh configuration values after they changed on disk
  - facilities to apply the url-match algorithm and to normalize urls before comparison.
- mailmap
- object replacements (git replace)
- read git configuration
- merging
- stashing
- Use Commit Graph to speed up certain queries
- subtree
- interactive rebase status/manipulation
- submodules
  - handle 'old' form for reading and detect old form
  - list
  - edit
API documentation
- Some examples

gix-actor

read and write a signature that uniquely identifies an actor within a git repository
a way to parse name <email> tuples (instead of full signatures) to facilitate parsing commit trailers.
a way to write only actors, useful for commit trailers.

gix-hash

types to represent hash digests to identify git objects.
used to abstract over different kinds of hashes, like SHA1 and the upcoming SHA256
API documentation
- Some examples

gix-chunk

decode the chunk file table of contents and provide convenient API
write the table of contents

gix-hashtable

hashmap
hashset

gix-utils

filesystem
- probe capabilities
- symlink creation and removal
- file snapshots

gix-fs

probe capabilities
symlink creation and removal
file snapshots
stack abstraction

gix-object

decode (zero-copy) borrowed objects
- commit
  - parse trailers
- tree
encode owned objects
- commit
- tree
- tag
  - name validation
transform borrowed to owned objects
API documentation
- Some examples

gix-pack

packs
- traverse pack index
- 'object' abstraction
  - decode (zero copy)
  - verify checksum
- simple and fast pack traversal
  - fast pack traversal works with ref-deltas
- decode
  - full objects
  - deltified objects
- decode
  - decode a pack from Read input
    - Add support for zlib-ng for 20% faster decompression performance
    - Read to Iterator of entries
      - read as is, verify hash, and restore partial packs
  - create index from pack alone (much faster than git)
    - resolve 'thin' packs
- encode
  - Add support for zlib-ng for 2.5x compression performance
  - objects to entries iterator
    - input objects as-is
    - pack only changed objects as derived from input
    - base object compression
    - delta compression
      - respect the delta=false attribute
    - create 'thin' pack, i.e. deltas that are based on objects the other side has.
    - parallel implementation that scales perfectly
  - entries to pack data iterator
  - write index along with the new pack
- verify pack with statistics
  - brute force - less memory
  - indexed - optimal speed, but more memory
- advanced
  - Multi-Pack index file (MIDX)
    - read
    - write
    - verify
  - 'bitmap' file
  - special handling for networked packs
  - detect and retry packed object reading
API documentation
- Some examples

gix-odb

loose object store
- traverse
- read
  - into memory
  - streaming
  - verify checksum
- streaming write for blobs
- buffer write for small in-memory objects/non-blobs to bring IO down to open-read-close == 3 syscalls
- read object header (size + kind) without full decompression
dynamic store
- auto-refresh of on-disk state
- handles alternates
- multi-pack indices
- perfect scaling with cores
- support for pack caches, object caches and MRU for best per-thread performance.
- prefix/short-id lookup, with optional listing of ambiguous objects.
- object replacements (git replace)
- high-speed packed object traversal without wasted CPU time
  - user defined filters
- read object header (size + kind) without full decompression
sink
- write objects and obtain id
alternates
- resolve links between object databases
- safe with cycles and recursive configurations
- multi-line with comments and quotes
promisor
- It's vague, but these seems to be like index files allowing to fetch objects from a server on demand.
API documentation
- Some examples

gix-diff

Check out the performance discussion as well.

tree
- changes needed to obtain other tree
patches
- There are various ways to generate a patch from two blobs.
- text
- binary
lines
- Simple line-by-line diffs powered by the imara-diff crate.
generic rename tracker to find renames and copies
- find by exact match
- find by similarity check
- heuristics to find best candidate
- find by basename to help detecting simple moves
blob
- a choice of to-worktree, to-git and to-worktree-if-needed conversions
- textconv filters
- special handling of files beyond the big-file threshold.
- detection of binary files by looking at header (first 8k bytes)
- caching of diff-able data
- prepare invocation of external diff program
  - pass meta-info
working with hunks of data
API documentation
- Examples

gix-traverse

Check out the performance discussion as well.

trees
- nested traversal
commits
- ancestor graph traversal similar to git revlog
- commitgraph support
API documentation
- Examples

gix-url

As documented here: https://www.git-scm.com/docs/git-clone#_git_urls
parse
- ssh URLs and SCP like syntax
- file, git, and SSH
- paths (OS paths, without need for UTF-8)
username expansion for ssh and git urls
convert URL to string
API documentation
- Some examples

gix-packetline

PKT-Line
encode
decode (zero-copy)
error line
V2 additions
side-band mode
Read from packet line with (optional) progress support via sidebands
Write with built-in packet line encoding
async support
API documentation
- Some examples

gix-transport

No matter what we do here, timeouts must be supported to prevent hanging forever and to make interrupts destructor-safe.
client
- general purpose connect(…) for clients
  - file:// launches service application
  - ssh:// launches service application in a remote shell using ssh
  - git:// establishes a tcp connection to a git daemon
  - http(s):// establishes connections to web server
    - via curl (blocking only)
    - via reqwest (blocking only)
  - pass context for scheme specific configuration, like timeouts
- git://
  - V1 handshake
    - send values + receive data with sidebands
    - ~~support for receiving 'shallow' refs in case the remote repository is shallow itself (I presume)~~
      - Since V2 doesn't seem to support that, let's skip this until there is an actual need. No completionist :D
  - V2 handshake
    - send command request, receive response with sideband support
- http(s)://
  - set identity for basic authentication
  - V1 handshake
    - send values + receive data with sidebands
  - V2 handshake
    - send command request, receive response with sideband support
  - ~~'dumb'~~ - we opt out using this protocol seems too slow to be useful, unless it downloads entire packs for clones?
- authentication failures are communicated by io::ErrorKind::PermissionDenied, allowing other layers to retry with authentication
- async support
server
- general purpose accept(…) for servers
API documentation
- Some examples

Advanced HTTP transport features

feature	curl	reqwest
01
02	X
03		X
04	X
05

01 -> async
02 -> proxy support
03 -> custom request configuration via fn(request)
04 -> proxy authentication
05 -> reauthentication after redirect

gix-protocol

abstract over protocol versions to allow delegates to deal only with a single way of doing things
credentials
- via gix-credentials
- via pure Rust implementation if no git is installed
handshake
- parse initial response of V1 and V2 servers
ls-refs
- parse V1 refs as provided during handshake
- parse V2 refs
- handle empty refs, AKA PKT-LINE(zero-id SP "capabilities^{}" NUL capability-list)
fetch
- detailed progress
- control credentials provider to fill, approve and reject
- initialize and validate command arguments and features sanely
- abort early for ls-remote capabilities
- packfile negotiation
  - delegate can support for all fetch features, including shallow, deepen, etc.
  - receive parsed shallow refs
push
API documentation
- Some examples

gix-attributes

parse .gitattribute files
an attributes stack for matching paths to their attributes, with support for built-in binary macro for -text -diff -merge

gix-ignore

parse .gitignore files
an attributes stack for checking if paths are excluded

gix-quote

ansi-c
- quote
- unquote

gix-mailmap

parsing
lookup and mapping of author names

gix-path

transformations to and from bytes
conversions between different platforms
virtual canonicalization for more concise paths via absolutize()
more flexible canonicalization with symlink resolution for paths which are partially virtual via realpath()
spec
- parse
- check for match

gix-pathspec

parse single
parse file line by line (with or without quoting, NUL and LF/CRLF line separation) (see --pathspec-from-file and --pathspec-file-nul)
matching of paths with git-attributes support
programmatic creation of pathspecs
TryIntoPathspec trait to parse strings or accept ready-made pathspecs as well, for use in APIs

gix-refspec

parse
matching of references and object names
- for fetch
- for push

gix-command

execute commands directly
execute commands with sh
support for GIT_EXEC_PATH environment variable with gix-sec filter

gix-prompt

open prompts for usernames for example
secure prompts for password
use askpass program if available
signal handling (resetting and restoring terminal settings)
windows prompts for cmd.exe and mingw terminals

gix-note

A mechanism to associate metadata with any object, and keep revisions of it using git itself.

CRUD for git notes

gix-negotiate

algorithms
- noop
- consecutive
- skipping

gix-fetchhead

parse FETCH_HEAD information back entirely
write typical fetch-head lines

gix-discover

check if a git directory is a git repository
find a git repository by searching upward
- define ceilings that should not be surpassed
- prevent crossing file-systems (non-windows only)
handle linked worktrees
a way to handle safe.directory
- note that it's less critical to support it as gitoxide allows access but prevents untrusted configuration to become effective.

gix-date

parse git dates
serialize Time

gix-credentials

launch git credentials helpers with a given action
- built-in git credential program
- as scripts
- as absolute paths to programs with optional arguments
- program name with optional arguments, transformed into git credential-<name>
helper::main() for easy custom credential helper programs written in Rust

gix-filter

Provide base-implementations for dealing with smudge and clean filters as well as filter processes, facilitating their development.

clean filter base
smudge filter base
filter process base

gix-sec

Provides a trust model to share across gitoxide crates. It helps configuring how to interact with external processes, among other things.

integrations
- gix-config
- gix

gix-rebase

obtain rebase status
drive a rebase operation

gix-sequencer

Handle human-aided operations which cannot be completed in one command invocation.

gix-lfs

Implement git large file support using the process protocol and make it flexible enough to handle a variety of cases. Make it the best-performing implementation and the most convenient one.

gix-glob

parse pattern
a type for pattern matching of paths and non-paths, optionally case-insensitively.

gix-status

differences between index and worktree to turn index into worktree
- rename tracking
- untracked files
- support for fs-monitor for modification checks
differences between index and index to learn what changed
- rename tracking

gix-worktree-state

handle the working tree/checkout
- checkout an index of files, executables and symlinks just as fast as git
  - forbid symlinks in directories
  - handle submodules
  - handle sparse directories
  - handle sparse index
  - linear scaling with multi-threading up to IO saturation
- supported attributes to affect working tree and index contents
  - eol
  - working-tree-encoding
  - …more
- filtering
  - text
  - ident
  - filter processes
  - single-invocation clean/smudge filters
access to per-path information, like .gitignore and .gitattributes in a manner well suited for efficient lookups
- exclude information
- attributes

gix-worktree

A stack to to efficiently generate attribute lists for matching paths against.

gix-revision

describe() (similar to git name-rev)
parse specifications
- parsing and navigation
- revision ranges
- full date parsing support (depends on gix-date)

gix-revision

primitives to help with graph traversal, along with commit-graph acceleration.

gix-submodule

read .gitmodule files, access all their fields, and apply overrides
check if a submodule is 'active'
CRUD for submodules
try to handle with all the nifty interactions and be a little more comfortable than what git offers, lay a foundation for smarter git submodules.

gix-bitmap

A plumbing crate with shared functionality regarding EWAH compressed bitmaps, as well as other kinds of bitmap implementations.

EWAH
- Array type to read and write bits
  - execute closure for each true bit
- decode on-disk representation
- encode on-disk representation

gix-dir

A git directory walk.

list untracked files
list ignored files
collapsing of untracked and ignored directories
pathspec based filtering
multi-threaded initialization of icase hash table is always used to accelerate index lookups, even if ignoreCase = false for performance
special handling of submodules (for now, submodules or nested repositories are detected, but they can't be walked into naturally)
accelerated walk with untracked-cache (as provided by UNTR extension of gix_index::File)

gix-index

The git staging area.

read
- V2 - the default, including long-paths support
- V3 - extended flags
- V4 - delta-compression for paths
- TODO(perf): multi-threaded implementation should boost performance, spends most time in storing paths, has barely any benefit right now.
- optional threading
  - concurrent loading of index extensions
  - threaded entry reading
- extensions
  - TREE for speeding up tree generation
  - REUC resolving undo
  - UNTR untracked cache
  - FSMN file system monitor cache V1 and V2
  - EOIE end of index entry
  - IEOT index entry offset table
  - 'link' base indices to take information from, split index
  - 'sdir' sparse directory entries - marker
- verification of entries and extensions as well as checksum
- expand sparse directory entries using information of the tree itself
write
- V2
- V3 - extension bits
- V4
- extensions
  - TREE
  - REUC
  - UNTR
  - FSMN
  - EOIE
  - 'sdir'
  - 'link'
    - note that we currently dissolve any shared index we read so when writing this extension is removed.
stat update
- optional threaded stat based on thread_cost (aka preload)
handling of .gitignore and system file exclude configuration
lookups that ignore the case
- multi-threaded lookup table generation with the same algorithm as the one used by Git
- expand sparse folders (don't know how this relates to traversals right now)
maintain extensions when altering the cache
- TREE for speeding up tree generation
- REUC resolving undo
- UNTR untracked cache
- FSMN file system monitor cache V1 and V2
- EOIE end of index entry
- IEOT index entry offset table
- 'link' base indices to take information from, split index
- 'sdir' sparse directory entries
add and remove entries
API documentation
- Some examples

gix-commitgraph

read-only access
- Graph lookup of commit information to obtain timestamps, generation and parents, and extra edges
- Corrected generation dates
- Bloom filter index
- Bloom filter data
create and update graphs and graph files
API documentation
- Some examples

gix-tempfile

See its README.md.

gix-lock

See its README.md.

gix-config-value

parse
- boolean
- integer
- color
  - ANSI code output for terminal colors
- path (incl. resolution)
- date
- [permission][https://github.com/git/git/blob/71a8fab31b70c417e8f5b5f716581f89955a7082/setup.c#L1526:L1526]

gix-config

read
- zero-copy parsing with event emission
- all config values as per the gix-config-value crate
- includeIf
  - gitdir, gitdir/i, and onbranch
  - hasconfig
access values and sections by name and sub-section
edit configuration in memory, non-destructively
- cross-platform newline handling
write files back for lossless round-trips.
- keep comments and whitespace, and only change lines that are affected by actual changes, to allow truly non-destructive editing
cascaded loading of various configuration files into one
- load from environment variables
- load from well-known sources for global configuration
- load repository configuration with all known sources
API documentation
- Some examples

gix-worktree-stream

encode git-tree as stream of bytes (with large file support and actual streaming)
produce a stream of entries
add custom entries to the stream
respect export-ignore git attribute
apply standard worktree conversion to simulate an actual checkout
support for submodule inclusion
API documentation
- Some examples

gix-archive

write_to() for creating an archive with various container formats
- tar and tar.gz
- zip
add prefix and modification date
API documentation
- Some examples

gix-bundle

create a bundle from an archive
- respect export-ignore and export-subst
extract a branch from a bundle into a repository
API documentation
- Some examples

gix-validate

validate ref names
validate submodule names
validate tag names

gix-fsck

validate connectivity and find missing objects starting from…
- commits
- tags
- tree-cache in the index or any entry within
validate object hashes during connectivity traversal
progress reporting and interruptability
skipList to exclude objects which are known to be broken
validate blob hashes (connectivity check
identify objects that exist but are not reachable (i.e. what remains after a full graph traversal from all valid starting points)
write dangling objects to the .git/log-found directory structure
strict mode, to check for tree objects with g+w permissions
consider reflog entries from ref starting points
when reporting reachable objects, provide the path through which they are reachable, i.e. ref-log@{3} -> commit -> tree -> path-in-tree
limit search to ODB without alternates (default is equivalent to git fsck --full due to ODB implementation)
all individual checks available in git fsck (too many to print here)

gix-ref

Prepare code for arrival of longer hashes like Sha256. It's part of the V2 proposal but should work for loose refs as well.
Stores
- disable transactions during quarantine
- namespaces
  - a server-side feature to transparently isolate refs in a single shared repository, allowing all forks to live in the same condensed repository.
- loose file
  - ref validation
  - find single ref by name
  - special handling of FETCH_HEAD and MERGE_HEAD
  - iterate refs with optional prefix
  - worktree support
    - support multiple bases and classify refs
    - support for ref iteration merging common and private refs seamlessly.
    - avoid packing refs which are worktree private
  - ~~symbolic ref support, using symbolic links~~
    - This is a legacy feature which is not in use anymore.
  - transactions
    - delete, create or update single ref or multiple refs while handling the reflog
    - set any valid ref value (not just object ids)
    - reflog changes can be entirely disabled (i.e. for bare repos)
    - rename or copy references
    - transparent handling of packed-refs during deletion
    - writing loose refs into packed-refs and optionally delete them
    - initial transaction optimization (a faster way to create clones with a lot of refs)
  - log
    - forward iteration
    - backward iteration
    - expire
  - ref
    - peel to id
  - packed
    - find single ref by name
    - iterate refs with optional prefix
    - handle unsorted packed refs and those without a header
- reftable,
  - see here for a Go/C implementation
API documentation
- Some examples

gix-features

io-pipe feature toggle
- a unix like pipeline for bytes
parallel feature toggle
- When on…
  - in_parallel
  - join
- When off all functions execute serially
fast-sha1
- provides a faster SHA1 implementation using CPU intrinsics
API documentation

gix-tui

a terminal user interface seeking to replace and improve on tig
Can display complex history in novel ways to make them graspable. Maybe this post can be an inspiration.

gix-tix

A re-implementation of a minimal tig like UI that aims to be fast and to the point.

gix-lfs

Definitely optimize for performance and see how we fare compared to oxen. Right now, git lfs is 40x slower, due to sequential uploads and lack of fast compression. It seems this can be greatly improved to get close to 6min for 200k images (1.4GB). GitHub seems to cap upload speeds to 100kb/s, one major reason it's so slow, and it can only do it sequentially as git-lfs doesn't use the new filter-process protocol which would allow parallelization. Oxen uses the XXH3 (30gb/s) which greatly outperforms SHA1 - however, it doesn't look like the hash is necessarily the bottleneck in typical benchmarks.

Files

crate-status.md

Latest commit

History

crate-status.md

File metadata and controls

gitoxide

gix

gix-actor

gix-hash

gix-chunk

gix-hashtable

gix-utils

gix-fs

gix-object

gix-pack

gix-odb

gix-diff

gix-traverse

gix-url

gix-packetline

gix-transport

Advanced HTTP transport features

gix-protocol

gix-attributes

gix-ignore

gix-quote

gix-mailmap

gix-path

gix-pathspec

gix-refspec

gix-command

gix-prompt

gix-note

gix-negotiate

gix-fetchhead

gix-discover

gix-date

gix-credentials

gix-filter

gix-sec

gix-rebase

gix-sequencer

gix-lfs

gix-glob

gix-status

gix-worktree-state

gix-worktree

gix-revision

gix-revision

gix-submodule

gix-bitmap

gix-dir

gix-index

gix-commitgraph

gix-tempfile

gix-lock

gix-config-value

gix-config

gix-worktree-stream

gix-archive

gix-bundle

gix-validate

gix-fsck

gix-ref

gix-features

gix-tui

gix-tix

gix-lfs