Skip to content
View soldni's full-sized avatar
🏳️‍🌈
vibing!
🏳️‍🌈
vibing!

Organizations

@Georgetown-IR-Lab @allenai

Block or report soldni

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

FULL v0, Cursor, Manus, Same.dev, Lovable, Devin, Replit Agent, Windsurf Agent & VSCode Agent (And other Open Sourced) System Prompts, Tools & AI Models.

29,058 9,125 Updated Apr 24, 2025

Data mapping framework for rust stuff

Rust 2 Updated Apr 22, 2025

Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.

Python 157 22 Updated Jun 18, 2024

c/ua is the Docker Container for Computer-Use AI Agents.

Python 4,722 176 Updated Apr 24, 2025

PyTorch building blocks for the OLMo ecosystem

Python 197 35 Updated Apr 24, 2025

GhoulBoii's Firefox Dots

CSS 6 Updated Jan 15, 2025

OLMost every training recipe you need to perform data interventions with the OLMo family of models.

Python 23 5 Updated Apr 23, 2025

Curated list of datasets and tools for post-training.

2,970 257 Updated Jan 29, 2025

Versatile typeface for code, from code.

JavaScript 20,127 605 Updated Apr 21, 2025

👻 Ghostty is a fast, feature-rich, and cross-platform terminal emulator that uses platform-native UI and GPU acceleration.

Zig 29,874 796 Updated Apr 23, 2025

😸 Soothing pastel theme for the high-spirited!

TypeScript 16,393 302 Updated Apr 23, 2025

A more intuitive version of du in rust

Rust 9,650 211 Updated Apr 20, 2025

A curated list of resources and examples of ASCII Art

114 8 Updated Apr 24, 2024

A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.

Python 2,567 203 Updated Feb 12, 2025

Toolkit for linearizing PDFs for LLM datasets/training

Python 11,887 809 Updated Apr 23, 2025

LLM.swift is a simple and readable library that allows you to interact with large language models locally with ease for macOS, iOS, watchOS, tvOS, and visionOS.

C 576 62 Updated Apr 16, 2025

Large Language Model (LLM) module for the Spezi Ecosystem

Swift 221 27 Updated Apr 24, 2025

BPE modification that implements removing of the intermediate tokens during tokenizer training.

Python 25 2 Updated Nov 25, 2024

A curated list of awesome model based RL resources (continually updated)

1,090 64 Updated Feb 17, 2025

Dockerized iCloud Client - make a local copy of your iCloud documents and photos, and keep it automatically up-to-date.

Python 1,397 56 Updated Apr 7, 2025

Tools for shrinking fastText models (in gensim format)

Jupyter Notebook 178 13 Updated May 3, 2024
Rust 3 2 Updated Jun 10, 2024

[ACL 2024] This is the code repo for our ACL’24 paper "Cleaner Pretraining Corpus Curation with Neural Web Scraping".

Python 225 20 Updated Aug 28, 2024

GitHub Action to build and push Docker images with Buildx

TypeScript 4,717 607 Updated Apr 24, 2025

Fast bare-bones BPE for modern tokenizer training

Python 154 4 Updated Apr 2, 2025

A javascript text differencing implementation.

JavaScript 8,508 510 Updated Apr 11, 2025

scroll two or more areas simultaneously

JavaScript 237 33 Updated Apr 17, 2015

A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.

Python 293 18 Updated Mar 27, 2025

Run a TryCloudflare tunnel to your flask app right from code.

Python 37 17 Updated Feb 9, 2025

A PyTorch native library for large-scale model training

Python 3,627 343 Updated Apr 24, 2025
Next