Stars
An open-source PAM tool alternative to CyberArk. 广受欢迎的开源堡垒机。
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Convert PDF to markdown + JSON quickly with high accuracy
A machine learning software for extracting information from scholarly documents
#1 Locally hosted web application that allows you to perform various operations on PDF files
Hazelcast is a unified real-time data platform combining stream processing with a fast data store, allowing customers to act instantly on data-in-motion for real-time insights.
Apache Polaris, the interoperable, open source catalog for Apache Iceberg
😎 Awesome lists about all kinds of interesting topics
Focalboard is an open source, self-hosted alternative to Trello, Notion, and Asana.
ReaLTaiizor is a .NET WinForms control library that offers a wide range of components and is user-friendly and design-focused.
Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licen…
The open source Solver AI for Java, Python and Kotlin to optimize scheduling and routing. Solve the vehicle routing problem, employee rostering, task assignment, maintenance scheduling and other pl…
File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.
Docs2KG: Unified Knowledge Graph Construction from Heterogeneous Documents Assisted by Large Language Models
Open Source Continuous File Synchronization
OpenRefine is a free, open source power tool for working with messy data and improving it
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
A super fast Graph Database uses GraphBLAS under the hood for its sparse adjacency matrix graph representation. Our goal is to provide the best Knowledge Graph for LLM (GraphRAG).
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
Build multi-modal Agents with memory, knowledge, tools and reasoning. Chat with them using a beautiful Agent UI.
Elegant Scraper and Crawler Framework for Golang
An easy-to-use, distributed, extensible task/job queue framework for #golang
Distributed Task Queue (development branch)
Ip2region (2.0 - xdb) is a offline IP address manager framework and locator, support billions of data segments, ten microsecond searching performance. xdb engine implementation for many programming…
Dataframes powered by a multithreaded, vectorized query engine, written in Rust
🚀 10x easier, 🚀 140x lower storage cost, 🚀 high performance, 🚀 petabyte scale - Elasticsearch/Splunk/Datadog alternative for 🚀 (logs, metrics, traces, RUM, Error tracking, Session replay).
ZincSearch . A lightweight alternative to elasticsearch that requires minimal resources, written in Go.
Get up and running with Llama 3.3, Mistral, Gemma 2, and other large language models.