Starred repositories
A MULTI-GENERATOR ENSEMBLE FRAMEWORK FOR NATURAL LANGUAGE TO SQL
Fluss is a streaming storage built for real-time analytics.
🤖 Open-source GenBI AI Agent that empowers data-driven teams to chat with their data to generate Text-to-SQL, charts, spreadsheets, reports, dashboards and BI. 📈📊📋🧑💻
World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
Open, Multi-modal Catalog for Data & AI
Examples and guides for using the OpenAI API
The universal tool suite for vector database management. Manage Pinecone, Chroma, Qdrant, Weaviate and more vector databases with ease.
Curated tutorials and resources for Large Language Models, AI Painting, and more.
Curated tutorials and resources for Large Language Models, Text2SQL, Text2DSL、Text2API、Text2Vis and more.
Workshop material for Spring AI and Azure OpenAI Service
Java version of LangChain, while empowering LLM for Big Data.
airda(Air Data Agent)是面向数据分析的多智能体,能够理解数据开发和数据分析需求、理解数据、生成面向数据查询、数据可视化、机器学习等任务的SQL和Python代码
ChatOllama is an open source chatbot based on LLMs. It supports a wide range of language models, and knowledge base management.
The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, No-code agent builder, MCP compatibility, and more.
scripts and baselines for Spider: Yale complex and cross-domain semantic parsing and text-to-SQL challenge
Evaluate the accuracy of LLM generated outputs
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using RAG 🔄.
The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance …
Source code for Twitter's Recommendation Algorithm
整理自然语言处理、推荐系统、搜索引擎等AI领域的入门笔记,论文学习笔记和面试资料(关于NLP那些你不知道的事、关于推荐系统那些你不知道的事、NLP百面百搭、推荐系统百面百搭、搜索引擎百面百搭)
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
文言文編程語言 A programming language for the ancient Chinese.
Jeff Dean's latency numbers plotted over time
A curated list to learn about distributed systems