Initial new docs

bakebrain · Jan 4, 2024 · 3b654d7 · 3b654d7
1 parent 34d8420
commit 3b654d7
Show file tree

Hide file tree

Showing 13 changed files with 1,379 additions and 65 deletions.
diff --git a/README.md b/README.md
@@ -6,9 +6,12 @@
 
 ## DSPy: _Programming_—not prompting—Foundation Models
 
-Paper —— **[DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines](https://arxiv.org/abs/2310.03714)**
+Main Paper (Oct 2023) —— **[DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines](https://arxiv.org/abs/2310.03714)**     
+New Features (Dec 2023) —— **[DSPy Assertions: Computational Constraints for Self-Refining Language Model Pipelines](https://arxiv.org/abs/2312.13382)**     
+Old Abstractions (Dec 2022) —— **[Demonstrate-Search-Predict: Composing Retrieval and Language Models for Knowledge-Intensive NLP](https://arxiv.org/abs/2212.14024.pdf)**
 
-[<img align="center" src="https://colab.research.google.com/assets/colab-badge.svg" />](https://colab.research.google.com/github/stanfordnlp/dspy/blob/main/intro.ipynb)
+
+**Getting Started:** &nbsp; [<img align="center" src="https://colab.research.google.com/assets/colab-badge.svg" />](https://colab.research.google.com/github/stanfordnlp/dspy/blob/main/intro.ipynb)
 
 
 **DSPy** is the framework for solving advanced tasks with language models (LMs) and retrieval models (RMs). **DSPy** unifies techniques for **prompting** and **fine-tuning** LMs — and approaches for **reasoning**, **self-improvement**, and **augmentation with retrieval and tools**. All of these are expressed through modules that compose and learn.
@@ -31,9 +34,9 @@ If you want to see **DSPy** in action, **[open our intro tutorial notebook](intr
 
 
 1. **[Installation](#1-installation)**
-1. **[Framework Syntax](#2-syntax-youre-in-charge-of-the-workflowits-free-form-python-code)**
-1. **[Compiling: Two Powerful Concepts](#3-two-powerful-concepts-signatures--teleprompters)**
-1. **[Tutorials & Documentation](#4-documentation--tutorials)**
+1. **[Tutorials & Documentation](#2-documentation)**
+1. **[Framework Syntax](#3-syntax-youre-in-charge-of-the-workflowits-free-form-python-code)**
+1. **[Compiling: Two Powerful Concepts](#4-two-powerful-concepts-signatures--teleprompters)**
 1. **[FAQ: Is DSPy right for me?](#5-faq-is-dspy-right-for-me)**
 
 
@@ -66,7 +69,44 @@ For the optional Pinecone, Qdrant, [chromadb](https://github.com/chroma-core/chr
 pip install dspy-ai[pinecone]  # or [qdrant] or [chromadb] or [marqo]
 ```
 
-## 2) Syntax: You're in charge of the workflow—it's free-form Python code!
+## 2) Documentation
+
+The DSPy documentation is divided into **tutorials**, **guides**, and **examples**.
+
+### A) Tutorials:
+
+DSPy Tutorials illustate how to start from scratch and build a powerful DSPy program on a specific task.
+
+If you're new here, start with **[our intro tutorial](intro.ipynb)**. We recommend you open it directly in free Google Colab: [<img align="center" src="https://colab.research.google.com/assets/colab-badge.svg" />](https://colab.research.google.com/github/stanfordnlp/dspy/blob/main/intro.ipynb)
+
+
+### B) Guides:
+
+Each DSPy guide presents how to use specific parts of the library.
+
+You will probably refer to these guides frequently, first to understand each concept in DSPy in isolation and then to copy/paste snippets that you can edit for your work. If you're new to DSPy, it's _usually_ best to go in this order. It's not necessary, though.
+
+1. **[DSPy Signatures](docs/guides/signatures.ipynb)**
+
+2. **[Language Models](docs/guides/language_models.ipynb)** and **[Retrieval Models](docs/guides/retrieval_models.ipynb)**
+
+3. **[DSPy Modules](docs/guides/modules.ipynb)**
+
+4. **[DSPy Optimizers](docs/guides/optimizers.ipynb)**
+
+5. **[DSPy Metrics](docs/guides/metrics.ipynb)**
+
+6. **[DSPy Assertions](docs/guides/assertions.ipynb)**
+
+
+### C) Examples:
+
+These are self-contained programs that illustrate how to build programs with DSPy.
+
+You can find many individual **examples** inside _and outside!_ the `examples/` folder and a bunch others tweeted by [@lateinteraction](https://twitter.com/lateinteraction) on Twitter/X.
+
+
+## 3) Syntax: You're in charge of the workflow—it's free-form Python code!
 
 **DSPy** hides tedious prompt engineering, but it cleanly exposes the important decisions you need to make: **[1]** what's your system design going to look like? **[2]** what are the important constraints on the behavior of your program?
 
@@ -108,12 +148,12 @@ The next section will discuss how to compile our simple `RAG` program. When we c
 If you later decide you need another step in your pipeline, just add another module and compile again. Maybe add a module that takes the chat history into account during search?
 
 
-## 3) Two Powerful Concepts: Signatures & Teleprompters
+## 4) Two Powerful Concepts: Signatures & Teleprompters
 
 To make it possible to compile any program you write, **DSPy** introduces two simple concepts: Signatures and Teleprompters.
 
 
-#### 3.a) Declaring the input/output behavior of LMs with `dspy.Signature`
+#### 4.a) Declaring the input/output behavior of LMs with `dspy.Signature`
 
 When we assign tasks to LMs in **DSPy**, we specify the behavior we need as a **Signature**. A signature is a declarative specification of input/output behavior of a **DSPy module**.
 
@@ -151,7 +191,7 @@ self.generate_answer = dspy.ChainOfThought(GenerateSearchQuery)
 You can optionally provide a `prefix` and/or `desc` key for each input or output field to refine or constraint the behavior of modules using your signature. The description of the sub-task itself is specified as the docstring (i.e., `"""Write a simple..."""`).
 
 
-#### 3.b) Asking **DSPy** to automatically optimize your program with `dspy.teleprompt.*`
+#### 4.b) Asking **DSPy** to automatically optimize your program with `dspy.teleprompt.*`
 
 After defining the `RAG` program, we can **compile** it. Compiling a program will update the parameters stored in each module. For large LMs, this is primarily in the form of creating and validating good demonstrations for inclusion in your prompt(s).
 
@@ -196,61 +236,6 @@ compiled_rag = teleprompter.compile(RAG(), trainset=my_rag_trainset)
 If we now use `compiled_rag`, it will invoke our LM with rich prompts with few-shot demonstrations of chain-of-thought retrieval-augmented question answering on our data.
 
 
-## 4) Documentation & Tutorials
-
-While we work on new tutorials, please check out **[our intro notebook](intro.ipynb)**.
-
-Or open it directly in free Google Colab: [<img align="center" src="https://colab.research.google.com/assets/colab-badge.svg" />](https://colab.research.google.com/github/stanfordnlp/dspy/blob/main/intro.ipynb)
-
-For module documentation, please refer to our [documentation folder](https://github.com/stanfordnlp/dspy/tree/main/docs).
-
-
-#### Language Model Clients
-
-- [`dspy.OpenAI`](docs/language_models_client.md#openai)
-- [`dspy.Cohere`](docs/language_models_client.md#cohere)
-- [`dspy.TGI`](docs/language_models_client.md#tgi)
-- [`dspy.VLLM`](docs/language_models_client.md#vllm)
-
-#### Retrieval Model Clients
-
-- [`dspy.ColBERTv2`](docs/retrieval_models_client.md#colbertv2)
-- [`dspy.AzureCognitiveSearch`](docs/retrieval_models_client.md#azurecognitivesearch)
-- `dspy.Pyserini`
-- `dspy.Pinecone`
-- `dspy.Qdrant`
-- `dspy.Chromadb`
-- `dspy.Marqo`
-
-#### Signatures
-
-- `dspy.Signature`
-- `dspy.InputField`
-- `dspy.OutputField`
-
-#### Modules
-
-- [`dspy.Predict`](docs/modules.md#dspypredict)
-- [`dspy.Retrieve`](docs/modules.md#dspyretrieve)
-- [`dspy.ChainOfThought`](docs/modules.md#dspychainofthought)
-- `dspy.ProgramOfThought`
-- [`dspy.ReAct`](docs/modules.md#dspyreact)
-- [`dspy.MultiChainComparison`](docs/modules.md#dspymultichaincomparison)
-- `dspy.SelfCritique` [coming soon]
-- `dspy.SelfRevision` [coming soon]
-- `dspy.majority` (self-consistency)
-
-
-#### Teleprompters
-
-- [`dspy.teleprompt.LabeledFewShot`](docs/teleprompters.md#telepromptlabeledfewshot)
-- [`dspy.teleprompt.BootstrapFewShot`](docs/teleprompters.md#telepromptbootstrapfewshot)
-- [`dspy.teleprompt.BootstrapFewShotWithRandomSearch`](docs/teleprompters.md#telepromptbootstrapfewshotwithrandomsearch)
-- `dspy.teleprompt.LabeledFinetune` [coming soon]
-- [`dspy.teleprompt.BootstrapFinetune`](docs/teleprompters.md#telepromptbootstrapfinetune)
-- [`dspy.teleprompt.Ensemble`](docs/teleprompters.md#telepromptensemble)
-- `dspy.teleprompt.kNN`
-
 
 
 ## 5) FAQ: Is DSPy right for me?
@@ -341,7 +326,9 @@ If you use DSPy or DSP in a research paper, please cite our work as follows:
 ```
 
 You can also read more about the evolution of the framework from Demonstrate-Search-Predict to DSPy:
-* [**DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines**](https://arxiv.org/abs/2310.03714) (Academic Paper, Oct 2023)
+
+* [**DSPy Assertions: Computational Constraints for Self-Refining Language Model Pipelines**](https://arxiv.org/abs/2312.13382)   (Academic Paper, Dec 2023) 
+* [**DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines**](https://arxiv.org/abs/2310.03714) (Academic Paper, Oct 2023) 
 * [**Releasing DSPy, the latest iteration of the framework**](https://twitter.com/lateinteraction/status/1694748401374490946) (Twitter Thread, Aug 2023)
 * [**Releasing the DSP Compiler (v0.1)**](https://twitter.com/lateinteraction/status/1625231662849073160)  (Twitter Thread, Feb 2023)
 * [**Introducing DSP**](https://twitter.com/lateinteraction/status/1617953413576425472)  (Twitter Thread, Jan 2023)

diff --git a/docs/guides/README.md b/docs/guides/README.md
diff --git a/docs/guides/assertions.ipynb b/docs/guides/assertions.ipynb
@@ -0,0 +1,77 @@
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%load_ext autoreload\n",
+    "%autoreload 2\n",
+    "import sys; sys.path.append('/future/u/okhattab/repos/public/tmp/dspy')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<img src=\"../../docs/images/DSPy8.png\" alt=\"DSPy7 Image\" height=\"150\"/>\n",
+    "\n",
+    "## Guide: **DSPy Assertions**\n",
+    "\n",
+    "[<img align=\"center\" src=\"https://colab.research.google.com/assets/colab-badge.svg\" />](https://colab.research.google.com/github/stanfordnlp/dspy/blob/main/docs/guides/signatures.ipynb)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Quick Recap\n",
+    "\n",
+    "This guide assumes you followed the [intro tutorial]() to build your first few DSPy programs.\n",
+    "\n",
+    "Remember that a **DSPy program** is just Python code that calls one or more DSPy modules, like `dspy.Predict` or `dspy.ChainOfThought`, to use LMs."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### 1) What is a DSPy Assertion?\n",
+    "\n",
+    "While we prepare this guide, please [read the DSPy assertions paper](https://arxiv.org/abs/2312.13382) and follow the examples in it."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Install `dspy-ai` if needed. Then set up a default language model.\n",
+    "# TODO: Add a graceful line for OPENAI_API_KEY.\n",
+    "\n",
+    "try: import dspy\n",
+    "except ImportError:\n",
+    "    %pip install dspy-ai\n",
+    "    import dspy\n",
+    "\n",
+    "dspy.configure(lm=dspy.OpenAI(model='gpt-3.5-turbo-1106'))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "language_info": {
+   "name": "python"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/docs/guides/language_model_details/launching_mlc.md b/docs/guides/language_model_details/launching_mlc.md
@@ -0,0 +1,48 @@
+## Setting up an MLC language model
+
+### Prerequisites
+
+Install the required packages using the following commands:
+
+```shell
+pip install --no-deps --pre --force-reinstall mlc-ai-nightly-cu118 mlc-chat-nightly-cu118 -f https://mlc.ai/wheels
+pip install transformers
+git lfs install
+```
+
+Adjust the pip wheels according to your OS/platform by referring to the provided commands in [MLC packages](https://mlc.ai/package/).
+
+
+### Running MLC Llama-2 models
+
+1. Create a directory for prebuilt models:
+
+```shell
+mkdir -p dist/prebuilt
+```
+
+2. Clone the necessary libraries from the repository:
+
+```shell
+git clone https://github.com/mlc-ai/binary-mlc-llm-libs.git dist/prebuilt/lib
+cd dist/prebuilt
+```
+
+3. Choose a Llama-2 model from [MLC LLMs](https://huggingface.co/mlc-ai) and clone the model repository:
+
+```shell
+git clone https://huggingface.co/mlc-ai/mlc-chat-Llama-2-7b-chat-hf-q4f16_1
+```
+
+### Sending requests to the server
+
+Initialize the `ChatModuleClient` within your program with the desired parameters. Here's an example call:
+
+```python
+model = 'dist/prebuilt/mlc-chat-Llama-2-7b-chat-hf-q4f16_1'
+model_path = 'dist/prebuilt/lib/Llama-2-7b-chat-hf-q4f16_1-cuda.so'
+
+llama = dspy.ChatModuleClient(model=model, model_path=model_path)
+```
+
+Please refer to the [official MLC repository](https://github.com/mlc-ai/mlc-llm) for more detailed [docs](https://mlc.ai/mlc-llm/docs/get_started/try_out.html).
diff --git a/docs/guides/language_model_details/launching_tgi.md b/docs/guides/language_model_details/launching_tgi.md
@@ -0,0 +1,60 @@
+## Launching a Text Generation Inference (TGI) Server
+
+### Prerequisites
+
+- Docker must be installed on your system. If you don't have Docker installed, you can get it from [here](https://docs.docker.com/get-docker/).
+
+### Setting up the Text-Generation-Inference Server
+
+1. Clone the Text-Generation-Inference repository from GitHub by executing the following command:
+
+```bash
+git clone https://github.com/huggingface/text-generation-inference.git
+```
+
+2. Change into the cloned repository directory:
+
+```bash
+cd text-generation-inference
+```
+
+3. Execute the Docker command under the "Get Started" section to run the server:
+
+```bash
+model=meta-llama/Llama-2-7b-hf # set to the specific Hugging Face model ID you wish to use.
+num_shard=1 # set to the number of shards you wish to use.
+volume=$PWD/data # share a volume with the Docker container to avoid downloading weights every run
+
+docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:latest --model-id $model --num-shard $num_shard
+```
+
+This command will start the server and make it accessible at `http://localhost:8080`.
+
+If you want to connect to [Meta Llama 2 models](https://huggingface.co/meta-llama), make sure to use version 9.3 (or higher) of the docker image (ghcr.io/huggingface/text-generation-inference:0.9.3) and pass in your huggingface token as an environment variable.
+
+```bash
+docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data -e HUGGING_FACE_HUB_TOKEN={your_token} ghcr.io/huggingface/text-generation-inference:latest --model-id $model --num-shard $num_shard
+```
+
+### Sending requests to the server
+
+After setting up the text-generation-inference server and ensuring that it displays "Connected" when it's running, you can interact with it using the `HFClientTGI`.
+
+Initialize the `HFClientTGI` within your program with the desired parameters. Here is an example call:
+
+   ```python
+   lm = dspy.HFClientTGI(model="meta-llama/Llama-2-7b-hf", port=8080, url="http://localhost")
+   ```
+
+   Customize the `model`, `port`, and `url` according to your requirements. The `model` parameter should be set to the specific Hugging Face model ID you wish to use. 
+
+
+### FAQs
+
+1. If your model doesn't require any shards, you still need to set a value for `num_shard`, but you don't need to include the parameter `--num-shard` on the command line.
+
+2. If your model runs into any "token exceeded" issues, you can set the following parameters on the command line to adjust the input length and token limit:
+   - `--max-input-length`: Set the maximum allowed input length for the text.
+   - `--max-total-tokens`: Set the maximum total tokens allowed for text generation.
+
+Please refer to the [official TGI repository](https://github.com/huggingface/text-generation-inference) for detailed docs.
diff --git a/docs/guides/language_model_details/launching_vllm.md b/docs/guides/language_model_details/launching_vllm.md
@@ -0,0 +1,31 @@
+## Launching a vLLM Server
+
+### Setting up the vLLM Server
+
+Follow these steps to set up the vLLM Server:
+
+1. Build the server from source by following the instructions provided in the [Build from Source guide](https://vllm.readthedocs.io/en/latest/getting_started/installation.html#build-from-source).
+
+2. Start the server by running the following command, and specify your desired model, host, and port using the appropriate arguments. The default server address is http://localhost:8000.
+
+Example command:
+
+```bash
+   python -m vllm.entrypoints.api_server --model mosaicml/mpt-7b --port 8000
+```
+
+This will launch the vLLM server.
+
+### Sending requests to the server
+
+After setting up the vLLM server and ensuring that it displays "Connected" when it's running, you can interact with it using the `HFClientVLLM`.
+
+Initialize the `HFClientVLLM` within your program with the desired parameters. Here is an example call:
+
+```python
+   lm = dspy.HFClientVLLM(model="mosaicml/mpt-7b", port=8000, url="http://localhost")
+```
+
+Customize the `model`, `port`, `url`, and `max_tokens` according to your requirements. The `model` parameter should be set to the specific Hugging Face model ID you wish to use.
+
+Please refer to the [official vLLM repository](https://github.com/vllm-project/vllm) for more detailed information and documentation.