diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 98095b869..0bd9514db 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -1,4 +1,4 @@
-# Contibuting
+# Contributing
 
 ## Finding Issues
 
diff --git a/README.md b/README.md
index ff5882289..bc0cc2ed6 100644
--- a/README.md
+++ b/README.md
@@ -114,7 +114,7 @@ If you're new to DSPy, it's probably best to go in sequential order. You will pr
 
 4. **[Optimizers (formerly Teleprompters)](https://dspy-docs.vercel.app/docs/building-blocks/optimizers)**
 
-6. **[DSPy Assertions](docs/assertions.md)**
+6. **[DSPy Assertions](https://dspy-docs.vercel.app/docs/building-blocks/assertions)**
 
 
 ### C) Examples
@@ -139,6 +139,8 @@ You can find other examples tweeted by [@lateinteraction](https://twitter.com/la
 - [DSPy on BIG-Bench Hard Example, by Chris Levy](https://drchrislevy.github.io/posts/dspy/dspy.html)
 - [Using Ollama with DSPy for Mistral (quantized) by @jrknox1977](https://gist.github.com/jrknox1977/78c17e492b5a75ee5bbaf9673aee4641)
 - [Using DSPy, "The Unreasonable Effectiveness of Eccentric Automatic Prompts" (paper) by VMware's Rick Battle & Teja Gollapudi, and interview at TheRegister](https://www.theregister.com/2024/02/22/prompt_engineering_ai_models/)
+- Typed DSPy (contributed by [@normal-computing](https://github.com/normal-computing))
+  - [Using DSPy to train Gpt 3.5 on HumanEval by @thomasahle](https://github.com/stanfordnlp/dspy/blob/main/examples/functional/functional.ipynb)
 
 There are also recent cool examples at [Weaviate's DSPy cookbook](https://github.com/weaviate/recipes/tree/main/integrations/dspy) by Connor Shorten. [See tutorial on YouTube](https://www.youtube.com/watch?v=CEuUG4Umfxs).
 
@@ -274,9 +276,32 @@ compiled_rag = teleprompter.compile(RAG(), trainset=my_rag_trainset)
 If we now use `compiled_rag`, it will invoke our LM with rich prompts with few-shot demonstrations of chain-of-thought retrieval-augmented question answering on our data.
 
 
+## 5) Pydantic Types
 
+Sometimes you need more than just string inputs/outputs.
+Assume, for example, you need to find 
 
-## 5) FAQ: Is DSPy right for me?
+```python
+from pydantic import BaseModel, Field
+
+class TravelInformation(BaseModel):
+    origin: str = Field(pattern=r"^[A-Z]{3}$")
+    destination: str = Field(pattern=r"^[A-Z]{3}$")
+    date: datetime.date
+    confidence: float = Field(gt=0, lt=1)
+
+class TravelSignature(Signature):
+    """ Extract all travel information in the given email """
+    email: str = InputField()
+    flight_information: list[TravelInformation] = OutputField()
+
+predictor = dspy.TypedPredictor(TravelSignature)
+predictor(email='...')
+```
+
+Which will output a list of `TravelInformation` objects.
+
+## 6) FAQ: Is DSPy right for me?
 
 The **DSPy** philosophy and abstraction differ significantly from other libraries and frameworks, so it's usually straightforward to decide when **DSPy** is (or isn't) the right framework for your usecase.
 
diff --git a/docs-page/README.md b/docs-page/README.md
deleted file mode 100644
index 524a59361..000000000
--- a/docs-page/README.md
+++ /dev/null
@@ -1,41 +0,0 @@
-# DSPy Documentation
-
-This website is built using [Docusaurus](https://docusaurus.io/), a modern static website generator.
-
-### Installation
-
-```
-$ yarn
-```
-
-### Local Development
-
-```
-$ yarn start
-```
-
-This command starts a local development server and opens up a browser window. Most changes are reflected live without having to restart the server.
-
-### Build
-
-```
-$ yarn build
-```
-
-This command generates static content into the `build` directory and can be served using any static contents hosting service.
-
-### Deployment
-
-Using SSH:
-
-```
-$ USE_SSH=true yarn deploy
-```
-
-Not using SSH:
-
-```
-$ GIT_USER=<Your GitHub username> yarn deploy
-```
-
-If you are using GitHub pages for hosting, this command is a convenient way to build the website and push to the `gh-pages` branch.
diff --git a/docs-page/api/hosting_language_models_locally/HFModel.md b/docs-page/api/hosting_language_models_locally/HFModel.md
deleted file mode 100644
index 52fdf624b..000000000
--- a/docs-page/api/hosting_language_models_locally/HFModel.md
+++ /dev/null
@@ -1,5 +0,0 @@
-Initialize `HFModel` within your program with the desired model to load in. Here's an example call:
-
-    ```python
-    llama = dspy.HFModel(model = 'meta-llama/Llama-2-7b-hf')
-    ```
\ No newline at end of file
diff --git a/docs-page/api/hosting_language_models_locally/MLC.md b/docs-page/api/hosting_language_models_locally/MLC.md
deleted file mode 100644
index 87bf65d0d..000000000
--- a/docs-page/api/hosting_language_models_locally/MLC.md
+++ /dev/null
@@ -1,48 +0,0 @@
-## Setting up an MLC language model
-
-### Prerequisites
-
-Install the required packages using the following commands:
-   
-```shell
-pip install --no-deps --pre --force-reinstall mlc-ai-nightly-cu118 mlc-chat-nightly-cu118 -f https://mlc.ai/wheels
-pip install transformers
-git lfs install
-```
-
-Adjust the pip wheels according to your OS/platform by referring to the provided commands in [MLC packages](https://mlc.ai/package/).
-
-
-### Running MLC Llama-2 models
-
-1. Create a directory for prebuilt models:
-
-```shell
-mkdir -p dist/prebuilt
-```
-
-2. Clone the necessary libraries from the repository:
-
-```shell
-git clone https://github.com/mlc-ai/binary-mlc-llm-libs.git dist/prebuilt/lib
-cd dist/prebuilt
-```
-
-3. Choose a Llama-2 model from [MLC LLMs](https://huggingface.co/mlc-ai) and clone the model repository:
-
-```shell
-git clone https://huggingface.co/mlc-ai/mlc-chat-Llama-2-7b-chat-hf-q4f16_1
-```
-
-### Sending requests to the server
-
-Initialize the `ChatModuleClient` within your program with the desired parameters. Here's an example call:
-
-```python
-model = 'dist/prebuilt/mlc-chat-Llama-2-7b-chat-hf-q4f16_1'
-model_path = 'dist/prebuilt/lib/Llama-2-7b-chat-hf-q4f16_1-cuda.so'
-
-llama = dspy.ChatModuleClient(model=model, model_path=model_path)
-```
-
-Please refer to the [official MLC repository](https://github.com/mlc-ai/mlc-llm) for more detailed [docs](https://mlc.ai/mlc-llm/docs/get_started/try_out.html).
diff --git a/docs-page/api/hosting_language_models_locally/Ollama.md b/docs-page/api/hosting_language_models_locally/Ollama.md
deleted file mode 100644
index 516489c9d..000000000
--- a/docs-page/api/hosting_language_models_locally/Ollama.md
+++ /dev/null
@@ -1,43 +0,0 @@
-## Running LLMs through Ollama
-
-#### Adapted from documentation provided by https://github.com/insop
-
-Ollama is a good software tool that allows you to run LLMs locally, such as Mistral, Llama2, and Phi.
-The following are the instructions to install and run Ollama.
-
-## Prerequisites
-
-Install Ollama by following the instructions from this page:
-
-- https://ollama.ai
-
-Download model: `ollama pull`
-
-Download a model by running the `ollama pull` command. You can download Mistral, Llama2, and Phi.
-
-```bash
-# download mistral
-ollama pull mistral
-```
-
-Here is the list of other models you can download:
-- https://ollama.ai/library
-
-## Running Ollama model
-
-Run model: `ollama run`
-
-You can test a model by running the model with the `ollama run` command.
-
-```bash
-# run mistral
-ollama run mistral
-```
-
-## Sending requests to the server
-
-Here is the code to load a model through Ollama:
-
-```python
-lm = dspy.OllamaLocal(model='mistral')
-```
\ No newline at end of file
diff --git a/docs-page/api/hosting_language_models_locally/TGI.md b/docs-page/api/hosting_language_models_locally/TGI.md
deleted file mode 100644
index 3ecc6aff6..000000000
--- a/docs-page/api/hosting_language_models_locally/TGI.md
+++ /dev/null
@@ -1,60 +0,0 @@
-## Launching a Text Generation Inference (TGI) Server
-
-### Prerequisites
-
-- Docker must be installed on your system. If you don't have Docker installed, you can get it from [here](https://docs.docker.com/get-docker/).
-
-### Setting up the Text-Generation-Inference Server
-
-1. Clone the Text-Generation-Inference repository from GitHub by executing the following command:
-
-```bash
-git clone https://github.com/huggingface/text-generation-inference.git
-```
-
-2. Change into the cloned repository directory:
-
-```bash
-cd text-generation-inference
-```
-
-3. Execute the Docker command under the "Get Started" section to run the server:
-
-```bash
-model=mosaicml/mpt-30b # set to the specific Hugging Face model ID you wish to use.
-num_shard=1 # set to the number of shards you wish to use.
-volume=$PWD/data # share a volume with the Docker container to avoid downloading weights every run
-
-docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:latest --model-id $model --num-shard $num_shard
-```
-
-This command will start the server and make it accessible at `http://localhost:8080`.
-
-If you want to connect to private HuggingFace models such as [Meta Llama 2 models](https://huggingface.co/meta-llama), make sure to use version 9.3 (or higher) of the docker image (ghcr.io/huggingface/text-generation-inference:0.9.3) and pass in your huggingface token as an environment variable.
-
-```bash
-docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data -e HUGGING_FACE_HUB_TOKEN={your_token} ghcr.io/huggingface/text-generation-inference:latest --model-id $model --num-shard $num_shard
-```
-
-### Sending requests to the server
-
-After setting up the text-generation-inference server and ensuring that it displays "Connected" when it's running, you can interact with it using the `HFClientTGI`.
-
-Initialize the `HFClientTGI` within your program with the desired parameters. Here is an example call:
-
-   ```python
-   lm = dspy.HFClientTGI(model="meta-llama/Llama-2-7b-hf", port=8080, url="http://localhost")
-   ```
-
-   Customize the `model`, `port`, and `url` according to your requirements. The `model` parameter should be set to the specific Hugging Face model ID you wish to use. 
-
-
-### FAQs
-
-1. If your model doesn't require any shards, you still need to set a value for `num_shard`, but you don't need to include the parameter `--num-shard` on the command line.
-
-2. If your model runs into any "token exceeded" issues, you can set the following parameters on the command line to adjust the input length and token limit:
-   - `--max-input-length`: Set the maximum allowed input length for the text.
-   - `--max-total-tokens`: Set the maximum total tokens allowed for text generation.
-
-Please refer to the [official TGI repository](https://github.com/huggingface/text-generation-inference) for detailed docs.
diff --git a/docs-page/api/hosting_language_models_locally/_category_.json b/docs-page/api/hosting_language_models_locally/_category_.json
deleted file mode 100644
index f729af67a..000000000
--- a/docs-page/api/hosting_language_models_locally/_category_.json
+++ /dev/null
@@ -1,8 +0,0 @@
-{
-    "label": "Hosting Local Language Models in DSPy",
-    "position": 3,
-    "link": {
-      "type": "generated-index",
-      "description": "Hosting Local Language Models in DSPy"
-    }
-}
\ No newline at end of file
diff --git a/docs-page/api/hosting_language_models_locally/vLLM.md b/docs-page/api/hosting_language_models_locally/vLLM.md
deleted file mode 100644
index 4ac911f85..000000000
--- a/docs-page/api/hosting_language_models_locally/vLLM.md
+++ /dev/null
@@ -1,31 +0,0 @@
-## Launching a vLLM Server
-
-### Setting up the vLLM Server
-
-Follow these steps to set up the vLLM Server:
-
-1. Build the server from source by following the instructions provided in the [Build from Source guide](https://vllm.readthedocs.io/en/latest/getting_started/installation.html#build-from-source).
-
-2. Start the server by running the following command, and specify your desired model, host, and port using the appropriate arguments. The default server address is http://localhost:8000.
-
-Example command:
-
-```bash
-   python -m vllm.entrypoints.api_server --model mosaicml/mpt-7b --port 8000
-```
-
-This will launch the vLLM server.
-
-### Sending requests to the server
-
-After setting up the vLLM server and ensuring that it displays "Connected" when it's running, you can interact with it using the `HFClientVLLM`.
-
-Initialize the `HFClientVLLM` within your program with the desired parameters. Here is an example call:
-
-```python
-   lm = dspy.HFClientVLLM(model="mosaicml/mpt-7b", port=8000, url="http://localhost")
-```
-
-Customize the `model`, `port`, `url`, and `max_tokens` according to your requirements. The `model` parameter should be set to the specific Hugging Face model ID you wish to use.
-
-Please refer to the [official vLLM repository](https://github.com/vllm-project/vllm) for more detailed information and documentation.
diff --git a/docs-page/api/intro.md b/docs-page/api/intro.md
deleted file mode 100644
index 165f76c2d..000000000
--- a/docs-page/api/intro.md
+++ /dev/null
@@ -1,5 +0,0 @@
----
-sidebar_position: 1
----
-
-# API References
\ No newline at end of file
diff --git a/docs-page/docs/deep-dive/language_model_clients/local_models/_category_.json b/docs-page/docs/deep-dive/language_model_clients/local_models/_category_.json
deleted file mode 100644
index e1d91e8ce..000000000
--- a/docs-page/docs/deep-dive/language_model_clients/local_models/_category_.json
+++ /dev/null
@@ -1,8 +0,0 @@
-{
-    "label": "Local Language Model Clients",
-    "position": 1,
-    "link": {
-      "type": "generated-index",
-      "description": "Local Language Model Clients in DSPy"
-    }
-}
\ No newline at end of file
diff --git a/docs-page/.gitignore b/docs/.gitignore
similarity index 100%
rename from docs-page/.gitignore
rename to docs/.gitignore
diff --git a/docs/DSPy-preprint.pdf b/docs/DSPy-preprint.pdf
deleted file mode 100644
index df4b2e025..000000000
Binary files a/docs/DSPy-preprint.pdf and /dev/null differ
diff --git a/docs/README.md b/docs/README.md
new file mode 100644
index 000000000..7c4caca54
--- /dev/null
+++ b/docs/README.md
@@ -0,0 +1,23 @@
+# DSPy Documentation
+
+This website is built using [Docusaurus](https://docusaurus.io/), a modern static website generator.
+
+## Contributing to the `docs` Folder
+
+This guide is for contributors looking to make changes to the documentation in the `dspy/docs` folder. 
+
+1. **Pull the up-to-date version of the website**: Please pull the latest version of the live documentation site via its subtree repository with the following command:
+
+```bash
+#Ensure you are in the top-level dspy/ folder
+git subtree pull --prefix=docs https://github.com/krypticmouse/dspy-docs master
+```
+
+2. **Push your new changes on a new branch**: Feel free to add or edit existing documentation and open a PR for your changes. Once your PR is reviewed and approved, the changes will be ready to merge into main. 
+
+3. **Updating the website**: Once your changes are merged to main, they need to be pushed to the subtree repository that hosts the live documentation site. This step will eventually be done automatically, but for now, please run the following command to push the updated `docs` content to the website subtree repository:
+
+```bash
+#Ensure you are in the top-level dspy/ folder
+git subtree push --prefix=docs https://github.com/krypticmouse/dspy-docs master
+```
diff --git a/docs/assertions.md b/docs/api/assertions.md
similarity index 99%
rename from docs/assertions.md
rename to docs/api/assertions.md
index a1601227d..07dfa55f4 100644
--- a/docs/assertions.md
+++ b/docs/api/assertions.md
@@ -1,11 +1,14 @@
-# DSPy Assertions 
-## Introduction
+---
+sidebar_position: 7
+---
+
+# DSPy Assertions
 
 Language models (LMs) have transformed how we interact with machine learning, offering vast capabilities in natural language understanding and generation. However, ensuring these models adhere to domain-specific constraints remains a challenge. Despite the growth of techniques like fine-tuning or “prompt engineering”, these approaches are extremely tedious and rely on heavy, manual hand-waving to guide the LMs in adhering to specific constraints. Even DSPy's modularity of programming prompting pipelines lacks mechanisms to effectively and automatically enforce these constraints. 
 
 To address this, we introduce DSPy Assertions, a feature within the DSPy framework designed to automate the enforcement of computational constraints on LMs. DSPy Assertions empower developers to guide LMs towards desired outcomes with minimal manual intervention, enhancing the reliability, predictability, and correctness of LM outputs.
 
-### dspy.Assert and dspy.Suggest API
+## dspy.Assert and dspy.Suggest API
 
 We introduce two primary constructs within DSPy Assertions:
 
@@ -255,4 +258,4 @@ compiled_with_assertions_baleen = teleprompter.compile(student = baleen, teacher
 #Compilation + Inference with Assertions
 compiled_baleen_with_assertions = teleprompter.compile(student=baleen_with_assertions, teacher = baleen_with_assertions, trainset=trainset, valset=devset)
 
-```
+```
\ No newline at end of file
diff --git a/docs/api/intro.md b/docs/api/intro.md
new file mode 100644
index 000000000..7859c7d0e
--- /dev/null
+++ b/docs/api/intro.md
@@ -0,0 +1,7 @@
+---
+sidebar_position: 1
+---
+
+# API References
+
+Welcome to the API References for DSPy! This is where you'll find easy-to-understand information about all the parts of DSPy that you can use in your projects. We've got guides on different tools and helpers that DSPy has, like modules and optimizers. Everything is sorted so you can quickly find what you need. If you're making something and need to quickly get started with DSPy to do certain tasks, this place will show you how to set it up and get it working just right.
\ No newline at end of file
diff --git a/docs/api/language_model_clients/Anyscale.md b/docs/api/language_model_clients/Anyscale.md
new file mode 100644
index 000000000..c9485420e
--- /dev/null
+++ b/docs/api/language_model_clients/Anyscale.md
@@ -0,0 +1,31 @@
+---
+sidebar_position: 6
+---
+
+# dspy.Anyscale
+
+### Usage
+
+```python
+lm = dspy.Anyscale(model="mistralai/Mistral-7B-Instruct-v0.1")
+```
+
+### Constructor
+
+The constructor initializes the base class `LM` and verifies the `api_key` for using Anyscale API.
+We expect the following environment variables to be set:
+- `ANYSCALE_API_KEY`: API key for Together.
+- `ANYSCALE_API_BASE`: API base URL for Together.
+
+
+```python
+class Anyscale(HFModel):
+    def __init__(self, model, **kwargs):
+```
+
+**Parameters:**
+- `model` (_str_): models hosted on Together.
+
+### Methods
+
+Refer to [`dspy.OpenAI`](https://dspy-docs.vercel.app/api/language_model_clients/OpenAI) documentation.
diff --git a/docs/api/language_model_clients/AzureOpenAI.md b/docs/api/language_model_clients/AzureOpenAI.md
new file mode 100644
index 000000000..1f173eb5e
--- /dev/null
+++ b/docs/api/language_model_clients/AzureOpenAI.md
@@ -0,0 +1,56 @@
+---
+sidebar_position: 2
+---
+
+# dspy.AzureOpenAI
+
+### Usage
+
+```python
+lm = dspy.AzureOpenAI(api_base='...', api_version='2023-12-01-preview', model='gpt-3.5-turbo')
+```
+
+### Constructor
+
+The constructor initializes the base class `LM` and verifies the provided arguments like the `api_provider`, `api_key`, and `api_base` to set up OpenAI request retrieval through Azure. The `kwargs` attribute is initialized with default values for relevant text generation parameters needed for communicating with the GPT API, such as `temperature`, `max_tokens`, `top_p`, `frequency_penalty`, `presence_penalty`, and `n`.
+
+```python
+class AzureOpenAI(LM):
+    def __init__(
+        self,
+        api_base: str,
+        api_version: str,
+        model: str = "gpt-3.5-turbo-instruct",
+        api_key: Optional[str] = None,
+        model_type: Literal["chat", "text"] = None,
+        **kwargs,
+    ):
+```
+
+
+
+**Parameters:** 
+- `api_base` (str): Azure Base URL.
+- `api_version` (str): Version identifier for Azure OpenAI API.
+- `api_key` (_Optional[str]_, _optional_): API provider authentication token. Retrieves from `AZURE_OPENAI_KEY` environment variable if None.
+- `model_type` (_Literal["chat", "text"]_): Specified model type to use, defaults to 'chat'.
+- `**kwargs`: Additional language model arguments to pass to the API provider.
+
+### Methods
+
+#### `__call__(self, prompt: str, only_completed: bool = True, return_sorted: bool = False, **kwargs) -> List[Dict[str, Any]]`
+
+Retrieves completions from Azure OpenAI Endpoints by calling `request`. 
+
+Internally, the method handles the specifics of preparing the request prompt and corresponding payload to obtain the response.
+
+After generation, the completions are post-processed based on the `model_type` parameter. If the parameter is set to 'chat', the generated content look like `choice["message"]["content"]`. Otherwise, the generated text will be `choice["text"]`.
+
+**Parameters:**
+- `prompt` (_str_): Prompt to send to Azure OpenAI.
+- `only_completed` (_bool_, _optional_): Flag to return only completed responses and ignore completion due to length. Defaults to True.
+- `return_sorted` (_bool_, _optional_): Flag to sort the completion choices using the returned averaged log-probabilities. Defaults to False.
+- `**kwargs`: Additional keyword arguments for completion request.
+
+**Returns:**
+- `List[Dict[str, Any]]`: List of completion choices.
\ No newline at end of file
diff --git a/docs/api/language_model_clients/Cohere.md b/docs/api/language_model_clients/Cohere.md
new file mode 100644
index 000000000..f3a39e1e6
--- /dev/null
+++ b/docs/api/language_model_clients/Cohere.md
@@ -0,0 +1,34 @@
+---
+sidebar_position: 3
+---
+
+# dsp.Cohere
+
+### Usage
+
+```python
+lm = dsp.Cohere(model='command-nightly')
+```
+
+### Constructor
+
+The constructor initializes the base class `LM` and verifies the `api_key` to set up Cohere request retrieval.
+
+```python
+class Cohere(LM):
+    def __init__(
+        self,
+        model: str = "command-nightly",
+        api_key: Optional[str] = None,
+        stop_sequences: List[str] = [],
+    ):
+```
+
+**Parameters:**
+- `model` (_str_): Cohere pretrained models. Defaults to `command-nightly`.
+- `api_key` (_Optional[str]_, _optional_): API provider from Cohere. Defaults to None.
+- `stop_sequences` (_List[str]_, _optional_): List of stopping tokens to end generation.
+
+### Methods
+
+Refer to [`dspy.OpenAI`](https://dspy-docs.vercel.app/api/language_model_clients/OpenAI) documentation.
diff --git a/docs/api/language_model_clients/Databricks.md b/docs/api/language_model_clients/Databricks.md
new file mode 100644
index 000000000..e6201e1dc
--- /dev/null
+++ b/docs/api/language_model_clients/Databricks.md
@@ -0,0 +1,43 @@
+---
+sidebar_position: 8
+---
+
+# dspy.Databricks
+
+### Usage
+```python
+lm = dspy.Databricks(model="databricks-mpt-30b-instruct")
+```
+
+### Constructor
+
+The constructor inherits from the `GPT3` class and verifies the Databricks authentication credentials for using Databricks Model Serving API through the OpenAI SDK.
+We expect the following environment variables to be set:
+- `openai.api_key`: Databricks API key.
+- `openai.base_url`: Databricks Model Endpoint url
+
+The `kwargs` attribute is initialized with default values for relevant text generation parameters needed for communicating with the Databricks OpenAI SDK, such as `temperature`, `max_tokens`, `top_p`, and `n`. However, it removes the `frequency_penalty` and `presence_penalty` arguments as these are not currently supported by the Databricks API.
+
+```python
+class Databricks(GPT3):
+    def __init__(
+        self,
+        model: str,
+        api_key: Optional[str] = None,
+        api_base: Optional[str] = None,
+        model_type: Literal["chat", "text"] = None,
+        **kwargs,
+    ):
+```
+
+**Parameters:**
+- `model` (_str_): models hosted on Databricks.
+- `stop` (_List[str]_, _optional_): List of stopping tokens to end generation.
+- `api_key` (_Optional[str]_): Databricks API key. Defaults to None
+- `api_base` (_Optional[str]_): Databricks Model Endpoint url Defaults to None.
+- `model_type` (_Literal["chat", "text", "embeddings"]_): Specified model type to use.
+- `**kwargs`: Additional language model arguments to pass to the API provider.
+
+### Methods
+
+Refer to [`dspy.OpenAI`](https://dspy-docs.vercel.app/api/language_model_clients/OpenAI) documentation.
\ No newline at end of file
diff --git a/docs/api/language_model_clients/HFClientVLLM.md b/docs/api/language_model_clients/HFClientVLLM.md
new file mode 100644
index 000000000..347ce89eb
--- /dev/null
+++ b/docs/api/language_model_clients/HFClientVLLM.md
@@ -0,0 +1,23 @@
+---
+sidebar_position: 5
+---
+
+# dspy.HFClientVLLM
+
+### Usage
+
+```python
+lm = dspy.HFClientVLLM(model="meta-llama/Llama-2-7b-hf", port=8080, url="http://localhost")
+```
+
+### Prerequisites
+
+Refer to the [vLLM Server](https://dspy-docs.vercel.app/api/language_model_clients/HFClientVLLM) section of the `Using Local Models` documentation.
+
+### Constructor
+
+Refer to [`dspy.TGI`](https://dspy-docs.vercel.app/api/language_model_clients/TGI) documentation. Replace with `HFClientVLLM`.
+
+### Methods
+
+Refer to [`dspy.OpenAI`](https://dspy-docs.vercel.app/api/language_model_clients/OpenAI) documentation.
\ No newline at end of file
diff --git a/docs/api/language_model_clients/OpenAI.md b/docs/api/language_model_clients/OpenAI.md
new file mode 100644
index 000000000..414a3afdc
--- /dev/null
+++ b/docs/api/language_model_clients/OpenAI.md
@@ -0,0 +1,54 @@
+---
+sidebar_position: 1
+---
+
+# dspy.OpenAI
+
+### Usage
+
+```python
+lm = dspy.OpenAI(model='gpt-3.5-turbo')
+```
+
+### Constructor
+
+The constructor initializes the base class `LM` and verifies the provided arguments like the `api_provider`, `api_key`, and `api_base` to set up OpenAI request retrieval. The `kwargs` attribute is initialized with default values for relevant text generation parameters needed for communicating with the GPT API, such as `temperature`, `max_tokens`, `top_p`, `frequency_penalty`, `presence_penalty`, and `n`.
+
+```python
+class OpenAI(LM):
+    def __init__(
+        self,
+        model: str = "text-davinci-002",
+        api_key: Optional[str] = None,
+        api_provider: Literal["openai"] = "openai",
+        model_type: Literal["chat", "text"] = None,
+        **kwargs,
+    ):
+```
+
+
+
+**Parameters:** 
+- `api_key` (_Optional[str]_, _optional_): API provider authentication token. Defaults to None.
+- `api_provider` (_Literal["openai"]_, _optional_): API provider to use. Defaults to "openai".
+- `model_type` (_Literal["chat", "text"]_): Specified model type to use.
+- `**kwargs`: Additional language model arguments to pass to the API provider.
+
+### Methods
+
+#### `__call__(self, prompt: str, only_completed: bool = True, return_sorted: bool = False, **kwargs) -> List[Dict[str, Any]]`
+
+Retrieves completions from OpenAI by calling `request`. 
+
+Internally, the method handles the specifics of preparing the request prompt and corresponding payload to obtain the response.
+
+After generation, the completions are post-processed based on the `model_type` parameter. If the parameter is set to 'chat', the generated content look like `choice["message"]["content"]`. Otherwise, the generated text will be `choice["text"]`.
+
+**Parameters:**
+- `prompt` (_str_): Prompt to send to OpenAI.
+- `only_completed` (_bool_, _optional_): Flag to return only completed responses and ignore completion due to length. Defaults to True.
+- `return_sorted` (_bool_, _optional_): Flag to sort the completion choices using the returned averaged log-probabilities. Defaults to False.
+- `**kwargs`: Additional keyword arguments for completion request.
+
+**Returns:**
+- `List[Dict[str, Any]]`: List of completion choices.
\ No newline at end of file
diff --git a/docs/api/language_model_clients/TGI.md b/docs/api/language_model_clients/TGI.md
new file mode 100644
index 000000000..c44902349
--- /dev/null
+++ b/docs/api/language_model_clients/TGI.md
@@ -0,0 +1,34 @@
+---
+sidebar_position: 4
+---
+
+# dspy.HFClientTGI
+
+### Usage
+
+```python
+lm = dspy.HFClientTGI(model="meta-llama/Llama-2-7b-hf", port=8080, url="http://localhost")
+```
+
+### Prerequisites
+
+Refer to the [Text Generation-Inference Server](https://dspy-docs.vercel.app/docs/deep-dive/language_model_clients/local_models/HFClientTGI) section of the `Using Local Models` documentation.
+
+### Constructor
+
+The constructor initializes the `HFModel` base class and configures the client for communicating with the TGI server. It requires a `model` instance, communication `port` for the server, and the `url` for the server to host generate requests. Additional configuration can be provided via keyword arguments in `**kwargs`.
+
+```python
+class HFClientTGI(HFModel):
+    def __init__(self, model, port, url="http://future-hgx-1", **kwargs):
+```
+
+**Parameters:**
+- `model` (_HFModel_): Instance of Hugging Face model connected to the TGI server.
+- `port` (_int_): Port for TGI server.
+- `url` (_str_): Base URL where the TGI server is hosted. 
+- `**kwargs`: Additional keyword arguments to configure the client.
+
+### Methods
+
+Refer to [`dspy.OpenAI`](https://dspy-docs.vercel.app/api/language_model_clients/OpenAI) documentation.
\ No newline at end of file
diff --git a/docs/api/language_model_clients/Together.md b/docs/api/language_model_clients/Together.md
new file mode 100644
index 000000000..c6baccf6d
--- /dev/null
+++ b/docs/api/language_model_clients/Together.md
@@ -0,0 +1,32 @@
+---
+sidebar_position: 7
+---
+
+# dspy.Together
+
+### Usage
+
+```python
+lm = dspy.Together(model="mistralai/Mistral-7B-v0.1")
+```
+
+### Constructor
+
+The constructor initializes the base class `LM` and verifies the `api_key` for using Together API.
+We expect the following environment variables to be set:
+- `TOGETHER_API_KEY`: API key for Together.
+- `TOGETHER_API_BASE`: API base URL for Together.
+
+
+```python
+class Together(HFModel):
+    def __init__(self, model, **kwargs):
+```
+
+**Parameters:**
+- `model` (_str_): models hosted on Together.
+- `stop` (_List[str]_, _optional_): List of stopping tokens to end generation.
+
+### Methods
+
+Refer to [`dspy.OpenAI`](https://dspy-docs.vercel.app/api/language_model_clients/OpenAI) documentation.
\ No newline at end of file
diff --git a/docs/api/language_model_clients/_category_.json b/docs/api/language_model_clients/_category_.json
new file mode 100644
index 000000000..3f6129bae
--- /dev/null
+++ b/docs/api/language_model_clients/_category_.json
@@ -0,0 +1,8 @@
+{
+    "label": "Language Model API Clients",
+    "position": 4,
+    "link": {
+      "type": "generated-index",
+      "description": "This documentation provides an overview of the DSPy Language Model Clients."
+    }
+}
\ No newline at end of file
diff --git a/docs-page/api/modules/ChainOfThought.md b/docs/api/modules/ChainOfThought.md
similarity index 99%
rename from docs-page/api/modules/ChainOfThought.md
rename to docs/api/modules/ChainOfThought.md
index f17919fa7..ab9bd0e5f 100644
--- a/docs-page/api/modules/ChainOfThought.md
+++ b/docs/api/modules/ChainOfThought.md
@@ -1,5 +1,7 @@
 # dspy.ChainOfThought
 
+### Constructor
+
 The constructor initializes the `ChainOfThought` class and sets up its attributes. It inherits from the `Predict` class and adds specific functionality for chain of thought processing. 
 
 Internally, the class initializes the `activated` attribute to indicate if chain of thought processing has been selected. It extends the `signature` to include additional reasoning steps and an updated `rationale_type` when chain of thought processing is activated.
diff --git a/docs-page/api/modules/ChainOfThoughtWithHint.md b/docs/api/modules/ChainOfThoughtWithHint.md
similarity index 100%
rename from docs-page/api/modules/ChainOfThoughtWithHint.md
rename to docs/api/modules/ChainOfThoughtWithHint.md
diff --git a/docs-page/api/modules/MultiChainComparison.md b/docs/api/modules/MultiChainComparison.md
similarity index 100%
rename from docs-page/api/modules/MultiChainComparison.md
rename to docs/api/modules/MultiChainComparison.md
diff --git a/docs-page/api/modules/Predict.md b/docs/api/modules/Predict.md
similarity index 100%
rename from docs-page/api/modules/Predict.md
rename to docs/api/modules/Predict.md
diff --git a/docs-page/api/modules/ProgramOfThought.md b/docs/api/modules/ProgramOfThought.md
similarity index 100%
rename from docs-page/api/modules/ProgramOfThought.md
rename to docs/api/modules/ProgramOfThought.md
diff --git a/docs-page/api/modules/ReAct.md b/docs/api/modules/ReAct.md
similarity index 100%
rename from docs-page/api/modules/ReAct.md
rename to docs/api/modules/ReAct.md
diff --git a/docs-page/api/modules/Retrieve.md b/docs/api/modules/Retrieve.md
similarity index 100%
rename from docs-page/api/modules/Retrieve.md
rename to docs/api/modules/Retrieve.md
diff --git a/docs-page/docs/deep-dive/modules/_category_.json b/docs/api/modules/_category_.json
similarity index 86%
rename from docs-page/docs/deep-dive/modules/_category_.json
rename to docs/api/modules/_category_.json
index 3488b4208..5cf44ce64 100644
--- a/docs-page/docs/deep-dive/modules/_category_.json
+++ b/docs/api/modules/_category_.json
@@ -1,6 +1,6 @@
 {
     "label": "Modules",
-    "position": 3,
+    "position": 1,
     "link": {
       "type": "generated-index",
       "description": "Modules in DSPy"
diff --git a/docs/api/optimizers/BootstrapFewShot.md b/docs/api/optimizers/BootstrapFewShot.md
new file mode 100644
index 000000000..14b2deff5
--- /dev/null
+++ b/docs/api/optimizers/BootstrapFewShot.md
@@ -0,0 +1,63 @@
+---
+sidebar_position: 2
+---
+
+# teleprompt.BootstrapFewShot
+
+### Constructor
+
+The constructor initializes the `BootstrapFewShot` class and sets up parameters for bootstrapping.
+
+```python
+class BootstrapFewShot(Teleprompter):
+    def __init__(self, metric=None, teacher_settings={}, max_bootstrapped_demos=4, max_labeled_demos=16, max_rounds=1):
+        self.metric = metric
+        self.teacher_settings = teacher_settings
+
+        self.max_bootstrapped_demos = max_bootstrapped_demos
+        self.max_labeled_demos = max_labeled_demos
+        self.max_rounds = max_rounds
+```
+
+**Parameters:**
+- `metric` (_callable_, _optional_): Metric function to evaluate examples during bootstrapping. Defaults to `None`.
+- `teacher_settings` (_dict_, _optional_): Settings for teacher predictor. Defaults to empty dictionary.
+- `max_bootstrapped_demos` (_int_, _optional_): Maximum number of bootstrapped demonstrations per predictor. Defaults to 4.
+- `max_labeled_demos` (_int_, _optional_): Maximum number of labeled demonstrations per predictor. Defaults to 16.
+- `max_rounds` (_int_, _optional_): Maximum number of bootstrapping rounds. Defaults to 1.
+
+### Method
+
+#### `compile(self, student, *, teacher=None, trainset, valset=None)`
+
+This method compiles the BootstrapFewShot instance by performing bootstrapping to refine the student predictor.
+
+This process includes preparing the student and teacher predictors, which involves creating predictor copies, verifying the student predictor is uncompiled, and compiling the teacher predictor with labeled demonstrations via LabeledFewShot if the teacher predictor hasn't been compiled.
+
+The next stage involves preparing predictor mappings by validating that both the student and teacher predictors have the same program structure and the same signatures but are different objects.
+
+The final stage is performing the bootstrapping iterations.
+
+**Parameters:**
+- `student` (_Teleprompter_): Student predictor to be compiled.
+- `teacher` (_Teleprompter_, _optional_): Teacher predictor used for bootstrapping. Defaults to `None`.
+- `trainset` (_list_): Training dataset used in bootstrapping.
+- `valset` (_list_, _optional_): Validation dataset used in compilation. Defaults to `None`.
+
+**Returns:**
+- The compiled `student` predictor after bootstrapping with refined demonstrations.
+
+### Example
+
+```python
+#Assume defined trainset
+#Assume defined RAG class
+...
+
+#Define teleprompter and include teacher
+teacher = dspy.OpenAI(model='gpt-3.5-turbo', api_key = openai.api_key, api_provider = "openai", model_type = "chat")
+teleprompter = BootstrapFewShot(teacher_settings=dict({'lm': teacher}))
+
+# Compile!
+compiled_rag = teleprompter.compile(student=RAG(), trainset=trainset)
+```
diff --git a/docs/api/optimizers/BootstrapFewShotWithRandomSearch.md b/docs/api/optimizers/BootstrapFewShotWithRandomSearch.md
new file mode 100644
index 000000000..1a36af2a8
--- /dev/null
+++ b/docs/api/optimizers/BootstrapFewShotWithRandomSearch.md
@@ -0,0 +1,59 @@
+---
+sidebar_position: 4
+---
+
+# teleprompt.BootstrapFewShotWithRandomSearch
+
+### Constructor
+
+The constructor initializes the `BootstrapFewShotWithRandomSearch` class and sets up its attributes. It inherits from the `BootstrapFewShot` class and introduces additional attributes for the random search process.
+
+```python
+class BootstrapFewShotWithRandomSearch(BootstrapFewShot):
+    def __init__(self, metric, teacher_settings={}, max_bootstrapped_demos=4, max_labeled_demos=16, max_rounds=1, num_candidate_programs=16, num_threads=6):
+        self.metric = metric
+        self.teacher_settings = teacher_settings
+        self.max_rounds = max_rounds
+
+        self.num_threads = num_threads
+
+        self.min_num_samples = 1
+        self.max_num_samples = max_bootstrapped_demos
+        self.num_candidate_sets = num_candidate_programs
+        self.max_num_traces = 1 + int(max_bootstrapped_demos / 2.0 * self.num_candidate_sets)
+
+        self.max_bootstrapped_demos = self.max_num_traces
+        self.max_labeled_demos = max_labeled_demos
+
+        print("Going to sample between", self.min_num_samples, "and", self.max_num_samples, "traces per predictor.")
+        print("Going to sample", self.max_num_traces, "traces in total.")
+        print("Will attempt to train", self.num_candidate_sets, "candidate sets.")
+```
+
+**Parameters:**
+- `metric` (_callable_, _optional_): Metric function to evaluate examples during bootstrapping. Defaults to `None`.
+- `teacher_settings` (_dict_, _optional_): Settings for teacher predictor. Defaults to empty dictionary.
+- `max_bootstrapped_demos` (_int_, _optional_): Maximum number of bootstrapped demonstrations per predictor. Defaults to 4.
+- `max_labeled_demos` (_int_, _optional_): Maximum number of labeled demonstrations per predictor. Defaults to 16.
+- `max_rounds` (_int_, _optional_): Maximum number of bootstrapping rounds. Defaults to 1.
+- `num_candidate_programs` (_int_): Number of candidate programs to generate during random search.
+- `num_threads` (_int_): Number of threads used for evaluation during random search.
+
+### Method
+
+Refer to [teleprompt.BootstrapFewShot](https://dspy-docs.vercel.app/docs/deep-dive/teleprompter/bootstrap-fewshot) documentation.
+
+## Example
+
+```python
+#Assume defined trainset
+#Assume defined RAG class
+...
+
+#Define teleprompter and include teacher
+teacher = dspy.OpenAI(model='gpt-3.5-turbo', api_key = openai.api_key, api_provider = "openai", model_type = "chat")
+teleprompter = BootstrapFewShotWithRandomSearch(teacher_settings=dict({'lm': teacher}))
+
+# Compile!
+compiled_rag = teleprompter.compile(student=RAG(), trainset=trainset)
+```
\ No newline at end of file
diff --git a/docs/api/optimizers/BootstrapFinetune.md b/docs/api/optimizers/BootstrapFinetune.md
new file mode 100644
index 000000000..86540aa46
--- /dev/null
+++ b/docs/api/optimizers/BootstrapFinetune.md
@@ -0,0 +1,56 @@
+---
+sidebar_position: 5
+---
+
+# teleprompt.BootstrapFinetune
+
+### Constructor
+
+### `__init__(self, metric=None, teacher_settings={}, multitask=True)`
+
+The constructor initializes a `BootstrapFinetune` instance and sets up its attributes. It defines the teleprompter as a `BootstrapFewShot` instance for the finetuning compilation.
+
+```python
+class BootstrapFinetune(Teleprompter):
+    def __init__(self, metric=None, teacher_settings={}, multitask=True):
+```
+
+**Parameters:**
+- `metric` (_callable_, _optional_): Metric function to evaluate examples during bootstrapping. Defaults to `None`.
+- `teacher_settings` (_dict_, _optional_): Settings for teacher predictor. Defaults to empty dictionary.
+- `multitask` (_bool_, _optional_): Enable multitask fine-tuning. Defaults to `True`.
+
+### Method
+
+#### `compile(self, student, *, teacher=None, trainset, valset=None, target='t5-large', bsize=12, accumsteps=1, lr=5e-5, epochs=1, bf16=False)`
+
+This method first compiles for bootstrapping with the `BootstrapFewShot` teleprompter. It then prepares fine-tuning data by generating prompt-completion pairs for training and performs finetuning. After compilation, the LMs are set to the finetuned models and the method returns a compiled and fine-tuned predictor.
+
+**Parameters:**
+- `student` (_Predict_): Student predictor to be fine-tuned.
+- `teacher` (_Predict_, _optional_): Teacher predictor to help with fine-tuning. Defaults to `None`.
+- `trainset` (_list_): Training dataset for fine-tuning.
+- `valset` (_list_, _optional_): Validation dataset for fine-tuning. Defaults to `None`.
+- `target` (_str_, _optional_): Target model for fine-tuning. Defaults to `'t5-large'`.
+- `bsize` (_int_, _optional_): Batch size for training. Defaults to `12`.
+- `accumsteps` (_int_, _optional_): Gradient accumulation steps. Defaults to `1`.
+- `lr` (_float_, _optional_): Learning rate for fine-tuning. Defaults to `5e-5`.
+- `epochs` (_int_, _optional_): Number of training epochs. Defaults to `1`.
+- `bf16` (_bool_, _optional_): Enable mixed-precision training with BF16. Defaults to `False`.
+
+**Returns:**
+- `compiled2` (_Predict_): A compiled and fine-tuned `Predict` instance.
+
+### Example
+
+```python
+#Assume defined trainset
+#Assume defined RAG class
+...
+
+#Define teleprompter
+teleprompter = BootstrapFinetune(teacher_settings=dict({'lm': teacher}))
+
+# Compile!
+compiled_rag = teleprompter.compile(student=RAG(), trainset=trainset, target='google/flan-t5-base')
+```
\ No newline at end of file
diff --git a/docs/api/optimizers/Ensemble.md b/docs/api/optimizers/Ensemble.md
new file mode 100644
index 000000000..42dbc0dde
--- /dev/null
+++ b/docs/api/optimizers/Ensemble.md
@@ -0,0 +1,47 @@
+---
+sidebar_position: 3
+---
+
+# teleprompt.Ensemble
+
+### Constructor
+
+The constructor initializes the `Ensemble` class and sets up its attributes. This teleprompter is designed to create ensembled versions of multiple programs, reducing various outputs from different programs into a single output.
+
+```python
+class Ensemble(Teleprompter):
+    def __init__(self, *, reduce_fn=None, size=None, deterministic=False):
+```
+
+**Parameters:**
+- `reduce_fn` (_callable_, _optional_): Function used to reduce multiple outputs from different programs into a single output. A common choice is `dspy.majority`. Defaults to `None`.
+- `size` (_int_, _optional_): Number of programs to randomly select for ensembling. If not specified, all programs will be used. Defaults to `None`.
+- `deterministic` (_bool_, _optional_): Specifies whether ensemble should operate deterministically. Currently, setting this to `True` will raise an error as this feature is pending implementation. Defaults to `False`.
+
+### Method
+
+#### `compile(self, programs)`
+
+This method compiles an ensemble of programs into a single program that when run, can either randomly sample a subset of the given programs to produce outputs or use all of them. The multiple outputs can then be reduced into a single output using the `reduce_fn`.
+
+**Parameters:**
+- `programs` (_list_): List of programs to be ensembled.
+
+**Returns:**
+- `EnsembledProgram` (_Module_): An ensembled version of the input programs.
+
+### Example
+
+```python
+import dspy
+from dspy.teleprompt import Ensemble
+
+# Assume a list of programs
+programs = [program1, program2, program3, ...]
+
+# Define Ensemble teleprompter
+teleprompter = Ensemble(reduce_fn=dspy.majority, size=2)
+
+# Compile to get the EnsembledProgram
+ensembled_program = teleprompter.compile(programs)
+```
\ No newline at end of file
diff --git a/docs/api/optimizers/LabeledFewShot.md b/docs/api/optimizers/LabeledFewShot.md
new file mode 100644
index 000000000..d58f688aa
--- /dev/null
+++ b/docs/api/optimizers/LabeledFewShot.md
@@ -0,0 +1,58 @@
+---
+sidebar_position: 1
+---
+
+# teleprompt.LabeledFewShot
+
+### Constructor
+
+The constructor initializes the `LabeledFewShot` class and sets up its attributes, particularly defining `k` number of samples to be used by the predictor.
+
+```python
+class LabeledFewShot(Teleprompter):
+    def __init__(self, k=16):
+        self.k = k
+```
+
+**Parameters:**
+- `k` (_int_): Number of samples to be used for each predictor. Defaults to 16.
+
+### Method
+
+#### `compile(self, student, *, trainset)`
+
+This method compiles the `LabeledFewShot` instance by configuring the `student` predictor. It assigns subsets of the `trainset` in each student's predictor's `demos` attribute. If the `trainset` is empty, the method returns the original `student`.
+
+**Parameters:**
+- `student` (_Teleprompter_): Student predictor to be compiled.
+- `trainset` (_list_): Training dataset for compiling with student predictor.
+
+**Returns:**
+- The compiled `student` predictor with assigned training samples for each predictor or the original `student` if the `trainset` is empty.
+
+### Example
+
+```python
+import dspy
+
+#Assume defined trainset
+class RAG(dspy.Module):
+    def __init__(self, num_passages=3):
+        super().__init__()
+
+        #declare retrieval and predictor modules
+        self.retrieve = dspy.Retrieve(k=num_passages)
+        self.generate_answer = dspy.ChainOfThought(GenerateAnswer)
+    
+    #flow for answering questions using predictor and retrieval modules
+    def forward(self, question):
+        context = self.retrieve(question).passages
+        prediction = self.generate_answer(context=context, question=question)
+        return dspy.Prediction(context=context, answer=prediction.answer)
+
+#Define teleprompter
+teleprompter = LabeledFewShot()
+
+# Compile!
+compiled_rag = teleprompter.compile(student=RAG(), trainset=trainset)
+```
\ No newline at end of file
diff --git a/docs/api/optimizers/_category_.json b/docs/api/optimizers/_category_.json
new file mode 100644
index 000000000..5d7edc9df
--- /dev/null
+++ b/docs/api/optimizers/_category_.json
@@ -0,0 +1,8 @@
+{
+    "label": "Optimizers",
+    "position": 2,
+    "link": {
+        "type": "generated-index",
+        "description": "Teleprompters are powerful optimizers (included in DSPy) that can learn to bootstrap and select effective prompts for the modules of any program. (The \"tele-\" in the name means \"at a distance\", i.e., automatic prompting at a distance.)\n\nThis documentation provides an overview of the DSPy Teleprompters."
+    }
+}
\ No newline at end of file
diff --git a/docs/api/retrieval_model_clients/AzureCognitiveSearch.md b/docs/api/retrieval_model_clients/AzureCognitiveSearch.md
new file mode 100644
index 000000000..b8c353f31
--- /dev/null
+++ b/docs/api/retrieval_model_clients/AzureCognitiveSearch.md
@@ -0,0 +1,34 @@
+---
+sidebar_position: 3
+---
+
+# retrieve.AzureCognitiveSearch
+
+### Constructor
+
+The constructor initializes an instance of the `AzureCognitiveSearch` class and sets up parameters for sending queries and retreiving results  with the Azure Cognitive Search server.
+
+```python
+class AzureCognitiveSearch:
+    def __init__(
+        self,
+        search_service_name: str,
+        search_api_key: str,
+        search_index_name: str,
+        field_text: str,
+        field_score: str, # required field to map with "score" field in dsp framework
+    ):
+```
+
+**Parameters:**
+- `search_service_name` (_str_): Name of Azure Cognitive Search server.
+- `search_api_key` (_str_): API Authentication token for accessing Azure Cognitive Search server.
+- `search_index_name` (_str_): Name of search index in the Azure Cognitive Search server.
+- `field_text` (_str_): Field name that maps to DSP "content" field.
+- `field_score` (_str_): Field name that maps to DSP "score" field.
+
+### Methods
+
+Refer to [ColBERTv2](/api/retrieval_model_clients/ColBERTv2) documentation. Keep in mind there is no `simplify` flag for AzureCognitiveSearch.
+
+AzureCognitiveSearch supports sending queries and processing the received results, mapping content and scores to a correct format for the Azure Cognitive Search server.
\ No newline at end of file
diff --git a/docs/api/retrieval_model_clients/ChromadbRM.md b/docs/api/retrieval_model_clients/ChromadbRM.md
new file mode 100644
index 000000000..9b40759a5
--- /dev/null
+++ b/docs/api/retrieval_model_clients/ChromadbRM.md
@@ -0,0 +1,65 @@
+---
+sidebar_position: 2
+---
+
+# retrieve.ChromadbRM
+
+### Constructor
+
+Initialize an instance of the `ChromadbRM` class, with the option to use OpenAI's embeddings or any alternative supported by chromadb, as detailed in the official [chromadb embeddings documentation](https://docs.trychroma.com/embeddings).
+
+```python
+ChromadbRM(
+    collection_name: str,
+    persist_directory: str,
+    embedding_function: Optional[EmbeddingFunction[Embeddable]] = OpenAIEmbeddingFunction(),
+    k: int = 7,
+)
+```
+
+**Parameters:**
+- `collection_name` (_str_): The name of the chromadb collection.
+- `persist_directory` (_str_): Path to the directory where chromadb data is persisted.
+- `embedding_function` (_Optional[EmbeddingFunction[Embeddable]]_, _optional_): The function used for embedding documents and queries. Defaults to `DefaultEmbeddingFunction()` if not specified.
+- `k` (_int_, _optional_): The number of top passages to retrieve. Defaults to 7.
+
+### Methods
+
+#### `forward(self, query_or_queries: Union[str, List[str]], k: Optional[int] = None) -> dspy.Prediction`
+
+Search the chromadb collection for the top `k` passages matching the given query or queries, using embeddings generated via the specified `embedding_function`.
+
+**Parameters:**
+- `query_or_queries` (_Union[str, List[str]]_): The query or list of queries to search for.
+- `k` (_Optional[int]_, _optional_): The number of results to retrieve. If not specified, defaults to the value set during initialization.
+
+**Returns:**
+- `dspy.Prediction`: Contains the retrieved passages, each represented as a `dotdict` with a `long_text` attribute.
+
+### Quickstart with OpenAI Embeddings
+
+ChromadbRM have the flexibility from a variety of embedding functions as outlined in the [chromadb embeddings documentation](https://docs.trychroma.com/embeddings). While different options are available, this example demonstrates how to utilize OpenAI embeddings specifically.
+
+```python
+from dspy.retrieve import ChromadbRM
+import os
+import openai
+from chromadb.utils.embedding_functions import OpenAIEmbeddingFunction
+
+embedding_function = OpenAIEmbeddingFunction(
+    api_key=os.environ.get('OPENAI_API_KEY'),
+    model_name="text-embedding-ada-002"
+)
+
+retriever_model = ChromadbRM(
+    'your_collection_name',
+    '/path/to/your/db',
+    embedding_function=embedding_function,
+    k=5
+)
+
+results = retriever_model("Explore the significance of quantum computing", k=5)
+
+for result in results:
+    print("Document:", result.long_text, "\n")
+```
\ No newline at end of file
diff --git a/docs/api/retrieval_model_clients/ColBERTv2.md b/docs/api/retrieval_model_clients/ColBERTv2.md
new file mode 100644
index 000000000..2dd31bef8
--- /dev/null
+++ b/docs/api/retrieval_model_clients/ColBERTv2.md
@@ -0,0 +1,51 @@
+---
+sidebar_position: 1
+---
+
+# dspy.ColBERTv2
+
+### Constructor
+
+The constructor initializes the `ColBERTv2` class instance and sets up the request parameters for interacting with the ColBERTv2 server.
+
+```python
+class ColBERTv2:
+    def __init__(
+        self,
+        url: str = "http://0.0.0.0",
+        port: Optional[Union[str, int]] = None,
+        post_requests: bool = False,
+    ):
+```
+
+**Parameters:**
+- `url` (_str_): URL for ColBERTv2 server.
+- `port` (_Union[str, int]_, _Optional_): Port endpoint for ColBERTv2 server. Defaults to `None`.
+- `post_requests` (_bool_, _Optional_): Flag for using HTTP POST requests. Defaults to `False`.
+
+### Methods
+
+#### `__call__(self, query: str, k: int = 10, simplify: bool = False) -> Union[list[str], list[dotdict]]`
+
+Enables making queries to the ColBERTv2 server for retrieval. Internally, the method handles the specifics of preparing the request prompt and corresponding payload to obtain the response. The function handles the retrieval of the top-k passages based on the provided query.
+
+**Parameters:**
+- `query` (_str_): Query string used for retrieval.
+- `k` (_int_, _optional_): Number of passages to retrieve. Defaults to 10.
+- `simplify` (_bool_, _optional_): Flag for simplifying output to a list of strings. Defaults to False.
+
+**Returns:**
+- `Union[list[str], list[dotdict]]`: Depending on `simplify` flag, either a list of strings representing the passage content (`True`) or a list of `dotdict` instances containing passage details (`False`).
+
+### Quickstart
+
+```python
+import dspy
+
+colbertv2_wiki17_abstracts = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')
+
+retrieval_response = colbertv2_wiki17_abstracts('When was the first FIFA World Cup held?', k=5)
+
+for result in retrieval_response:
+    print("Text:", result['text'], "\n")
+```
diff --git a/docs/api/retrieval_model_clients/FaissRM.md b/docs/api/retrieval_model_clients/FaissRM.md
new file mode 100644
index 000000000..2ef9dadc1
--- /dev/null
+++ b/docs/api/retrieval_model_clients/FaissRM.md
@@ -0,0 +1,62 @@
+---
+sidebar_position: 4
+---
+
+# retrieve.FaissRM
+
+### Constructor
+
+Initialize an instance of FaissRM by providing it with a vectorizer and a list of strings
+
+```python
+FaissRM(
+    document_chunks: List[str],
+    vectorizer: dsp.modules.sentence_vectorizer.BaseSentenceVectorizer,
+    k: int = 3
+)
+```
+
+**Parameters:**
+- `document_chunks` (_List[str]_): a list of strings that comprises the corpus to search. You cannot add/insert/upsert to this list after creating this FaissRM object.
+- `vectorizer` (_dsp.modules.sentence_vectorizer.BaseSentenceVectorizer_, _optional_): If not provided, a dsp.modules.sentence_vectorizer.SentenceTransformersVectorizer object is created and used.
+- `k` (_int_, _optional_): The number of top passages to retrieve. Defaults to 3.
+
+### Methods
+
+#### `forward(self, query_or_queries: Union[str, List[str]]) -> dspy.Prediction`
+
+Search the FaissRM vector database for the top `k` passages matching the given query or queries, using embeddings generated via the vectorizer specified at FaissRM construction time
+
+**Parameters:**
+- `query_or_queries` (_Union[str, List[str]]_): The query or list of queries to search for.
+
+**Returns:**
+- `dspy.Prediction`: Contains the retrieved passages, each represented as a `dotdict` with a `long_text` attribute and an `index` attribute. The `index` attribute is the index in the document_chunks array provided to this FaissRM object at construction time.
+
+### Quickstart with the default vectorizer
+
+The **FaissRM** module provides a retriever that uses an in-memory Faiss vector database. This module does not include a vectorizer; instead it supports any subclass of **dsp.modules.sentence_vectorizer.BaseSentenceVectorizer**. If a vectorizer is not provided, an instance of **dsp.modules.sentence_vectorizer.SentenceTransformersVectorizer** is created and used by **FaissRM**. Note that the default embedding model for **SentenceTransformersVectorizer** is **all-MiniLM-L6-v2**
+
+
+```python
+import dspy
+from dspy.retrieve import FaissRM
+
+document_chunks = [
+    "The superbowl this year was played between the San Francisco 49ers and the Kanasas City Chiefs",
+    "Pop corn is often served in a bowl",
+    "The Rice Bowl is a Chinese Restaurant located in the city of Tucson, Arizona",
+    "Mars is the fourth planet in the Solar System",
+    "An aquarium is a place where children can learn about marine life",
+    "The capital of the United States is Washington, D.C",
+    "Rock and Roll musicians are honored by being inducted in the Rock and Roll Hall of Fame",
+    "Music albums were published on Long Play Records in the 70s and 80s",
+    "Sichuan cuisine is a spicy cuisine from central China",
+    "The interest rates for mortgages is considered to be very high in 2024",
+]
+
+frm = FaissRM(document_chunks)
+turbo = dspy.OpenAI(model="gpt-3.5-turbo")
+dspy.settings.configure(lm=turbo, rm=frm)
+print(frm(["I am in the mood for Chinese food"]))
+```
\ No newline at end of file
diff --git a/docs/api/retrieval_model_clients/_category_.json b/docs/api/retrieval_model_clients/_category_.json
new file mode 100644
index 000000000..0c3ec89a3
--- /dev/null
+++ b/docs/api/retrieval_model_clients/_category_.json
@@ -0,0 +1,8 @@
+{
+    "label": "Retrieval Model Clients",
+    "position": 3,
+    "link": {
+        "type": "generated-index",
+        "description": "This documentation provides an overview of the DSPy Retrieval Model Clients."
+    }
+}
\ No newline at end of file
diff --git a/docs-page/babel.config.js b/docs/babel.config.js
similarity index 100%
rename from docs-page/babel.config.js
rename to docs/babel.config.js
diff --git a/docs/custom.css b/docs/custom.css
deleted file mode 100644
index c653343f1..000000000
--- a/docs/custom.css
+++ /dev/null
@@ -1,169 +0,0 @@
-.green-title {
-  color: green;
-  display: inline;
-  font-size: small;
-}
-
-.platform-badge,
-.version-badge {
-  padding: 5px;
-  border-radius: 5px;
-  margin-left: 10px;
-  display: inline;
-}
-
-.platform-badge {
-  background-color: #007bff;
-  color: white;
-  font-size: smaller;
-}
-
-.version-badge {
-  background-color: #f39d12ae;
-  color: rgb(0, 0, 0);
-  font-size: 10px;
-  padding: 10px;
-}
-
-.team-members {
-  display: flex;
-  justify-content: space-around;
-}
-
-.team-member {
-  text-align: center;
-  /* flex: 1; */
-  width: 40px;
-  /* height: 40px; */
-}
-
-.title-separator,
-.comments-separator {
-  border-top: 1px solid #ccc;
-  margin: 10px 0;
-}
-
-.comments-section {
-  margin-top: 10px;
-  font-size: smaller;
-}
-
-.warning-message {
-  background-color: #f44336;
-  color: white;
-  padding: 10px;
-  border-radius: 5px;
-  display: none;
-}
-
-.warning-message.show {
-  display: block;
-}
-
-.success-message {
-  background-color: #4caf50;
-  color: white;
-  padding: 10px;
-  border-radius: 5px;
-}
-
-.priority-section {
-  display: flex;
-  align-items: center;
-}
-
-.warning-message-below {
-  background-color: rgba(244, 67, 54, 0.7); /* Red with alpha 0.7 */
-  color: white;
-  padding: 10px;
-  border-radius: 5px;
-  margin-top: 10px;
-  font-size: smaller;
-}
-
-.collapsible-info {
-  background-color: #2baa55ce; /* Blue */
-  color: white;
-  padding: 10px;
-  border-radius: 5px;
-  margin-top: 10px;
-  display: block;
-  font-size: smaller;
-}
-
-.collapsible-info.show {
-  display: block;
-}
-.title-badge {
-  font-size: 1.2em;
-  background-color: #4caf50; /* Green */
-  color: white;
-  padding: 10px;
-  border-radius: 5px;
-  display: inline;
-}
-
-.model-image {
-  width: 130px; /* Set the width */
-  max-width: 100%; /* Make sure it scales down if the container is smaller */
-  height: auto; /* Maintain aspect ratio */
-  display: block; /* Change from inline to block */
-  margin: auto; /* Center the image */
-}
-
-.model-badge {
-  font-size: 1.2em;
-  background-color: #4caf50; /* Green */
-  color: white;
-  padding: 10px;
-  border-radius: 5px;
-  display: inline;
-  transition: background-color 0.3s ease; /* Smooth transition */
-}
-
-.model-badge:hover {
-  background-color: #4a53ec; /* Darker green */
-}
-
-.title-platform-section {
-  display: flex;
-  align-items: center;
-}
-
-.status-priority-section {
-  display: flex;
-  align-items: center;
-  justify-content: space-between;
-}
-
-.card-grid {
-  display: flex;
-  flex-wrap: wrap;
-  gap: 16px;
-}
-
-.card {
-  flex: 1;
-  border: 1px solid #ccc;
-  border-radius: 8px;
-  padding: 16px;
-  margin: 16px;
-  min-width: calc(45% - 20px); /* For 3 cards per row */
-  max-width: calc(45% - 20px);
-  max-height: fit-content;
-  margin: 10px;
-}
-
-button {
-  background-color: #023b04c0;
-  color: white;
-  padding: 14px 20px;
-  margin: 8px 0;
-  border: none;
-  cursor: pointer;
-}
-
-button:disabled {
-  background-color: #ccc;
-  cursor: not-allowed;
-}
diff --git a/docs-page/docs/building-blocks/1-language_models.md b/docs/docs/building-blocks/1-language_models.md
similarity index 90%
rename from docs-page/docs/building-blocks/1-language_models.md
rename to docs/docs/building-blocks/1-language_models.md
index 1457f6cac..cb03ce29b 100644
--- a/docs-page/docs/building-blocks/1-language_models.md
+++ b/docs/docs/building-blocks/1-language_models.md
@@ -10,7 +10,7 @@ Let's first make sure you can set up your language model. DSPy support clients f
 
 ## Setting up the LM client.
 
-You can just call the constructor that connects to the LM. Then, use `dspy.configure` to declare this as the default LM.
+You can just call the constructor that connects to the LM. Then, use `dspy.configure` to declare this as the dexfault LM.
 
 For example, to use OpenAI language models, you can do it as follows.
 
@@ -141,31 +141,31 @@ lm = dspy.{provider_listed_below}(model="your model", model_request_kwargs="..."
 
 You need to host these models on your own GPU(s). Below, we include pointers for how to do that.
 
-1.  `dspy.HFClientTGI`: for HuggingFace models through the Text Generation Inference (TGI) system. [Tutorial: How do I install and launch the TGI server?](/api/hosting_language_models_locally/TGI)
+1.  `dspy.HFClientTGI`: for HuggingFace models through the Text Generation Inference (TGI) system. [Tutorial: How do I install and launch the TGI server?](https://dspy-docs.vercel.app/docs/deep-dive/language_model_clients/local_models/HFClientTGI)
 
 ```python
 tgi_llama2 = dspy.HFClientTGI(model="meta-llama/Llama-2-7b-hf", port=8080, url="http://localhost")
 ```
 
-2.  `dspy.HFClientVLLM`: for HuggingFace models through vLLM. [Tutorial: How do I install and launch the vLLM server?](/api/hosting_language_models_locally/vLLM)
+2.  `dspy.HFClientVLLM`: for HuggingFace models through vLLM. [Tutorial: How do I install and launch the vLLM server?](https://dspy-docs.vercel.app/docs/deep-dive/language_model_clients/local_models/HFClientVLLM)
 
 ```python
 vllm_llama2 = dspy.HFClientVLLM(model="meta-llama/Llama-2-7b-hf", port=8080, url="http://localhost")
 ```
 
-3.  `dspy.HFModel` (experimental) [Tutorial: How do I initialize models using HFModel](/api/hosting_language_models_locally/HFModel)
+3.  `dspy.HFModel` (experimental) [Tutorial: How do I initialize models using HFModel](https://dspy-docs.vercel.app/api/local_language_model_clients/HFModel)
 
 ```python
 llama = dspy.HFModel(model = 'meta-llama/Llama-2-7b-hf')
 ```
 
-4.  `dspy.Ollama` (experimental) for open source models through [Ollama](https://ollama.com). [Tutorial: How do I install and use Ollama on a local computer?](/api/hosting_language_models_locally/Ollama)\n",
+4.  `dspy.Ollama` (experimental) for open source models through [Ollama](https://ollama.com). [Tutorial: How do I install and use Ollama on a local computer?](https://dspy-docs.vercel.app/api/local_language_model_clients/Ollama)\n",
 
 ```python
 mistral_ollama = dspy.OllamaLocal(model='mistral')
 ```
 
-5.  `dspy.ChatModuleClient` (experimental): [How do I install and use MLC?](/api/hosting_language_models_locally/MLC)
+5.  `dspy.ChatModuleClient` (experimental): [How do I install and use MLC?](https://dspy-docs.vercel.app/api/local_language_model_clients/MLC)
 
 ```python
 model = 'dist/prebuilt/mlc-chat-Llama-2-7b-chat-hf-q4f16_1'
diff --git a/docs-page/docs/building-blocks/2-signatures.md b/docs/docs/building-blocks/2-signatures.md
similarity index 98%
rename from docs-page/docs/building-blocks/2-signatures.md
rename to docs/docs/building-blocks/2-signatures.md
index c79c5c085..5ce2e56fb 100644
--- a/docs-page/docs/building-blocks/2-signatures.md
+++ b/docs/docs/building-blocks/2-signatures.md
@@ -155,6 +155,6 @@ Prediction(
 
 ## Using signatures to build modules & compiling them
 
-While signatures are covenient for prototyping with structured inputs/outputs, that's not the main reason to use them!
+While signatures are convenient for prototyping with structured inputs/outputs, that's not the main reason to use them!
 
 You should compose multiple signatures into bigger [DSPy modules] and [compile] these modules into optimized prompts and finetunes.
diff --git a/docs-page/docs/building-blocks/3-modules.md b/docs/docs/building-blocks/3-modules.md
similarity index 97%
rename from docs-page/docs/building-blocks/3-modules.md
rename to docs/docs/building-blocks/3-modules.md
index 54ecb13da..dad351596 100644
--- a/docs-page/docs/building-blocks/3-modules.md
+++ b/docs/docs/building-blocks/3-modules.md
@@ -114,7 +114,7 @@ We also have some function-style modules:
 6. **`dspy.majority`**: Can do basic voting to return the most popular response from a set of predictions.
 
 
-Check out further examples in [each module's respective guide](/api/category/modules/).
+Check out further examples in [each module's respective guide](https://dspy-docs.vercel.app/api/category/modules).
 
 
 ## How do I compose multiple modules into a bigger program?
diff --git a/docs-page/docs/building-blocks/4-data.md b/docs/docs/building-blocks/4-data.md
similarity index 100%
rename from docs-page/docs/building-blocks/4-data.md
rename to docs/docs/building-blocks/4-data.md
diff --git a/docs-page/docs/building-blocks/5-metrics.md b/docs/docs/building-blocks/5-metrics.md
similarity index 100%
rename from docs-page/docs/building-blocks/5-metrics.md
rename to docs/docs/building-blocks/5-metrics.md
diff --git a/docs-page/docs/building-blocks/6-optimizers.md b/docs/docs/building-blocks/6-optimizers.md
similarity index 95%
rename from docs-page/docs/building-blocks/6-optimizers.md
rename to docs/docs/building-blocks/6-optimizers.md
index db6223e59..6c9c3603a 100644
--- a/docs-page/docs/building-blocks/6-optimizers.md
+++ b/docs/docs/building-blocks/6-optimizers.md
@@ -23,7 +23,7 @@ If you happen to have a lot of data, DSPy can leverage that. But you can start s
 
 Traditional deep neural networks (DNNs) can be optimized with gradient descent, given a loss function and some training data.
 
-DSPy programs consist of multiple calls to LMs, stacked togther as [DSPy modules]. Each DSPy module has internal parameters of three kinds: (1) the LM weights, (2) the instructions, and (3) demonstrations of the input/output behavior.
+DSPy programs consist of multiple calls to LMs, stacked together as [DSPy modules]. Each DSPy module has internal parameters of three kinds: (1) the LM weights, (2) the instructions, and (3) demonstrations of the input/output behavior.
 
 Given a metric, DSPy can optimize all of these three with multi-stage optimization algorithms. These can combine gradient descent (for LM weights) and discrete LM-driven optimization, i.e. for crafting/updating instructions and for creating/validating demonstrations. DSPy Demonstrations are like few-shot examples, but they're far more powerful. They can be created from scratch, given your program, and their creation and selection can be optimized in many effective ways.
 
diff --git a/docs-page/docs/building-blocks/7-assertions.md b/docs/docs/building-blocks/7-assertions.md
similarity index 100%
rename from docs-page/docs/building-blocks/7-assertions.md
rename to docs/docs/building-blocks/7-assertions.md
diff --git a/docs-page/docs/building-blocks/_category_.json b/docs/docs/building-blocks/_category_.json
similarity index 100%
rename from docs-page/docs/building-blocks/_category_.json
rename to docs/docs/building-blocks/_category_.json
diff --git a/docs-page/docs/building-blocks/solving_your_task.md b/docs/docs/building-blocks/solving_your_task.md
similarity index 82%
rename from docs-page/docs/building-blocks/solving_your_task.md
rename to docs/docs/building-blocks/solving_your_task.md
index b4ba1f14e..cdfc86865 100644
--- a/docs-page/docs/building-blocks/solving_your_task.md
+++ b/docs/docs/building-blocks/solving_your_task.md
@@ -8,7 +8,7 @@ Using DSPy well for solving a new task is just doing good machine learning with
 
 What this means is that it's an iterative process. You make some initial choices, which will be sub-optimal, and then you refine them incrementally. 
 
-As we discuss below, you will define your task and the metrics you want to maximize, and prepare a few example inputs — typically without labels (or only with labels for the final outputs, if your metric requires them). Then, you build your pipeline by selecting built-in layers [(`modules`)](/docs/building-blocks/modules) to use, giving each layer a [`signature` (input/output spec)](/docs/building-blocks/signatures), and then calling your modules freely in your Python code. Lastly, you use a DSPy [`optimizer`](/docs/building-blocks/optimizers) to compile your code into high-quality instructions, automatic few-shot examples, or updated LM weights for your LM.
+As we discuss below, you will define your task and the metrics you want to maximize, and prepare a few example inputs — typically without labels (or only with labels for the final outputs, if your metric requires them). Then, you build your pipeline by selecting built-in layers [(`modules`)](https://dspy-docs.vercel.app/docs/building-blocks/modules) to use, giving each layer a [`signature` (input/output spec)](https://dspy-docs.vercel.app/docs/building-blocks/signatures), and then calling your modules freely in your Python code. Lastly, you use a DSPy [`optimizer`]https://dspy-docs.vercel.app/docs/building-blocks/optimizers) to compile your code into high-quality instructions, automatic few-shot examples, or updated LM weights for your LM.
 
 
 ## 1) Define your task.
@@ -31,7 +31,7 @@ What should your DSPy program do? Can it just be a simple chain-of-thought step?
 
 Is there a typical workflow for solving your problem in multiple well-defined steps? Or do you want a fully open-ended LM (or open-ended tool use with agents) for your task?
 
-Think about this space but always start simple. Almost every task should probably start with just a single [dspy.ChainofThought](/api/modules/ChainOfThought) module, and then add complexity incrementally as you go.
+Think about this space but always start simple. Almost every task should probably start with just a single [dspy.ChainofThought](https://dspy-docs.vercel.app/api/modules/ChainOfThought) module, and then add complexity incrementally as you go.
 
 Then write your (initial) DSPy program. Again: start simple, and let the next few steps guide any complexity you will add.
 
@@ -39,7 +39,7 @@ Then write your (initial) DSPy program. Again: start simple, and let the next fe
 
 By this point, you probably have a few examples of the task you're trying to solve.
 
-Run them through your pipeline. Consider using a large and powerful LM at this point, or a couple of different LMs, just to understand what's possible. (DSPy will make swapping these LMs pretty easy - [LM Guide](/docs/building-blocks/language_models).)
+Run them through your pipeline. Consider using a large and powerful LM at this point, or a couple of different LMs, just to understand what's possible. (DSPy will make swapping these LMs pretty easy - [LM Guide](https://dspy-docs.vercel.app/docs/building-blocks/language_models).)
 
 At this point, you're still using your pipeline zero-shot, so it will be far from perfect. DSPy will help you optimize the instructions, few-shot examples, and even weights of your LM calls below, but understanding where things go wrong in zero-shot usage will go a long way.
 
@@ -47,7 +47,7 @@ Record the interesting (both easy and hard) examples you try: even if you don't
 
 ## 4) Define your data.
 
-Now it's time to more formally declare your training and validation data for DSPy evaluation and optimization - [Data Guide](/docs/building-blocks/data).
+Now it's time to more formally declare your training and validation data for DSPy evaluation and optimization - [Data Guide](https://dspy-docs.vercel.app/docs/building-blocks/data).
 
 You can use DSPy optimizers usefully with as few as 10 examples, but having 50-100 examples (or even better, 300-500 examples) goes a long way.
 
@@ -62,7 +62,7 @@ If there's data whose licenses are permissive enough, we suggest you use them. O
 
 What makes outputs from your system good or bad? Invest in defining metrics and improving them over time incrementally. It's really hard to consistently improve what you aren't able to define.
 
-A metric is just a function that will take examples from your data and take the output of your system, and return a score that quantifies how good the output is - [Metric Guide](/docs/building-blocks/metrics).
+A metric is just a function that will take examples from your data and take the output of your system, and return a score that quantifies how good the output is - [Metric Guide](https://dspy-docs.vercel.app/docs/building-blocks/metrics).
 
 For simple tasks, this could be just "accuracy" or "exact match" or "F1 score". This may be the case for simple classification or short-form QA tasks.
 
@@ -79,17 +79,17 @@ Look at the outputs and the metric scores. This will probably allow you to spot
 
 ## 7) Compile with a DSPy optimizer.
 
-Given some data and a metric, we can now optimize the program you built - [Optimizer Guide](/docs/building-blocks/optimizers).
+Given some data and a metric, we can now optimize the program you built - [Optimizer Guide](https://dspy-docs.vercel.app/docs/building-blocks/optimizers).
 
 DSPy includes many optimizers that do different things. Remember: DSPy optimizers will create examples of each step, craft instructions, and/or update LM weights. In general, you don't need to have labels for your pipeline steps, but your data examples need to have input values and whatever labels your metric requires (e.g., no labels if your metric is reference-free, but final output labels otherwise in most cases).
 
 Here's the general guidance on getting started:
 
-* If you have very little data, e.g. 10 examples of your task, use [`BootstrapFewShot`](/docs/deep-dive/teleprompter/bootstrap-fewshot)
+* If you have very little data, e.g. 10 examples of your task, use [`BootstrapFewShot`](https://dspy-docs.vercel.app/docs/deep-dive/teleprompter/bootstrap-fewshot)
 
-* If you have slightly more data, e.g. 50 examples of your task, use [`BootstrapFewShotWithRandomSearch`](/docs/deep-dive/teleprompter/bootstrap-fewshot).
+* If you have slightly more data, e.g. 50 examples of your task, use [`BootstrapFewShotWithRandomSearch`](https://dspy-docs.vercel.app/docs/deep-dive/teleprompter/bootstrap-fewshot).
 
-* If you have more data than that, e.g. 300 examples or more, use [`BayesianSignatureOptimizer`](/docs/deep-dive/teleprompter/signature-optimizer).
+* If you have more data than that, e.g. 300 examples or more, use [`BayesianSignatureOptimizer`](https://dspy-docs.vercel.app/docs/deep-dive/teleprompter/signature-optimizer).
 
 * If you have been able to use one of these with a large LM (e.g., 7B parameters or above) and need a very efficient program, compile that down to a small LM with `BootstrapFinetune`.
 
@@ -97,7 +97,7 @@ Here's the general guidance on getting started:
 
 At this point, you are either very happy with everything (we've seen quite a few people get it right on first try with DSPy) or, more likely, you've made a lot of progress but you don't like something about the final program or the metric.
 
-At this point, go back to step 1 and revisit the major questions. Did you define your task well? Do you need to collect (or find online) more data for your problem? Do you want to update your metric? And do you want to use a more sophisticated optimizer? Do you need to consider advanced features like [DSPy Assertions](/docs/building-blocks/assertions)? Or, perhaps most importantly, do you want to add some more complexity or steps in your DSPy program itself? Do you want to use multiple optimizers in a sequence?
+At this point, go back to step 1 and revisit the major questions. Did you define your task well? Do you need to collect (or find online) more data for your problem? Do you want to update your metric? And do you want to use a more sophisticated optimizer? Do you need to consider advanced features like [DSPy Assertions](https://dspy-docs.vercel.app/docs/building-blocks/assertions)? Or, perhaps most importantly, do you want to add some more complexity or steps in your DSPy program itself? Do you want to use multiple optimizers in a sequence?
 
 Iterative development is key. DSPy gives you the pieces to do that incrementally: iterating on your data, your program structure, your assertions, your metric, and your optimization steps.
 
diff --git a/docs-page/docs/cheatsheet.md b/docs/docs/cheatsheet.md
similarity index 100%
rename from docs-page/docs/cheatsheet.md
rename to docs/docs/cheatsheet.md
diff --git a/docs-page/docs/deep-dive/_category_.json b/docs/docs/deep-dive/_category_.json
similarity index 100%
rename from docs-page/docs/deep-dive/_category_.json
rename to docs/docs/deep-dive/_category_.json
diff --git a/docs-page/docs/deep-dive/data-handling/_category_.json b/docs/docs/deep-dive/data-handling/_category_.json
similarity index 100%
rename from docs-page/docs/deep-dive/data-handling/_category_.json
rename to docs/docs/deep-dive/data-handling/_category_.json
diff --git a/docs-page/docs/deep-dive/data-handling/built-in-datasets.mdx b/docs/docs/deep-dive/data-handling/built-in-datasets.mdx
similarity index 100%
rename from docs-page/docs/deep-dive/data-handling/built-in-datasets.mdx
rename to docs/docs/deep-dive/data-handling/built-in-datasets.mdx
diff --git a/docs-page/docs/deep-dive/data-handling/examples.mdx b/docs/docs/deep-dive/data-handling/examples.mdx
similarity index 100%
rename from docs-page/docs/deep-dive/data-handling/examples.mdx
rename to docs/docs/deep-dive/data-handling/examples.mdx
diff --git a/docs-page/docs/deep-dive/data-handling/img/data-loading.png b/docs/docs/deep-dive/data-handling/img/data-loading.png
similarity index 100%
rename from docs-page/docs/deep-dive/data-handling/img/data-loading.png
rename to docs/docs/deep-dive/data-handling/img/data-loading.png
diff --git a/docs-page/docs/deep-dive/data-handling/loading-custom-data.mdx b/docs/docs/deep-dive/data-handling/loading-custom-data.mdx
similarity index 100%
rename from docs-page/docs/deep-dive/data-handling/loading-custom-data.mdx
rename to docs/docs/deep-dive/data-handling/loading-custom-data.mdx
diff --git a/docs-page/docs/deep-dive/language_model_clients/_category_.json b/docs/docs/deep-dive/language_model_clients/_category_.json
similarity index 100%
rename from docs-page/docs/deep-dive/language_model_clients/_category_.json
rename to docs/docs/deep-dive/language_model_clients/_category_.json
diff --git a/docs-page/docs/deep-dive/language_model_clients/custom-lm-client.mdx b/docs/docs/deep-dive/language_model_clients/custom-lm-client.mdx
similarity index 100%
rename from docs-page/docs/deep-dive/language_model_clients/custom-lm-client.mdx
rename to docs/docs/deep-dive/language_model_clients/custom-lm-client.mdx
diff --git a/docs-page/docs/deep-dive/language_model_clients/local_models/HFClientTGI.mdx b/docs/docs/deep-dive/language_model_clients/local_models/HFClientTGI.mdx
similarity index 96%
rename from docs-page/docs/deep-dive/language_model_clients/local_models/HFClientTGI.mdx
rename to docs/docs/deep-dive/language_model_clients/local_models/HFClientTGI.mdx
index 0a0f4531b..2550ed3c5 100644
--- a/docs-page/docs/deep-dive/language_model_clients/local_models/HFClientTGI.mdx
+++ b/docs/docs/deep-dive/language_model_clients/local_models/HFClientTGI.mdx
@@ -4,7 +4,7 @@ import AuthorDetails from '@site/src/components/AuthorDetails';
 
 ### Prerequisites - Launching TGI Server locally
 
-Refer to the [Text Generation-Inference Server API](/api/hosting_language_models_locally/TGI) for setting up the TGI server locally.
+Refer to the [Text Generation-Inference Server API](/api/local_language_model_clients/TGI) for setting up the TGI server locally.
 
 ```bash
 #Example TGI Server Launch
diff --git a/docs-page/docs/deep-dive/language_model_clients/local_models/HFClientVLLM.mdx b/docs/docs/deep-dive/language_model_clients/local_models/HFClientVLLM.mdx
similarity index 96%
rename from docs-page/docs/deep-dive/language_model_clients/local_models/HFClientVLLM.mdx
rename to docs/docs/deep-dive/language_model_clients/local_models/HFClientVLLM.mdx
index 66eb63f1d..059c2a9be 100644
--- a/docs-page/docs/deep-dive/language_model_clients/local_models/HFClientVLLM.mdx
+++ b/docs/docs/deep-dive/language_model_clients/local_models/HFClientVLLM.mdx
@@ -4,7 +4,7 @@ import AuthorDetails from '@site/src/components/AuthorDetails';
 
 ### Prerequisites - Launching vLLM Server locally
 
-Refer to the [vLLM Server API](/api/hosting_language_models_locally/vLLM) for setting up the vLLM server locally.
+Refer to the [vLLM Server API](/api/local_language_model_clients/vLLM) for setting up the vLLM server locally.
 
 ```bash
 #Example vLLM Server Launch
diff --git a/docs-page/docs/deep-dive/language_model_clients/remote_models/Anyscale.mdx b/docs/docs/deep-dive/language_model_clients/remote_models/Anyscale.mdx
similarity index 100%
rename from docs-page/docs/deep-dive/language_model_clients/remote_models/Anyscale.mdx
rename to docs/docs/deep-dive/language_model_clients/remote_models/Anyscale.mdx
diff --git a/docs-page/docs/deep-dive/language_model_clients/remote_models/Cohere.mdx b/docs/docs/deep-dive/language_model_clients/remote_models/Cohere.mdx
similarity index 100%
rename from docs-page/docs/deep-dive/language_model_clients/remote_models/Cohere.mdx
rename to docs/docs/deep-dive/language_model_clients/remote_models/Cohere.mdx
diff --git a/docs-page/docs/deep-dive/language_model_clients/remote_models/OpenAI.mdx b/docs/docs/deep-dive/language_model_clients/remote_models/OpenAI.mdx
similarity index 100%
rename from docs-page/docs/deep-dive/language_model_clients/remote_models/OpenAI.mdx
rename to docs/docs/deep-dive/language_model_clients/remote_models/OpenAI.mdx
diff --git a/docs-page/docs/deep-dive/language_model_clients/remote_models/Together.mdx b/docs/docs/deep-dive/language_model_clients/remote_models/Together.mdx
similarity index 100%
rename from docs-page/docs/deep-dive/language_model_clients/remote_models/Together.mdx
rename to docs/docs/deep-dive/language_model_clients/remote_models/Together.mdx
diff --git a/docs-page/docs/deep-dive/language_model_clients/remote_models/_category_.json b/docs/docs/deep-dive/language_model_clients/remote_models/_category_.json
similarity index 100%
rename from docs-page/docs/deep-dive/language_model_clients/remote_models/_category_.json
rename to docs/docs/deep-dive/language_model_clients/remote_models/_category_.json
diff --git a/docs-page/api/modules/_category_.json b/docs/docs/deep-dive/modules/_category_.json
similarity index 100%
rename from docs-page/api/modules/_category_.json
rename to docs/docs/deep-dive/modules/_category_.json
diff --git a/docs-page/docs/deep-dive/modules/assertions.mdx b/docs/docs/deep-dive/modules/assertions.mdx
similarity index 100%
rename from docs-page/docs/deep-dive/modules/assertions.mdx
rename to docs/docs/deep-dive/modules/assertions.mdx
diff --git a/docs-page/docs/deep-dive/modules/chain-of-thought-with-hint.mdx b/docs/docs/deep-dive/modules/chain-of-thought-with-hint.mdx
similarity index 100%
rename from docs-page/docs/deep-dive/modules/chain-of-thought-with-hint.mdx
rename to docs/docs/deep-dive/modules/chain-of-thought-with-hint.mdx
diff --git a/docs-page/docs/deep-dive/modules/guide.mdx b/docs/docs/deep-dive/modules/guide.mdx
similarity index 86%
rename from docs-page/docs/deep-dive/modules/guide.mdx
rename to docs/docs/deep-dive/modules/guide.mdx
index db7cb7bac..29c349e32 100644
--- a/docs-page/docs/deep-dive/modules/guide.mdx
+++ b/docs/docs/deep-dive/modules/guide.mdx
@@ -14,7 +14,7 @@ Remember that **DSPy program** is just Python code that calls one or more **DSPy
 
 A **DSPy module** is a building block for programs that use LMs.
 
-- Each built-in module abstracts a **prompting technique** (like chain of thought or ReAct). Crucially, they are generalized to handle any [DSPy Signature](/docs/building-blocks/2-signatures.md).
+- Each built-in module abstracts a **prompting technique** (like chain of thought or ReAct). Crucially, they are generalized to handle any [DSPy Signature](https://dspy-docs.vercel.app/docs/building-blocks/signatures).
 
 - A DSPy module has **learnable parameters** (i.e., the little pieces comprising the prompt and the LM weights) and can be invoked (called) to process inputs and return outputs.
 
@@ -22,15 +22,15 @@ A **DSPy module** is a building block for programs that use LMs.
 
 ### 2) What DSPy Modules are currently built-in?
 
-1. **[`dspy.Predict`](/api/modules/Predict)**:
+1. **[`dspy.Predict`](https://dspy-docs.vercel.app/api/modules/Predict)**:
 
-2. **[`dspy.ChainOfThought`](/api/modules/ChainOfThought)**: 
+2. **[`dspy.ChainOfThought`](https://dspy-docs.vercel.app/api/modules/ChainOfThought)**: 
 
-3. **[`dspy.ProgramOfThought`](/api/modules/ProgramOfThought)**:
+3. **[`dspy.ProgramOfThought`](https://dspy-docs.vercel.app/api/modules/ProgramOfThought)**:
 
-4. **[`dspy.ReAct`](/api/modules/ReAct)**:
+4. **[`dspy.ReAct`](https://dspy-docs.vercel.app/api/modules/ReAct)**:
 
-5. **[`dspy.MultiChainComparison`](/api/modules/MultiChainComparison)**:
+5. **[`dspy.MultiChainComparison`](https://dspy-docs.vercel.app/api/modules/MultiChainComparison)**:
 
 
 We also have some function-style modules:
@@ -41,7 +41,7 @@ We also have some function-style modules:
 
 Let's start with the most fundamental one, `dspy.Predict`. Internally, all of the others are just built using it!
 
-We'll assume you are already at least a little familiar with [DSPy signatures](/docs/building-blocks/2-signatures.md), which are declarative specs for defining the behavior of any module we use in DSPy.
+We'll assume you are already at least a little familiar with [DSPy signatures](https://dspy-docs.vercel.app/docs/building-blocks/signatures), which are declarative specs for defining the behavior of any module we use in DSPy.
 To use a module, we first **declare** it by giving it a signature. Then we **call** the module with the input arguments, and extract the output fields!
 
 
@@ -127,7 +127,7 @@ True
 
 The others are very similar, `dspy.ReAct` and `dspy.ProgramOfThought` etc. They mainly change the internal behavior with which your signature is implemented!
 
-Check out further examples in [each module's respective guide](/api/category/modules/).
+Check out further examples in [each module's respective guide](https://dspy-docs.vercel.app/docs/category/modules).
 
 ### 5) How do I compose multiple modules into a bigger program?
 
diff --git a/docs-page/docs/deep-dive/modules/program-of-thought.mdx b/docs/docs/deep-dive/modules/program-of-thought.mdx
similarity index 100%
rename from docs-page/docs/deep-dive/modules/program-of-thought.mdx
rename to docs/docs/deep-dive/modules/program-of-thought.mdx
diff --git a/docs-page/docs/deep-dive/modules/react.mdx b/docs/docs/deep-dive/modules/react.mdx
similarity index 100%
rename from docs-page/docs/deep-dive/modules/react.mdx
rename to docs/docs/deep-dive/modules/react.mdx
diff --git a/docs-page/docs/deep-dive/modules/retrieve.mdx b/docs/docs/deep-dive/modules/retrieve.mdx
similarity index 100%
rename from docs-page/docs/deep-dive/modules/retrieve.mdx
rename to docs/docs/deep-dive/modules/retrieve.mdx
diff --git a/docs-page/docs/deep-dive/retrieval_models_clients/Azure.mdx b/docs/docs/deep-dive/retrieval_models_clients/Azure.mdx
similarity index 100%
rename from docs-page/docs/deep-dive/retrieval_models_clients/Azure.mdx
rename to docs/docs/deep-dive/retrieval_models_clients/Azure.mdx
diff --git a/docs-page/docs/deep-dive/retrieval_models_clients/ChromadbRM.mdx b/docs/docs/deep-dive/retrieval_models_clients/ChromadbRM.mdx
similarity index 100%
rename from docs-page/docs/deep-dive/retrieval_models_clients/ChromadbRM.mdx
rename to docs/docs/deep-dive/retrieval_models_clients/ChromadbRM.mdx
diff --git a/docs-page/docs/deep-dive/retrieval_models_clients/ColBERTv2.mdx b/docs/docs/deep-dive/retrieval_models_clients/ColBERTv2.mdx
similarity index 100%
rename from docs-page/docs/deep-dive/retrieval_models_clients/ColBERTv2.mdx
rename to docs/docs/deep-dive/retrieval_models_clients/ColBERTv2.mdx
diff --git a/docs-page/docs/deep-dive/retrieval_models_clients/_category_.json b/docs/docs/deep-dive/retrieval_models_clients/_category_.json
similarity index 100%
rename from docs-page/docs/deep-dive/retrieval_models_clients/_category_.json
rename to docs/docs/deep-dive/retrieval_models_clients/_category_.json
diff --git a/docs-page/docs/deep-dive/retrieval_models_clients/custom-rm-client.mdx b/docs/docs/deep-dive/retrieval_models_clients/custom-rm-client.mdx
similarity index 100%
rename from docs-page/docs/deep-dive/retrieval_models_clients/custom-rm-client.mdx
rename to docs/docs/deep-dive/retrieval_models_clients/custom-rm-client.mdx
diff --git a/docs-page/docs/deep-dive/retrieval_models_clients/img/io_rm_module.png b/docs/docs/deep-dive/retrieval_models_clients/img/io_rm_module.png
similarity index 100%
rename from docs-page/docs/deep-dive/retrieval_models_clients/img/io_rm_module.png
rename to docs/docs/deep-dive/retrieval_models_clients/img/io_rm_module.png
diff --git a/docs-page/docs/deep-dive/signature/_category_.json b/docs/docs/deep-dive/signature/_category_.json
similarity index 100%
rename from docs-page/docs/deep-dive/signature/_category_.json
rename to docs/docs/deep-dive/signature/_category_.json
diff --git a/docs-page/docs/deep-dive/signature/executing-signatures.mdx b/docs/docs/deep-dive/signature/executing-signatures.mdx
similarity index 100%
rename from docs-page/docs/deep-dive/signature/executing-signatures.mdx
rename to docs/docs/deep-dive/signature/executing-signatures.mdx
diff --git a/docs-page/docs/deep-dive/signature/img/class_based_prompt_creation.png b/docs/docs/deep-dive/signature/img/class_based_prompt_creation.png
similarity index 100%
rename from docs-page/docs/deep-dive/signature/img/class_based_prompt_creation.png
rename to docs/docs/deep-dive/signature/img/class_based_prompt_creation.png
diff --git a/docs-page/docs/deep-dive/signature/img/dspy_signatures.png b/docs/docs/deep-dive/signature/img/dspy_signatures.png
similarity index 100%
rename from docs-page/docs/deep-dive/signature/img/dspy_signatures.png
rename to docs/docs/deep-dive/signature/img/dspy_signatures.png
diff --git a/docs-page/docs/deep-dive/signature/img/prompt_creation.png b/docs/docs/deep-dive/signature/img/prompt_creation.png
similarity index 100%
rename from docs-page/docs/deep-dive/signature/img/prompt_creation.png
rename to docs/docs/deep-dive/signature/img/prompt_creation.png
diff --git a/docs-page/docs/deep-dive/signature/understanding-signatures.mdx b/docs/docs/deep-dive/signature/understanding-signatures.mdx
similarity index 100%
rename from docs-page/docs/deep-dive/signature/understanding-signatures.mdx
rename to docs/docs/deep-dive/signature/understanding-signatures.mdx
diff --git a/docs-page/docs/deep-dive/teleprompter/_category_.json b/docs/docs/deep-dive/teleprompter/_category_.json
similarity index 100%
rename from docs-page/docs/deep-dive/teleprompter/_category_.json
rename to docs/docs/deep-dive/teleprompter/_category_.json
diff --git a/docs-page/docs/deep-dive/teleprompter/bootstrap-fewshot.mdx b/docs/docs/deep-dive/teleprompter/bootstrap-fewshot.mdx
similarity index 96%
rename from docs-page/docs/deep-dive/teleprompter/bootstrap-fewshot.mdx
rename to docs/docs/deep-dive/teleprompter/bootstrap-fewshot.mdx
index 5d30cdd09..4d15d6709 100644
--- a/docs-page/docs/deep-dive/teleprompter/bootstrap-fewshot.mdx
+++ b/docs/docs/deep-dive/teleprompter/bootstrap-fewshot.mdx
@@ -10,7 +10,7 @@ When compiling a DSPy program, we generally invoke a teleprompter, which is an o
 
 ## Setting up a Sample Pipeline
 
-We'll be making a basic answer generation pipeline over GSM8K dataset that we saw in the [Minimal Example](/docs/quick-start/minimal-example), we won't be changing anything in it! So let's start by configuring the LM which will be OpenAI LM client with `gpt-3.5-turbo` as the LLM in use.
+We'll be making a basic answer generation pipeline over GSM8K dataset that we saw in the [Minimal Example](https://dspy-docs.vercel.app/docs/quick-start/minimal-example), we won't be changing anything in it! So let's start by configuring the LM which will be OpenAI LM client with `gpt-3.5-turbo` as the LLM in use.
 
 ```python
 import dspy
diff --git a/docs-page/docs/deep-dive/teleprompter/img/signature_optimizer.png b/docs/docs/deep-dive/teleprompter/img/signature_optimizer.png
similarity index 100%
rename from docs-page/docs/deep-dive/teleprompter/img/signature_optimizer.png
rename to docs/docs/deep-dive/teleprompter/img/signature_optimizer.png
diff --git a/docs-page/docs/deep-dive/teleprompter/img/signature_optimizer_process.png b/docs/docs/deep-dive/teleprompter/img/signature_optimizer_process.png
similarity index 100%
rename from docs-page/docs/deep-dive/teleprompter/img/signature_optimizer_process.png
rename to docs/docs/deep-dive/teleprompter/img/signature_optimizer_process.png
diff --git a/docs-page/docs/deep-dive/teleprompter/img/signature_optimizer_process_v2.png b/docs/docs/deep-dive/teleprompter/img/signature_optimizer_process_v2.png
similarity index 100%
rename from docs-page/docs/deep-dive/teleprompter/img/signature_optimizer_process_v2.png
rename to docs/docs/deep-dive/teleprompter/img/signature_optimizer_process_v2.png
diff --git a/docs-page/docs/deep-dive/teleprompter/img/signature_optimizer_process_v3.png b/docs/docs/deep-dive/teleprompter/img/signature_optimizer_process_v3.png
similarity index 100%
rename from docs-page/docs/deep-dive/teleprompter/img/signature_optimizer_process_v3.png
rename to docs/docs/deep-dive/teleprompter/img/signature_optimizer_process_v3.png
diff --git a/docs-page/docs/deep-dive/teleprompter/img/signature_optimizer_process_v4.png b/docs/docs/deep-dive/teleprompter/img/signature_optimizer_process_v4.png
similarity index 100%
rename from docs-page/docs/deep-dive/teleprompter/img/signature_optimizer_process_v4.png
rename to docs/docs/deep-dive/teleprompter/img/signature_optimizer_process_v4.png
diff --git a/docs-page/docs/deep-dive/teleprompter/signature-optimizer.mdx b/docs/docs/deep-dive/teleprompter/signature-optimizer.mdx
similarity index 100%
rename from docs-page/docs/deep-dive/teleprompter/signature-optimizer.mdx
rename to docs/docs/deep-dive/teleprompter/signature-optimizer.mdx
diff --git a/docs-page/docs/faqs.md b/docs/docs/faqs.md
similarity index 87%
rename from docs-page/docs/faqs.md
rename to docs/docs/faqs.md
index b65b2e517..3308ba5a5 100644
--- a/docs-page/docs/faqs.md
+++ b/docs/docs/faqs.md
@@ -16,7 +16,7 @@ The **DSPy** philosophy and abstraction differ significantly from other librarie
 
 ## Basic Usage
 
-**How should I use DSPy for my task?** We wrote a [seven-step guide](/docs/building-blocks/solving_your_task.md) on this. In short, using DSPy is an iterative process. You first define your task and the metrics you want to maximize, and prepare a few example inputs — typically without labels (or only with labels for the final outputs, if your metric requires them). Then, you build your pipeline by selecting built-in layers (`modules`) to use, giving each layer a `signature` (input/output spec), and then calling your modules freely in your Python code. Lastly, you use a DSPy `optimizer` to compile your code into high-quality instructions, automatic few-shot examples, or updated LM weights for your LM.
+**How should I use DSPy for my task?** We wrote a [eight-step guide](https://dspy-docs.vercel.app/docs/building-blocks/solving_your_task) on this. In short, using DSPy is an iterative process. You first define your task and the metrics you want to maximize, and prepare a few example inputs — typically without labels (or only with labels for the final outputs, if your metric requires them). Then, you build your pipeline by selecting built-in layers (`modules`) to use, giving each layer a `signature` (input/output spec), and then calling your modules freely in your Python code. Lastly, you use a DSPy `optimizer` to compile your code into high-quality instructions, automatic few-shot examples, or updated LM weights for your LM.
 
 **How do I convert my complex prompt into a DSPy pipeline?** See the same answer above.
 
@@ -34,11 +34,11 @@ You can specify the generation of long responses as a `dspy.OutputField`. To ens
 
 - **How do I define my own metrics? Can metrics return a float?**
 
-You can define metrics as simply Python functions that process model generations and evaluate them based on user-defined requirements. Metrics can compare existent data (e.g. gold labels) to model predictions or they can be used to assess various components of an output using validation feedback from LMs (e.g. LLMs-as-Judges). Metrics can return `bool`, `int`, and `float` types scores. Check out the official [Metrics docs](/docs/building-blocks/5-metrics.md) to learn more about defining custom metrics and advanced evaluations using AI feedback and/or DSPy programs.
+You can define metrics as simply Python functions that process model generations and evaluate them based on user-defined requirements. Metrics can compare existent data (e.g. gold labels) to model predictions or they can be used to assess various components of an output using validation feedback from LMs (e.g. LLMs-as-Judges). Metrics can return `bool`, `int`, and `float` types scores. Check out the official [Metrics docs](https://dspy-docs.vercel.app/docs/building-blocks/metrics) to learn more about defining custom metrics and advanced evaluations using AI feedback and/or DSPy programs.
 
 - **How expensive or slow is compiling??**
 
-To reflect compiling metrics, we highlight an experiment for reference, compiling the [`SimplifiedBaleen`](/docs/tutorials/simplified-baleen.md) using the `dspy.BootstrapFewShotWithRandomSearch` optimizer on the `gpt-3.5-turbo-1106` model over 7 candidate programs and 10 threads. We report that compiling this program takes around 6 minutes with 3200 API calls, 2.7 million input tokens and 156,000 output tokens, reporting a total cost of $3 USD (at the current pricing of the OpenAI model).
+To reflect compiling metrics, we highlight an experiment for reference, compiling the [`SimplifiedBaleen`](https://dspy-docs.vercel.app/docs/tutorials/simplified-baleen) using the [`dspy.BootstrapFewShotWithRandomSearch`](https://dspy-docs.vercel.app/docs/deep-dive/teleprompter/bootstrap-fewshot) optimizer on the `gpt-3.5-turbo-1106` model over 7 candidate programs and 10 threads. We report that compiling this program takes around 6 minutes with 3200 API calls, 2.7 million input tokens and 156,000 output tokens, reporting a total cost of $3 USD (at the current pricing of the OpenAI model).
 
 Compiling DSPy `optimizers` naturally will incur additional LM calls, but we substantiate this overhead with minimalistic executions with the goal of maximizing performance. This invites avenues to enhance performance of smaller models by compiling DSPy programs with larger models to learn enhanced behavior during compile-time and propagate such behavior to the tested smaller model during inference-time.  
 
@@ -88,7 +88,7 @@ Modules can be frozen by setting their `._compiled` attribute to be True, indica
 
 You can specify JSON-type descriptions in the `desc` field of the long-form signature `dspy.OutputField` (e.g. `output = dspy.OutputField(desc='key-value pairs')`).
 
-If you notice outputs are still not conforming to JSON formatting, try Asserting this constraint! Check out [Assertions](/docs/building-blocks/7-assertions.md) (or the next question!)
+If you notice outputs are still not conforming to JSON formatting, try Asserting this constraint! Check out [Assertions](https://dspy-docs.vercel.app/docs/building-blocks/assertions) (or the next question!)
 
 - **How do I use DSPy assertions?**
 
@@ -127,4 +127,4 @@ If all variables seem stable, you may be experiencing timeouts or backoff errors
 
 **How can I add my favorite LM or vector store?** 
 
-Check out these walkthroughs on setting up a [Custom LM client](/docs/deep-dive/language_model_clients/custom-lm-client.mdx) and [Custom RM client](/docs/deep-dive/retrieval_models_clients/custom-rm-client.mdx).
+Check out these walkthroughs on setting up a [Custom LM client](https://dspy-docs.vercel.app/docs/deep-dive/language_model_clients/custom-lm-client) and [Custom RM client](https://dspy-docs.vercel.app/docs/deep-dive/retrieval_models_clients/custom-rm-client).
diff --git a/docs-page/docs/intro.md b/docs/docs/intro.md
similarity index 100%
rename from docs-page/docs/intro.md
rename to docs/docs/intro.md
diff --git a/docs-page/docs/quick-start/_category_.json b/docs/docs/quick-start/_category_.json
similarity index 100%
rename from docs-page/docs/quick-start/_category_.json
rename to docs/docs/quick-start/_category_.json
diff --git a/docs-page/docs/quick-start/installation.mdx b/docs/docs/quick-start/installation.mdx
similarity index 100%
rename from docs-page/docs/quick-start/installation.mdx
rename to docs/docs/quick-start/installation.mdx
diff --git a/docs-page/docs/quick-start/minimal-example.mdx b/docs/docs/quick-start/minimal-example.mdx
similarity index 100%
rename from docs-page/docs/quick-start/minimal-example.mdx
rename to docs/docs/quick-start/minimal-example.mdx
diff --git a/docs-page/docs/tutorials/_category_.json b/docs/docs/tutorials/_category_.json
similarity index 100%
rename from docs-page/docs/tutorials/_category_.json
rename to docs/docs/tutorials/_category_.json
diff --git a/docs/docs/tutorials/examples.md b/docs/docs/tutorials/examples.md
new file mode 100644
index 000000000..a0f51e436
--- /dev/null
+++ b/docs/docs/tutorials/examples.md
@@ -0,0 +1,28 @@
+---
+sidebar_position: 99998
+---
+
+# Community Examples
+
+The DSPy team believes complexity has to be justified. We take this seriously: we never release a complex tutorial (above) or example (below) _unless we can demonstrate empirically that this complexity has generally led to improved quality or cost._ This kind of rule is rarely enforced by other frameworks or docs, but you can count on it in DSPy examples.
+
+There's a bunch of examples in the `examples/` directory and in the top-level directory. We welcome contributions!
+
+You can find other examples tweeted by [@lateinteraction](https://twitter.com/lateinteraction) on Twitter/X.
+
+**Some other examples (not exhaustive, feel free to add more via PR):**
+
+- Applying DSPy Assertions
+  - [Long-form Answer Generation with Citations, by Arnav Singhvi](https://colab.research.google.com/github/stanfordnlp/dspy/blob/main/examples/longformqa/longformqa_assertions.ipynb)
+  - [Generating Answer Choices for Quiz Questions, by Arnav Singhvi](https://colab.research.google.com/github/stanfordnlp/dspy/blob/main/examples/quiz/quiz_assertions.ipynb)
+  - [Generating Tweets for QA, by Arnav Singhvi](https://colab.research.google.com/github/stanfordnlp/dspy/blob/main/examples/tweets/tweets_assertions.ipynb)
+- [Compiling LCEL runnables from LangChain in DSPy](https://github.com/stanfordnlp/dspy/blob/main/examples/tweets/compiling_langchain.ipynb)
+- [AI feedback, or writing LM-based metrics in DSPy](https://github.com/stanfordnlp/dspy/blob/main/examples/tweets/tweet_metric.py)
+- [DSPy Optimizers Benchmark on a bunch of different tasks, by Michael Ryan](https://github.com/stanfordnlp/dspy/tree/main/testing/tasks)
+- [Indian Languages NLI with gains due to compiling by Saiful Haq](https://github.com/saifulhaq95/DSPy-Indic/blob/main/indicxlni.ipynb)
+- [Sophisticated Extreme Multi-Class Classification, IReRa, by Karel D’Oosterlinck](https://github.com/KarelDO/xmc.dspy)
+- [DSPy on BIG-Bench Hard Example, by Chris Levy](https://drchrislevy.github.io/posts/dspy/dspy.html)
+- [Using Ollama with DSPy for Mistral (quantized) by @jrknox1977](https://gist.github.com/jrknox1977/78c17e492b5a75ee5bbaf9673aee4641)
+- [Using DSPy, "The Unreasonable Effectiveness of Eccentric Automatic Prompts" (paper) by VMware's Rick Battle & Teja Gollapudi, and interview at TheRegister](https://www.theregister.com/2024/02/22/prompt_engineering_ai_models/)
+
+There are also recent cool examples at [Weaviate's DSPy cookbook](https://github.com/weaviate/recipes/tree/main/integrations/dspy) by Connor Shorten. [See tutorial on YouTube](https://www.youtube.com/watch?v=CEuUG4Umfxs).
\ No newline at end of file
diff --git a/docs-page/docs/tutorials/other_tutorial.md b/docs/docs/tutorials/other_tutorial.md
similarity index 89%
rename from docs-page/docs/tutorials/other_tutorial.md
rename to docs/docs/tutorials/other_tutorial.md
index 4545eea7d..6f678da9c 100644
--- a/docs-page/docs/tutorials/other_tutorial.md
+++ b/docs/docs/tutorials/other_tutorial.md
@@ -9,8 +9,8 @@ sidebar_position: 99999
 | **Level** |  **Tutorial** |  **Run in Colab** |  **Description** |
 | --- | -------------  |  -------------  |  -------------  | 
 | Beginner |  [**Getting Started**](https://github.com/stanfordnlp/dspy/blob/main/intro.ipynb) | [<img align="center" src="https://colab.research.google.com/assets/colab-badge.svg" />](https://colab.research.google.com/github/stanfordnlp/dspy/blob/main/intro.ipynb)  |  Introduces the basic building blocks in DSPy. Tackles the task of complex question answering with HotPotQA. |
-| Beginner | [**Minimal Working Example**](/docs/quick-start/minimal-example) | N/A | Builds and optimizes a very simple chain-of-thought program in DSPy for math question answering. Very short. |
-| Beginner | [**Compiling for Tricky Tasks**](https://github.com/stanfordnlp/dspy/blob/main/examples/nli/scone/scone.ipynb) | N/A | Teaches LMs to reason about logical statements and negation. Uses GPT-4 to bootstrap few-shot CoT demonstations for GPT-3.5. Establishes a state-of-the-art result on [ScoNe](https://arxiv.org/abs/2305.19426). Contributed by [Chris Potts](https://twitter.com/ChrisGPotts/status/1740033519446057077). |
+| Beginner | [**Minimal Working Example**](https://dspy-docs.vercel.app/docs/quick-start/minimal-example) | N/A | Builds and optimizes a very simple chain-of-thought program in DSPy for math question answering. Very short. |
+| Beginner | [**Compiling for Tricky Tasks**](https://github.com/stanfordnlp/dspy/blob/main/examples/nli/scone/scone.ipynb) | N/A | Teaches LMs to reason about logical statements and negation. Uses GPT-4 to bootstrap few-shot CoT demonstrations for GPT-3.5. Establishes a state-of-the-art result on [ScoNe](https://arxiv.org/abs/2305.19426). Contributed by [Chris Potts](https://twitter.com/ChrisGPotts/status/1740033519446057077). |
 | Beginner | [**Local Models & Custom Datasets**](https://github.com/stanfordnlp/dspy/blob/main/skycamp2023.ipynb) | [<img align="center" src="https://colab.research.google.com/assets/colab-badge.svg" />](https://colab.research.google.com/github/stanfordnlp/dspy/blob/main/skycamp2023.ipynb) | Illustrates two different things together: how to use local models (Llama-2-13B in particular) and how to use your own data examples for training and development.
 | Intermediate | [**The DSPy Paper**](https://arxiv.org/abs/2310.03714) | N/A | Sections 3, 5, 6, and 7 of the DSPy paper can be consumed as a tutorial. They include explained code snippets, results, and discussions of the abstractions and API.
 | Intermediate | [**DSPy Assertions**](https://arxiv.org/abs/2312.13382) | [<img align="center" src="https://colab.research.google.com/assets/colab-badge.svg" />](https://colab.research.google.com/github/stanfordnlp/dspy/blob/main/examples/longformqa/longformqa_assertions.ipynb) | Introduces example of applying DSPy Assertions while generating long-form responses to questions with citations. Presents comparative evaluation in both zero-shot and compiled settings.
diff --git a/docs-page/docs/tutorials/rag.md b/docs/docs/tutorials/rag.md
similarity index 92%
rename from docs-page/docs/tutorials/rag.md
rename to docs/docs/tutorials/rag.md
index 0560d10b5..fe45e8b30 100644
--- a/docs-page/docs/tutorials/rag.md
+++ b/docs/docs/tutorials/rag.md
@@ -10,7 +10,7 @@ RAG ensures LLMs can dynamically utilize real-time knowledge even if not origina
 
 ## Configuring LM and RM
 
-We'll start by setting up the language model (LM) and retrieval model (RM), which **DSPy** supports through multiple [LM](/docs/category/remote-language-model-clients) and [RM](/docs/category/retrieval-model-clients) APIs and [local models hosting](/docs/category/local-language-model-clients).
+We'll start by setting up the language model (LM) and retrieval model (RM), which **DSPy** supports through multiple [LM](https://dspy-docs.vercel.app/docs/category/language-model-clients) and [RM](https://dspy-docs.vercel.app/docs/category/retrieval-model-clients) APIs and [local models hosting](https://dspy-docs.vercel.app/docs/category/local-language-model-clients).
 
 In this notebook, we'll work with GPT-3.5 (`gpt-3.5-turbo`) and the `ColBERTv2` retriever (a free server hosting a Wikipedia 2017 "abstracts" search index containing the first paragraph of each article from this [2017 dump](https://hotpotqa.github.io/wiki-readme.html)). We configure the LM and RM within DSPy, allowing DSPy to internally call the respective module when needed for generation or retrieval. 
 
@@ -47,7 +47,7 @@ len(trainset), len(devset)
 
 ## Building Signatures
 
-Now that we have the data loaded, let's start defining the [signatures](/docs/building-blocks/signatures) for the sub-tasks of our pipeline.
+Now that we have the data loaded, let's start defining the [signatures](https://dspy-docs.vercel.app/docs/building-blocks/signatures) for the sub-tasks of our pipeline.
 
 We can identify our simple input `question` and output `answer`, but since we are building out a RAG pipeline, we wish to utilize some contextual information from our ColBERT corpus. So let's define our signature: `context, question --> answer`.
 
@@ -64,7 +64,7 @@ We include small descriptions for the `context` and `answer` fields to define mo
 
 ## Building the Pipeline
 
-We will build our RAG pipeline as a [DSPy module](/docs/building-blocks/modules) which will require two methods:
+We will build our RAG pipeline as a [DSPy module](https://dspy-docs.vercel.app/docs/building-blocks/modules) which will require two methods:
 
 * The `__init__` method will simply declare the sub-modules it needs: `dspy.Retrieve` and `dspy.ChainOfThought`. The latter is defined to implement our `GenerateAnswer` signature.
 * The `forward` method will describe the control flow of answering the question using the modules we have: Given a question, we'll search for the top-3 relevant passages and then feed them as context for answer generation.
@@ -88,7 +88,7 @@ class RAG(dspy.Module):
 
 ##### Compiling the RAG program
 
-Having defined this program, let's now **compile** it. [Compiling a program](/docs/building-blocks/optimizers) will update the parameters stored in each module. In our setting, this is primarily in the form of collecting and selecting good demonstrations for inclusion within the prompt(s).
+Having defined this program, let's now **compile** it. [Compiling a program](https://dspy-docs.vercel.app/docs/building-blocks/optimizers) will update the parameters stored in each module. In our setting, this is primarily in the form of collecting and selecting good demonstrations for inclusion within the prompt(s).
 
 Compiling depends on three things:
 
diff --git a/docs-page/docs/tutorials/simplified-baleen.md b/docs/docs/tutorials/simplified-baleen.md
similarity index 97%
rename from docs-page/docs/tutorials/simplified-baleen.md
rename to docs/docs/tutorials/simplified-baleen.md
index 2f9f2e6d6..6a7e94ca2 100644
--- a/docs-page/docs/tutorials/simplified-baleen.md
+++ b/docs/docs/tutorials/simplified-baleen.md
@@ -10,7 +10,7 @@ The standard approach for this challenge in retrieval-augmented NLP literature i
 
 ## Configuring LM and RM
 
-We'll start by setting up the language model (LM) and retrieval model (RM), which **DSPy** supports through multiple [LM](/docs/category/remote-language-model-clients) and [RM](/docs/category/retrieval-model-clients) APIs and [local models hosting](/docs/category/local-language-model-clients). 
+We'll start by setting up the language model (LM) and retrieval model (RM), which **DSPy** supports through multiple [LM](https://dspy-docs.vercel.app/docs/category/language-model-clients) and [RM](https://dspy-docs.vercel.app/docs/category/retrieval-model-clients) APIs and [local models hosting](https://dspy-docs.vercel.app/docs/category/local-language-model-clients).
 
 In this notebook, we'll work with GPT-3.5 (`gpt-3.5-turbo`) and the `ColBERTv2` retriever (a free server hosting a Wikipedia 2017 "abstracts" search index containing the first paragraph of each article from this [2017 dump](https://hotpotqa.github.io/wiki-readme.html)). We configure the LM and RM within DSPy, allowing DSPy to internally call the respective module when needed for generation or retrieval. 
 
diff --git a/docs/docs_requirements.txt b/docs/docs_requirements.txt
deleted file mode 100644
index 9837430cf..000000000
--- a/docs/docs_requirements.txt
+++ /dev/null
@@ -1,5 +0,0 @@
-mkdocs
-mkdocs-gen-files
-mkdocs-material
-mkdocs-material-extensions
-mkdocstrings-python
diff --git a/docs-page/docusaurus.config.ts b/docs/docusaurus.config.ts
similarity index 100%
rename from docs-page/docusaurus.config.ts
rename to docs/docusaurus.config.ts
diff --git a/docs/guides/README.md b/docs/guides/README.md
deleted file mode 100644
index 4f699d07e..000000000
--- a/docs/guides/README.md
+++ /dev/null
@@ -1,3 +0,0 @@
-For the guides, please visit:
-
-**https://dspy-docs.vercel.app/docs/category/dspy-building-blocks**
diff --git a/docs/guides/assertions.ipynb b/docs/guides/assertions.ipynb
deleted file mode 100644
index 711e6e3d6..000000000
--- a/docs/guides/assertions.ipynb
+++ /dev/null
@@ -1,77 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "%load_ext autoreload\n",
-    "%autoreload 2\n",
-    "import sys; sys.path.append('/future/u/okhattab/repos/public/tmp/dspy')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "<img src=\"../../docs/images/DSPy8.png\" alt=\"DSPy7 Image\" height=\"150\"/>\n",
-    "\n",
-    "## Guide: **DSPy Assertions**\n",
-    "\n",
-    "[<img align=\"center\" src=\"https://colab.research.google.com/assets/colab-badge.svg\" />](https://colab.research.google.com/github/stanfordnlp/dspy/blob/main/docs/guides/signatures.ipynb)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### Quick Recap\n",
-    "\n",
-    "This guide assumes you followed the [intro tutorial]() to build your first few DSPy programs.\n",
-    "\n",
-    "Remember that a **DSPy program** is just Python code that calls one or more DSPy modules, like `dspy.Predict` or `dspy.ChainOfThought`, to use LMs."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 1) What is a DSPy Assertion?\n",
-    "\n",
-    "While we prepare this guide, please [read the DSPy assertions paper](https://arxiv.org/abs/2312.13382) and follow the examples in it."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Install `dspy-ai` if needed. Then set up a default language model.\n",
-    "# TODO: Add a graceful line for OPENAI_API_KEY.\n",
-    "\n",
-    "try: import dspy\n",
-    "except ImportError:\n",
-    "    %pip install dspy-ai\n",
-    "    import dspy\n",
-    "\n",
-    "dspy.configure(lm=dspy.OpenAI(model='gpt-3.5-turbo-1106'))"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  }
- ],
- "metadata": {
-  "language_info": {
-   "name": "python"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 2
-}
diff --git a/docs/guides/language_model_details/launching_mlc.md b/docs/guides/language_model_details/launching_mlc.md
deleted file mode 100644
index 87bf65d0d..000000000
--- a/docs/guides/language_model_details/launching_mlc.md
+++ /dev/null
@@ -1,48 +0,0 @@
-## Setting up an MLC language model
-
-### Prerequisites
-
-Install the required packages using the following commands:
-   
-```shell
-pip install --no-deps --pre --force-reinstall mlc-ai-nightly-cu118 mlc-chat-nightly-cu118 -f https://mlc.ai/wheels
-pip install transformers
-git lfs install
-```
-
-Adjust the pip wheels according to your OS/platform by referring to the provided commands in [MLC packages](https://mlc.ai/package/).
-
-
-### Running MLC Llama-2 models
-
-1. Create a directory for prebuilt models:
-
-```shell
-mkdir -p dist/prebuilt
-```
-
-2. Clone the necessary libraries from the repository:
-
-```shell
-git clone https://github.com/mlc-ai/binary-mlc-llm-libs.git dist/prebuilt/lib
-cd dist/prebuilt
-```
-
-3. Choose a Llama-2 model from [MLC LLMs](https://huggingface.co/mlc-ai) and clone the model repository:
-
-```shell
-git clone https://huggingface.co/mlc-ai/mlc-chat-Llama-2-7b-chat-hf-q4f16_1
-```
-
-### Sending requests to the server
-
-Initialize the `ChatModuleClient` within your program with the desired parameters. Here's an example call:
-
-```python
-model = 'dist/prebuilt/mlc-chat-Llama-2-7b-chat-hf-q4f16_1'
-model_path = 'dist/prebuilt/lib/Llama-2-7b-chat-hf-q4f16_1-cuda.so'
-
-llama = dspy.ChatModuleClient(model=model, model_path=model_path)
-```
-
-Please refer to the [official MLC repository](https://github.com/mlc-ai/mlc-llm) for more detailed [docs](https://mlc.ai/mlc-llm/docs/get_started/try_out.html).
diff --git a/docs/guides/language_model_details/launching_ollama.md b/docs/guides/language_model_details/launching_ollama.md
deleted file mode 100644
index 242fc6ec7..000000000
--- a/docs/guides/language_model_details/launching_ollama.md
+++ /dev/null
@@ -1,41 +0,0 @@
-## Setting up an Ollama language model
-
-Ollama is a good software tool that allows you to run LLMs locally, such as Mistral, Llama2, and Phi.
-The following are the instructions to install and run Ollama.
-
-### Prerequisites
-
-Install Ollama by following the instructions from this page:
-
-- https://ollama.ai
-
-Download model: `ollama pull`
-
-Download a model by running the `ollama pull` command. You can download Mistral, Llama2, and Phi.
-
-```bash
-# download mistral
-ollama pull mistral
-```
-
-Here is the list of other models you can download:
-- https://ollama.ai/library
-
-### Running Ollama model
-
-Run model: `ollama run`
-
-You can test a model by running the model with the `ollama run` command.
-
-```bash
-# run mistral
-ollama run mistral
-```
-
-### Sending requests to the server
-
-Here is the code to load a model through Ollama:
-
-```python
-lm = dspy.OllamaLocal(model='mistral')
-```
diff --git a/docs/guides/language_model_details/launching_tgi.md b/docs/guides/language_model_details/launching_tgi.md
deleted file mode 100644
index d45c2dea5..000000000
--- a/docs/guides/language_model_details/launching_tgi.md
+++ /dev/null
@@ -1,60 +0,0 @@
-## Launching a Text Generation Inference (TGI) Server
-
-### Prerequisites
-
-- Docker must be installed on your system. If you don't have Docker installed, you can get it from [here](https://docs.docker.com/get-docker/).
-
-### Setting up the Text-Generation-Inference Server
-
-1. Clone the Text-Generation-Inference repository from GitHub by executing the following command:
-
-```bash
-git clone https://github.com/huggingface/text-generation-inference.git
-```
-
-2. Change into the cloned repository directory:
-
-```bash
-cd text-generation-inference
-```
-
-3. Execute the Docker command under the "Get Started" section to run the server:
-
-```bash
-model=meta-llama/Llama-2-7b-hf # set to the specific Hugging Face model ID you wish to use.
-num_shard=1 # set to the number of shards you wish to use.
-volume=$PWD/data # share a volume with the Docker container to avoid downloading weights every run
-
-docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:latest --model-id $model --num-shard $num_shard
-```
-
-This command will start the server and make it accessible at `http://localhost:8080`.
-
-If you want to connect to [Meta Llama 2 models](https://huggingface.co/meta-llama), make sure to use version 9.3 (or higher) of the docker image (ghcr.io/huggingface/text-generation-inference:0.9.3) and pass in your huggingface token as an environment variable.
-
-```bash
-docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data -e HUGGING_FACE_HUB_TOKEN={your_token} ghcr.io/huggingface/text-generation-inference:latest --model-id $model --num-shard $num_shard
-```
-
-### Sending requests to the server
-
-After setting up the text-generation-inference server and ensuring that it displays "Connected" when it's running, you can interact with it using the `HFClientTGI`.
-
-Initialize the `HFClientTGI` within your program with the desired parameters. Here is an example call:
-
-   ```python
-   lm = dspy.HFClientTGI(model="meta-llama/Llama-2-7b-hf", port=8080, url="http://localhost")
-   ```
-
-   Customize the `model`, `port`, and `url` according to your requirements. The `model` parameter should be set to the specific Hugging Face model ID you wish to use. 
-
-
-### FAQs
-
-1. If your model doesn't require any shards, you still need to set a value for `num_shard`, but you don't need to include the parameter `--num-shard` on the command line.
-
-2. If your model runs into any "token exceeded" issues, you can set the following parameters on the command line to adjust the input length and token limit:
-   - `--max-input-length`: Set the maximum allowed input length for the text.
-   - `--max-total-tokens`: Set the maximum total tokens allowed for text generation.
-
-Please refer to the [official TGI repository](https://github.com/huggingface/text-generation-inference) for detailed docs.
diff --git a/docs/guides/language_model_details/launching_vllm.md b/docs/guides/language_model_details/launching_vllm.md
deleted file mode 100644
index f75d538b0..000000000
--- a/docs/guides/language_model_details/launching_vllm.md
+++ /dev/null
@@ -1,31 +0,0 @@
-## Launching a vLLM Server
-
-### Setting up the vLLM Server
-
-Follow these steps to set up the vLLM Server:
-
-1. Build the server from source by following the instructions provided in the [Build from Source guide](https://vllm.readthedocs.io/en/latest/getting_started/installation.html#build-from-source).
-
-2. Start the server by running the following command, and specify your desired model, host, and port using the appropriate arguments. The default server address is http://localhost:8000.
-
-Example command:
-
-```bash
-   python -m vllm.entrypoints.openai.api_server --model mosaicml/mpt-7b --port 8000
-```
-
-This will launch the vLLM server.
-
-### Sending requests to the server
-
-After setting up the vLLM server and ensuring that it displays "Connected" when it's running, you can interact with it using the `HFClientVLLM`.
-
-Initialize the `HFClientVLLM` within your program with the desired parameters. Here is an example call:
-
-```python
-   lm = dspy.HFClientVLLM(model="mosaicml/mpt-7b", port=8000, url="http://localhost")
-```
-
-Customize the `model`, `port`, `url`, and `max_tokens` according to your requirements. The `model` parameter should be set to the specific Hugging Face model ID you wish to use.
-
-Please refer to the [official vLLM repository](https://github.com/vllm-project/vllm) for more detailed information and documentation.
diff --git a/docs/guides/language_models.ipynb b/docs/guides/language_models.ipynb
deleted file mode 100644
index be7f37318..000000000
--- a/docs/guides/language_models.ipynb
+++ /dev/null
@@ -1,257 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "%load_ext autoreload\n",
-    "%autoreload 2\n",
-    "import sys; sys.path.append('/future/u/okhattab/repos/public/tmp/dspy')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "<img src=\"../../docs/images/DSPy8.png\" alt=\"DSPy7 Image\" height=\"150\"/>\n",
-    "\n",
-    "## Guide: **Language Models**\n",
-    "\n",
-    "[<img align=\"center\" src=\"https://colab.research.google.com/assets/colab-badge.svg\" />](https://colab.research.google.com/github/stanfordnlp/dspy/blob/main/docs/guides/language_models.ipynb)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### Quick Recap\n",
-    "\n",
-    "This guide assumes you followed the [intro tutorial]() to build your first few DSPy programs.\n",
-    "\n",
-    "Remember that a **DSPy program** is just Python code that calls one or more DSPy modules, like `dspy.Predict` or `dspy.ChainOfThought`, to use LMs."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 1) Short Intro to LMs in DSPy\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Install `dspy-ai` if needed.\n",
-    "\n",
-    "try: import dspy\n",
-    "except ImportError:\n",
-    "    %pip install dspy-ai\n",
-    "    import dspy"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 2) Supported LM clients.\n",
-    "\n",
-    "#### Remote LMs.\n",
-    "\n",
-    "These models are managed services. You just need to sign up and obtain an API key.\n",
-    "\n",
-    "1. `dspy.OpenAI` for GPT-3.5 and GPT-4.\n",
-    "\n",
-    "2. `dspy.Cohere`\n",
-    "\n",
-    "3. `dspy.Anyscale` for hosted Llama2 models.\n",
-    "\n",
-    "4. `dspy.Together` for hosted various open source models.\n",
-    "\n",
-    "#### Local LMs.\n",
-    "\n",
-    "You need to host these models on your own GPU(s). Below, we include pointers for how to do that.\n",
-    "\n",
-    "4. `dspy.HFClientTGI`: for HuggingFace models through the Text Generation Inference (TGI) system. [Tutorial: How do I install and launch the TGI server?](language_model_details/launching_tgi.md)\n",
-    "\n",
-    "5. `dspy.HFClientVLLM`: for HuggingFace models through vLLM. [Tutorial: How do I install and launch the vLLM server?](language_model_details/launching_vllm.md)\n",
-    "\n",
-    "6. `dspy.HFModel` (experimental)\n",
-    "\n",
-    "7. `dspy.Ollama` (experimental) for open source models through [Ollama](https://ollama.com). [Tutorial: How do I install and use Ollama on a local computer?](language_model_details/launching_ollama.md)\n",
-    "\n",
-    "\n",
-    "8. `dspy.ChatModuleClient` (experimental): [How do I install and use MLC?](language_model_details/launching_mlc.md)\n",
-    "\n",
-    "\n",
-    "\n",
-    "If there are other clients you want added, let us know!"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 3) Setting up the LM client.\n",
-    "\n",
-    "You can just call the constructor that connects to the LM. Then, use `dspy.configure` to declare this as the default LM.\n",
-    "\n",
-    "For example, for OpenAI, you can do it as follows."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 10,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# TODO: Add a graceful line for OPENAI_API_KEY.\n",
-    "\n",
-    "gpt3_turbo = dspy.OpenAI(model='gpt-3.5-turbo-1106', max_tokens=300)\n",
-    "gpt4_turbo = dspy.OpenAI(model='gpt-4-1106-preview', max_tokens=300)\n",
-    "\n",
-    "# cohere = dspy.Cohere(...)\n",
-    "# anyscale = dspy.Anyscale(...)\n",
-    "# together = dspy.Together(...)\n",
-    "# ollama = dspy.OllamaLocal(...)\n",
-    "# tgi_llama2 = dspy.HFClientTGI(model=\"meta-llama/Llama-2-7b-hf\", port=8080, url=\"http://localhost\")\n",
-    "\n",
-    "dspy.configure(lm=gpt3_turbo)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 4) Using a different LM within a code block.\n",
-    "\n",
-    "The default LM above is GPT-3.5, `gpt3_turbo`. What if I want to run a piece of code with, say, GPT-4 or LLama-2?\n",
-    "\n",
-    "Instead of changing the default LM, you can just change it inside a block of code.\n",
-    "\n",
-    "**Tip:** Using `dspy.configure` and `dspy.context` is thread-safe!"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 11,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "The castle David Gregory inherited has 7 floors.\n",
-      "The number of floors in the castle David Gregory inherited cannot be determined with the information provided.\n"
-     ]
-    }
-   ],
-   "source": [
-    "qa = dspy.ChainOfThought('question -> answer')\n",
-    "\n",
-    "response = qa(question=\"How many floors are in the castle David Gregory inherited?\")\n",
-    "print(response.answer)\n",
-    "\n",
-    "with dspy.context(lm=gpt4_turbo):\n",
-    "    response = qa(question=\"How many floors are in the castle David Gregory inherited?\")\n",
-    "    print(response.answer)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 5) Tips and Tricks.\n",
-    "\n",
-    "In DSPy, all LM calls are cached. If you repeat the same call, you will get the same outputs. (If you change the inputs or configurations, you will get new outputs.)\n",
-    "\n",
-    "To generate 5 outputs, you can use `n=5` in the module constructor, or pass `config=dict(n=5)` when invoking the module."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 13,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "[\"The specific number of floors in David Gregory's inherited castle is not provided here, so further research would be needed to determine the answer.\",\n",
-       " 'The castle David Gregory inherited has 4 floors.',\n",
-       " 'The castle David Gregory inherited has 5 floors.',\n",
-       " 'David Gregory inherited 10 floors in the castle.',\n",
-       " 'The castle David Gregory inherited has 5 floors.']"
-      ]
-     },
-     "execution_count": 13,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "qa = dspy.ChainOfThought('question -> answer', n=5)\n",
-    "\n",
-    "response = qa(question=\"How many floors are in the castle David Gregory inherited?\")\n",
-    "response.completions.answer"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "If you just call `qa(...)` in a loop with the same input, it will always return the same value! That's by design.\n",
-    "\n",
-    "To loop and generate one output at a time with the same input, bypass the cache by making sure each request is (slightly) unique, as below."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 14,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "The specific number of floors in David Gregory's inherited castle is not provided here, so further research would be needed to determine the answer.\n",
-      "It is not possible to determine the exact number of floors in the castle David Gregory inherited without specific information about the castle's layout and history.\n",
-      "The castle David Gregory inherited has 5 floors.\n",
-      "We need more information to determine the number of floors in the castle David Gregory inherited.\n",
-      "The castle David Gregory inherited has a total of 6 floors.\n"
-     ]
-    }
-   ],
-   "source": [
-    "for idx in range(5):\n",
-    "    response = qa(question=\"How many floors are in the castle David Gregory inherited?\", config=dict(temperature=0.7+0.0001*idx))\n",
-    "    print(response.answer)"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "py39_nov2023",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.9.18"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 2
-}
diff --git a/docs/guides/modules.ipynb b/docs/guides/modules.ipynb
deleted file mode 100644
index e75531241..000000000
--- a/docs/guides/modules.ipynb
+++ /dev/null
@@ -1,287 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "%load_ext autoreload\n",
-    "%autoreload 2\n",
-    "import sys; sys.path.append('/future/u/okhattab/repos/public/tmp/dspy')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "<img src=\"../../docs/images/DSPy8.png\" alt=\"DSPy7 Image\" height=\"150\"/>\n",
-    "\n",
-    "## Guide: **DSPy Modules**\n",
-    "\n",
-    "[<img align=\"center\" src=\"https://colab.research.google.com/assets/colab-badge.svg\" />](https://colab.research.google.com/github/stanfordnlp/dspy/blob/main/docs/guides/signatures.ipynb)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### Quick Recap\n",
-    "\n",
-    "This guide assumes you followed the [intro tutorial]() to build your first few DSPy programs.\n",
-    "\n",
-    "Remember that **DSPy program** is just Python code that calls one or more **DSPy modules**, like `dspy.Predict` or `dspy.ChainOfThought`, to use LMs."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 1) What is a DSPy Module?\n",
-    "\n",
-    "A **DSPy module** is a building block for programs that use LMs.\n",
-    "\n",
-    "- Each built-in module abstracts a **prompting technique** (like chain of thought or ReAct). Crucially, they are generalized to handle any [DSPy Signature]().\n",
-    "\n",
-    "- A DSPy module has **learnable parameters** (i.e., the little pieces comprising the prompt and the LM weights) and can be invoked (called) to process inputs and return outputs.\n",
-    "\n",
-    "- Multiple modules can be composed into bigger modules (programs). DSPy modules are inspired directly by NN modules in PyTorch, but applied to LM programs."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 2) Why should I use a DSPy Module?\n",
-    "\n",
-    "TODO. I typically take this as self-evident, but I'll spell it out here."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Install `dspy-ai` if needed. Then set up a default language model.\n",
-    "# TODO: Add a graceful line for OPENAI_API_KEY.\n",
-    "\n",
-    "try: import dspy\n",
-    "except ImportError:\n",
-    "    %pip install dspy-ai\n",
-    "    import dspy\n",
-    "\n",
-    "dspy.configure(lm=dspy.OpenAI(model='gpt-3.5-turbo-1106'))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 3) What DSPy Modules are currently built-in?\n",
-    "\n",
-    "1. **`dspy.Predict`**:\n",
-    "\n",
-    "2. **`dspy.ChainOfThought`**: \n",
-    "\n",
-    "3. **`dspy.ProgramOfThought`**:\n",
-    "\n",
-    "4. **`dspy.ReAct`**:\n",
-    "\n",
-    "5. **`dspy.MultiChainComparison`**:\n",
-    "\n",
-    "\n",
-    "We also have some function-style modules:\n",
-    "\n",
-    "6. **`dspy.majority`**:"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 4) How do I use a built-in module, like `dspy.Predict` or `dspy.ChainOfThought`?\n",
-    "\n",
-    "Let's start with the most fundamental one, `dspy.Predict`. Internally, all of the others are just built using it!\n",
-    "\n",
-    "We'll assume you are already at least a little familiar with [DSPy signatures](), which are declarative specs for defining the behavior of any module we use in DSPy.\n",
-    "\n",
-    "To use a module, we first **declare** it by giving it a signature. Then we **call** the module with the input arguments, and extract the output fields!"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 35,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Positive\n"
-     ]
-    }
-   ],
-   "source": [
-    "sentence = \"it's a charming and often affecting journey.\"  # example from the SST-2 dataset.\n",
-    "\n",
-    "# 1) Declare with a signature.\n",
-    "classify = dspy.Predict('sentence -> sentiment')\n",
-    "\n",
-    "# 2) Call with input argument(s). \n",
-    "response = classify(sentence=sentence)\n",
-    "\n",
-    "# 3) Access the output.\n",
-    "print(response.sentiment)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "When we declare a module, we can pass configuration keys to it.\n",
-    "\n",
-    "Below, we'll pass `n=5` to request five completions. We can also pass `temperature` or `max_len`, etc.\n",
-    "\n",
-    "Let's use `dspy.ChainOfThought`. In many cases, simply swapping `dspy.ChainOfThought` in place of `dspy.Predict` improves quality."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 40,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "['One great thing about the ColBERT retrieval model is its superior efficiency and effectiveness compared to other models.',\n",
-       " 'Its ability to efficiently retrieve relevant information from large document collections.',\n",
-       " 'One great thing about the ColBERT retrieval model is its superior performance compared to other models and its efficient use of pre-trained language models.',\n",
-       " 'One great thing about the ColBERT retrieval model is its superior efficiency and accuracy compared to other models.',\n",
-       " 'One great thing about the ColBERT retrieval model is its ability to incorporate user feedback and support complex queries.']"
-      ]
-     },
-     "execution_count": 40,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "question = \"What's something great about the ColBERT retrieval model?\"\n",
-    "\n",
-    "# 1) Declare with a signature, and pass some config.\n",
-    "classify = dspy.ChainOfThought('question -> answer', n=5)\n",
-    "\n",
-    "# 2) Call with input argument.\n",
-    "response = classify(question=question)\n",
-    "\n",
-    "# 3) Access the outputs.\n",
-    "response.completions.answer"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Let's dicuss the output object here.\n",
-    "\n",
-    "The `dspy.ChainOfThought` module will generally inject a `rationale` before the output field(s) of your signature.\n",
-    "\n",
-    "Let's inspect the (first) rationale and answer!"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 42,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Rationale: produce the answer. We can consider the fact that ColBERT has shown to outperform other state-of-the-art retrieval models in terms of efficiency and effectiveness. It uses contextualized embeddings and performs document retrieval in a way that is both accurate and scalable.\n",
-      "Answer: One great thing about the ColBERT retrieval model is its superior efficiency and effectiveness compared to other models.\n"
-     ]
-    }
-   ],
-   "source": [
-    "print(f\"Rationale: {response.rationale}\")\n",
-    "print(f\"Answer: {response.answer}\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "This is accessible whether we request one or many completions.\n",
-    "\n",
-    "We can also access the different completions as a list of `Prediction`s or as several lists, one for each field."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 45,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "True"
-      ]
-     },
-     "execution_count": 45,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "response.completions[3].rationale == response.completions.rationale[3]"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 5) How do I use more complex built-in modules?\n",
-    "\n",
-    "The others are very similar, `dspy.ReAct` and `dspy.ProgramOfThough` etc. They mainly change the internal behavior with which your signature is implemented!\n",
-    "\n",
-    "More example soon!"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 6) How do I compose multiple modules into a bigger program?\n",
-    "\n",
-    "DSPy is just Python code that uses modules in any control flow you like. (There's some magic internally at `compile` time to trace your LM calls.)\n",
-    "\n",
-    "What this means is that, you can just call the modules freely. No weird abstractions for chaining calls.\n",
-    "\n",
-    "This is basically PyTorch's design approach for define-by-run / dynamic computation graphs. Refer to the intro tutorials for examples."
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "py39_nov2023",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.9.18"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 2
-}
diff --git a/docs/guides/optimizers.ipynb b/docs/guides/optimizers.ipynb
deleted file mode 100644
index dcdb8c3d0..000000000
--- a/docs/guides/optimizers.ipynb
+++ /dev/null
@@ -1,168 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "The autoreload extension is already loaded. To reload it, use:\n",
-      "  %reload_ext autoreload\n"
-     ]
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "UsageError: unrecognized arguments: import sys; sys.path.append('/future/u/okhattab/repos/public/tmp/dspy')\n"
-     ]
-    }
-   ],
-   "source": [
-    "%load_ext autoreload\n",
-    "%autoreload 2\n",
-    "import sys; sys.path.append('/future/u/okhattab/repos/public/tmp/dspy')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "<img src=\"../../docs/images/DSPy8.png\" alt=\"DSPy7 Image\" height=\"150\"/>\n",
-    "\n",
-    "## Guide: **DSPy Optimizers**\n",
-    "\n",
-    "Formerly called **DSPy Teleprompters**. We will be making an official name update.\n",
-    "\n",
-    "[<img align=\"center\" src=\"https://colab.research.google.com/assets/colab-badge.svg\" />](https://colab.research.google.com/github/stanfordnlp/dspy/blob/main/docs/guides/signatures.ipynb)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### Quick Recap\n",
-    "\n",
-    "This guide assumes you followed the [intro tutorial]() to build your first few DSPy programs.\n",
-    "\n",
-    "Remember that a **DSPy program** is just Python code that calls one or more DSPy modules, like `dspy.Predict` or `dspy.ChainOfThought`, to use LMs."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 1) What is a DSPy Optimizer?\n",
-    "\n",
-    "A **DSPy optimizer** is an algorithm that can tune the parameters of a DSPy program (i.e., the prompts and the LM weights) to maximize the metrics you specify, like accuracy.\n",
-    "\n",
-    "There are many built-in optimizers in DSPy. They apply different strategies to tune your programs. A typical DSPy optimizer takes three things:\n",
-    "\n",
-    "- Your **DSPy program**. This may be a single module (e.g., `dspy.Predict`) or a complex multi-module program.\n",
-    "\n",
-    "- Your **metric**. This is a function that evaluates the output of your program, and assigns it a score (higher is better).\n",
-    "\n",
-    "- A few **training inputs**. This may be very small (i.e., only 5 or 10 examples) or incomplete (only inputs to your program, without any labels).\n",
-    "\n",
-    "Your training data could also be large or complete. DSPy can leverage having a lot of data, but you can start small."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 2) **What** does a DSPy Optimizer tune? **How** does it tune them?\n",
-    "\n",
-    "Traditional deep neural networks (DNNs) can be optimized with gradient descent, given a loss function and some training data.\n",
-    "\n",
-    "DSPy programs consist of multiple calls to LMs, stacked togther as [DSPy modules](). Each DSPy module has internal parameters of three kinds: (1) the LM weights, (2) the instructions, and (3) demonstrations of the input/output behavior.\n",
-    "\n",
-    "Given a metric, DSPy can optimize all of these three with multi-stage optimization algorithms. These can combine gradient descent (for LM weights) and LM-driven optimization (for the instructions), but primarily rely on discrete optimization for creating and validating demonstrations. DSPy Demonstrations are like few-shot examples, but they're far more powerful. They can be created from scratch, given your program, and their creation and selection can be optimized in many effective ways.\n",
-    "\n",
-    "In many cases, we found that compiling leads to better prompts than humans write. Not because DSPy optimizers are more creative than humans, but simply because they can try more things and tune the metrics directly."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Install `dspy-ai` if needed. Then set up a default language model.\n",
-    "# TODO: Add a graceful line for OPENAI_API_KEY.\n",
-    "\n",
-    "try: import dspy\n",
-    "except ImportError:\n",
-    "    %pip install dspy-ai\n",
-    "    import dspy\n",
-    "\n",
-    "dspy.configure(lm=dspy.OpenAI(model='gpt-3.5-turbo-1106'))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 2) What DSPy Optimizers are currently available?\n",
-    "\n",
-    "All of these can be accessed via `from dspy.teleprompt import *`.\n",
-    "\n",
-    "#### Automatic Few-Shot Learning\n",
-    "\n",
-    "1. **`LabeledFewShot`**:\n",
-    "\n",
-    "2. **`BootstrapFewShot`**: \n",
-    "\n",
-    "3. **`BootstrapFewShotWithRandomSearch`**:\n",
-    "\n",
-    "4. **`BootstrapFewShotWithOptuna`**:\n",
-    "\n",
-    "\n",
-    "#### Automatic Instruction Optimization\n",
-    "\n",
-    "5. **`SignatureOptimizer`**:\n",
-    "\n",
-    "\n",
-    "#### Automatic Finetuning\n",
-    "\n",
-    "6. **`BootstrapFinetune`**:\n",
-    "\n",
-    "\n",
-    "#### Program Transformations\n",
-    "\n",
-    "7. **`KNNFewShot`**:\n",
-    "\n",
-    "8. **`Ensemble`**:\n",
-    "\n",
-    "\n",
-    "#### Which one should I use?\n",
-    "\n",
-    "As a rule of thumb, if you don't know where to start, use `BootstrapFewShotWithRandomSearch`.\n",
-    "\n",
-    "There are some old docs for:\n",
-    "\n",
-    "- [`dspy.teleprompt.LabeledFewShot`](docs/teleprompters.md#telepromptlabeledfewshot)\n",
-    "- [`dspy.teleprompt.BootstrapFewShot`](docs/teleprompters.md#telepromptbootstrapfewshot)\n",
-    "- [`dspy.teleprompt.BootstrapFewShotWithRandomSearch`](docs/teleprompters.md#telepromptbootstrapfewshotwithrandomsearch)\n",
-    "- [`dspy.teleprompt.BootstrapFinetune`](docs/teleprompters.md#telepromptbootstrapfinetune)\n",
-    "- [`dspy.teleprompt.Ensemble`](docs/teleprompters.md#telepromptensemble)\n",
-    "- `dspy.teleprompt.kNN`\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": []
-  }
- ],
- "metadata": {
-  "language_info": {
-   "name": "python"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 2
-}
diff --git a/docs/guides/signatures.ipynb b/docs/guides/signatures.ipynb
deleted file mode 100644
index 7b3d1a82f..000000000
--- a/docs/guides/signatures.ipynb
+++ /dev/null
@@ -1,334 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "%load_ext autoreload\n",
-    "%autoreload 2\n",
-    "import sys; sys.path.append('/future/u/okhattab/repos/public/tmp/dspy')"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "<img src=\"../../docs/images/DSPy8.png\" alt=\"DSPy7 Image\" height=\"150\"/>\n",
-    "\n",
-    "## Guide: **DSPy Signatures**\n",
-    "\n",
-    "[<img align=\"center\" src=\"https://colab.research.google.com/assets/colab-badge.svg\" />](https://colab.research.google.com/github/stanfordnlp/dspy/blob/main/docs/guides/signatures.ipynb)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### Quick Recap\n",
-    "\n",
-    "This guide assumes you followed the [intro tutorial]() to build your first few DSPy programs.\n",
-    "\n",
-    "Remember that a **DSPy program** is just Python code that calls one or more **DSPy modules**, like `dspy.Predict` or `dspy.ChainOfThought`, to use LMs."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 1) What is a DSPy Signature?\n",
-    "\n",
-    "When we assign tasks to LMs in DSPy, we specify the behavior we need as a Signature.\n",
-    "\n",
-    "**A signature is a declarative specification of input/output behavior of a DSPy module.**\n",
-    "\n",
-    "You're probably familiar with function signatures. The differences are that:\n",
-    "\n",
-    "- While typical function signatures just _describe_ things, DSPy Signatures _define and control the behavior_ of modules.\n",
-    "\n",
-    "- The field names matter in DSPy Signatures. You express semantic roles in plain English: a `question` is different from an `answer`, a `sql_query` is different from `python_code`."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 36,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Install `dspy-ai` if needed. Then set up a default language model.\n",
-    "# TODO: Add a graceful line for OPENAI_API_KEY.\n",
-    "\n",
-    "try: import dspy\n",
-    "except ImportError:\n",
-    "    %pip install dspy-ai\n",
-    "    import dspy\n",
-    "\n",
-    "dspy.configure(lm=dspy.OpenAI(model='gpt-3.5-turbo-1106', max_tokens=300))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 2) Why should I use a DSPy Signature?\n",
-    "\n",
-    "**tl;dr** For modular and clean code, in which LM calls can be optimized into high-quality prompts (or automatic finetunes).\n",
-    "\n",
-    "**Long Answer:** Most people coerce LMs to do tasks by hacking long, brittle prompts. Or by collecting/generating data for fine-tuning.\n",
-    "\n",
-    "Writing signatures is far more modular, adaptive, and reproducible than hacking at prompts or finetunes. The DSPy compiler will figure out how to build a highly-optimized prompt for your LM (or finetune your small LM) for your signature, on your data, and within your pipeline. In many cases, we found that compiling leads to better prompts than humans write. Not because DSPy optimizers are more creative than humans, but simply because they can try more things and tune the metrics directly."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 3) **Short** DSPy Signatures\n",
-    "\n",
-    "Signatures can be defined as a short string, with argument names that define semantic roles for inputs/outputs.\n",
-    "\n",
-    "1. Question Answering: `\"question -> answer\"`\n",
-    "\n",
-    "2. Sentiment Classification: `\"sentence -> sentiment\"`\n",
-    "\n",
-    "3. Summarization: `\"document -> summary\"`\n",
-    "\n",
-    "Your signatures can also have multiple input/output fields.\n",
-    "\n",
-    "4. Retrieval-Augmented Question Answering: `\"context, question -> answer\"`\n",
-    "\n",
-    "5. Multiple-Choice Question Answering with Reasoning: `\"question, choices -> reasoning, selection\"`\n",
-    "\n",
-    "\n",
-    "**Tip:** For fields, any valid variable names work! Field names should be semantically meaningful, but start simple and don't prematurely optimize keywords! Leave that kind of hacking to the DSPy compiler. For example, for summarization, it's probably fine to say `\"document -> summary\"`, `\"text -> gist\"`, or `\"long_context -> tldr\"`."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 4) Example 1: Sentiment Classification"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "'Positive'"
-      ]
-     },
-     "execution_count": 4,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "sentence = \"it's a charming and often affecting journey.\"  # example from the SST-2 dataset.\n",
-    "\n",
-    "classify = dspy.Predict('sentence -> sentiment')\n",
-    "classify(sentence=sentence).sentiment"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Above, we covered a simple example with `dspy.Predict`.\n",
-    "\n",
-    "Below, let's use `dspy.ChainOfThought`."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 5) Example 2: Summarization"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 20,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "The 21-year-old Lee made seven appearances and scored one goal for West Ham last season. He had loan spells in League One with Blackpool and Colchester United, scoring twice for the latter. He has now signed a contract with Barnsley, but the length of the contract has not been revealed.\n"
-     ]
-    }
-   ],
-   "source": [
-    "# Example from the XSum dataset.\n",
-    "document = \"\"\"The 21-year-old made seven appearances for the Hammers and netted his only goal for them in a Europa League qualification round match against Andorran side FC Lustrains last season. Lee had two loan spells in League One last term, with Blackpool and then Colchester United. He scored twice for the U's but was unable to save them from relegation. The length of Lee's contract with the promoted Tykes has not been revealed. Find all the latest football transfers on our dedicated page.\"\"\"\n",
-    "\n",
-    "summarize = dspy.ChainOfThought('document -> summary')\n",
-    "response = summarize(document=document)\n",
-    "\n",
-    "print(response.summary)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Many DSPy modules (except `dspy.Predict`) return auxiliary information by expanding your signature under the hood.\n",
-    "\n",
-    "For example, `dspy.ChainOfThought` also adds a `rationale` field that includes the LM's reasoning before it generates the output `summary`."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 22,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Rationale: produce the summary. We need to highlight the key points about Lee's performance for West Ham, his loan spells in League One, and his new contract with Barnsley. We also need to mention that his contract length has not been disclosed.\n"
-     ]
-    }
-   ],
-   "source": [
-    "print(\"Rationale:\", response.rationale)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 6) Examples of _long_ DSPy Signatures\n",
-    "\n",
-    "For some advanced tasks, you need more verbose signatures. This is typically to:\n",
-    "\n",
-    "1. Clarify something about the nature of the task (expressed below as a `docstring`).\n",
-    "\n",
-    "2. Supply hints on the nature of an input field, expressed as a `desc` keyword argument for `dspy.InputField`.\n",
-    "\n",
-    "2. Supply constraints on an output field, expressed as a `desc` keyword argument for `dspy.OutputField."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 7) Example C: Classification\n",
-    "\n",
-    "Notice how the docstring contains (minimal) instructions, which in this case are necessary to have a fully-defined task.\n",
-    "\n",
-    "Some optimizers in DSPy, like `SignatureOptimizer`, can take this simple docstring and then generate more effective variants if needed."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 37,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "Prediction(\n",
-       "    sentiment='Fear'\n",
-       ")"
-      ]
-     },
-     "execution_count": 37,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "class Emotion(dspy.Signature):\n",
-    "    \"\"\"Classify emotion among sadness, joy, love, anger, fear, surprise.\"\"\"\n",
-    "    \n",
-    "    sentence = dspy.InputField()\n",
-    "    sentiment = dspy.OutputField()\n",
-    "\n",
-    "sentence = \"i started feeling a little vulnerable when the giant spotlight started blinding me\"  # from dair-ai/emotion\n",
-    "\n",
-    "classify = dspy.Predict(Emotion)\n",
-    "classify(sentence=sentence)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 8) Example D: A metric that evaluates faithfulness to citations"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 40,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "Prediction(\n",
-       "    rationale=\"produce the faithfulness. We know that Lee had two loan spells in League One last term, with Blackpool and then Colchester United. He scored twice for the U's but was unable to save them from relegation. However, there is no mention of him scoring three goals for Colchester United.\",\n",
-       "    faithfulness='False'\n",
-       ")"
-      ]
-     },
-     "execution_count": 40,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "class CheckCitationFaithfulness(dspy.Signature):\n",
-    "    \"\"\"Verify that the text is based on the provided context.\"\"\"\n",
-    "\n",
-    "    context = dspy.InputField(desc=\"facts here are assumed to be true\")\n",
-    "    text = dspy.InputField()\n",
-    "    faithfulness = dspy.OutputField(desc=\"True/False indicating if text is faithful to context\")\n",
-    "\n",
-    "context = \"The 21-year-old made seven appearances for the Hammers and netted his only goal for them in a Europa League qualification round match against Andorran side FC Lustrains last season. Lee had two loan spells in League One last term, with Blackpool and then Colchester United. He scored twice for the U's but was unable to save them from relegation. The length of Lee's contract with the promoted Tykes has not been revealed. Find all the latest football transfers on our dedicated page.\"\n",
-    "\n",
-    "text = \"Lee scored 3 goals for Colchester United.\"\n",
-    "\n",
-    "faithfulness = dspy.ChainOfThought(CheckCitationFaithfulness)\n",
-    "faithfulness(context=context, text=text)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### 9) Building modules & compiling them\n",
-    "\n",
-    "While signatures are covenient for prototyping with structured inputs/outputs, that's not the main reason to use them!\n",
-    "\n",
-    "You should compose multiple signatures into bigger [DSPy modules]() and [compile]() these modules into optimized prompts and finetunes."
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "py39_nov2023",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.9.18"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 2
-}
diff --git a/docs/index.md b/docs/index.md
deleted file mode 100644
index c72e6c223..000000000
--- a/docs/index.md
+++ /dev/null
@@ -1,58 +0,0 @@
-# 🌟👋 Welcome to DSPy -- The framework for programming—not prompting—foundation models 🌐🚀
-
-<!DOCTYPE html>
-<html>
-<head>
-    <style>
-        /* Define the animation */
-        @keyframes bounceRotate {
-            0% {
-                transform: rotateY(0deg);
-            }
-            50% {
-                transform: rotateY(20deg);
-            }
-            100% {
-                transform: rotateY(0deg);
-            }
-        }
-
-        /* Apply the animation to the image */
-        .bounce-rotate-logo {
-            animation: bounceRotate 7s ease-in-out infinite; /* Bounce back and forth infinitely with a duration of 5 seconds */
-        }
-    </style>
-
-</head>
-<body>
-
-<p style="text-align:center;">
-  <img class="bounce-rotate-logo" src="./docs/images/DSPy8.png" width="50%">
-</p>
-
-</body>
-</html>
-
-## 🎯 The Vision Behind DSPy
-
-**DSPy** is a framework for developing **high-quality systems** with LMs. While prompting LMs can quickly build (brittle) demos, the best LM systems generally break down problems into steps and tune the prompts or LM weights of each step well. As a bonus, these systems use small LMs to save costs.
-
-This is hard as we usually don't have data to tune each of these steps. **DSPy** treats prompts and LM weights as parameters to be optimized in LM pipelines, given the metrics you want to maximize.
-
-To make this possible:
-
-- [x] **DSPy** provides **composable and declarative modules** for instructing LMs in a familiar Pythonic syntax. It upgrades "prompting techniques" like chain-of-thought and self-reflection from hand-adapted _string manipulation tricks_ into truly modular _generalized operations that learn to adapt to your task_.
-
-- [x] **DSPy** introduces an **automatic compiler that teaches LMs** how to conduct the declarative steps in your program. Specifically, the **DSPy compiler** will internally _trace_ your program and then **craft high-quality prompts for large LMs (or train automatic finetunes for small LMs)** to teach them the steps of your task.
-
-- [x] **DSPy** has many modules and optimizers built-in and we want you to add more. Think of this like PyTorch but for LM pipelines, not DNNs. The **DSPy compiler** _bootstraps_ prompts and finetunes from minimal data **without needing manual labels for the intermediate steps** in your program. Instead of brittle "prompt engineering" with hacky string manipulation, you can explore a systematic space of modular and trainable pieces.
-
-- [x] For complex tasks, **DSPy** can routinely teach powerful models like `GPT-3.5` and local models like `T5-base` or `Llama2-13b` to be much more reliable at tasks. **DSPy** will compile the _same program_ into different few-shot prompts and/or finetunes for each LM.
-
-## 🚀 Analogy to Neural Networks
-
-When we build neural networks, we don't write manual _for-loops_ over lists of _hand-tuned_ floats. Instead, you might use a framework like [PyTorch](https://pytorch.org/) to compose declarative layers (e.g., `Convolution` or `Dropout`) and then use optimizers (e.g., SGD or Adam) to learn the parameters of the network.
-
-Ditto! **DSPy** gives you the right general-purpose modules (e.g., `ChainOfThought`, `Retrieve`, etc.) and takes care of optimizing their prompts _for your program_ and your metric, whatever they aim to do. Whenever you modify your code, your data, or your validation constraints, you can _compile_ your program again and **DSPy** will create new effective prompts that fit your changes.
-
-**Welcome to the future of LLMs programmig! 🌟🌐**
diff --git a/docs/language_models_client.md b/docs/language_models_client.md
deleted file mode 100644
index 55a28d1cf..000000000
--- a/docs/language_models_client.md
+++ /dev/null
@@ -1,313 +0,0 @@
-# LM Modules Documentation
-
-This documentation provides an overview of the DSPy Language Model Clients.
-
-### Quickstart
-
-```python
-import dspy
-
-lm = dspy.OpenAI(model='gpt-3.5-turbo')
-
-prompt = "Translate the following English text to Spanish: 'Hi, how are you?'"
-completions = lm(prompt, n=5, return_sorted=False)
-for i, completion in enumerate(completions):
-    print(f"Completion {i+1}: {completion}")
-```
-
-## Supported LM Clients
-
-| LM Client | Jump To |
-| --- | --- |
-| OpenAI | [OpenAI Section](#openai) |
-| AzureOpenAI | [Azure OpenAI Section](#azureopenai) |
-| Cohere | [Cohere Section](#cohere) |
-| TGI | [TGI Section](#tgi) |
-| VLLM | [VLLM Section](#vllm) |
-| Anyscale | [Anyscale Section](#anyscale) |
-| Together | [Together Section](#together) |
-
-## OpenAI
-
-### Usage
-
-```python
-lm = dspy.OpenAI(model='gpt-3.5-turbo')
-```
-
-### Constructor
-
-The constructor initializes the base class `LM` and verifies the provided arguments like the `api_provider`, `api_key`, and `api_base` to set up OpenAI request retrieval. The `kwargs` attribute is initialized with default values for relevant text generation parameters needed for communicating with the GPT API, such as `temperature`, `max_tokens`, `top_p`, `frequency_penalty`, `presence_penalty`, and `n`.
-
-```python
-class OpenAI(LM):
-    def __init__(
-        self,
-        model: str = "text-davinci-002",
-        api_key: Optional[str] = None,
-        api_provider: Literal["openai"] = "openai",
-        model_type: Literal["chat", "text"] = None,
-        **kwargs,
-    ):
-```
-
-
-
-**Parameters:** 
-- `api_key` (_Optional[str]_, _optional_): API provider authentication token. Defaults to None.
-- `api_provider` (_Literal["openai"]_, _optional_): API provider to use. Defaults to "openai".
-- `model_type` (_Literal["chat", "text"]_): Specified model type to use.
-- `**kwargs`: Additional language model arguments to pass to the API provider.
-
-### Methods
-
-#### `__call__(self, prompt: str, only_completed: bool = True, return_sorted: bool = False, **kwargs) -> List[Dict[str, Any]]`
-
-Retrieves completions from OpenAI by calling `request`. 
-
-Internally, the method handles the specifics of preparing the request prompt and corresponding payload to obtain the response.
-
-After generation, the completions are post-processed based on the `model_type` parameter. If the parameter is set to 'chat', the generated content look like `choice["message"]["content"]`. Otherwise, the generated text will be `choice["text"]`.
-
-**Parameters:**
-- `prompt` (_str_): Prompt to send to OpenAI.
-- `only_completed` (_bool_, _optional_): Flag to return only completed responses and ignore completion due to length. Defaults to True.
-- `return_sorted` (_bool_, _optional_): Flag to sort the completion choices using the returned averaged log-probabilities. Defaults to False.
-- `**kwargs`: Additional keyword arguments for completion request.
-
-**Returns:**
-- `List[Dict[str, Any]]`: List of completion choices.
-
-## AzureOpenAI
-
-### Usage
-
-```python
-lm = dspy.AzureOpenAI(api_base='...', api_version='2023-12-01-preview', model='gpt-3.5-turbo')
-```
-
-### Constructor
-
-The constructor initializes the base class `LM` and verifies the provided arguments like the `api_provider`, `api_key`, and `api_base` to set up OpenAI request retrieval through Azure. The `kwargs` attribute is initialized with default values for relevant text generation parameters needed for communicating with the GPT API, such as `temperature`, `max_tokens`, `top_p`, `frequency_penalty`, `presence_penalty`, and `n`.
-
-```python
-class AzureOpenAI(LM):
-    def __init__(
-        self,
-        api_base: str,
-        api_version: str,
-        model: str = "gpt-3.5-turbo-instruct",
-        api_key: Optional[str] = None,
-        model_type: Literal["chat", "text"] = None,
-        **kwargs,
-    ):
-```
-
-
-
-**Parameters:** 
-- `api_base` (str): Azure Base URL.
-- `api_version` (str): Version identifier for Azure OpenAI API.
-- `api_key` (_Optional[str]_, _optional_): API provider authentication token. Retrieves from `AZURE_OPENAI_KEY` environment variable if None.
-- `model_type` (_Literal["chat", "text"]_): Specified model type to use, defaults to 'chat'.
-- `**kwargs`: Additional language model arguments to pass to the API provider.
-
-### Methods
-
-#### `__call__(self, prompt: str, only_completed: bool = True, return_sorted: bool = False, **kwargs) -> List[Dict[str, Any]]`
-
-Retrieves completions from Azure OpenAI Endpoints by calling `request`. 
-
-Internally, the method handles the specifics of preparing the request prompt and corresponding payload to obtain the response.
-
-After generation, the completions are post-processed based on the `model_type` parameter. If the parameter is set to 'chat', the generated content look like `choice["message"]["content"]`. Otherwise, the generated text will be `choice["text"]`.
-
-**Parameters:**
-- `prompt` (_str_): Prompt to send to Azure OpenAI.
-- `only_completed` (_bool_, _optional_): Flag to return only completed responses and ignore completion due to length. Defaults to True.
-- `return_sorted` (_bool_, _optional_): Flag to sort the completion choices using the returned averaged log-probabilities. Defaults to False.
-- `**kwargs`: Additional keyword arguments for completion request.
-
-**Returns:**
-- `List[Dict[str, Any]]`: List of completion choices.
-
-## Cohere
-
-### Usage
-
-```python
-lm = dsp.Cohere(model='command-nightly')
-```
-
-### Constructor
-
-The constructor initializes the base class `LM` and verifies the `api_key` to set up Cohere request retrieval.
-
-```python
-class Cohere(LM):
-    def __init__(
-        self,
-        model: str = "command-nightly",
-        api_key: Optional[str] = None,
-        stop_sequences: List[str] = [],
-    ):
-```
-
-**Parameters:**
-- `model` (_str_): Cohere pretrained models. Defaults to `command-nightly`.
-- `api_key` (_Optional[str]_, _optional_): API provider from Cohere. Defaults to None.
-- `stop_sequences` (_List[str]_, _optional_): List of stopping tokens to end generation.
-
-### Methods
-
-Refer to [`dspy.OpenAI`](#openai) documentation.
-
-## TGI
-
-### Usage
-
-```python
-lm = dspy.HFClientTGI(model="meta-llama/Llama-2-7b-hf", port=8080, url="http://localhost")
-```
-
-### Prerequisites
-
-Refer to the [Text Generation-Inference Server](https://github.com/stanfordnlp/dspy/blob/local_models_docs/docs/using_local_models.md#text-generation-inference-server) section of the `Using Local Models` documentation.
-
-### Constructor
-
-The constructor initializes the `HFModel` base class and configures the client for communicating with the TGI server. It requires a `model` instance, communication `port` for the server, and the `url` for the server to host generate requests. Additional configuration can be provided via keyword arguments in `**kwargs`.
-
-```python
-class HFClientTGI(HFModel):
-    def __init__(self, model, port, url="http://future-hgx-1", **kwargs):
-```
-
-**Parameters:**
-- `model` (_HFModel_): Instance of Hugging Face model connected to the TGI server.
-- `port` (_int_): Port for TGI server.
-- `url` (_str_): Base URL where the TGI server is hosted. 
-- `**kwargs`: Additional keyword arguments to configure the client.
-
-### Methods
-
-Refer to [`dspy.OpenAI`](#openai) documentation.
-
-## VLLM
-
-### Usage
-
-```python
-lm = dspy.HFClientVLLM(model="meta-llama/Llama-2-7b-hf", port=8080, url="http://localhost")
-```
-
-### Prerequisites
-
-Refer to the [vLLM Server](https://github.com/stanfordnlp/dspy/blob/local_models_docs/docs/using_local_models.md#vllm-server) section of the `Using Local Models` documentation.
-
-### Constructor
-
-Refer to [`dspy.TGI`](#tgi) documentation. Replace with `HFClientVLLM`.
-
-### Methods
-
-Refer to [`dspy.OpenAI`](#openai) documentation.
-
-## Anyscale
-
-### Usage
-
-```python
-lm = dspy.Anyscale(model="mistralai/Mistral-7B-Instruct-v0.1")
-```
-
-### Constructor
-
-The constructor initializes the base class `LM` and verifies the `api_key` for using Anyscale API.
-We expect the following environment variables to be set:
-- `ANYSCALE_API_KEY`: API key for Together.
-- `ANYSCALE_API_BASE`: API base URL for Together.
-
-
-```python
-class Anyscale(HFModel):
-    def __init__(self, model, **kwargs):
-```
-
-**Parameters:**
-- `model` (_str_): models hosted on Together.
-
-### Methods
-
-Refer to [`dspy.OpenAI`](#openai) documentation.
-
-
-## Together
-
-### Usage
-
-```python
-lm = dspy.Together(model="mistralai/Mistral-7B-v0.1")
-```
-
-### Constructor
-
-The constructor initializes the base class `LM` and verifies the `api_key` for using Together API.
-We expect the following environment variables to be set:
-- `TOGETHER_API_KEY`: API key for Together.
-- `TOGETHER_API_BASE`: API base URL for Together.
-
-
-```python
-class Together(HFModel):
-    def __init__(self, model, **kwargs):
-```
-
-**Parameters:**
-- `model` (_str_): models hosted on Together.
-- `stop` (_List[str]_, _optional_): List of stopping tokens to end generation.
-
-### Methods
-
-Refer to [`dspy.OpenAI`](#openai) documentation.
-
-
-## Databricks (Model Serving Endpoints)
-
-### Usage
-```python
-lm = dspy.Databricks(model="databricks-mpt-30b-instruct")
-```
-
-### Constructor
-
-The constructor inherits from the `GPT3` class and verifies the Databricks authentication credentials for using Databricks Model Serving API through the OpenAI SDK.
-We expect the following environment variables to be set:
-- `openai.api_key`: Databricks API key.
-- `openai.base_url`: Databricks Model Endpoint url
-
-The `kwargs` attribute is initialized with default values for relevant text generation parameters needed for communicating with the Databricks OpenAI SDK, such as `temperature`, `max_tokens`, `top_p`, and `n`. However, it removes the `frequency_penalty` and `presence_penalty` arguments as these are not currently supported by the Databricks API.
-
-```python
-class Databricks(GPT3):
-    def __init__(
-        self,
-        model: str,
-        api_key: Optional[str] = None,
-        api_base: Optional[str] = None,
-        model_type: Literal["chat", "text"] = None,
-        **kwargs,
-    ):
-```
-
-**Parameters:**
-- `model` (_str_): models hosted on Databricks.
-- `stop` (_List[str]_, _optional_): List of stopping tokens to end generation.
-- `api_key` (_Optional[str]_): Databricks API key. Defaults to None
-- `api_base` (_Optional[str]_): Databricks Model Endpoint url Defaults to None.
-- `model_type` (_Literal["chat", "text", "embeddings"]_): Specified model type to use.
-- `**kwargs`: Additional language model arguments to pass to the API provider.
-
-### Methods
-
-Refer to [`dspy.OpenAI`](#openai) documentation.
\ No newline at end of file
diff --git a/docs/modules.md b/docs/modules.md
deleted file mode 100644
index 2c68cb441..000000000
--- a/docs/modules.md
+++ /dev/null
@@ -1,431 +0,0 @@
-# dspy.Modules Documentation
-
-This documentation provides an overview of the DSPy Modules.
-
-## DSPy Modules
-
-| Module | Jump To |
-| --- | --- |
-| Predict | [Predict Section](#dspypredict) |
-| Retrieve | [Retrieve Section](#dspyretrieve) |
-| ChainOfThought | [ChainOfThought Section](#dspychainofthought) |
-| ChainOfThoughtWithHint | [ChainOfThoughtWithHint Section](#dspychainofthoughtwithhint) |
-| MultiChainComparison | [MultiChainComparison Section](#dspymultichaincomparison) |
-| ReAct | [ReAct Section](#dspyreact) |
-
-## dspy.Predict
-
-### Constructor
-
-The constructor initializes the `Predict` class and sets up its attributes, taking in the `signature` and additional config options. If the `signature` is a string, it processes the input and output fields, generates instructions, and creates a template for the specified `signature` type.
-
-```python
-class Predict(Parameter):
-    def __init__(self, signature, **config):
-        self.stage = random.randbytes(8).hex()
-        self.signature = signature
-        self.config = config
-        self.reset()
-
-        if isinstance(signature, str):
-            inputs, outputs = signature.split("->")
-            inputs, outputs = inputs.split(","), outputs.split(",")
-            inputs, outputs = [field.strip() for field in inputs], [field.strip() for field in outputs]
-
-            assert all(len(field.split()) == 1 for field in (inputs + outputs))
-
-            inputs_ = ', '.join([f"`{field}`" for field in inputs])
-            outputs_ = ', '.join([f"`{field}`" for field in outputs])
-
-            instructions = f"""Given the fields {inputs_}, produce the fields {outputs_}."""
-
-            inputs = {k: InputField() for k in inputs}
-            outputs = {k: OutputField() for k in outputs}
-
-            for k, v in inputs.items():
-                v.finalize(k, infer_prefix(k))
-            
-            for k, v in outputs.items():
-                v.finalize(k, infer_prefix(k))
-
-            self.signature = dsp.Template(instructions, **inputs, **outputs)
-```
-
-**Parameters:**
-- `signature` (_Any_): Signature of predictive model.
-- `**config` (_dict_): Additional configuration parameters for model.
-
-### Method
-
-#### `__call__(self, **kwargs)`
-
-This method serves as a wrapper for the `forward` method. It allows making predictions using the `Predict` class by providing keyword arguments.
-
-**Paramters:**
-- `**kwargs`: Keyword arguments required for prediction.
-
-**Returns:**
-- The result of `forward` method.
-
-### Examples
-
-```python
-#Define a simple signature for basic question answering
-class BasicQA(dspy.Signature):
-    """Answer questions with short factoid answers."""
-    question = dspy.InputField()
-    answer = dspy.OutputField(desc="often between 1 and 5 words")
-
-#Pass signature to Predict module
-generate_answer = dspy.Predict(BasicQA)
-
-# Call the predictor on a particular input.
-question='What is the color of the sky?'
-pred = generate_answer(question=question)
-
-print(f"Question: {question}")
-print(f"Predicted Answer: {pred.answer}")
-```
-
-
-## dspy.Retrieve
-
-### Constructor
-
-The constructor initializes the `Retrieve` class and sets up its attributes, taking in `k` number of retrieval passages to return for a query.
-
-```python
-class Retrieve(Parameter):
-    def __init__(self, k=3):
-        self.stage = random.randbytes(8).hex()
-        self.k = k
-```
-
-**Parameters:**
-- `k` (_Any_): Number of retrieval responses
-
-### Method
-
-#### `__call__(self, *args, **kwargs):`
-
-This method serves as a wrapper for the `forward` method. It allows making retrievals on an input query using the `Retrieve` class.
-
-**Parameters:**
-- `**args`: Arguments required for retrieval.
-- `**kwargs`: Keyword arguments required for retrieval.
-
-**Returns:**
-- The result of the `forward` method.
-
-### Examples
-
-```python
-query='When was the first FIFA World Cup held?'
-
-# Call the retriever on a particular query.
-retrieve = dspy.Retrieve(k=3)
-topK_passages = retrieve(query).passages
-
-print(f"Top {retrieve.k} passages for question: {query} \n", '-' * 30, '\n')
-
-for idx, passage in enumerate(topK_passages):
-    print(f'{idx+1}]', passage, '\n')
-```
-
-# dspy.ChainOfThought
-
-The constructor initializes the `ChainOfThought` class and sets up its attributes. It inherits from the `Predict` class and adds specific functionality for chain of thought processing. 
-
-Internally, the class initializes the `activated` attribute to indicate if chain of thought processing has been selected. It extends the `signature` to include additional reasoning steps and an updated `rationale_type` when chain of thought processing is activated.
-
-```python
-class ChainOfThought(Predict):
-    def __init__(self, signature, rationale_type=None, activated=True, **config):
-        super().__init__(signature, **config)
-
-        self.activated = activated
-
-        signature = self.signature
-        *keys, last_key = signature.kwargs.keys()
-
-        DEFAULT_RATIONALE_TYPE = dsp.Type(prefix="Reasoning: Let's think step by step in order to",
-                                          desc="${produce the " + last_key + "}. We ...")
-
-        rationale_type = rationale_type or DEFAULT_RATIONALE_TYPE
-        
-        extended_kwargs = {key: signature.kwargs[key] for key in keys}
-        extended_kwargs.update({'rationale': rationale_type, last_key: signature.kwargs[last_key]})
-        
-        self.extended_signature = dsp.Template(signature.instructions, **extended_kwargs)
-```
-
-**Parameters:**
-- `signature` (_Any_): Signature of predictive model.
-- `rationale_type` (_dsp.Type_, _optional_): Rationale type for reasoning steps. Defaults to `None`.
-- `activated` (_bool_, _optional_): Flag for activated chain of thought processing. Defaults to `True`.
-- `**config` (_dict_): Additional configuration parameters for model.
-
-### Method
-
-#### `forward(self, **kwargs)`
-
-This method extends the parent `Predict` class' forward pass while updating the signature when chain of thought reasoning is activated or if the language model is a GPT3 model.
-
-**Parameters:**
-- `**kwargs`: Keyword arguments required for prediction.
-
-**Returns:**
-- The result of the `forward` method.
-
-### Examples
-
-```python
-#Define a simple signature for basic question answering
-class BasicQA(dspy.Signature):
-    """Answer questions with short factoid answers."""
-    question = dspy.InputField()
-    answer = dspy.OutputField(desc="often between 1 and 5 words")
-
-#Pass signature to ChainOfThought module
-generate_answer = dspy.ChainOfThought(BasicQA)
-
-# Call the predictor on a particular input.
-question='What is the color of the sky?'
-pred = generate_answer(question=question)
-
-print(f"Question: {question}")
-print(f"Predicted Answer: {pred.answer}")
-```
-
-## dspy.ChainOfThoughtWithHint
-
-### Constructor
-
-The constructor initializes the `ChainOfThoughtWithHint` class and sets up its attributes, inheriting from the `Predict` class. This class enhances the `ChainOfThought` class by offering an additional option to provide hints for reasoning. Two distinct signature templates are created internally depending on the presence of the hint.
-
-```python
-class ChainOfThoughtWithHint(Predict):
-    def __init__(self, signature, rationale_type=None, activated=True, **config):
-        super().__init__(signature, **config)
-
-        self.activated = activated
-
-        signature = self.signature
-        *keys, last_key = signature.kwargs.keys()
-
-        DEFAULT_HINT_TYPE = dsp.Type(prefix="Hint:", desc="${hint}")
-
-        DEFAULT_RATIONALE_TYPE = dsp.Type(prefix="Reasoning: Let's think step by step in order to",
-                                          desc="${produce the " + last_key + "}. We ...")
-
-        rationale_type = rationale_type or DEFAULT_RATIONALE_TYPE
-        
-        extended_kwargs1 = {key: signature.kwargs[key] for key in keys}
-        extended_kwargs1.update({'rationale': rationale_type, last_key: signature.kwargs[last_key]})
-
-        extended_kwargs2 = {key: signature.kwargs[key] for key in keys}
-        extended_kwargs2.update({'hint': DEFAULT_HINT_TYPE, 'rationale': rationale_type, last_key: signature.kwargs[last_key]})
-        
-        self.extended_signature1 = dsp.Template(signature.instructions, **extended_kwargs1)
-        self.extended_signature2 = dsp.Template(signature.instructions, **extended_kwargs2)
-```
-
-**Parameters:**
-- `signature` (_Any_): Signature of predictive model.
-- `rationale_type` (_dsp.Type_, _optional_): Rationale type for reasoning steps. Defaults to `None`.
-- `activated` (_bool_, _optional_): Flag for activated chain of thought processing. Defaults to `True`.
-- `**config` (_dict_): Additional configuration parameters for model.
-
-### Method
-
-#### `forward(self, **kwargs)`
-
-This method extends the parent `Predict` class's forward pass, updating the signature dynamically based on the presence of `hint` in the keyword arguments and the `activated` attribute.
-
-**Parameters:**
-- `**kwargs`: Keyword arguments required for prediction.
-
-**Returns:**
-- The result of the `forward` method in the parent `Predict` class.
-
-### Examples
-
-```python
-#Define a simple signature for basic question answering
-class BasicQA(dspy.Signature):
-    """Answer questions with short factoid answers."""
-    question = dspy.InputField()
-    answer = dspy.OutputField(desc="often between 1 and 5 words")
-
-#Pass signature to ChainOfThought module
-generate_answer = dspy.ChainOfThoughtWithHint(BasicQA)
-
-# Call the predictor on a particular input alongside a hint.
-question='What is the color of the sky?'
-hint = "It's what you often see during a sunny day."
-pred = generate_answer(question=question, hint=hint)
-
-print(f"Question: {question}")
-print(f"Predicted Answer: {pred.answer}")
-```
-
-
-## dspy.MultiChainComparison
-
-### Constructor
-
-The constructor initializes the `MultiChainComparison` class and sets up its attributes. It inherits from the `Predict` class and adds specific functionality for multiple chain comparisons.
-
-The class incorporates multiple student attempt reasonings and concludes with the selected best reasoning path out of the available attempts.
-
-```python
-from .predict import Predict
-from ..primitives.program import Module
-
-import dsp
-
-class MultiChainComparison(Module):
-    def __init__(self, signature, M=3, temperature=0.7, **config):
-        super().__init__()
-
-        self.M = M
-        signature = Predict(signature).signature
-        *keys, last_key = signature.kwargs.keys()
-
-        extended_kwargs = {key: signature.kwargs[key] for key in keys}
-
-        for idx in range(M):
-            candidate_type = dsp.Type(prefix=f"Student Attempt #{idx+1}:", desc="${reasoning attempt}")
-            extended_kwargs.update({f'reasoning_attempt_{idx+1}': candidate_type})
-        
-        rationale_type = dsp.Type(prefix="Accurate Reasoning: Thank you everyone. Let's now holistically", desc="${corrected reasoning}")
-        extended_kwargs.update({'rationale': rationale_type, last_key: signature.kwargs[last_key]})
-
-        signature = dsp.Template(signature.instructions, **extended_kwargs)
-        self.predict = Predict(signature, temperature=temperature, **config)
-        self.last_key = last_key
-```
-
-**Parameters:**
-- `signature` (_Any_): Signature of predictive model.
-- `M` (_int_, _optional_): Number of student reasoning attempts. Defaults to `3`.
-- `temperature` (_float_, _optional_): Temperature parameter for prediction. Defaults to `0.7`.
-- `**config` (_dict_): Additional configuration parameters for model.
-
-### Method
-
-#### `forward(self, completions, **kwargs)`
-
-This method aggregates all the student reasoning attempts and calls the predict method with extended signatures to get the best reasoning.
-
-**Parameters:**
-- `completions`: List of completion objects which include student reasoning attempts.
-- `**kwargs`: Additional keyword arguments.
-
-**Returns:**
-- The result of the `predict` method for the best reasoning.
-
-### Examples
-
-```python
-class BasicQA(dspy.Signature):
-    """Answer questions with short factoid answers."""
-    question = dspy.InputField()
-    answer = dspy.OutputField(desc="often between 1 and 5 words")
-
-# Example completions generated by a model for reference
-completions = [
-    dspy.Prediction(rationale="I recall that during clear days, the sky often appears this color.", answer="blue"),
-    dspy.Prediction(rationale="Based on common knowledge, I believe the sky is typically seen as this color.", answer="green"),
-    dspy.Prediction(rationale="From images and depictions in media, the sky is frequently represented with this hue.", answer="blue"),
-]
-
-# Pass signature to MultiChainComparison module
-compare_answers = dspy.MultiChainComparison(BasicQA)
-
-# Call the MultiChainComparison on the completions
-question = 'What is the color of the sky?'
-final_pred = compare_answers(completions, question=question)
-
-print(f"Question: {question}")
-print(f"Final Predicted Answer (after comparison): {final_pred.answer}")
-print(f"Final Rationale: {final_pred.rationale}")
-```
-
-## dspy.ReAct
-
-### Constructor
-
-The constructor initializes the `ReAct` class and sets up its attributes. It is specifically designed to compose the interleaved steps of Thought, Action, and Observation.
-
-Internally, the class follows a sequential process: Thoughts (or reasoning) lead to Actions (such as queries or activities). These Actions then result in Observations (like results or responses), which subsequently feedback into the next Thought. This cycle is maintained for a predefined number of iterations.
-
-```python
-import dsp
-import dspy
-from ..primitives.program import Module
-from .predict import Predict
-
-class ReAct(Module):
-    def __init__(self, signature, max_iters=5, num_results=3, tools=None):
-        ...
-```
-
-**Parameters:**
-- `signature` (_Any_): Signature of the predictive model.
-- `max_iters` (_int_, _optional_): Maximum number of iterations for the Thought-Action-Observation cycle. Defaults to `5`.
-- `num_results` (_int_, _optional_): Number of results to retrieve in the action step. Defaults to `3`.
-- `tools` (_List[dspy.Tool]_, _optional_): List of tools available for actions. If none is provided, a default `Retrieve` tool with `num_results` is used.
-
-### Methods
-
-#### `_generate_signature(self, iters)`
-
-Generates a signature for the Thought-Action-Observation cycle based on the number of iterations.
-
-**Parameters:**
-- `iters` (_int_): Number of iterations.
-
-**Returns:**
-- A dictionary representation of the signature.
-
-#### `act(self, output, hop)`
-
-Processes an action and returns the observation or final answer.
-
-**Parameters:**
-- `output` (_dict_): Current output from the Thought.
-- `hop` (_int_): Current iteration number.
-
-**Returns:**
-- A string representing the final answer or `None`.
-
-#### `forward(self, **kwargs)`
-
-Main method to execute the Thought-Action-Observation cycle for a given set of input fields.
-
-**Parameters:**
-- `**kwargs`: Keyword arguments corresponding to input fields.
-
-**Returns:**
-- A `dspy.Prediction` object containing the result of the ReAct process.
-
-### Examples
-
-```python
-# Define a simple signature for basic question answering
-class BasicQA(dspy.Signature):
-    """Answer questions with short factoid answers."""
-    question = dspy.InputField()
-    answer = dspy.OutputField(desc="often between 1 and 5 words")
-
-# Pass signature to ReAct module
-react_module = dspy.ReAct(BasicQA)
-
-# Call the ReAct module on a particular input
-question = 'What is the color of the sky?'
-result = react_module(question=question)
-
-print(f"Question: {question}")
-print(f"Final Predicted Answer (after ReAct process): {result.answer}")
-```
\ No newline at end of file
diff --git a/docs-page/package-lock.json b/docs/package-lock.json
similarity index 100%
rename from docs-page/package-lock.json
rename to docs/package-lock.json
diff --git a/docs-page/package.json b/docs/package.json
similarity index 100%
rename from docs-page/package.json
rename to docs/package.json
diff --git a/docs/repo/contributing.md b/docs/repo/contributing.md
deleted file mode 100644
index a7e952769..000000000
--- a/docs/repo/contributing.md
+++ /dev/null
@@ -1,71 +0,0 @@
-# ⚙️ Setting-up working envirionment
-
-## 💻 Env Setup
-
-```
-conda create --name dspy python=3.11
-```
-
-or
-
-```
-python3 -m venv dspy
-```
-
-## 🚀 Pre-commit hook
-
-Before using pre-commit hook you need to install it in your python environment.
-
-```
-conda install -c conda-forge pre-commit
-```
-
-go to the root folder and then activate it as follows (it will first download all required dependencies):
-
-```
-pre-commit install
-```
-
-> Pre-commit hooks will attept to fix all your files and so you will need to (add + commit) them once the fixes are done !
-
-!!! info "Optionally"
-
-    Generally the pre-commit will run automatically before each of your commit,
-    but you can also manually trigger it, as follows:
-
-    ```python
-    pre-commit run --all-files
-    ```
-
-## 📝 Commit with Style
-
-Use standarized commit message:
-
-`{LABEL}(ACRONYM): {message}`
-
-This is very important for the automatic releases and a clean history on the `main` branch.
-
-!!! Labels-types
-
-    | Label | Usage |
-    | ----- | ----- |
-    | break| `break` is used to identify changes related to old compatibility or functionality that breaks the current usage (major) |
-    | feat | `feat` is used to identify changes related to new backward-compatible abilities or functionality (minor) |
-    | init | `init` is used to indentify the starting related to the project (minor) |
-    | enh | `enh` is used to indentify changes related to amelioration of abilities or functionality (patch) |
-    | build | `build` (also known as `chore`) is used to identify **development** changes related to the build system (involving scripts, configurations, or tools) and package dependencies (patch) |
-    | ci | `ci` is used to identify **development** changes related to the continuous integration and deployment system - involving scripts, configurations, or tools (minor) |
-    | docs | `docs`  is used to identify documentation changes related to the project; whether intended externally for the end-users or internally for the developers (patch) |
-    | perf | `perf`  is used to identify changes related to backward-compatible **performance improvements** (patch) |
-    | refactor | `refactor` is used to identify changes related to modifying the codebase, which neither adds a feature nor fixes a bug - such as removing redundant code, simplifying the code, renaming variables, etc.<br />i.e. handy for your wip ; ) (patch) |
-    | style | `style`  is used to identify **development** changes related to styling the codebase, regardless of the meaning - such as indentations, semi-colons, quotes, trailing commas, and so on (patch) |
-    | test | `test` is used to identify **development** changes related to tests - such as refactoring existing tests or adding new tests. (minor) |
-    | fix | `fix`  is used to identify changes related to backward-compatible bug fixes. (patch) |
-    | ops | `ops` is used to identify changes related to deployment files like `values.yml`, `gateway.yml,` or `Jenkinsfile` in the **ops** directory. (minor) |
-    | hotfix | `hotfix` is used to identify **production** changes related to backward-compatible bug fixes (patch) |
-    | revert | `revert` is used to identify backward changes (patch) |
-    | maint | `maint` is used to identify **maintenance** changes related to project (patch) |
-
-```
-
-```
diff --git a/docs/repo/documentation.md b/docs/repo/documentation.md
deleted file mode 100644
index 685bf7dfc..000000000
--- a/docs/repo/documentation.md
+++ /dev/null
@@ -1,82 +0,0 @@
-# 📃 Documentation
-
-We are using MkDocs to build the documentation, you can find more info about all possibilites here: [Examples](https://squidfunk.github.io/mkdocs-material/reference/#setting-the-page-title). It is basically a combination of Markdown and Mermaid for graph.
-
-!!! info
-You can read more about [Mermaid](https://github.com/mermaid-js/mermaid) with all its possibilities !
-If you would like to test your mermaid FlowChart online without having to install any required libraries, you can refere to [Online Schema Editor](https://mermaid-js.github.io/mermaid-live-editor)
-
-## ➕ Extending the documentation
-
-- [x] start from the `dev` branch and create a new branch with the correct naming convention, see: [how to contribute](contributing.md)
-
-- [x] add additional '.md' files to the documentation directory: `/docs`
-
-- [x] add the new entry into the navigation bar and connect it with your `md` file. This can be done in: [`root/mkdocs.yml`](mkdocs.yml)
-
-- [x] You can interactively test the documentation locally by using the following command: `mkdocs serve`
-
-  > You will need to have all local docs-related requirements installed (see: [tool.poetry.group.doc.dependencies]):
-
-  ```python
-  mkdocs = ">=1.5.3"
-  mkdocs-material = ">=9.0.6"
-  mkdocs-material-extensions = ">=1.3.1"
-  mkdocs-gen-files = "^0.5.0"
-  mkdocstrings-python = "^1.7.5"
-  mkdocstrings = {extras = ["python"], version = ">=0.20.0"}
-  mike = ">=2.0.0"
-  ```
-
-- [x] Once you are done, create a new Merge Request to the `dev` branch.
-
-- [x] When your MR gets approved merge it into the `dev` following the well know conventions [how to contribute](contributing.md)
-
-- [x] New documentation will be automatically deployed once your MR gets merged !
-
-!!! warning
-In some cases you may need to deploy the new doc to Github-pages immediatly, this can be done using the following command: `mkdocs gh-deploy` (while being in a right venv)
-
-## 🔍 Documenting code
-
-Documenting code is done using dedicated docstrings, which are then automatically parsed and converted into the documentation.
-
-In order to document your code, you need to use the following syntax:
-
-```python
-# 🗺️ PARAGRAPG NAME
-
-::: dspy.predict.predict.Predict
-    handler: python
-    options:
-        show_root_heading: true
-        show_source: true
-```
-
-and the Predict class documentation needs to foollow the Google Style Guide, see: [Google Style Guide](https://google.github.io/styleguide/pyguide.html#38-comments-and-docstrings)
-
-!!! example
-
-    ```python
-    """Send a static HTML report along with a message to the Slack channel.
-
-        Args:
-            project_name (str): The name of the project for which the job was submitted.
-            html_file_path (str): The file path of the HTML report to be sent.
-            message (str): The message to send along with the HTML report.
-
-        Example:
-            ```python
-            from theparrot.notify import SlackBot
-
-            bot = SlackBot()
-            bot.send_html_report(
-                html_file_path="path/to/your/report.html",
-                message="Check out this report!",
-            )
-            ```
-        """
-    ```
-
-This approach allows to handle documentation directly from the code, which is a great way to keep it up to date.
-It also allows to version the documentation, which is a great way to keep track of changes and handle multiple versions of the package.
diff --git a/docs/repo/getting_started.md b/docs/repo/getting_started.md
deleted file mode 100644
index 0a0bfd730..000000000
--- a/docs/repo/getting_started.md
+++ /dev/null
@@ -1,42 +0,0 @@
-## 💻 Installation
-
-To install python packages, you need:
-
-```python
-pip install dspy-ai
-```
-
-Or open our intro notebook in Google Colab: [<img align="center" src="https://colab.research.google.com/assets/colab-badge.svg" />](https://colab.research.google.com/github/stanfordnlp/dspy/blob/main/intro.ipynb)
-
-!!! info
-By default, DSPy depends on `openai==0.28`. However, if you install `openai>=1.0`, the library will use that just fine. Both are supported.
-
-For the optional Pinecone, Qdrant, [chromadb](https://github.com/chroma-core/chroma), or [marqo](https://github.com/marqo-ai/marqo) retrieval integration(s), include the extra(s) below:
-
-```python
-pip install dspy-ai[pinecone]  # or [qdrant] or [chromadb] or [marqo] or [mongodb]
-```
-
-## ℹ️ Examples
-
-The DSPy team believes complexity has to be justified. We take this seriously: we never release a complex tutorial (above) or example (below) _unless we can demonstrate empirically that this complexity has generally led to improved quality or cost._ This kind of rule is rarely enforced by other frameworks or docs, but you can count on it in DSPy examples.
-
-There's a bunch of examples in the `examples/` directory and in the top-level directory. We welcome contributions!
-
-You can find other examples tweeted by [@lateinteraction](https://twitter.com/lateinteraction) on Twitter/X.
-
-## 🔍 Detailed Tutorials
-
-If you're new to DSPy, it's probably best to go in sequential order. You will probably refer to these guides frequently after that, e.g. to copy/paste snippets that you can edit for your own DSPy programs.
-
-1. **[DSPy Signatures](docs/guides/signatures.ipynb)**
-
-2. **[Language Models](docs/guides/language_models.ipynb)** and **[Retrieval Models](docs/guides/retrieval_models.ipynb)**
-
-3. **[DSPy Modules](docs/guides/modules.ipynb)**
-
-4. **[DSPy Optimizers](docs/guides/optimizers.ipynb)**
-
-5. **[DSPy Metrics](docs/guides/metrics.ipynb)**
-
-6. **[DSPy Assertions](docs/guides/assertions.ipynb)**
diff --git a/docs/retrieval_models_client.md b/docs/retrieval_models_client.md
deleted file mode 100644
index 8f31e0bbd..000000000
--- a/docs/retrieval_models_client.md
+++ /dev/null
@@ -1,217 +0,0 @@
-# Retriever Modules Documentation
-
-This documentation provides an overview of the DSPy Retrieval Model Clients.
-
-## Supported RM Clients
-
-| RM Client | Jump To |
-| --- | --- |
-| ColBERTv2 | [ColBERTv2 Section](#ColBERTv2) |
-| AzureCognitiveSearch | [AzureCognitiveSearch Section](#AzureCognitiveSearch) |
-| ChromadbRM | [ChromadbRM Section](#ChromadbRM) |
-| Faiss | [Faiss Section](#Faiss) |
-
-## ColBERTv2
-
-### Quickstart
-
-```python
-import dspy
-
-colbertv2_wiki17_abstracts = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')
-
-retrieval_response = colbertv2_wiki17_abstracts('When was the first FIFA World Cup held?', k=5)
-
-for result in retrieval_response:
-    print("Text:", result['text'], "\n")
-```
-
-
-### Constructor
-
-The constructor initializes the `ColBERTv2` class instance and sets up the request parameters for interacting with the ColBERTv2 server.
-
-```python
-class ColBERTv2:
-    def __init__(
-        self,
-        url: str = "http://0.0.0.0",
-        port: Optional[Union[str, int]] = None,
-        post_requests: bool = False,
-    ):
-```
-
-**Parameters:**
-- `url` (_str_): URL for ColBERTv2 server.
-- `port` (_Union[str, int]_, _Optional_): Port endpoint for ColBERTv2 server. Defaults to `None`.
-- `post_requests` (_bool_, _Optional_): Flag for using HTTP POST requests. Defaults to `False`.
-
-### Methods
-
-#### `__call__(self, query: str, k: int = 10, simplify: bool = False) -> Union[list[str], list[dotdict]]`
-
-Enables making queries to the ColBERTv2 server for retrieval. Internally, the method handles the specifics of preparing the request prompt and corresponding payload to obtain the response. The function handles the retrieval of the top-k passages based on the provided query.
-
-**Parameters:**
-- `query` (_str_): Query string used for retrieval.
-- `k` (_int_, _optional_): Number of passages to retrieve. Defaults to 10.
-- `simplify` (_bool_, _optional_): Flag for simplifying output to a list of strings. Defaults to False.
-
-**Returns:**
-- `Union[list[str], list[dotdict]]`: Depending on `simplify` flag, either a list of strings representing the passage content (`True`) or a list of `dotdict` instances containing passage details (`False`).
-
-## AzureCognitiveSearch
-
-### Quickstart
-
-#TODO
-
-### Constructor
-
-The constructor initializes an instance of the `AzureCognitiveSearch` class and sets up parameters for sending queries and retreiving results  with the Azure Cognitive Search server.
-
-```python
-class AzureCognitiveSearch:
-    def __init__(
-        self,
-        search_service_name: str,
-        search_api_key: str,
-        search_index_name: str,
-        field_text: str,
-        field_score: str, # required field to map with "score" field in dsp framework
-    ):
-```
-
-**Parameters:**
-- `search_service_name` (_str_): Name of Azure Cognitive Search server.
-- `search_api_key` (_str_): API Authentication token for accessing Azure Cognitive Search server.
-- `search_index_name` (_str_): Name of search index in the Azure Cognitive Search server.
-- `field_text` (_str_): Field name that maps to DSP "content" field.
-- `field_score` (_str_): Field name that maps to DSP "score" field.
-
-### Methods
-
-Refer to [ColBERTv2](#ColBERTv2) documentation. Keep in mind there is no `simplify` flag for AzureCognitiveSearch.
-
-AzureCognitiveSearch supports sending queries and processing the received results, mapping content and scores to a correct format for the Azure Cognitive Search server.
-
-## ChromadbRM
-
-### Quickstart with OpenAI Embeddings
-
-ChromadbRM have the flexibility from a variety of embedding functions as outlined in the [chromadb embeddings documentation](https://docs.trychroma.com/embeddings). While different options are available, this example demonstrates how to utilize OpenAI embeddings specifically.
-
-```python
-from dspy.retrieve import ChromadbRM
-import os
-import openai
-from chromadb.utils.embedding_functions import OpenAIEmbeddingFunction
-
-embedding_function = OpenAIEmbeddingFunction(
-    api_key=os.environ.get('OPENAI_API_KEY'),
-    model_name="text-embedding-ada-002"
-)
-
-retriever_model = ChromadbRM(
-    'your_collection_name',
-    '/path/to/your/db',
-    embedding_function=embedding_function,
-    k=5
-)
-
-results = retriever_model("Explore the significance of quantum computing", k=5)
-
-for result in results:
-    print("Document:", result.long_text, "\n")
-```
-
-### Constructor
-
-Initialize an instance of the `ChromadbRM` class, with the option to use OpenAI's embeddings or any alternative supported by chromadb, as detailed in the official [chromadb embeddings documentation](https://docs.trychroma.com/embeddings).
-
-```python
-ChromadbRM(
-    collection_name: str,
-    persist_directory: str,
-    embedding_function: Optional[EmbeddingFunction[Embeddable]] = OpenAIEmbeddingFunction(),
-    k: int = 7,
-)
-```
-
-**Parameters:**
-- `collection_name` (_str_): The name of the chromadb collection.
-- `persist_directory` (_str_): Path to the directory where chromadb data is persisted.
-- `embedding_function` (_Optional[EmbeddingFunction[Embeddable]]_, _optional_): The function used for embedding documents and queries. Defaults to `DefaultEmbeddingFunction()` if not specified.
-- `k` (_int_, _optional_): The number of top passages to retrieve. Defaults to 7.
-
-### Methods
-
-#### `forward(self, query_or_queries: Union[str, List[str]], k: Optional[int] = None) -> dspy.Prediction`
-
-Search the chromadb collection for the top `k` passages matching the given query or queries, using embeddings generated via the specified `embedding_function`.
-
-**Parameters:**
-- `query_or_queries` (_Union[str, List[str]]_): The query or list of queries to search for.
-- `k` (_Optional[int]_, _optional_): The number of results to retrieve. If not specified, defaults to the value set during initialization.
-
-**Returns:**
-- `dspy.Prediction`: Contains the retrieved passages, each represented as a `dotdict` with a `long_text` attribute.
-
-## Faiss
-
-### Quickstart with the default vectorizer
-
-The **FaissRM** module provides a retriever that uses an in-memory Faiss vector database. This module does not include a vectorizer; instead it supports any subclass of **dsp.modules.sentence_vectorizer.BaseSentenceVectorizer**. If a vectorizer is not provided, an instance of **dsp.modules.sentence_vectorizer.SentenceTransformersVectorizer** is created and used by **FaissRM**. Note that the default embedding model for **SentenceTransformersVectorizer** is **all-MiniLM-L6-v2**
-
-
-```python
-import dspy
-from dspy.retrieve import faiss_rm
-
-document_chunks = [
-    "The superbowl this year was played between the San Francisco 49ers and the Kanasas City Chiefs",
-    "Pop corn is often served in a bowl",
-    "The Rice Bowl is a Chinese Restaurant located in the city of Tucson, Arizona",
-    "Mars is the fourth planet in the Solar System",
-    "An aquarium is a place where children can learn about marine life",
-    "The capital of the United States is Washington, D.C",
-    "Rock and Roll musicians are honored by being inducted in the Rock and Roll Hall of Fame",
-    "Music albums were published on Long Play Records in the 70s and 80s",
-    "Sichuan cuisine is a spicy cuisine from central China",
-    "The interest rates for mortgages is considered to be very high in 2024",
-]
-
-frm = faiss_rm.FaissRM(document_chunks)
-turbo = dspy.OpenAI(model="gpt-3.5-turbo")
-dspy.settings.configure(lm=turbo, rm=frm)
-print(frm(["I am in the mood for Chinese food"]))
-```
-
-### Constructor
-
-Initialize an instance of FaissRM by providing it with a vectorizer and a list of strings
-
-```python
-FaissRM(
-    document_chunks: List[str],
-    vectorizer: dsp.modules.sentence_vectorizer.BaseSentenceVectorizer,
-    k: int = 3
-)
-```
-
-**Parameters:**
-- `document_chunks` (_List[str]_): a list of strings that comprises the corpus to search. You cannot add/insert/upsert to this list after creating this FaissRM object.
-- `vectorizer` (_dsp.modules.sentence_vectorizer.BaseSentenceVectorizer_, _optional_): If not provided, a dsp.modules.sentence_vectorizer.SentenceTransformersVectorizer object is created and used.
-- `k` (_int_, _optional_): The number of top passages to retrieve. Defaults to 3.
-
-### Methods
-
-#### `forward(self, query_or_queries: Union[str, List[str]]) -> dspy.Prediction`
-
-Search the FaissRM vector database for the top `k` passages matching the given query or queries, using embeddings generated via the vectorizer specified at FaissRM construction time
-
-**Parameters:**
-- `query_or_queries` (_Union[str, List[str]]_): The query or list of queries to search for.
-
-**Returns:**
-- `dspy.Prediction`: Contains the retrieved passages, each represented as a `dotdict` with a `long_text` attribute and an `index` attribute. The `index` attribute is the index in the document_chunks array provided to this FaissRM object at construction time.
diff --git a/docs-page/sidebars.ts b/docs/sidebars.ts
similarity index 100%
rename from docs-page/sidebars.ts
rename to docs/sidebars.ts
diff --git a/docs-page/src/components/AuthorDetails/index.tsx b/docs/src/components/AuthorDetails/index.tsx
similarity index 100%
rename from docs-page/src/components/AuthorDetails/index.tsx
rename to docs/src/components/AuthorDetails/index.tsx
diff --git a/docs-page/src/components/AuthorDetails/styles.module.css b/docs/src/components/AuthorDetails/styles.module.css
similarity index 100%
rename from docs-page/src/components/AuthorDetails/styles.module.css
rename to docs/src/components/AuthorDetails/styles.module.css
diff --git a/docs-page/src/components/HomepageFeatures/index.tsx b/docs/src/components/HomepageFeatures/index.tsx
similarity index 98%
rename from docs-page/src/components/HomepageFeatures/index.tsx
rename to docs/src/components/HomepageFeatures/index.tsx
index a09634beb..5d7eb942d 100644
--- a/docs-page/src/components/HomepageFeatures/index.tsx
+++ b/docs/src/components/HomepageFeatures/index.tsx
@@ -28,7 +28,7 @@ const FeatureList: FeatureItem[] = [
     ),
   },
   {
-    title: 'Universal Compatibility',
+    title: 'Cross-LM Compatibility',
     img: '/img/universal_compatibility.png',
     description: (
       <>
diff --git a/docs-page/src/components/HomepageFeatures/styles.module.css b/docs/src/components/HomepageFeatures/styles.module.css
similarity index 100%
rename from docs-page/src/components/HomepageFeatures/styles.module.css
rename to docs/src/components/HomepageFeatures/styles.module.css
diff --git a/docs-page/src/css/custom.css b/docs/src/css/custom.css
similarity index 100%
rename from docs-page/src/css/custom.css
rename to docs/src/css/custom.css
diff --git a/docs-page/src/pages/index.module.css b/docs/src/pages/index.module.css
similarity index 100%
rename from docs-page/src/pages/index.module.css
rename to docs/src/pages/index.module.css
diff --git a/docs-page/src/pages/index.tsx b/docs/src/pages/index.tsx
similarity index 100%
rename from docs-page/src/pages/index.tsx
rename to docs/src/pages/index.tsx
diff --git a/docs-page/src/pages/markdown-page.md b/docs/src/pages/markdown-page.md
similarity index 100%
rename from docs-page/src/pages/markdown-page.md
rename to docs/src/pages/markdown-page.md
diff --git a/docs-page/static/.nojekyll b/docs/static/.nojekyll
similarity index 100%
rename from docs-page/static/.nojekyll
rename to docs/static/.nojekyll
diff --git a/docs-page/static/img/dspy_logo.png b/docs/static/img/dspy_logo.png
similarity index 100%
rename from docs-page/static/img/dspy_logo.png
rename to docs/static/img/dspy_logo.png
diff --git a/docs-page/static/img/logo.png b/docs/static/img/logo.png
similarity index 100%
rename from docs-page/static/img/logo.png
rename to docs/static/img/logo.png
diff --git a/docs-page/static/img/modular.png b/docs/static/img/modular.png
similarity index 100%
rename from docs-page/static/img/modular.png
rename to docs/static/img/modular.png
diff --git a/docs-page/static/img/optimize.png b/docs/static/img/optimize.png
similarity index 100%
rename from docs-page/static/img/optimize.png
rename to docs/static/img/optimize.png
diff --git a/docs-page/static/img/undraw_docusaurus_mountain.svg b/docs/static/img/undraw_docusaurus_mountain.svg
similarity index 100%
rename from docs-page/static/img/undraw_docusaurus_mountain.svg
rename to docs/static/img/undraw_docusaurus_mountain.svg
diff --git a/docs-page/static/img/undraw_docusaurus_react.svg b/docs/static/img/undraw_docusaurus_react.svg
similarity index 100%
rename from docs-page/static/img/undraw_docusaurus_react.svg
rename to docs/static/img/undraw_docusaurus_react.svg
diff --git a/docs-page/static/img/undraw_docusaurus_tree.svg b/docs/static/img/undraw_docusaurus_tree.svg
similarity index 100%
rename from docs-page/static/img/undraw_docusaurus_tree.svg
rename to docs/static/img/undraw_docusaurus_tree.svg
diff --git a/docs-page/static/img/universal_compatibility.png b/docs/static/img/universal_compatibility.png
similarity index 100%
rename from docs-page/static/img/universal_compatibility.png
rename to docs/static/img/universal_compatibility.png
diff --git a/docs/teleprompters.md b/docs/teleprompters.md
deleted file mode 100644
index d60ea7f9d..000000000
--- a/docs/teleprompters.md
+++ /dev/null
@@ -1,283 +0,0 @@
-# Teleprompters Documentation
-
-Teleprompters are powerful optimizers (included in DSPy) that can learn to bootstrap and select effective prompts for the modules of any program. (The "tele-" in the name means "at a distance", i.e., automatic prompting at a distance.)
-
-This documentation provides an overview of the DSPy Teleprompters.
-
-## Teleprompters
-
-| Module | Jump To |
-| --- | --- |
-| LabeledFewShot | [LabeledFewShot Section](#telepromptlabeledfewshot) |
-| BootstrapFewShot | [BootstrapFewShot Section](#telepromptbootstrapfewshot) |
-| Ensemble | [Ensemble Section](#telepromptensemble) |
-| BootstrapFewShotWithRandomSearch | [BootstrapFewShotWithRandomSearch Section](#telepromptbootstrapfewshotwithrandomsearch) |
-| BootstrapFinetune | [BootstrapFinetune Section](#telepromptbootstrapfinetune) |
-
-## teleprompt.LabeledFewShot
-
-### Constructor
-
-The constructor initializes the `LabeledFewShot` class and sets up its attributes, particularly defining `k` number of samples to be used by the predictor.
-
-```python
-class LabeledFewShot(Teleprompter):
-    def __init__(self, k=16):
-        self.k = k
-```
-
-**Parameters:**
-- `k` (_int_): Number of samples to be used for each predictor. Defaults to 16.
-
-### Method
-
-#### `compile(self, student, *, trainset)`
-
-This method compiles the `LabeledFewShot` instance by configuring the `student` predictor. It assigns subsets of the `trainset` in each student's predictor's `demos` attribute. If the `trainset` is empty, the method returns the original `student`.
-
-**Parameters:**
-- `student` (_Teleprompter_): Student predictor to be compiled.
-- `trainset` (_list_): Training dataset for compiling with student predictor.
-
-**Returns:**
-- The compiled `student` predictor with assigned training samples for each predictor or the original `student` if the `trainset` is empty.
-
-### Example
-
-```python
-import dspy
-
-#Assume defined trainset
-class RAG(dspy.Module):
-    def __init__(self, num_passages=3):
-        super().__init__()
-
-        #declare retrieval and predictor modules
-        self.retrieve = dspy.Retrieve(k=num_passages)
-        self.generate_answer = dspy.ChainOfThought(GenerateAnswer)
-    
-    #flow for answering questions using predictor and retrieval modules
-    def forward(self, question):
-        context = self.retrieve(question).passages
-        prediction = self.generate_answer(context=context, question=question)
-        return dspy.Prediction(context=context, answer=prediction.answer)
-
-#Define teleprompter
-teleprompter = LabeledFewShot()
-
-# Compile!
-compiled_rag = teleprompter.compile(student=RAG(), trainset=trainset)
-```
-
-## teleprompt.BootstrapFewShot
-
-### Constructor
-
-The constructor initializes the `BootstrapFewShot` class and sets up parameters for bootstrapping.
-
-```python
-class BootstrapFewShot(Teleprompter):
-    def __init__(self, metric=None, teacher_settings={}, max_bootstrapped_demos=4, max_labeled_demos=16, max_rounds=1):
-        self.metric = metric
-        self.teacher_settings = teacher_settings
-
-        self.max_bootstrapped_demos = max_bootstrapped_demos
-        self.max_labeled_demos = max_labeled_demos
-        self.max_rounds = max_rounds
-```
-
-**Parameters:**
-- `metric` (_callable_, _optional_): Metric function to evaluate examples during bootstrapping. Defaults to `None`.
-- `teacher_settings` (_dict_, _optional_): Settings for teacher predictor. Defaults to empty dictionary.
-- `max_bootstrapped_demos` (_int_, _optional_): Maximum number of bootstrapped demonstrations per predictor. Defaults to 4.
-- `max_labeled_demos` (_int_, _optional_): Maximum number of labeled demonstrations per predictor. Defaults to 16.
-- `max_rounds` (_int_, _optional_): Maximum number of bootstrapping rounds. Defaults to 1.
-
-### Method
-
-#### `compile(self, student, *, teacher=None, trainset, valset=None)`
-
-This method compiles the BootstrapFewShot instance by performing bootstrapping to refine the student predictor.
-
-This process includes preparing the student and teacher predictors, which involves creating predictor copies, verifying the student predictor is uncompiled, and compiling the teacher predictor with labeled demonstrations via LabeledFewShot if the teacher predictor hasn't been compiled.
-
-The next stage involves preparing predictor mappings by validating that both the student and teacher predictors have the same program structure and the same signatures but are different objects.
-
-The final stage is performing the bootstrapping iterations.
-
-**Parameters:**
-- `student` (_Teleprompter_): Student predictor to be compiled.
-- `teacher` (_Teleprompter_, _optional_): Teacher predictor used for bootstrapping. Defaults to `None`.
-- `trainset` (_list_): Training dataset used in bootstrapping.
-- `valset` (_list_, _optional_): Validation dataset used in compilation. Defaults to `None`.
-
-**Returns:**
-- The compiled `student` predictor after bootstrapping with refined demonstrations.
-
-### Example
-
-```python
-#Assume defined trainset
-#Assume defined RAG class
-...
-
-#Define teleprompter and include teacher
-teacher = dspy.OpenAI(model='gpt-3.5-turbo', api_key = openai.api_key, api_provider = "openai", model_type = "chat")
-teleprompter = BootstrapFewShot(teacher_settings=dict({'lm': teacher}))
-
-# Compile!
-compiled_rag = teleprompter.compile(student=RAG(), trainset=trainset)
-```
-
-## teleprompt.Ensemble
-
-### Constructor
-
-The constructor initializes the `Ensemble` class and sets up its attributes. This teleprompter is designed to create ensembled versions of multiple programs, reducing various outputs from different programs into a single output.
-
-```python
-class Ensemble(Teleprompter):
-    def __init__(self, *, reduce_fn=None, size=None, deterministic=False):
-```
-
-**Parameters:**
-- `reduce_fn` (_callable_, _optional_): Function used to reduce multiple outputs from different programs into a single output. A common choice is `dspy.majority`. Defaults to `None`.
-- `size` (_int_, _optional_): Number of programs to randomly select for ensembling. If not specified, all programs will be used. Defaults to `None`.
-- `deterministic` (_bool_, _optional_): Specifies whether ensemble should operate deterministically. Currently, setting this to `True` will raise an error as this feature is pending implementation. Defaults to `False`.
-
-### Method
-
-#### `compile(self, programs)`
-
-This method compiles an ensemble of programs into a single program that when run, can either randomly sample a subset of the given programs to produce outputs or use all of them. The multiple outputs can then be reduced into a single output using the `reduce_fn`.
-
-**Parameters:**
-- `programs` (_list_): List of programs to be ensembled.
-
-**Returns:**
-- `EnsembledProgram` (_Module_): An ensembled version of the input programs.
-
-### Example
-
-```python
-import dspy
-from dspy.teleprompt import Ensemble
-
-# Assume a list of programs
-programs = [program1, program2, program3, ...]
-
-# Define Ensemble teleprompter
-teleprompter = Ensemble(reduce_fn=dspy.majority, size=2)
-
-# Compile to get the EnsembledProgram
-ensembled_program = teleprompter.compile(programs)
-```
-
-## teleprompt.BootstrapFewShotWithRandomSearch
-
-### Constructor
-
-The constructor initializes the `BootstrapFewShotWithRandomSearch` class and sets up its attributes. It inherits from the `BootstrapFewShot` class and introduces additional attributes for the random search process.
-
-```python
-class BootstrapFewShotWithRandomSearch(BootstrapFewShot):
-    def __init__(self, metric, teacher_settings={}, max_bootstrapped_demos=4, max_labeled_demos=16, max_rounds=1, num_candidate_programs=16, num_threads=6):
-        self.metric = metric
-        self.teacher_settings = teacher_settings
-        self.max_rounds = max_rounds
-
-        self.num_threads = num_threads
-
-        self.min_num_samples = 1
-        self.max_num_samples = max_bootstrapped_demos
-        self.num_candidate_sets = num_candidate_programs
-        self.max_num_traces = 1 + int(max_bootstrapped_demos / 2.0 * self.num_candidate_sets)
-
-        self.max_bootstrapped_demos = self.max_num_traces
-        self.max_labeled_demos = max_labeled_demos
-
-        print("Going to sample between", self.min_num_samples, "and", self.max_num_samples, "traces per predictor.")
-        print("Going to sample", self.max_num_traces, "traces in total.")
-        print("Will attempt to train", self.num_candidate_sets, "candidate sets.")
-```
-
-**Parameters:**
-- `metric` (_callable_, _optional_): Metric function to evaluate examples during bootstrapping. Defaults to `None`.
-- `teacher_settings` (_dict_, _optional_): Settings for teacher predictor. Defaults to empty dictionary.
-- `max_bootstrapped_demos` (_int_, _optional_): Maximum number of bootstrapped demonstrations per predictor. Defaults to 4.
-- `max_labeled_demos` (_int_, _optional_): Maximum number of labeled demonstrations per predictor. Defaults to 16.
-- `max_rounds` (_int_, _optional_): Maximum number of bootstrapping rounds. Defaults to 1.
-- `num_candidate_programs` (_int_): Number of candidate programs to generate during random search.
-- `num_threads` (_int_): Number of threads used for evaluation during random search.
-
-### Method
-
-Refer to [teleprompt.BootstrapFewShot](#telepromptbootstrapfewshot) documentation.
-
-## Example
-
-```python
-#Assume defined trainset
-#Assume defined RAG class
-...
-
-#Define teleprompter and include teacher
-teacher = dspy.OpenAI(model='gpt-3.5-turbo', api_key = openai.api_key, api_provider = "openai", model_type = "chat")
-teleprompter = BootstrapFewShotWithRandomSearch(teacher_settings=dict({'lm': teacher}))
-
-# Compile!
-compiled_rag = teleprompter.compile(student=RAG(), trainset=trainset)
-```
-
-## teleprompt.BootstrapFinetune
-
-### Constructor
-
-### `__init__(self, metric=None, teacher_settings={}, multitask=True)`
-
-The constructor initializes a `BootstrapFinetune` instance and sets up its attributes. It defines the teleprompter as a `BootstrapFewShot` instance for the finetuning compilation.
-
-```python
-class BootstrapFinetune(Teleprompter):
-    def __init__(self, metric=None, teacher_settings={}, multitask=True):
-```
-
-**Parameters:**
-- `metric` (_callable_, _optional_): Metric function to evaluate examples during bootstrapping. Defaults to `None`.
-- `teacher_settings` (_dict_, _optional_): Settings for teacher predictor. Defaults to empty dictionary.
-- `multitask` (_bool_, _optional_): Enable multitask fine-tuning. Defaults to `True`.
-
-### Method
-
-#### `compile(self, student, *, teacher=None, trainset, valset=None, target='t5-large', bsize=12, accumsteps=1, lr=5e-5, epochs=1, bf16=False)`
-
-This method first compiles for bootstrapping with the `BootstrapFewShot` teleprompter. It then prepares fine-tuning data by generating prompt-completion pairs for training and performs finetuning. After compilation, the LMs are set to the finetuned models and the method returns a compiled and fine-tuned predictor.
-
-**Parameters:**
-- `student` (_Predict_): Student predictor to be fine-tuned.
-- `teacher` (_Predict_, _optional_): Teacher predictor to help with fine-tuning. Defaults to `None`.
-- `trainset` (_list_): Training dataset for fine-tuning.
-- `valset` (_list_, _optional_): Validation dataset for fine-tuning. Defaults to `None`.
-- `target` (_str_, _optional_): Target model for fine-tuning. Defaults to `'t5-large'`.
-- `bsize` (_int_, _optional_): Batch size for training. Defaults to `12`.
-- `accumsteps` (_int_, _optional_): Gradient accumulation steps. Defaults to `1`.
-- `lr` (_float_, _optional_): Learning rate for fine-tuning. Defaults to `5e-5`.
-- `epochs` (_int_, _optional_): Number of training epochs. Defaults to `1`.
-- `bf16` (_bool_, _optional_): Enable mixed-precision training with BF16. Defaults to `False`.
-
-**Returns:**
-- `compiled2` (_Predict_): A compiled and fine-tuned `Predict` instance.
-
-### Example
-
-```python
-#Assume defined trainset
-#Assume defined RAG class
-...
-
-#Define teleprompter
-teleprompter = BootstrapFinetune(teacher_settings=dict({'lm': teacher}))
-
-# Compile!
-compiled_rag = teleprompter.compile(student=RAG(), trainset=trainset, target='google/flan-t5-base')
-```
\ No newline at end of file
diff --git a/docs-page/tsconfig.json b/docs/tsconfig.json
similarity index 100%
rename from docs-page/tsconfig.json
rename to docs/tsconfig.json
diff --git a/docs/using_local_models.md b/docs/using_local_models.md
deleted file mode 100644
index 37c3391ab..000000000
--- a/docs/using_local_models.md
+++ /dev/null
@@ -1,198 +0,0 @@
-# Using local models within DSPy
-
-DSPy supports various methods including `built-in wrappers`, `server integration`, and `external package integration` for model loading. This documentation provides a concise introduction on how to load in models within DSPy extending these capabilities for your specific needs.
-
-## Local Model Loaders
-
-| Loaders | Jump To |
-| --- | --- |
-| HFModel | [HFModel Section](#hfmodel) |
-| Cohere | [Cohere Section](#cohere) |
-| TGI | [TGI Section](#tgi) |
-| VLLM | [VLLM Section](#vllm) |
-| MLC LLM | [MLC LLM Section](#mlc-llm) |
-| Ollama | [Ollama Section](#ollama) |
-
-
-# HFModel
-
-Initialize `HFModel` within your program with the desired model to load in. Here's an example call:
-
-   ```python
-   llama = dspy.HFModel(model = 'meta-llama/Llama-2-7b-hf')
-   ```
-
-# Text-Generation-Inference Server
-
-## Prerequisites
-
-- Docker must be installed on your system. If you don't have Docker installed, you can get it from [here](https://docs.docker.com/get-docker/).
-
-## Setting up the Text-Generation-Inference Server
-
-1. Clone the Text-Generation-Inference repository from GitHub by executing the following command:
-
-   ```
-   git clone https://github.com/huggingface/text-generation-inference.git
-   ```
-
-2. Change into the cloned repository directory:
-
-   ```
-   cd text-generation-inference
-   ```
-
-3. Execute the Docker command under the "Get Started" section to run the server:
-
-   ```
-   model=meta-llama/Llama-2-7b-hf # set to the specific Hugging Face model ID you wish to use.
-   num_shard=2 # set to the number of shards you wish to use.
-   volume=$PWD/data # share a volume with the Docker container to avoid downloading weights every run
-
-   docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:0.9 --model-id $model --num-shard $num_shard
-   ```
-
-   This command will start the server and make it accessible at `http://localhost:8080`.
-
-If you want to connect to [Meta Llama 2 models](https://huggingface.co/meta-llama), make sure to use version 9.3 (or higher) of the docker image (ghcr.io/huggingface/text-generation-inference:0.9.3) and pass in your huggingface token as an environment variable.
-
-    docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data -e HUGGING_FACE_HUB_TOKEN={your_token} ghcr.io/huggingface/text-generation-inference:0.9.3 --model-id $model --num-shard $num_shard
-
-## Sending requests to the server
-
-After setting up the text-generation-inference server and ensuring that it displays "Connected" when it's running, you can interact with it using the `HFClientTGI`.
-
-Initialize the `HFClientTGI` within your program with the desired parameters. Here is an example call:
-
-   ```python
-   lm = dspy.HFClientTGI(model="meta-llama/Llama-2-7b-hf", port=8080, url="http://localhost")
-   ```
-
-   Customize the `model`, `port`, and `url` according to your requirements. The `model` parameter should be set to the specific Hugging Face model ID you wish to use. 
-
-
-### FAQs
-
-1. If your model doesn't require any shards, you still need to set a value for `num_shard`, but you don't need to include the parameter `--num-shard` on the command line.
-
-2. If your model runs into any "token exceeded" issues, you can set the following parameters on the command line to adjust the input length and token limit:
-   - `--max-input-length`: Set the maximum allowed input length for the text.
-   - `--max-total-tokens`: Set the maximum total tokens allowed for text generation.
-
-Please refer to the [official Text-Generation-Inference repository](https://github.com/huggingface/text-generation-inference) for more detailed information and documentation.
-
-
-# vLLM Server
-
-## Setting up the vLLM Server
-
-Follow these steps to set up the vLLM Server:
-
-1. Build the server from source by following the instructions provided in the [Build from Source guide](https://vllm.readthedocs.io/en/latest/getting_started/installation.html#build-from-source).
-
-2. Start the server by running the following command, and specify your desired model, host, and port using the appropriate arguments. The default server address is http://localhost:8000.
-
-   Example command:
-   ```
-   python -m vllm.entrypoints.openai.api_server --model mosaicml/mpt-7b --port 8000
-   ```
-
-This will launch the vLLM server.
-
-## Sending requests to the vLLM server
-
-After setting up the vLLM server and ensuring that it displays "Connected" when it's running, you can interact with it using the `HFClientVLLM`.
-
-Initialize the `HFClientVLLM` within your program with the desired parameters. Here is an example call:
-
-   ```python
-   lm = dspy.HFClientVLLM(model="mosaicml/mpt-7b", port=8000, url="http://localhost")
-   ```
-
-   Customize the `model`, `port`, `url`, and `max_tokens` according to your requirements. The `model` parameter should be set to the specific Hugging Face model ID you wish to use.
-
-Please refer to the [official vLLM repository](https://github.com/vllm-project/vllm) for more detailed information and documentation.
-
-# MLC LLM
-
-## Prerequisites
-
-1. Install the required packages using the following commands:
-   
-   ```shell
-   pip install --no-deps --pre --force-reinstall mlc-ai-nightly-cu118 mlc-chat-nightly-cu118 -f https://mlc.ai/wheels
-   pip install transformers
-   git lfs install
-   ```
-   
-   Adjust the pip wheels according to your OS/platform by referring to the provided commands in [MLC packages](https://mlc.ai/package/).
-
-## Running MLC Llama-2 models
-
-1. Create a directory for prebuilt models:
-
-   ```shell
-   mkdir -p dist/prebuilt
-   ```
-   
-2. Clone the necessary libraries from the repository:
-
-   ```shell
-   git clone https://github.com/mlc-ai/binary-mlc-llm-libs.git dist/prebuilt/lib
-   cd dist/prebuilt
-   ```
-   
-3. Choose a Llama-2 model from [MLC LLMs](https://huggingface.co/mlc-ai) and clone the model repository:
-
-   ```shell
-   git clone https://huggingface.co/mlc-ai/mlc-chat-Llama-2-7b-chat-hf-q4f16_1
-   ```
-
-4. Initialize the `ChatModuleClient` within your program with the desired parameters. Here's an example call:
-
-   ```python
-   llama = dspy.ChatModuleClient(model='dist/prebuilt/mlc-chat-Llama-2-7b-chat-hf-q4f16_1', model_path='dist/prebuilt/lib/Llama-2-7b-chat-hf-q4f16_1-cuda.so')
-   ```
-Please refer to the [official MLC repository](https://github.com/mlc-ai/mlc-llm) for more detailed information and [documentation](https://mlc.ai/mlc-llm/docs/get_started/try_out.html).
-
-# Ollama
-
-Ollama is a good software tool that allows you to run LLMs locally, such as Mistral, Llama2, and Phi.
-The following are the instructions to install and run Ollama.
-
-## Prerequisites
-
-Install Ollama by following the instructions from this page:
-
-- https://ollama.ai
-
-Download model: `ollama pull`
-
-Download a model by running the `ollama pull` command. You can download Mistral, Llama2, and Phi.
-
-```bash
-# download mistral
-ollama pull mistral
-```
-
-Here is the list of other models you can download:
-- https://ollama.ai/library
-
-## Running Ollama model
-
-Run model: `ollama run`
-
-You can test a model by running the model with the `ollama run` command.
-
-```bash
-# run mistral
-ollama run mistral
-```
-
-## Sending requests to the server
-
-Here is the code to load a model through Ollama:
-
-```python
-lm = dspy.OllamaLocal(model='mistral')
-```
\ No newline at end of file
diff --git a/dsp/modules/azure_openai.py b/dsp/modules/azure_openai.py
index d930bec6b..c90f634e4 100644
--- a/dsp/modules/azure_openai.py
+++ b/dsp/modules/azure_openai.py
@@ -107,6 +107,9 @@ def __init__(
             kwargs["model"] = model
 
         self.kwargs = {
+            "api_base": api_base,
+            "api_version": api_version,
+            "api_key": api_key,
             "temperature": 0.0,
             "max_tokens": 150,
             "top_p": 1,
diff --git a/dsp/modules/lm.py b/dsp/modules/lm.py
index ad3f883dd..fb0b0aab5 100644
--- a/dsp/modules/lm.py
+++ b/dsp/modules/lm.py
@@ -101,4 +101,4 @@ def copy(self, **kwargs):
         kwargs = {**self.kwargs, **kwargs}
         model = kwargs.pop('model')
 
-        return self.__class__(model, **kwargs)
+        return self.__class__(model=model, **kwargs)
diff --git a/dspy/experimental/__init__.py b/dspy/experimental/__init__.py
new file mode 100644
index 000000000..b4385639d
--- /dev/null
+++ b/dspy/experimental/__init__.py
@@ -0,0 +1,2 @@
+from .synthetic_data import *
+from .synthesizer import *
diff --git a/dspy/experimental/synthesizer.py b/dspy/experimental/synthesizer.py
new file mode 100644
index 000000000..e247d35f3
--- /dev/null
+++ b/dspy/experimental/synthesizer.py
@@ -0,0 +1,180 @@
+import dspy
+import random
+from typing import List
+from tqdm import tqdm, trange
+from datasets import Dataset
+
+def format_examples(examples: List[dspy.Example]):
+    if isinstance(examples, str):
+        return examples
+
+    formatted_example = ""
+
+    for example in examples:
+        input_keys = example.inputs().keys()
+        label_keys = example.labels().keys()
+
+        formatted_example += f"Inputs:\n"
+        for key in input_keys:
+            formatted_example += f"{key}: {example[key]}\n"
+
+        formatted_example += f"Outputs:\n"
+        for key in label_keys:
+            formatted_example += f"{key}: {example[key]}\n"
+
+    return formatted_example
+
+class ExplainTask(dspy.Signature):
+    """Analyze the provided set of datapoints carefully, and prepare a concise, comprehensible summary that captures the essence and purpose of the task these datapoints aim to address. Your summary should illuminate the general objective and the type of problem being solved, offering a clear picture of what the task entails at a high level. Avoid getting into the nuances of individual datapoints, specifics about models, examples, algorithms, or any intricate technicalities. Your explanation should serve to clarify the task's overall goal and its basic premise, without touching on methodologies or solutions."""
+
+    examples = dspy.InputField(
+        prefix="Examples Datapoints:-",
+        desc="List of datapoints to analyze and explain the task.",
+        format=format_examples,
+    )
+    explanation = dspy.OutputField(
+        prefix="Task Description:",
+        desc="Explanation of the task.",
+    )
+
+class GenerateFieldDescription(dspy.Signature):
+    """Generate a concise and informative description for a given field based on the provided name and task description. This description should be no longer than 10 words and should be in simple english."""
+
+    task_description = dspy.InputField(
+        prefix="Task Description:",
+        desc="Description of the task the field is an input to.",
+    )
+    field_name = dspy.InputField(
+        prefix="Field Name:",
+        desc="Name of the field to generate synthetic data for.",
+    )
+    field_description = dspy.OutputField(
+        prefix="Field Description:",
+        desc="Description of the field.",
+    )
+
+class GenerateInputFieldsData(dspy.Signature):
+    """Generate synthetic data based on the task description and the given knowledge seed."""
+    
+    knowledge_seed = dspy.InputField(
+        prefix="Knowledge Seed:",
+        desc="Seed for the knowledge base search to base the inputs around.",
+        format=lambda x: str(x),
+    )
+    task_description = dspy.InputField(
+        prefix="Task Description:",
+        desc="Description of the task the field is an input to.",
+    )
+
+class GenerateOutputFieldsData(dspy.Signature):
+    pass
+
+class Synthesizer:
+    def __init__(self):
+        self.explain_task = dspy.Predict(ExplainTask)
+        self.generate_field_description = dspy.Predict(GenerateFieldDescription)
+
+        self.generate_input_data = GenerateInputFieldsData
+        self.generate_output_data = GenerateOutputFieldsData
+
+    def _prepare_synthetic_data_predictors(self, input_keys: List[str], output_keys: List[str], task_description: str):
+        for key in tqdm(input_keys, desc="Preparing Input Fields"):
+            field_details = self.generate_field_description(
+                task_description=task_description,
+                field_name=key,
+            )
+
+            field_name = key
+            field_description = field_details.field_description
+
+            output_field = dspy.OutputField(
+                prefix=f"{field_name}:",
+                desc=field_description,
+            )
+            self.generate_input_data = self.generate_input_data.insert(
+                -1,
+                field_name,
+                output_field
+            )
+
+            input_field = dspy.InputField(
+                prefix=f"{field_name}:",
+                desc=field_description,
+            )
+            self.generate_output_data = self.generate_output_data.insert(
+                -1,
+                field_name,
+                input_field
+            )
+
+        for key in tqdm(output_keys, desc="Preparing Output Fields"):
+            field_details = self.generate_field_description(
+                task_description=task_description,
+                field_name=key,
+            )
+
+            field_name = key
+            field_description = field_details.field_description
+
+            output_field = dspy.OutputField(
+                prefix=f"{field_name}:",
+                desc=field_description,
+            )
+            self.generate_output_data = self.generate_output_data.insert(
+                -1,
+                field_name,
+                output_field
+            )
+
+        return dspy.ChainOfThought(self.generate_input_data), dspy.Predict(self.generate_output_data)
+
+    def generate(self, examples: List[dspy.Example], num_data: int, task_description: str = None, input_keys: str = None, output_keys: str = None) -> List[dspy.Example]:
+        task_description = task_description or self.explain_task(examples=examples).explanation
+        self.generate_output_data.__doc__ = task_description
+
+        input_keys = input_keys or [key for key in examples[0].inputs()]
+        output_keys = output_keys or [key for key in examples[0].labels()]
+
+        self.input_predictor, self.output_predictor = self._prepare_synthetic_data_predictors(
+            input_keys=input_keys,
+            output_keys=output_keys,
+            task_description=task_description,
+        )
+
+        data = []
+
+        for idx in trange(num_data, desc="Generating Synthetic Data"):
+            inputs = self.input_predictor(task_description=task_description, knowledge_seed=random.randint(0, 1000000), config=dict(temperature=0.7+0.01*idx))
+
+            input_kwargs = {
+                key: getattr(inputs, key)
+                for key in input_keys
+            }
+
+            outputs = self.output_predictor(**input_kwargs, config=dict(temperature=0.7+0.01*idx))
+
+            output_kwargs = {
+                key: getattr(outputs, key)
+                for key in output_keys
+            }
+
+            data.append(dspy.Example(**input_kwargs, **output_kwargs).with_inputs(*input_keys))
+
+        return data
+
+
+    def export(self, data: List[dspy.Example], path: str, mode: str = None, **kwargs):
+        extention = mode or path.split(".")[-1]
+
+        dataset = Dataset.from_list(
+            [example.toDict() for example in data]
+        )
+
+        if extention == "csv":
+            dataset.to_csv(path_or_buf=path, **kwargs)
+
+        elif extention == "json":
+            dataset.to_json(path_or_buf=path, **kwargs)
+
+        elif extention == "arrow" or extention == "hf":
+            dataset.save_to_disk(path)
\ No newline at end of file
diff --git a/dspy/experimental/synthetic_data.py b/dspy/experimental/synthetic_data.py
new file mode 100644
index 000000000..d5177fcac
--- /dev/null
+++ b/dspy/experimental/synthetic_data.py
@@ -0,0 +1,68 @@
+from pydantic import BaseModel
+import dspy
+import random
+from typing import List, Optional
+
+class descriptionSignature(dspy.Signature):
+  field_name = dspy.InputField(desc="name of a field")
+  example = dspy.InputField(desc="an example value for the field")
+  description = dspy.OutputField(desc="a short text only description of what the field contains")
+
+class SyntheticDataGenerator:
+    def __init__(self, schema_class: Optional[BaseModel] = None, examples: Optional[List[dspy.Example]] = None):
+        self.schema_class = schema_class
+        self.examples = examples
+
+    def generate(self, sample_size: int) -> List[dspy.Example]:
+        if not self.schema_class and not self.examples:
+            raise ValueError("Either a schema_class or examples must be provided.")
+        if self.examples and len(self.examples) >= sample_size:
+            print("No additional data generation needed.")
+            return self.examples[:sample_size]
+
+        additional_samples_needed = sample_size - (len(self.examples) if self.examples else 0)
+        generated_examples = self._generate_additional_examples(additional_samples_needed)
+
+        return self.examples + generated_examples if self.examples else generated_examples
+
+    def _define_or_infer_fields(self):
+        if self.schema_class:
+            data_schema = self.schema_class.model_json_schema()
+            properties = data_schema['properties']
+        elif self.examples:
+            inferred_schema = self.examples[0].__dict__['_store']
+            descriptor = dspy.Predict(descriptionSignature)
+            properties = {field: {'description': str((descriptor(field_name=field, example=str(inferred_schema[field]))).description)}
+                          for field in inferred_schema.keys()}
+        else:
+            properties = {}
+        return properties
+
+    def _generate_additional_examples(self, additional_samples_needed: int) -> List[dspy.Example]:
+        properties = self._define_or_infer_fields()
+        class_name = f"{self.schema_class.__name__ if self.schema_class else 'Inferred'}Signature"
+        fields = self._prepare_fields(properties)
+
+        signature_class = type(class_name, (dspy.Signature,), fields)
+        generator = dspy.Predict(signature_class, n=additional_samples_needed)
+        response = generator(sindex=str(random.randint(1, additional_samples_needed)))
+
+        return [dspy.Example({field_name: getattr(completion, field_name) for field_name in properties.keys()})
+                for completion in response.completions]
+
+    def _prepare_fields(self, properties) -> dict:
+        return {
+            '__doc__': f"Generates the following outputs: {{{', '.join(properties.keys())}}}.",
+            'sindex': dspy.InputField(desc="a random string"),
+            **{field_name: dspy.OutputField(desc=properties[field_name].get('description', 'No description'))
+               for field_name in properties.keys()}
+        }
+
+# # Usage example
+# # Generating synthetic data via a pydantic model
+# generator = SyntheticDataGenerator(schema_class=SyntheticFacts)
+# examples = generator.generate(sample_size=6)
+
+# # Generating synthetic data via existing examples
+# generator = SyntheticDataGenerator(examples=existing_examples)
+# examples = generator.generate(sample_size=5)
diff --git a/dspy/utils/__init__.py b/dspy/utils/__init__.py
index c6f239df0..9f8b201f6 100644
--- a/dspy/utils/__init__.py
+++ b/dspy/utils/__init__.py
@@ -1 +1 @@
-from .dummies import *
\ No newline at end of file
+from .dummies import *
diff --git a/examples/longformqa/longformqa_assertions.ipynb b/examples/longformqa/longformqa_assertions.ipynb
index b37be6adf..cb13909bd 100644
--- a/examples/longformqa/longformqa_assertions.ipynb
+++ b/examples/longformqa/longformqa_assertions.ipynb
@@ -6,6 +6,7 @@
    "source": [
     "<img src=\"../../docs/images/DSPy8.png\" alt=\"DSPy7 Image\" height=\"150\"/>\n",
     "\n",
+    "\n",
     "## **DSPy Assertions**: Asserting Computational Constraints on Foundation \n",
     "\n",
     "### **LongFormQA**: Generating long-form length responses to answer questions"
diff --git a/examples/quiz/quiz_assertions.ipynb b/examples/quiz/quiz_assertions.ipynb
index 48385d97e..917cd6599 100644
--- a/examples/quiz/quiz_assertions.ipynb
+++ b/examples/quiz/quiz_assertions.ipynb
@@ -18,10 +18,10 @@
     "[<img align=\"center\" src=\"https://colab.research.google.com/assets/colab-badge.svg\" />](https://colab.research.google.com/github/stanfordnlp/dspy/blob/main/examples/quiz/quiz_assertions.ipynb)\n",
     "\n",
     "\n",
-    "This notebook highlights an example of [**DSPy Assertions**](../../docs/assertions.md), allowing for declaration of computational constraints within DSPy programs. \n",
+    "This notebook highlights an example of [**DSPy Assertions**](https://dspy-docs.vercel.app/docs/building-blocks/assertions), allowing for declaration of computational constraints within DSPy programs. \n",
     "\n",
     "\n",
-    "This notebook builds upon the foundational concepts of the **DSPy** framework. Prerequisites of following this notebook is having gone through the [DSPy tutorial](../../intro.ipynb), the [**DSPy Assertions documentation**](../../docs/assertions.md) and the introductory DSPy Assertions [tutorial on LongFormQA](../longformqa/longformqa_assertions.ipynb).\n"
+    "This notebook builds upon the foundational concepts of the **DSPy** framework. Prerequisites of following this notebook is having gone through the [DSPy tutorial](../../intro.ipynb), the [**DSPy Assertions documentation**](https://dspy-docs.vercel.app/docs/building-blocks/assertions) and the introductory DSPy Assertions [tutorial on LongFormQA](../longformqa/longformqa_assertions.ipynb).\n"
    ]
   },
   {
diff --git a/examples/tweets/tweets_assertions.ipynb b/examples/tweets/tweets_assertions.ipynb
index 46afc29cc..24b7f37d8 100644
--- a/examples/tweets/tweets_assertions.ipynb
+++ b/examples/tweets/tweets_assertions.ipynb
@@ -18,10 +18,10 @@
     "[<img align=\"center\" src=\"https://colab.research.google.com/assets/colab-badge.svg\" />](https://colab.research.google.com/github/stanfordnlp/dspy/blob/main/examples/tweets/tweets_assertions.ipynb)\n",
     "\n",
     "\n",
-    "This notebook highlights an example of [**DSPy Assertions**](../../docs/assertions.md), allowing for declaration of computational constraints within DSPy programs. \n",
+    "This notebook highlights an example of [**DSPy Assertions**](https://dspy-docs.vercel.app/docs/building-blocks/assertions), allowing for declaration of computational constraints within DSPy programs. \n",
     "\n",
     "\n",
-    "This notebook builds upon the foundational concepts of the **DSPy** framework. Prerequisites of following this notebook is having gone through the [DSPy tutorial](../../intro.ipynb), the [**DSPy Assertions documentation**](../../docs/assertions.md) and the introductory DSPy Assertions [tutorial on LongFormQA](../longformqa/longformqa_assertions.ipynb).\n"
+    "This notebook builds upon the foundational concepts of the **DSPy** framework. Prerequisites of following this notebook is having gone through the [DSPy tutorial](../../intro.ipynb), the [**DSPy Assertions documentation**](https://dspy-docs.vercel.app/docs/building-blocks/assertions) and the introductory DSPy Assertions [tutorial on LongFormQA](../longformqa/longformqa_assertions.ipynb).\n"
    ]
   },
   {
diff --git a/pyproject.toml b/pyproject.toml
index db7fc3fe4..1c0f75968 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 
 [project]
 name = "dspy-ai"
-version = "2.3.4"
+version = "2.3.6"
 description = "DSPy"
 readme = "README.md"
 authors = [{ name = "Omar Khattab", email = "okhattab@stanford.edu" }]
@@ -272,4 +272,4 @@ convention = "google"
 
 [tool.ruff.lint.per-file-ignores]
 "**/{tests,docs}/*" = ["ALL"]
-"**__init__.py" = ["F401"]
\ No newline at end of file
+"**__init__.py" = ["F401"]
diff --git a/setup.py b/setup.py
index 96e6ff5ab..d31e9c791 100644
--- a/setup.py
+++ b/setup.py
@@ -10,7 +10,7 @@
 
 setup(	
     name="dspy-ai",	
-    version="2.3.4",	
+    version="2.3.6",	
     description="DSPy",	
     long_description=long_description,	
     long_description_content_type='text/markdown',	
diff --git a/skycamp2023.ipynb b/skycamp2023.ipynb
index 28cd3c760..5590509da 100644
--- a/skycamp2023.ipynb
+++ b/skycamp2023.ipynb
@@ -425,6 +425,7 @@
     "        # TODO: Replace `None` with a call to self.generate_query_from_context to generate a search query.\n",
     "        # Note: In DSPy, always pass keyword arguments (e.g., context=..., question=...) to the modules to avoid ambiguity.\n",
     "        # Note 2: Don't forget to access the field .search_query to extract that from the output of the module.\n",
+    "        # Note 3: Check the following notebook for a completed example: https://github.com/stanfordnlp/dspy/blob/main/skycamp2023_completed.ipynb.\n",
     "        search_query2 = None\n",
     "\n",
     "        # TODO: Replace `None` with a call to self.retrieve to retrieve passages. Append them to the list `passages`.\n",