Skip to content

Commit

Permalink
Fix typos (stanfordnlp#1707)
Browse files Browse the repository at this point in the history
* Fix typos

* Revert false positives

* Fix "fullset"

---------

Co-authored-by: arnavsinghvi11 <[email protected]>
  • Loading branch information
szepeviktor and arnavsinghvi11 authored Nov 3, 2024
1 parent 9f43944 commit 649be4a
Show file tree
Hide file tree
Showing 43 changed files with 93 additions and 93 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/precommits_check.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ jobs:
run: |
echo "Changed files"
echo ${{ steps.files.outputs.all }}
echo "Github Client version"
echo "GitHub Client version"
echo $(gh --version)
- name: Pre-Commit Checks
run: |
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -415,7 +415,7 @@ Guidance, LMQL, RELM, and Outlines are all exciting new libraries for controllin
This is very useful in many settings, but it's generally focused on low-level, structured control of a single LM call. It doesn't help ensure the JSON (or structured output) you get is going to be correct or useful for your task.
In contrast, **DSPy** automatically optimizes the prompts in your programs to align them with various task needs, which may also include producing valid structured ouputs. That said, we are considering allowing **Signatures** in **DSPy** to express regex-like constraints that are implemented by these libraries.
In contrast, **DSPy** automatically optimizes the prompts in your programs to align them with various task needs, which may also include producing valid structured outputs. That said, we are considering allowing **Signatures** in **DSPy** to express regex-like constraints that are implemented by these libraries.
</details>
## Testing
Expand Down
6 changes: 3 additions & 3 deletions docs/docs/cheatsheet.md
Original file line number Diff line number Diff line change
Expand Up @@ -230,7 +230,7 @@ def gsm8k_metric(gold, pred, trace=None) -> int:
class FactJudge(dspy.Signature):
"""Judge if the answer is factually correct based on the context."""

context = dspy.InputField(desc="Context for the prediciton")
context = dspy.InputField(desc="Context for the prediction")
question = dspy.InputField(desc="Question to be answered")
answer = dspy.InputField(desc="Answer for the question")
factually_correct = dspy.OutputField(desc="Is the answer factually correct based on the context?", prefix="Factual[Yes/No]:")
Expand Down Expand Up @@ -417,7 +417,7 @@ optimized_program = teleprompter.compile(
optimized_program.save(f"mipro_optimized")

# Evaluate optimized program
print(f"Evluate optimized program...")
print(f"Evaluate optimized program...")
evaluate(optimized_program, devset=devset[:])
```

Expand Down Expand Up @@ -446,7 +446,7 @@ optimized_program = teleprompter.compile(
optimized_program.save(f"mipro_optimized")

# Evaluate optimized program
print(f"Evluate optimized program...")
print(f"Evaluate optimized program...")
evaluate(optimized_program, devset=devset[:])
```
### Signature Optimizer with Types
Expand Down
4 changes: 2 additions & 2 deletions docs/docs/deep-dive/data-handling/loading-custom-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ Using the Dataset base class now makes loading custom datasets incredibly easy a

!!! caution

We did not populate `_test` attribute in the above code, which is fine and won't cause any unneccesary error as such. However it'll give you an error if you try to access the test split.
We did not populate `_test` attribute in the above code, which is fine and won't cause any unnecessary error as such. However it'll give you an error if you try to access the test split.

```python
dataset.test[:5]
Expand All @@ -110,6 +110,6 @@ Using the Dataset base class now makes loading custom datasets incredibly easy a

To prevent that you'll just need to make sure `_test` is not `None` and populated with the appropriate data.

You can overide the methods in `Dataset` class to customize your class even more.
You can override the methods in `Dataset` class to customize your class even more.

In summary, the Dataset base class provides a simplistic way to load and preprocess custom datasets with minimal code!
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ The constructor initializes the `HFModel` base class to support the handling of
- `model` (_str_): ID of Hugging Face model connected to the TGI server.
- `port` (_int_ or _list_): Port for communicating to the TGI server. This can be a single port number (`8080`) or a list of TGI ports (`[8080, 8081, 8082]`) to route the requests to.
- `url` (_str_): Base URL of hosted TGI server. This will often be `"http://localhost"`.
- `http_request_kwargs` (_dict_): Dictionary of additional keyword agruments to pass to the HTTP request function to the TGI server. This is `None` by default.
- `http_request_kwargs` (_dict_): Dictionary of additional keyword arguments to pass to the HTTP request function to the TGI server. This is `None` by default.
- `**kwargs`: Additional keyword arguments to configure the TGI client.

Example of the TGI constructor:
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/deep-dive/modules/predict.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ class Predict(Parameter):

This method serves as a wrapper for the `forward` method. It allows making predictions using the `Predict` class by providing keyword arguments.

**Paramters:**
**Parameters:**
- `**kwargs`: Keyword arguments required for prediction.

**Returns:**
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/deep-dive/optimizers/bfrs.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ In terms of API `BootstrapFewShotWithRandomSearch` teleprompter is quite similar

## Working Example

Let's take the example of optimizing a simple CoT pipeline for GSM8k dataset, we'll take the example in [BootstrapFewShot](/deep-dive/optimizers/bootstrap-fewshot) as our running example for optimizers. We're gonna assume our data and pipeline is same as the on in `BootstrapFewShot` article. So let's start by intializing the optimizer:
Let's take the example of optimizing a simple CoT pipeline for GSM8k dataset, we'll take the example in [BootstrapFewShot](/deep-dive/optimizers/bootstrap-fewshot) as our running example for optimizers. We're gonna assume our data and pipeline is same as the on in `BootstrapFewShot` article. So let's start by initializing the optimizer:

```python
from dspy.teleprompt import BootstrapFewShotWithRandomSearch
Expand Down
10 changes: 5 additions & 5 deletions docs/docs/deep-dive/optimizers/miprov2.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ optimized_program = teleprompter.compile(
optimized_program.save(f"mipro_optimized")

# Evaluate optimized program
print(f"Evluate optimized program...")
print(f"Evaluate optimized program...")
evaluate(optimized_program, devset=devset[:])
```

Expand Down Expand Up @@ -119,7 +119,7 @@ zeroshot_optimized_program = teleprompter.compile(
zeroshot_optimized_program.save(f"mipro_zeroshot_optimized")

# Evaluate optimized program
print(f"Evluate optimized program...")
print(f"Evaluate optimized program...")
evaluate(zeroshot_optimized_program, devset=devset[:])
```

Expand Down Expand Up @@ -156,7 +156,7 @@ optimized_program = teleprompter.compile(
optimized_program.save(f"mipro_optimized")

# Evaluate optimized program
print(f"Evluate optimized program...")
print(f"Evaluate optimized program...")
evaluate(optimized_program, devset=devset[:])
```

Expand All @@ -170,7 +170,7 @@ evaluate(optimized_program, devset=devset[:])
| `prompt_model` | `dspy.LM` | LM specified in `dspy.settings` | Model used for prompt generation. |
| `task_model` | `dspy.LM` | LM specified in `dspy.settings` | Model used for task execution. |
| `auto` | `Optional[str]` | None | If set to `light`, `medium`, or `heavy`, this will automatically configure the following hyperparameters: `num_candidates`, `num_trials`, `minibatch`, and will also cap the size of `valset` up to 100, 300, and 1000 for `light`, `medium`, and `heavy` runs respectively. |
| `num_candidates` | `int` | `10` | Number of candidate instructions & few-shot examples to generate and evaluate for each predictor. If `num_candidates=10`, this means for a 2 module LM program we'll be optimizing over 10 candidates x 2 modules x 2 variables (few-shot ex. and instructions for each module)= 40 total variables. Therfore, if we increase `num_candidates`, we will probably want to increase `num_trials` as well (see Compile parameters). |
| `num_candidates` | `int` | `10` | Number of candidate instructions & few-shot examples to generate and evaluate for each predictor. If `num_candidates=10`, this means for a 2 module LM program we'll be optimizing over 10 candidates x 2 modules x 2 variables (few-shot ex. and instructions for each module)= 40 total variables. Therefore, if we increase `num_candidates`, we will probably want to increase `num_trials` as well (see Compile parameters). |
| `num_threads` | `int` | `6` | Threads to use for evaluation. |
| `max_errors` | `int` | `10` | Maximum errors during an evaluation run that can be made before throwing an Exception. |
| `teacher_settings` | `dict` | `{}` | Settings to use for the teacher model that bootstraps few-shot examples. An example dict would be `{lm=<dspy.LM object>}`. If your LM program with your default model is struggling to bootstrap any examples, it could be worth using a more powerful teacher model for bootstrapping. |
Expand Down Expand Up @@ -210,7 +210,7 @@ At a high level, `MIPROv2` works by creating both few-shot examples and new inst
These steps are broken down in more detail below:
1) **Bootstrap Few-Shot Examples**: The same bootstrapping technique used in `BootstrapFewshotWithRandomSearch` is used to create few-shot examples. This works by randomly sampling examples from your training set, which are then run through your LM program. If the output from the program is correct for this example, it is kept as a valid few-shot example candidate. Otherwise, we try another example until we've curated the specified amount of few-shot example candidates. This step creates `num_candidates` sets of `max_bootstrapped_demos` bootstrapped examples and `max_labeled_demos` basic examples sampled from the training set.
2) **Propose Instruction Candidates**. Next, we propose instruction candidates for each predictor in the program. This is done using another LM program as a proposer, which bootstraps & summarizes relevant information about the task to generate high quality instructions. Specifically, the instruction proposer includes (1) a generated summary of properties of the training dataset, (2) a generated summary of your LM program's code and the specific predictor that an instruction is being generated for, (3) the previously bootstrapped few-shot examples to show reference inputs / outputs for a given predictor and (4) a randomly sampled tip for generation (i.e. "be creative", "be concise", etc.) to help explore the feature space of potential instructions.
3. **Find an Optimized Combination of Few-Shot Examples & Instructions**. Finally, now that we've created these few-shot examples and instructions, we use Bayesian Optimization to choose which set of these would work best for each predictor in our program. This works by running a series of `num_trials` trials, where a new set of prompts are evaluated over our validation set at each trial. This helps the Bayesian Optimizer learn which combination of prompts work best over time. If `minibatch` is set to `True` (which it is by default), then the new set of prompts are only evaluated on a minibatch of size `minibatch_size` at each trial which generally allows for more efficient exploration / exploitation. The best averaging set of prompts is then evalauted on the full validation set every `minibatch_full_eval_steps` get a less noisey performance benchmark. At the end of the optimization process, the LM program with the set of prompts that performed best on the full validation set is returned.
3. **Find an Optimized Combination of Few-Shot Examples & Instructions**. Finally, now that we've created these few-shot examples and instructions, we use Bayesian Optimization to choose which set of these would work best for each predictor in our program. This works by running a series of `num_trials` trials, where a new set of prompts are evaluated over our validation set at each trial. This helps the Bayesian Optimizer learn which combination of prompts work best over time. If `minibatch` is set to `True` (which it is by default), then the new set of prompts are only evaluated on a minibatch of size `minibatch_size` at each trial which generally allows for more efficient exploration / exploitation. The best averaging set of prompts is then evaluated on the full validation set every `minibatch_full_eval_steps` get a less noisey performance benchmark. At the end of the optimization process, the LM program with the set of prompts that performed best on the full validation set is returned.


For those interested in more details, more information on `MIPROv2` along with a study on `MIPROv2` compared with other DSPy optimizers can be found in [this paper](https://arxiv.org/abs/2406.11695).
2 changes: 1 addition & 1 deletion docs/docs/deep-dive/retrieval_models_clients/MilvusRM.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ Search the Milvus collection for the top `k` passages matching the given query o
from dspy.retrieve.milvus_rm import MilvusRM
import os

os.envrion["OPENAI_API_KEY"] = "<YOUR_OPENAI_API_KEY>"
os.environ["OPENAI_API_KEY"] = "<YOUR_OPENAI_API_KEY>"

retriever_model = MilvusRM(
collection_name="<YOUR_COLLECTION_NAME>",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ connection_parameters = {
snowpark = Session.builder.configs(connection_parameters).create()

snowflake_retriever = SnowflakeRM(snowflake_session=snowpark,
cortex_search_service="<YOUR_CORTEX_SERACH_SERVICE_NAME>",
cortex_search_service="<YOUR_CORTEX_SEARCH_SERVICE_NAME>",
snowflake_database="<YOUR_SNOWFLAKE_DATABASE_NAME>",
snowflake_schema="<YOUR_SNOWFLAKE_SCHEMA_NAME>",
auto_filter=True,
Expand Down
40 changes: 20 additions & 20 deletions docs/docs/dspy-usecases.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,27 +58,27 @@ WIP. This list mainly includes companies that have public posts or have OKed bei

| **Name** | **Description/Link** |
|---|---|
| **Stanford CS 224U Homework** | [Github](https://github.com/cgpotts/cs224u/blob/main/hw_openqa.ipynb) |
| **STORM Report Generation (10,000 GitHub stars)** | [Github](https://github.com/stanford-oval/storm) |
| **DSPy Redteaming** | [Github](https://github.com/haizelabs/dspy-redteam) |
| **DSPy Theory of Mind** | [Github](https://github.com/plastic-labs/dspy-opentom) |
| **Indic cross-lingual Natural Language Inference** | [Github](https://github.com/saifulhaq95/DSPy-Indic/blob/main/indicxlni.ipynb) |
| **Optimizing LM for Text2SQL using DSPy** | [Github](https://github.com/jjovalle99/DSPy-Text2SQL) |
| **Stanford CS 224U Homework** | [GitHub](https://github.com/cgpotts/cs224u/blob/main/hw_openqa.ipynb) |
| **STORM Report Generation (10,000 GitHub stars)** | [GitHub](https://github.com/stanford-oval/storm) |
| **DSPy Redteaming** | [GitHub](https://github.com/haizelabs/dspy-redteam) |
| **DSPy Theory of Mind** | [GitHub](https://github.com/plastic-labs/dspy-opentom) |
| **Indic cross-lingual Natural Language Inference** | [GitHub](https://github.com/saifulhaq95/DSPy-Indic/blob/main/indicxlni.ipynb) |
| **Optimizing LM for Text2SQL using DSPy** | [GitHub](https://github.com/jjovalle99/DSPy-Text2SQL) |
| **DSPy PII Masking Demo by Eric Ness** | [Colab](https://colab.research.google.com/drive/1KZR1sGTp_RLWUJPAiK1FKPKI-Qn9neUm?usp=sharing) |
| **DSPy on BIG-Bench Hard Example** | [Github](https://drchrislevy.github.io/posts/dspy/dspy.html) |
| **Building a chess playing agent using DSPy** | [Github](https://medium.com/thoughts-on-machine-learning/building-a-chess-playing-agent-using-dspy-9b87c868f71e) |
| **Ittia Research Fact Checking** | [Github](https://github.com/ittia-research/check) |
| **Strategic Debate via Tree-of-Thought** | [Github](https://github.com/zbambergerNLP/strategic-debate-tot) |
| **Sanskrit to English Translation App**| [Github](https://github.com/ganarajpr/sanskrit-translator-dspy) |
| **DSPy for extracting features from PDFs on arXiv**| [Github](https://github.com/S1M0N38/dspy-arxiv) |
| **DSPygen: DSPy in Ruby on Rails**| [Github](https://github.com/seanchatmangpt/dspygen) |
| **DSPy Inspector**| [Github](https://github.com/Neoxelox/dspy-inspector) |
| **DSPy with FastAPI**| [Github](https://github.com/diicellman/dspy-rag-fastapi) |
| **DSPy for Indian Languages**| [Github](https://github.com/saifulhaq95/DSPy-Indic) |
| **Hurricane: Blog Posts with Generative Feedback Loops!**| [Github](https://github.com/weaviate-tutorials/Hurricane) |
| **RAG example using DSPy, Gradio, FastAPI, and Ollama**| [Github](https://github.com/diicellman/dspy-gradio-rag) |
| **Synthetic Data Generation**| [Github](https://colab.research.google.com/drive/1CweVOu0qhTC0yOfW5QkLDRIKuAuWJKEr?usp=sharing) |
| **Self Discover**| [Github](https://colab.research.google.com/drive/1GkAQKmw1XQgg5UNzzy8OncRe79V6pADB?usp=sharing) |
| **DSPy on BIG-Bench Hard Example** | [GitHub](https://drchrislevy.github.io/posts/dspy/dspy.html) |
| **Building a chess playing agent using DSPy** | [GitHub](https://medium.com/thoughts-on-machine-learning/building-a-chess-playing-agent-using-dspy-9b87c868f71e) |
| **Ittia Research Fact Checking** | [GitHub](https://github.com/ittia-research/check) |
| **Strategic Debate via Tree-of-Thought** | [GitHub](https://github.com/zbambergerNLP/strategic-debate-tot) |
| **Sanskrit to English Translation App**| [GitHub](https://github.com/ganarajpr/sanskrit-translator-dspy) |
| **DSPy for extracting features from PDFs on arXiv**| [GitHub](https://github.com/S1M0N38/dspy-arxiv) |
| **DSPygen: DSPy in Ruby on Rails**| [GitHub](https://github.com/seanchatmangpt/dspygen) |
| **DSPy Inspector**| [GitHub](https://github.com/Neoxelox/dspy-inspector) |
| **DSPy with FastAPI**| [GitHub](https://github.com/diicellman/dspy-rag-fastapi) |
| **DSPy for Indian Languages**| [GitHub](https://github.com/saifulhaq95/DSPy-Indic) |
| **Hurricane: Blog Posts with Generative Feedback Loops!**| [GitHub](https://github.com/weaviate-tutorials/Hurricane) |
| **RAG example using DSPy, Gradio, FastAPI, and Ollama**| [GitHub](https://github.com/diicellman/dspy-gradio-rag) |
| **Synthetic Data Generation**| [GitHub](https://colab.research.google.com/drive/1CweVOu0qhTC0yOfW5QkLDRIKuAuWJKEr?usp=sharing) |
| **Self Discover**| [GitHub](https://colab.research.google.com/drive/1GkAQKmw1XQgg5UNzzy8OncRe79V6pADB?usp=sharing) |

TODO: This list in particular is highly incomplete. There are a couple dozen other good ones.

Expand Down
4 changes: 2 additions & 2 deletions docs/docs/quick-start/getting-started-01.md
Original file line number Diff line number Diff line change
Expand Up @@ -167,7 +167,7 @@ pred = cot(**example.inputs())
score = metric(example, pred)

print(f"Question: \t {example.question}\n")
print(f"Gold Reponse: \t {example.response}\n")
print(f"Gold Response: \t {example.response}\n")
print(f"Predicted Response: \t {pred.response}\n")
print(f"Semantic F1 Score: {score:.2f}")
```
Expand All @@ -176,7 +176,7 @@ print(f"Semantic F1 Score: {score:.2f}")
```
Question: what are high memory and low memory on linux?
Gold Reponse: "High Memory" refers to the application or user space, the memory that user programs can use and which isn't permanently mapped in the kernel's space, while "Low Memory" is the kernel's space, which the kernel can address directly and is permanently mapped.
Gold Response: "High Memory" refers to the application or user space, the memory that user programs can use and which isn't permanently mapped in the kernel's space, while "Low Memory" is the kernel's space, which the kernel can address directly and is permanently mapped.
The user cannot access the Low Memory as it is set aside for the required kernel programs.
Predicted Response: In Linux, "low memory" refers to the memory that is directly accessible by the kernel and user processes, typically the first 4GB on a 32-bit system. "High memory" refers to memory above this limit, which is not directly accessible by the kernel in a 32-bit environment. This distinction is crucial for memory management, particularly in systems with large amounts of RAM, as it influences how memory is allocated and accessed.
Expand Down
Loading

0 comments on commit 649be4a

Please sign in to comment.